Commit Graph

78 Commits

Author SHA1 Message Date
H. Peter Anvin
dcffe4b9f6 Add extension bytecodes to support operands 4+
The bytecode format assumes max 4 operands pretty strictly, but we
already have one instruction with 5 operands, and it's likely to get
more.  Support them via extension prefixes (similar to REX prefixes).
For bytecodes which use argument bytes we encode the number directly,
however.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-10-23 23:03:59 -07:00
H. Peter Anvin
fa3833db81 disasm: collapse all the segment register push/pop bytecodes
As far as the disassembler is concerned, the segment register push/pop
bytecodes can be collapsed to a simple expression; the remaining
differences are handled by the filter expressions in insns.pl.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-10-09 14:15:36 -07:00
H. Peter Anvin
ff6e12da50 Reshuffle and move the bytecodes for segment register push/pop
Reshuffle the bytecodes for segment register push/pop to make more
sense, and move them from \4 to \344, thus freeing up the single-digit
bytecodes \4..\7 for future use.  It doesn't really make sense to use
single-digit bytecodes for this very oddball use.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-10-08 21:17:32 -07:00
H. Peter Anvin
588df78b0d New opcode for 32->64 bit sign-extended immediate with warning
Add a new opcode for 32->64 bit sign-extended immediate, with warning
on the number not matching.

This unfortunately calls for an audit of all the \4[0123] opcodes, if
they should be replaced by \25[4567].  This only replaces one
instruction (MOV reg64,imm32); other instructions need to be
considered.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-10-07 10:05:10 -07:00
H. Peter Anvin
c1377e9a98 New opcodes to deal with 8-bit immediate sign extended to opsize
New opcodes to deal with 8-bit immediates which are then sign-extended
to the operand size.  These allow us to warn appropriately.
Not sure I'm using these in all the proper places; need audit of all
uses of the \14..\17 opcodes.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2008-10-06 23:40:31 -07:00
H. Peter Anvin
962e30519c BR 2029829: Accept VIA XCRYPT instructions with or without REP
Accept the VIA XCRYPT instructions either with or without a REP
prefix, as documented.

Add the missing XCRYPTCTR instruction.
2008-08-28 17:47:16 -07:00
H. Peter Anvin
83b2e4f82c BR 2062342: ndisasm: r12 *can* be an index register
R12 can be used as an index register.  Special encodings in the modr/m
byte are done *without* consideration for the REX prefix, but special
encodings in the SIB byte *do* take the REX prefix into consideration,
since it doesn't affect the overall instruction format.
2008-08-20 09:42:47 -07:00
H. Peter Anvin
f7d863b7d1 BR 2028910: fix decoding of VEX prefixes in 16- and 32-bit mode
We would incorrectly set a bunch of VEX-related state for C4 and C5
bytes, even though we had already rejected it as not a VEX prefix due
to the top two bits of the following byte not being 11.
2008-07-30 17:30:12 -07:00
H. Peter Anvin
9435283319 ndisasm: the high bit of is4 bytes is ignored in 32-bit mode
Mask the high bit of is4 bytes in 32-bit mode.  Provide a generic
"regmask" variable that we can use for equivalent purposes as needed.
2008-05-26 12:03:55 -07:00
H. Peter Anvin
bd420c7095 Add tokens vex.ww and vex.wx; vex.wx is the default
Add vex.ww (for VEX.W follows REX.W) and vex.wx (for VEX.W is a don't
care); vex.wx is the default since that seems to match existing usage
better.
2008-05-22 11:24:35 -07:00
H. Peter Anvin
a69ce1d19d insnsn.c: cleaner to *not* separate out conditional instructions
The disassembler code gets cleaner if we do *not* separate out the
conditional instructions; instead, rely on the fact that the
conditionals are always at the end and use FIRST_COND_OPCODE as a
barrier.
2008-05-21 15:09:31 -07:00
H. Peter Anvin
2fb033af18 Disassembler: select table based on VEX prefixes
We can use the new VEX prefixes to select into a large table of new
opcode spaces.  Since the table is (currently) sparse, add logic so we
don't end up producing tons of empty tables for no good reason.

This is also necessary since VEX is likely to reuse opcode bytes that
would appear as prefixes at some point, which would cause conflicts
with the regular tables.
2008-05-21 11:05:39 -07:00
H. Peter Anvin
39d6ac6f79 Fix display for fixed xmm0/ymm0, SSE redundant prefixes
All singleton registers need to be displayable from register flags
alone!

When using the new 0360..0363 codes, make sure we appropriate avoid
displaying the legacy use of the prefixes.
2008-05-21 10:33:19 -07:00
H. Peter Anvin
6b3b7bcd33 VEX prefixes apply to VEX instructions only... 2008-05-20 23:36:36 -07:00
H. Peter Anvin
52dc353868 Handle is4 bytes without meaningful information in the bottom bits
Support is4 bytes without meaningful information in the bottom bits.
This is equivalent to /is4=0 for the assembler, but makes the bottom
bits don't care for the disassembler.
2008-05-20 19:29:04 -07:00
H. Peter Anvin
0ab96a17d5 ndisasm: simple compare for conditional opcodes, no loop
We had a completely unnecessary loop to test for conditional opcodes.
Since we always put the conditional opcodes at the end, we might as
well just remember where that list starts and compare against it.
2008-05-20 17:07:57 -07:00
H. Peter Anvin
a4835d466c Avoid #including .c files; instead compile as separate units
Don't #include .c files, even if they are auto-generated; instead
compile them as separate compilation units and let the linker do its
job.
2008-05-20 14:21:29 -07:00
H. Peter Anvin
dfb918047b Add DY, YWORD, and the SY instruction flag
Add the DY instruction, YWORD keyword, and an SY marker for
instruction sizes.  Add a few more AVX sample instructions.
2008-05-20 11:43:53 -07:00
H. Peter Anvin
fff5a47e65 Same some space by introducing shorthand byte codes for SSE prefixes
Properly done, all SSE instructions which has the 66/F2/F3 opcode
multiplex need two prefixes: one to control the use of OSP and one to
control the use of REP.  However, it's a four-way select: np/66/F2/F3;
so introduce shorthand bytecodes for that purpose.
2008-05-20 09:46:24 -07:00
H. Peter Anvin
aaa088fbf3 Remove special hacks to avoid zero bytecodes
We can now have zero bytecodes with impunity, so remove any special
hacks we had to avoid zeroes in the bytecode.
2008-05-12 11:13:41 -07:00
H. Peter Anvin
d58656f797 Add support for register-number immediates with fixed 4-bit values
Add support for imm8 bytes which has a register value in the top four
bits and an arbitrary fixed value in the bottom four bits.
2008-05-06 20:11:14 -07:00
H. Peter Anvin
7334e3ac23 Initial NDISASM support for AVX instructions/VEX prefixes
Initial NDISASM support for AVX instructions and VEX prefixes.  It
doesn't mean it's correct, but it seems to match my current
understanding.  It can disassemble *some*, but not *all*, of the AVX
test cases (which are known to be at least partially incorrect...)
2008-05-05 18:47:27 -07:00
H. Peter Anvin
d85d250fa2 First cut at AVX machinery.
First cut at AVX machinery support.  The only instruction implemented
is VPERMIL2PS, and it's probably buggy.  I'm checking this in with the
hope that other people can start helping out with (a) testing this,
and (b) adding instructions.

NDISASM support is not there yet.
2008-05-04 17:53:31 -07:00
H. Peter Anvin
08367e2231 disasm: relative operands are signed, not unsigned
Relative operands are signed, not unsigned; record them as such and
then apply proper truncation after offset addition.
2008-01-02 12:20:38 -08:00
Beroset
095e6a2973 regularized spelling of license to match name of LICENSE file 2007-12-29 09:44:23 -05:00
H. Peter Anvin
9e9a24253a disasm: 32-bit index registers were displayed as 64 bits
Fix bug where 32-bit index registers got incorrectly displayed as 64
bits:

00000000  678B040B          mov eax,[ebx+rcx]
00000004  678B044B          mov eax,[ebx+rcx*2]
00000008  678B045B          mov eax,[ebx+rbx*2]
2007-12-26 19:10:20 -08:00
H. Peter Anvin
a30cc07224 BR 1834292: Fix multiple disassembler bugs
- Correct the building on the disassembler decision tree.
- Handle SSE instructions with F2 prefix (\332) correctly.
- Mark instructions which are now used as prefixes with ND.
  (In a future version when we have better CPU version handling,
  we should probably build the decision tree at runtime based on
  the selected CPU feature sets.)
- Sanitize the handling of \144-147 and \154-157 in both the assembler
  and disassembler.  They take an opcode byte as argument; don't
  pretend they don't.
2007-11-18 21:55:26 -08:00
H. Peter Anvin
d1fb15c154 Address data is int64_t; simplify writing an address object
Address data is always int64_t even if the size itself is smaller;
this was broken on bigendian hosts (still need testing!)

Create simple "write sized object" macros.
2007-11-13 09:37:59 -08:00
H. Peter Anvin
a5fb90834a ndisasm: factor out the common operand-extraction code
Factor out the common operand-extraction code in the disassembler, as
previously done in the assembler.
2007-11-12 23:00:31 -08:00
H. Peter Anvin
bb72f7f111 Un-special-case "xchg rax,rax"; disassemble o64
Un-special-case "xchg rax,rax"; allow it to be encoded as 48 90 for
orthogonality's sake.  It's a no-op, to be sure, but so are many other
instructions.

"xchg eax,eax" is still special-cased in 64-bit mode since it is not a
no-op; unadorned opcode 90 is now simply "nop" and nothing else.

Make the disassembler detect unused REX.W and display them as an "o64"
prefix.
2007-11-12 22:56:07 -08:00
H. Peter Anvin
e8cdcdcc37 Better (but not *good!*) handling of 64-bit addressing in ndisasm
More correct -- but not fully correct -- handing of 64-bit addressing
in ndisasm.  In particular, we need to generate "a32" versus "dword"
where applicable.
2007-11-12 21:57:00 -08:00
H. Peter Anvin
2344010d26 Fix disassembly of XCHG
"REX.B 90" in 64-bit mode is "xchg eax,r8d" not "nop"; equivalent
situation for "REX.WB 90" (xchg rax,r8).
2007-11-12 21:02:33 -08:00
H. Peter Anvin
de4b89bb3e 64-bit addressing and prefix handling changes
Revamp the address- and prefix-handling code to make more sense in
64-bit mode.  We are now a lot closer to where we want to be, but
we're not quite there yet.

ndisasm may very well have problems, or give counterintuitive output.
However, checking it in so we can make forward progress.
2007-10-28 22:04:00 -07:00
Charles Crayne
f6816b25bf Fix bugs item #1817677 2007-10-23 19:28:39 -07:00
H. Peter Anvin
7065309739 Formatting: kill off "stealth whitespace"
"Stealth whitespace" makes it harder to read diffs, and just generally
cause unwanted weirdness.  Do a source-wide pass to get rid of it.
2007-10-19 14:42:29 -07:00
Charles Crayne
46b31b0f08 Suppress signedness warnings in disassembler 2007-10-18 21:17:20 -07:00
H. Peter Anvin
6867acc18e Use the compiler-provided booleans if available, otherwise emulate
Both C and C++ have "bool", "true" and "false" in lower case; C
requires <stdbool.h> for this, in C++ it is an inherent type built
into the compiler.  Use those instead of the old macros; emulate with
a simple typedef enum if unavailable.
2007-10-10 14:58:45 -07:00
H. Peter Anvin
fe501957c0 Portability fixes
Concentrate compiler dependencies to compiler.h; make sure compiler.h
is included first in every .c file (since some prototypes may depend
on the presence of feature request macros.)

Actually use the conditional inclusion of various functions (totally
broken in previous releases.)
2007-10-02 21:53:51 -07:00
H. Peter Anvin
c5b9ce0a84 Auto-generate 0x67 prefixes without the need for \30x codes
Auto-generate 0x67 prefixes without the need for \30x codes; the
prefix is automatically added when there is a memory operand with
address size differing from the current address size (and impossible
combinations checked for.)
2007-09-22 21:49:51 -07:00
H. Peter Anvin
19e2010536 Speed up the disassembler by allowing prefixed instruction tables
Modify the disassembler so that we can have separate instruction
tables for prefixed instructions.  As it was, all instructions which
started with 0F were linearly searched, and that is by now more than
half the instruction set.
2007-09-18 15:08:20 -07:00
H. Peter Anvin
7786c364b4 Disassembler support for SSE5 instructions
Support for the SSE5 instruction format in the disassembler.

Also adds some comments to insnsd.c for easier debugging.
2007-09-17 18:45:44 -07:00
H. Peter Anvin
7eb4a38793 Initial support for four arguments per instruction
For SSE5, we will need to support four arguments per instruction.
2007-09-17 15:49:30 -07:00
H. Peter Anvin
cb9b690ae6 Add (untested!) SSSE3, SSE4.1, SSE4.2 instructions
Add the SSSE3, SSE4.1 and SSE4.2 instruction sets.  Change \332 to be
a literal 0xF2 prefix, by analog with \333 for 0xF3 prefix (the
previous \332 flag changed to \335).  This is necessary to get the REX
prefix in the right place for instructions that use it.

We are going to have to go in and change existing instruction patterns
which use these, as well.
2007-09-12 21:58:51 -07:00
H. Peter Anvin
0da6b580eb Support r/m operands for non-integer types
Support r/m operands for non-integer operands types, i.e. mmx or xmm
operands.  This allows mmx and xmm operands to be written more
compactly, speeding up the assembler.
2007-09-12 21:04:39 -07:00
H. Peter Anvin
62cb606f68 Handle instructions which can have both REX.W and OSP 2007-09-11 22:44:03 +00:00
H. Peter Anvin
2ba7ed7122 ndisasm: handle \366 codes, prefer unprefixed instructions
- Implement \366 codes in ndisasm
- Prefer instruction patterns without loose prefixes if possible
- Fix improper initialization of operands in ndisasm
2007-09-11 22:13:17 +00:00
H. Peter Anvin
bfb888c015 Quiet gcc warning about uninitialized variables 2007-09-11 04:26:44 +00:00
H. Peter Anvin
3360d79369 Make the big instruction arrays "const"
Make the big instruction arrays "const", so they end up in readonly
storage.  While we're at it, move their prototypes into insns.h.
2007-09-11 04:16:57 +00:00
H. Peter Anvin
99c4ecd18f Implement REL/ABS modifiers
Implement "REL" and "ABS" modifiers for offsets in 64-bit mode.  This
replaces "rip+XXX" type addressing.  The infrastructure to set the default
mode is there, but there is nothing to throw the switch just yet.
2007-08-28 23:06:00 +00:00
H. Peter Anvin
ce2b397f1e Fix the handling of the \313 code.
\313 indicates a fixed 64-bit address size.  It was incorrectly
documented and incorrectly implemented in the assembler, and was
unimplemented in the disassembler.
2007-05-30 22:21:11 +00:00