According to XED and experimentation, the 66 is ignored.
Signed-off-by: Ben Rudiak-Gould <benrudiak@gmail.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
It was there to support the SSE5 DREX encoding,
which as far as I know is dead forever.
Signed-off-by: Ben Rudiak-Gould <benrudiak@gmail.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
This patch is getting rid of the following bytecodes
'pushseg','popseg','pushseg2','popseg2' and simplifies
overall code.
[gorcunov@: a few style fixes]
Signed-off-by: Ben Rudiak-Gould <benrudiak@gmail.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Thus if someone need to rework this code he won't need
to jump between files trying to figure out where enum
and opcodes lay.
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
It doesn't seem worth >200 lines of C and Perl to save ~50 lines in insns.dat.
In order to make this work I had to rename sbyte16/sbyte32 so that
they can take an ordinary size suffix (their size suffix was formerly
treated specially).
This fixes one disassembly bug: 48C7C000000080 disassembles to mov
rax,0x80000000, which reassembles to B800000080, which loads a
different value.
Signed-off-by: Ben Rudiak-Gould <benrudiak@gmail.com>
Acked-by: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
This adds "np" to a bunch of SSE-style instructions that should have
it, "norep" (which was implemented but unused) on quasi-SSE instructions
that use F2 and F3 as instruction extensions but 66 for operand size,
"nof3" (newly implemented) on a few instructions, "norexw" on some
instructions that have only 32-bit and 64-bit versions, and one NOLONG.
It also removes some incorrect "np"s, changes some "f3"s to "f3i"s,
and fixes the decoding of the XCHG/NOP/PAUSE mess: F390 is always
PAUSE even when rex.b=1 (at least according to XED).
Signed-off-by: Ben Rudiak-Gould <benrudiak@gmail.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Soon we will need to encode 512 bits values
thus there is no space left in our opflags_t
which is 32 bitfield.
Extend it to 64 bits width.
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Since we are back to three bytecodes, move them back to the \271-\273
slot to free up the \264 complete quad.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
The way our matching system works we have to make NOHLE an instruction
flag rather than an byte code; by the time we run the byte code
interpreter we have already picked an instruction pattern once and for
all.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Get rid of the last vestiges of the explicit byte codes in insns.dat.
The only files that now depend on actual byte code numbers are
insns.pl, assemble.c and disasm.c.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Add an LOCK flag to the instruction template, and make the presence of
a LOCK prefix trigger a warning if it is not set. Simplify the LOCK
and HLE logic by hard-coding the knowledge that operand 0 has to be
memory.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
The a2/a3 mem_offs MOV opcodes are invalid with XRELEASE; those
instructions instead have to use a modrm form. Therefore give a way
to annotate those instruction patters so the pattern matcher will move
on to the next pattern, rather than selecting them and then issue a
warning.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
This implements the mechanism for XACQUIRE/XRELEASE. It does not
include the necessary annotations in insns.dat.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
insn->prefixes might contain not only values from
'enum prefixes' but from 'enum reg_enum' as well so
make it generic 'int' instead.
This calms down the compiler about enum's mess and
eliminates a wrong assumption that we always have
values by particular type in this field.
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
- GEN_SIB and GEN_MODRM helpers added
- a number of tabs vs space fixs
- more use of is_class() helper
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
When we don't have an immediate for the i-field in /is4, then use a
normal quad-bytecode encoding for it to save some small amount of
space and re-use existing machinery.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
The DREX encoding never hit production silicon, and has been replaced
by VEX/XOP encoding, so remove support for it.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
It is more logical, it cleans up the code and it makes implicit
operand size override prefixes come out in the same order as explicit
ones instead of after all other prefixes.
Suggested-by: H. Peter Anvin <hpa@zytor.com>
calcsize() had the wrong criterion for when C5 prefixes are permitted
(REX.R is permitted, REX.X is forbidden.) assemble() had the right
test already. This caused symbol value errors.
The implicit operand size override code didn't set the operand size
prefix, which confused the size calculation code for the range check.
The BITS 64 operand size calculation is still off, but "fixing" it by
making it 32-bit unless REX.W is set breaks PUSH and maybe others.
Change the .wx (ignore the W field) to .wig, to match the latest
version of the AVX specification. This is not a functional change,
but just makes instruction patterns a little easier to write.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Two fixes:
1. Optimization of [bx+0xFFFF] etc
0xFFFF is an sbyte under 16-bit semantics,
so make sure to check it right.
2. Don't optimize displacements in -O0
Displacements that fit into an sbyte or
can be removed should *not* be optimized in -O0.
Implicit zero displacements are still optimized, e.g.:
[eax] -> 0 bit displacement, [ebp] -> 8 bit displacement.
However explicit displacements are not optimized:
[eax+0] -> 32 bit displacement, [ebp+0] -> 32 bit displacement.
Because #2 breaks compatibility with 0.98,
I introduced a new optimization level: -OL, legacy.
Add OUT_REL1ADR (one-byte relative address) and support for
OUT_ADDRESs with size == 1. Add support for it in
outbin and outdbg. *It still needs to be added to other backends*,
both the OUT_REL*ADR and OUT_ADDRESS codepaths need to be handled.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Check if the offset and the representation are equivalent.
Disallow REL on absolute addresses.
I'm not sure what that would mean and the output formats don't support it.
Warn about ignored displacement size modifiers.
We may throw out j variable (since we break anyway)
and don't assign asize for free (since we don't
use it after).
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Hopefully this should catch all of them... but please keep an eye out
for any other uses of int32_t for the operand flags.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
is_class does not checking flags "strictly". Which means
it may fail if type is specified to REGMEM and you check for
is_class(MEMORY, ...).
Anyway in current patch we check for REGISTER which doesn't
overlap and it is safe to use is_class here.
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Do an "exclusive" test for a REGISTER operand when deciding to treat
sizes as wildcards. "Exclusive" meaning don't just accept any class
that could be REGISTER, but something that is strictly a part of the
REGISTER class.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Register with no size are a bit special: we don't honor extrinsic
register sizes in the first place ("oword xmm1" gives a warning,
even), and they should match any xmmrm size. As such, explicitly
handle sizeless register operands as a hard match, instead of relying
on the fuzzy-matching mechanism to handle them.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Consolidate I_none opcode to be used everywhere
instead of mix (-1,I_none).
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Clear an uninitialized variable warning. The case can't actually
happen, but the compiler doesn't know that.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Defer the "operand size missing" error until we know all the other
operands have the correct type. Otherwise we'll end up with false
positives, which result in noise entered into the xsizeflags array,
thus causing fuzzy matching to fail.
It's possible we should defer it even further.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
This allows automatic fuzzy matching of operand sizes. If an operand
size is not specified, but there is exactly one possible size for the
instruction, select that instruction size. This requires a second
pass through the instruction patterns, and so is slightly slower, but
should be a lot easier to get right than the S- flags, and works even
when there is more than one instruction.
The new SX (Size eXact) flag can be used to prevent fuzzy matching
completely.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Move the instruction-matching loop into a common function. This gives
us a single point to adjust the instruction-selection algorithm.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
*To the best of my knowledge*, we now have authorization from everyone
who has significantly contributed to NASM in the past. As such,
change the license to the 2-clause BSD license.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Add copyright headers to the *.c/*.h files in the main directory. For
files where I'm sure enough that we have all the approvals, I have
given them the 2-BSD license, the others have been given the "LGPL for
now" license header. Most of them can probably be changed after
auditing.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Fix the disassembly of JRCXZ; in 64-bit mode, we should only accept
JECXZ for disassembly with 32-bit address size override.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Add a byte code to explicitly support instructions which only uses the
low 8-bit registers (as if a REX prefix always was present.) This is
usable for instructions which are officially documented as using "the
low byte of a 32-bit register" and so on.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Handle AMD's XOP prefixes; they use basically the same encoding as VEX
prefixes, so treat them simply as a variant of VEX.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Optimize displacements, don't pessimize them. When running in the
optimizer, we always keep track of when a reference is forward. That
doesn't mean it is unknown.
Only be optimistic about the reachability of a symbol with NO_SEG if
we are truly in pass 1, i.e. it could possibly be just a forward
reference. After we have done a single pass, if it is still NO_SEG,
then it is an absolute symbol and need to be treated as such.
WAIT is technically an instruction, but from an assembler standpoint
it behaves as if it had been a prefix. In particular, it has to be
ordered *before* any real hardware prefixes.
We have a number of all-zero buffers in the code. Put a single
all-zero buffer in nasmlib.c. Additionally, add fwritezero()
which can be used to write an arbitrary number of all-zero bytes;
this prevents the situation where the all-zero buffer is simply
too small.
Fix op2 references not yet converted to accessing op2; add an opy
pointer similar to the opx pointer instead of multiple references.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
The bytecode format assumes max 4 operands pretty strictly, but we
already have one instruction with 5 operands, and it's likely to get
more. Support them via extension prefixes (similar to REX prefixes).
For bytecodes which use argument bytes we encode the number directly,
however.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
When issuing warnings for EA displacements during address generation,
actually look a the proper operand!
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Not all backends can handle being handled an intrasegment OUT_REL*ADR,
and we don't fix them up in common code either (which would be the
logical thing to do -- right now we fix them up in a bunch of
individual places.)
For now, just fix up the one in address generation.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
For OUT_REL*ADR, the "size" argument is actually the offset inside the
instruction; that is in fact why we encode the real size in the
instruction itself. Thus, emit the offsets properly using this
mechanism when generating relative EAs.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Use the case4() macros as we already do in disasm.c. It helps reduce
visual clutter, and more clearly demonstrates that groups of four
belong together. Furthermore, it makes the text compact enough that
we can now use case statements to mask down the EA patterns correctly.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Reshuffle the bytecodes for segment register push/pop to make more
sense, and move them from \4 to \344, thus freeing up the single-digit
bytecodes \4..\7 for future use. It doesn't really make sense to use
single-digit bytecodes for this very oddball use.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Add a new opcode for 32->64 bit sign-extended immediate, with warning
on the number not matching.
This unfortunately calls for an audit of all the \4[0123] opcodes, if
they should be replaced by \25[4567]. This only replaces one
instruction (MOV reg64,imm32); other instructions need to be
considered.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
is_sbyte64() was equivalent to is_sbyte32() plus the warning; however,
the warning is only used in one place (and conflicts with another
warning there), so remove the function.
Furthermore, add back the test for pure immediates in
possible_sbyte(); they had been broken out but never folded back in --
and are essential.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
New opcodes to deal with 8-bit immediates which are then sign-extended
to the operand size. These allow us to warn appropriately.
Not sure I'm using these in all the proper places; need audit of all
uses of the \14..\17 opcodes.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
When there is an immediate in the instruction, a RIP-relative offset
may not be relative to the end of the offset itself, since it is
relative to the end of the *instruction*, not the end of the *offset*.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Issue better warnings for out-of-range values. This is not yet
complete.
In particular, note we may have out-of-range for values that end up
being subject to optimization. That is because the optimization takes
place on the *truncated* value, not the pre-truncated value.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>