Support the +n syntax for multiple contiguous registers, and emit it
in the output from ndisasm as well.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
New instructions which do four full iterations of a data-reduction
operation (FMA, dot product.)
Bug report: https://bugzilla.nasm.us/show_bug.cgi?id=3392492
Reported-by: ff_ff <qqqqqqqqqfffffffff@gmail.com>
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Add PTWRITE instruction. It is worth noting that we should
be able to do "ptwrite [eax]" in 32-bit mode, but the instruction
selector doesn't currently handle that well in a way that doesn't make
64-bit mode very confusing.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Add instructions from the Intel Instruction Set Extensions and Future
Features Programming Reference, document 319433-034, May 2018.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Make it possible to generate variants of RET(F) with explicit operand
size specified without having to use o16/o32/o64.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Automatically assign values to the instruction flags; we ended up with
a case where pushing flags into the next dword caused comparison
failures due to other places in the code explicitly comparing
field[3].
This creates necessary defines for this not to happen; it also cleans
up a fair bit of the iflag code.
This resolves BR 3392454.
Reported-by: Thomasz Kantecki <tomasz.kantecki@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
We could generate the MRI version (SSE 4.1) instead of the RMI
(SSE 2) version of these instructions if a 64-bit register was given
as the destination.
Reported-by: Vasiliy Olekhov <olekhov@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
We don't need to sort opcodes anymore, since we are using an O(1) hash
and not binary search. Instead, sort them in the order they first
appear in insns.dat; this lets us move all the pseudo-ops to a
contiguous range at the start of the file, for more efficient
handling.
Change the functions that process pseudo-ops accordingly.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Compactify the instruction list in the documentation to have fewer
margin violations, and fix some of the headings (;#).
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
The fvm: annotation to generate the correct EVEX compressed
displacements had inadvertently gotten dropped from a handful of
instructions in checkin c33d95fde9:
BR 3392370: {z} decorator allowed on MOVDQ* memory operands
Put them back, and verify they work.
Reported-by: Henrik <henrik@gramner.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
insns-iflags.ph is included from another Perl script, so rename it .ph
(Perl header). Add missing dependency to the main Makefile.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
The 2-operand form was inherently unsafe. Use the 3-operand form
instead, which guarantees that arbitrary filenames are supported.
This also means we can remove a few instances of sysopen() which was
used for exactly this reason, however, at least in theory sysopen()
isn't portable.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
The spec says very clearly the {z} decorator is allowed on memory
operands for the MOVDQ* instructions. Remove special cases from the
code to disallow this case, which had the unfortunate effect of
generating a very uninformative error message.
Reported-by: Agner <agner@agner.org>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
The UD0 and UD1 opcodes are now officially documented, with UD1
officially documented to take a modr/m. Still permit the "UD2B" and
argument-less aliases, but not as preferred.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
The constant arrays in get_disp8N() should be static; otherwise the
compiler has to manifest them on the stack for every execution which
makes no sense at all.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Improve consistency by allowing contracted forms for EVEX-encoded
instructions when it's allowed for similar VEX-encoded instructions.
Previously the behavior would change depending on the vector size or
the register number which could be somewhat confusing:
vaddps xmm0, xmm1 ; ok
vaddps ymm0, ymm1 ; ok
vaddps zmm0, zmm1 ; error
vaddps xmm0, xmm16 ; error
Also allow contracted forms for a few additional older AVX instructions
where it makes sense.
Signed-off-by: Henrik Gramner <henrik@gramner.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Make the source code easier to understand and keep track of by
organizing it into subdirectories depending on the function.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>