We could generate the MRI version (SSE 4.1) instead of the RMI
(SSE 2) version of these instructions if a 64-bit register was given
as the destination.
Reported-by: Vasiliy Olekhov <olekhov@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Compactify the instruction list in the documentation to have fewer
margin violations, and fix some of the headings (;#).
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
The fvm: annotation to generate the correct EVEX compressed
displacements had inadvertently gotten dropped from a handful of
instructions in checkin c33d95fde9:
BR 3392370: {z} decorator allowed on MOVDQ* memory operands
Put them back, and verify they work.
Reported-by: Henrik <henrik@gramner.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
insns-iflags.ph is included from another Perl script, so rename it .ph
(Perl header). Add missing dependency to the main Makefile.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
The 2-operand form was inherently unsafe. Use the 3-operand form
instead, which guarantees that arbitrary filenames are supported.
This also means we can remove a few instances of sysopen() which was
used for exactly this reason, however, at least in theory sysopen()
isn't portable.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
The spec says very clearly the {z} decorator is allowed on memory
operands for the MOVDQ* instructions. Remove special cases from the
code to disallow this case, which had the unfortunate effect of
generating a very uninformative error message.
Reported-by: Agner <agner@agner.org>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
The UD0 and UD1 opcodes are now officially documented, with UD1
officially documented to take a modr/m. Still permit the "UD2B" and
argument-less aliases, but not as preferred.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
The constant arrays in get_disp8N() should be static; otherwise the
compiler has to manifest them on the stack for every execution which
makes no sense at all.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Improve consistency by allowing contracted forms for EVEX-encoded
instructions when it's allowed for similar VEX-encoded instructions.
Previously the behavior would change depending on the vector size or
the register number which could be somewhat confusing:
vaddps xmm0, xmm1 ; ok
vaddps ymm0, ymm1 ; ok
vaddps zmm0, zmm1 ; error
vaddps xmm0, xmm16 ; error
Also allow contracted forms for a few additional older AVX instructions
where it makes sense.
Signed-off-by: Henrik Gramner <henrik@gramner.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Make the source code easier to understand and keep track of by
organizing it into subdirectories depending on the function.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>