The fix for BR 3392278:
aa29b1d93f assemble.c: Don't drop rex prefix from instruction itself
... would cause multiple REX prefixes to be emitted for some
instructions. Create a new flag to indicate that REX has already been
emitted, which can be cleared for each instance of an instruction.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
BR 3392275 complains about xmm0 having to be explicitly included in
the assembly syntax when it is implicit in the encoding. In the
interest of "be liberal in what you accept", accept either form in the
input.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Allow specifying {vex3} or {vex2} (the latter is currently always
redundant, unless we end up with instructions at some point can be
specified with legacy prefixes or VEX) to select a specific encoding
of VEX-encoded instructions.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Since the multi-line macro preprocessor is modified to expand
grouped parameter with braces. The escape character is not needed
any more.
The testcase converter script is also modified not to generate '\'.
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
Only generate a signed relocation if the displacement size is less
than the address size. This matters when involving address size
overrides.
It is technically impossible to do this one perfectly, because it is
never really knowable if the displacement offset is used as a base or
an index.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Reverted the redundant branch instruction patterns for bnd prefix.
And when a relaxed jmp instruction becomes a short (Jb) form,
bnd prefix is not needed because it does not initialize bnd registers.
So in that case, bnd prefix is silently dropped.
BND JMP foo -> drops bnd prefix
BND JMP short foo -> shows an explicit error
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
GAS uses *1 multiplier for explicitly marking an index register in mib operand.
e.g.) [rdx * 1 + 3] is equivalent to [3, rdx] in NASM's split EA format
So only for mib operands, this is encoded same as gas does.
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
MPX test asm files are added. These include all three different styles of
mib syntax (NASM, ICC and gas).
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
Mostly intended for the "mib" expressions in BNDLDX/BNDSTX.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
test/avx512*.asm files are now tested by using perfomtest.pl
Refer to pefomtest help message for the usage.
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
Added Prefetch (AVX-512PF) instructions.
These instructions are supported
if CPUID.(EAX=07H, ECX=0):EBX.AVX512PF[bit 26] = 1.
CPUID feature flag for PREFETCHWT1 is TBD
but PREFETCHWT1 is included in this commit.
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Added Exponential and Reciprocal (AVX-512ER) instructions.
These instructions are supported
if CPUID.(EAX=07H, ECX=0):EBX.AVX512ER[bit 27] = 1.
IF_AVX512 is now shared by all AVX-512* instructions as a bit mask.
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Added Conflict Detection (AVX-512CD) instructions.
These instructions are supported
if CPUID.(EAX=07H, ECX=0):EBX.AVX512CD[bit 28] = 1.
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Added three-operand pseudo-ops for VCMPPD, VPCMPD and so on.
Test case is also updated to validate them.
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Added K* instructions test cases in test/avx512f.asm.
The previous test case from GNU AS were repeating the same instruction twice,
so the repeated half part is removed.
Changed the python script (gas2nasm.py) to include opmask instructions.
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Fixed or purged some old comments and added a comment for a previous patch.
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
From gas testsuite file, a text file containing raw bytecodes
is useful when verifying the output of NASM.
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
This was converted from a gas testsuite.
(gas/testsuite/gas/i386/x86-64-avx512f-intel.d)
A python script that is used for converting is also included.
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
There are two instructions (VGATHERQPS, VPGATHERQD) where the only
separation between two forms is the vector length given to the vector
SIB. This means the *matcher* has to be able to distinguish
instructions by vector SIB length and the matcher only operates on the
operands and the instruction flags, not on the bytecode.
Export the vector index-ness into the operand flags and add to the
matcher.
This resolves BR 3392260.
Reported-by: Agner <agner@anger.org>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
The moffset opcodes A2 and A3 do not support HLE. Unfortunately
checkin
fb3f4e6d HLE: Change NOHLE to be an instruction flag
... inadvertently lost the NOHLE flag for opcode A2.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Try to implement the handling of MOVD as attempted in checkin:
70712c0df6
and reverted in:
d279fbbd80
due to BR3392199. This time make sure to use the SX flag to only
match when a size is explicitly given, and also don't duplicate the 0F
6F/7F opcodes, which are documented as MOVQ by AMD as well as Intel.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
The way our matching system works we have to make NOHLE an instruction
flag rather than an byte code; by the time we run the byte code
interpreter we have already picked an instruction pattern once and for
all.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Clean up JMP/CALL patterns so they don't disassemble quite so uglily.
Fix a CALL pattern which would have incorrectly generated a (harmless)
REX.W prefix.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
It is more logical, it cleans up the code and it makes implicit
operand size override prefixes come out in the same order as explicit
ones instead of after all other prefixes.
Suggested-by: H. Peter Anvin <hpa@zytor.com>
The implicit operand size override code didn't set the operand size
prefix, which confused the size calculation code for the range check.
The BITS 64 operand size calculation is still off, but "fixing" it by
making it 32-bit unless REX.W is set breaks PUSH and maybe others.
calcsize() had the wrong criterion for when C5 prefixes are permitted
(REX.R is permitted, REX.X is forbidden.) assemble() had the right
test already. This caused symbol value errors.
The implicit operand size override code didn't set the operand size
prefix, which confused the size calculation code for the range check.
The BITS 64 operand size calculation is still off, but "fixing" it by
making it 32-bit unless REX.W is set breaks PUSH and maybe others.