emit_rex is supposed to write REX prefix into output stream
if needed, but we happen to drop it off on a first write
which breaks REX required instructions if TIMES directive
is used.
For example the code like
| times 4 movq xmm11, xmm11
compiles into
| 0000000000000000 <.text>:
| 0: f3 45 0f 7e db movq %xmm11,%xmm11
| 5: f3 0f 7e db movq %xmm3,%xmm3
| 9: f3 0f 7e db movq %xmm3,%xmm3
| d: f3 0f 7e db movq %xmm3,%xmm3
instead of proper
| 0000000000000000 <.text>:
| 0: f3 45 0f 7e db movq %xmm11,%xmm11
| 5: f3 45 0f 7e db movq %xmm11,%xmm11
| a: f3 45 0f 7e db movq %xmm11,%xmm11
| f: f3 45 0f 7e db movq %xmm11,%xmm11
http://bugzilla.nasm.us/show_bug.cgi?id=3392278
Reported-by: Javier <elpochodelagente@gmail.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
[nosplit eax] has been encoded as [eax*1+0] since 0.98.34.
But this seems like unexpected behavior.
So only when a register is multiplied, that will be treated
as an index. ([nosplit eax*1] -> [eax*1+0])
Document is updated accordingly.
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
[nosplit eax+eax] was encoded [eax*2] previously but
this seems against the user's intention.
So in this case, nosplit is ignored now and [eax+eax] will be
generated.
Document is also updated accordingly.
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
In mib operands, users' intention should be preserved.
e.g.) [eax + eax*1] and [eax*2] must be distinguished and encoded differently.
So a new EA flag EAF_MIB for mib operands is added.
And a new EA hint EAH_SUMMED for the case of [eax+eax*4] being parsed
as [eax*5] is also added.
NOSPLIT specifier does not have an effect in mib, so [nosplit eax + eax*1]
will be encoded as [eax, eax] rather than [eax*2] as in a regular EA.
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
bnd and nobnd prifixes can be used for each instruction line to
direct whether bnd registers should be preserved or not.
And those are also added as options for DEFAULT directive.
Once bnd is set with default, DEFAULT BND, all bnd-prefix
available instructions are prefixed with bnd. To override it,
nobnd prefix can be used.
In the other way, DEFAULT NOBND can disable DEFAULT BND and
have nasm encode in the normal way.
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
When bnd prefix is dropped as jmp is encoded as jmp short,
nasm shows a warning message, which can be suppressed with a new
command line option, -w-bnd.
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
Allow specifying {vex3} or {vex2} (the latter is currently always
redundant, unless we end up with instructions at some point can be
specified with legacy prefixes or VEX) to select a specific encoding
of VEX-encoded instructions.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
For checking the availability of {evex} prefix, AVX512 iflag
has been used. But this is a flag for an instruction set
not for an encoding scheme. And there are some AVX512 instructions
encoded with VEX prefix.
So a new instruction flag (IF_EVEX) is added for the instructions
which are actually encoded with EVEX prefix.
This flag is automatically added by insns.pl, so no need to add manually
in insns.dat.
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
Since only EVEX supports all 32 vector registers encoding for now,
VEX/REX encoded instructions should not take high-16 registers as operands.
This filtering had been done using instruction flag so far, but
using the opflags makes more sense.
[XYZ]MMREG operands used for non-EVEX instructions are automatically
converted to [XYZ]MM_L16 in insns.pl
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
Reverted the redundant branch instruction patterns for bnd prefix.
And when a relaxed jmp instruction becomes a short (Jb) form,
bnd prefix is not needed because it does not initialize bnd registers.
So in that case, bnd prefix is silently dropped.
BND JMP foo -> drops bnd prefix
BND JMP short foo -> shows an explicit error
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
Consolidated two separate but similar functions in nasm and ndisasm
into a commonly linked source code.
To encode and decode the compressed displacement (disp8*N) for EVEX,
N value should be derived using various conditions.
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
The broadcasting decorator {1to##} must describe exactly how many times
the memory element is repeated in order to clearly match the correct
instruction format.
For example,
vaddpd zmm30,zmm29,QWORD [rdx+0x3f8]{1to8} ; good
vaddpd zmm30,zmm29,QWORD [rdx+0x3f8]{1to16} ; fail qword * 16 = 1024b
vaddps zmm30,zmm29,DWORD [rcx]{1to16} ; good
vaddps zmm30,zmm29,DWORD [rcx]{1to8} ; fail dword * 8 = 256b
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
Broadcasting operand size is different from the original
operand size because 32b or 64b element is repeated to form a vector.
So when matching a broadcasting operand, opsize should be treated
differently.
The broadcasting element size is specified in the decorator information.
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
As BND prefix validity check conflicts with jcc8 prefix,
IF_BND is added for the instruction templates which can have
bnd prefix for preserving the content of bound register.
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
GAS uses *1 multiplier for explicitly marking an index register in mib operand.
e.g.) [rdx * 1 + 3] is equivalent to [3, rdx] in NASM's split EA format
So only for mib operands, this is encoded same as gas does.
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
BND prefix is used for adding bounds checking protection
across flow control changes such as call, ret, jmp and jcc calls.
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
Added MPX instructions and corresponding parser and encoder.
ICC style mib - base + disp and index are separate - is supported.
E.g. bndstx [ebx+3], bnd2, edx -> ebx+3 : base+disp, edx : index
As a supplement to NASM style mib - split EA - parser,
omitted base+disp is now treated as 0 displacement.
E.g. bndstx [,edx], bnd2 -> bndstx [0,edx], bnd2
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
REX.RXB bits were set for high-8 registers previously.
Since high-16 zmm registers are newly added, those bits should
be set as one bit of binary number of register value.
Similarly EVEX.R'/V'/X should be set in the same manner.
Authored-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
Initialized disp8 to avoid a case that disp8 encoded
instead of the actual offset value.
Added a checking routine for basereg value before using it
as an index of array.
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Added Exponential and Reciprocal (AVX-512ER) instructions.
These instructions are supported
if CPUID.(EAX=07H, ECX=0):EBX.AVX512ER[bit 27] = 1.
IF_AVX512 is now shared by all AVX-512* instructions as a bit mask.
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
If SAE is set, VL(vector length) is implied to be 512.
EVEX.L'L (=EVEX.RC) is set to 00b by default.
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Register value needs to be checked. Previous patch compared with reg_enum.
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
It was not so straight forward to find the postion of operand that has
a broadcasting, embedded rounding mode or SAE (Suppress All Exceptions)
decorator out from operands types or bytecode.
Remebering the postion of the operand of interest in the parser reduces
the burden that assembler looks through the operands.
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
High-16 registers of XMM and YMM need to be encoded with EVEX not VEX.
Even if all the operand types match with VEX instruction format,
it should use EVEX instead.
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Increased the size of data type for instruction flags from 32bits to 64bits.
And a new type (iflags_t) is defined for better maintainability.
Bigger data type is needed because more instruction set types are coming
but there were not enough space for them. Since they are not bit masks,
only one instruction set is allowed for each instruction.
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Fixed a bug that derived an incorrect N value for tuple types of
T2, T4, T8.
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Since embedded rounding mode is following the last SIMD op,
GPR op should be skipped when finding the last SIMD op.
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
When an instruction allows broadcasting, the memory element size is
different from the size of normal memory operation.
This information is provided in a decoflags field, so it should try to
match those properties before it fails.
Signed-off-by: Jin Kyu Song <jin.kyu.song@intel.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
According to XED and experimentation, the 66 is ignored.
Signed-off-by: Ben Rudiak-Gould <benrudiak@gmail.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
It was there to support the SSE5 DREX encoding,
which as far as I know is dead forever.
Signed-off-by: Ben Rudiak-Gould <benrudiak@gmail.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
This patch is getting rid of the following bytecodes
'pushseg','popseg','pushseg2','popseg2' and simplifies
overall code.
[gorcunov@: a few style fixes]
Signed-off-by: Ben Rudiak-Gould <benrudiak@gmail.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Thus if someone need to rework this code he won't need
to jump between files trying to figure out where enum
and opcodes lay.
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
It doesn't seem worth >200 lines of C and Perl to save ~50 lines in insns.dat.
In order to make this work I had to rename sbyte16/sbyte32 so that
they can take an ordinary size suffix (their size suffix was formerly
treated specially).
This fixes one disassembly bug: 48C7C000000080 disassembles to mov
rax,0x80000000, which reassembles to B800000080, which loads a
different value.
Signed-off-by: Ben Rudiak-Gould <benrudiak@gmail.com>
Acked-by: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
This adds "np" to a bunch of SSE-style instructions that should have
it, "norep" (which was implemented but unused) on quasi-SSE instructions
that use F2 and F3 as instruction extensions but 66 for operand size,
"nof3" (newly implemented) on a few instructions, "norexw" on some
instructions that have only 32-bit and 64-bit versions, and one NOLONG.
It also removes some incorrect "np"s, changes some "f3"s to "f3i"s,
and fixes the decoding of the XCHG/NOP/PAUSE mess: F390 is always
PAUSE even when rex.b=1 (at least according to XED).
Signed-off-by: Ben Rudiak-Gould <benrudiak@gmail.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Soon we will need to encode 512 bits values
thus there is no space left in our opflags_t
which is 32 bitfield.
Extend it to 64 bits width.
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Since we are back to three bytecodes, move them back to the \271-\273
slot to free up the \264 complete quad.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
The way our matching system works we have to make NOHLE an instruction
flag rather than an byte code; by the time we run the byte code
interpreter we have already picked an instruction pattern once and for
all.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>