Two bugs with respect to the FMA instructions:
- the variant increment is supposed to be 0x10, not 0x01.
- the base opcode for scalar VFNMADD is 0x9d, not 0x9c
The Perl script which auto-generated the VFM instructions had
incorrectly conflated the VEX.W and VEX.L bits, with the result that
only half the valid instructions were generated.
Fix the disassembly of the alternate forms of register-register
MOVAPD, MOVDQA, MOVDQU, MOVQ, MOVSD, and MOVUPD.
NASM never generates these, but they would be disassembled
incorrectly.
WAIT is technically an instruction, but from an assembler standpoint
it behaves as if it had been a prefix. In particular, it has to be
ordered *before* any real hardware prefixes.
Update the VFMA* instructions to match the AVX spec version 5.
Since these are highly regular, use a small Perl script to generate
the instruction patterns.
The POPCNT instruction should not require sizes on memory operands.
Add the appropriate size flags for that to work.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
The CRC32 instructions require F2, but can also take a 66 prefix to
set the operand size. This is not the SSE model of prefix extension.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Reshuffle the bytecodes for segment register push/pop to make more
sense, and move them from \4 to \344, thus freeing up the single-digit
bytecodes \4..\7 for future use. It doesn't really make sense to use
single-digit bytecodes for this very oddball use.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Change \40 class opcodes which need to be changed to \254. IMUL will
need a separate audit; I'm not convinced we are really sure what all
the IMUL conditions should be.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Add a new opcode for 32->64 bit sign-extended immediate, with warning
on the number not matching.
This unfortunately calls for an audit of all the \4[0123] opcodes, if
they should be replaced by \25[4567]. This only replaces one
instruction (MOV reg64,imm32); other instructions need to be
considered.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
New opcodes to deal with 8-bit immediates which are then sign-extended
to the operand size. These allow us to warn appropriately.
Not sure I'm using these in all the proper places; need audit of all
uses of the \14..\17 opcodes.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Issue better warnings for out-of-range values. This is not yet
complete.
In particular, note we may have out-of-range for values that end up
being subject to optimization. That is because the optimization takes
place on the *truncated* value, not the pre-truncated value.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
The official mnemonic for 32-to-64-bit sign extension is MOVSXD for
some idiotic reason. Add support for it while continue to recognize
MOVSX for this as an alias.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Handle SLDT with a 64-bit register operand. Don't generate a REX.W
prefix in the assembler, since zero-extending is just fine, but do
support it in the disassembler.
BR 1974170: VCVTPD2PS, VCVTPD2DQ, VCVTTPD2DQ with a memory operand are
ambiguous without a specific operand size, so force one to be added.
Split the instruction pattern due to our current clunky handling of
MMX/XMM/YMM registers together with sizes. Fix in the future, please!