This reverts commit 70712c0df6.
Conflicts:
insns.dat
Our instructions matcher fuzzy logic fails to handle it at moment.
Reported-by: KO Myung-Hun <komh@chollian.net>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
AMD has MOVD for both 32bit and 64bit GPRs so in a sake of
compatibility bring them into insns.dat.
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
As being spotted by nasm64developer the memory
operands size is incorrect. Fix it.
Reported-by: nasm64developer
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
The second VPERMD should be VPERMPD actually.
Thanks to nasm64developer for gas test file provided
which allowed to reveal this issue.
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
I converted almost all instructions in insns.dat (version
7a6f978698) to the more
readable format that insns.pl has supported for years.
I also made some changes to insns.pl. You can verify that the
new insns.dat and insns.pl produce byte-identical output to
the old insns.dat and insns.pl, so I think that this change
is safe to check in, even though it is a large change to
insns.dat.
The changes to insns.pl are:
* fixed a bug: ib,u was not recognized
* added support for a second immediate argument called "j" for
instructions like ENTER imm,imm
* added a "+r" syntax for \10..\13
[gorcunov: insns files remains the same, great job anonymous!]
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Clean up the formatting of the BMI instruction patterns, and fix:
a) X64,FUTURE is wrong - it needs to be LONG,FUTURE
b) Fix the BLSI, BLSMSK, BLSR instruction patterns
c) Use a bracket pattern for TZCNT
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
As HPA explained
|
| w.r.t. the -QQ- instruction forms... when we did
| the initial AVX implementation we decided that
| using -DQ- (double quadword) for 256-bit instructions
| was a bit messy, so we decided to accept both -DQ-
| (being official) and -QQ-
|
So move VLDQQU back and place it before VLDDQU so disassembler
match it first.
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
This form of VPEXTRW is that named 'B' form so
operands encoding should be fixed.
Reported-by: Jasper Neumann
Patch-by: Jasper Neumann
CC: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
In fact it was written as
MOVAPS xmmreg,xmmreg \360\2\x0F\x28\110 KATMAI,SSE
MOVAPS xmmreg,xmmreg \360\2\x0F\x29\101 KATMAI,SSE
in first place
MOVUPS xmmreg,xmmreg \360\2\x0F\x10\110 KATMAI,SSE
MOVUPS xmmreg,xmmreg \360\2\x0F\x11\101 KATMAI,SSE
and for example x28 stands for xmmrm128,xmmreg and
x1 for xmmrm128,xmmreg.
TODO: Inspect and fix WILLAMETTE instructions.
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Handle immediate-size optimization for "mov r64,imm" -- reduce it to
"mov r32,imm32" or "mov r64,imm32" as appropriate.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Allow implicit operands for VBLENDVP, just as for other instructions,
since the semi-legacy forms now are removed.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Version 7 of the AVX spec specifically forbids (#UD) using the
66 0F 38 14/15 forms of the BLENDV instructions with a VEX prefix;
those encodings are strictly legacy SSE 4.1.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Updates from the AVX version 7 specification: mostly tightening of the
rules for VEX.L and VEX.W, but remove the VPERMIL2 instructions.
Also encode all the full-length forms of the VCMP instructions and
prefer those for the disassembly.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Add FXSAVE64 and FXRSTOR64; drop the np prefix on 0F AE instructions:
none of the rest of the 0F AE instructions have them, and there are no
conflicts.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
FUTURE is a CPU level flag, and cannot be combined with X64 (which is
shorthand for X86_64,LONG). Also, make sure we add LONG annotations
to everything that is 64-bit mode only.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Add the RD*SBASE, WR*SBASE and RDRAND instructions from version 7 of
the AVX specification, Intel document 319433-007.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
PUSH imm64 confuses ones who is trying to find this instruction in
processor programming manuals.
Actually it was introduced in a sake of "push `size' imm" consistency.
In other words -- to allow users to state "PUSH qword imm32" in 64bit code,
though on byte level (ie generated) code it still has a correct and valid
sign-extended "PUSH imm32" instruction.
To get rid of this ambiguie bite we make explicit "PUSH imm32"
being valid in 64bit code. This also makes "PUSH dword imm32"
valid in 64bit code as well.
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
The former changes have been committed to binutils.
From initial message:
|
| 2010-03-22 Quentin Neill <quentin.neill@amd.com>
| Sebastian Pop <sebastian.pop@amd.com>
|
| opcodes/
| * i386-dis.c (OP_LWP_I): Removed.
| (reg_table): Do not use OP_LWP_I, use Iq.
| (OP_LWPCB_E): Remove use of names16.
| (OP_LWP_E): Same.
| * i386-opc.tbl: Removed 16bit LWP insns. 32bit LWP insns
| should not set the Vex.length bit.
| * i386-tbl.h: Regenerated.
|
| gas/
| * testsuite/gas/i386/x86-64-lwp.s: Remove use of 16bit LWP insns.
| * testsuite/gas/i386/lwp.s: Same.
| * testsuite/gas/i386/x86-64-lwp.d: Updated.
| * testsuite/gas/i386/lwp.d: Updated.
|
So there is no 16 bit instructions anymore.
Also xop.l field should be set to 0.
Based on patch from nasm64developer
Reported-by: nasm64developer
Signed-off-by: nasm64developer
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
The first argument to MONITOR is an address, so it should be 64 bits
(RAX) in 64-bit mode.
The preferred form is still just plain "monitor".
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
nasm64developer reported that we have no LWP support yet.
Add this feature.
Reported-by: nasm64developer <nasm64developer@users.sf.net>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
nasm64developer reported a few nits in XOP
instruction templates. Plain typo in specification
(http://support.amd.com/us/Processor_TechDocs/43479.pdf)
and opcode errors.
Reported-by: nasm64developer <nasm64developer@users.sf.net>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
nasm64developer reported that VFNMADDSD and VFNMADDSS
have "m" and "s" operands swapped in instruction templates
file.
Reported-by: nasm64developer <nasm64developer@users.sf.net>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
During conversion of size of memory operands into
explicit form the compatibility with 2.07 has been
broken (for a small set of instructions). Lets restore
it. Details below.
This is due to specifics of our "fuzzy logic" algorithm.
For example consider the user wrote an instruction like
VCVTTPD2DQ xmm0,[eax]
the last operand is memory reference. But template contains
the following two items (written in simplified form)
VCVTTPD2DQ xmmreg,mem128
VCVTTPD2DQ xmmreg,mem256
So this is impossible to find out what _exactly_ user meant:
either reference to 128 bit value in memory or 256 bit.
As a solution we've been using IF_Sx modifier written in
template which allows to choose "by-default" template
and break the tie.
Reported-by: Victor van den Elzen <victor.vde@gmail.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Even the non-DREX SSE5 instructions appear to have been either
obsoleted or replaced with XOP varieties. The only exception are the
ROUNDxx instructions, which are really SSE4.1 instructions and which
were simply duplicates.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
One more incorrect use of sbyte in IMUL.
Overall, the IMUL patterns seem really messy. *Furthermore*, despite
IMUL normally being thought of as signed, the 2- and 3-operand
versions don't produce a high half and are therefore
signedness-agnostic -- we could even add MUL patterns for those forms.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Fix a very curious transposition in the instruction patterns for IMUL,
which caused 32-bit IMUL instructions with constants like 0x10001 to
be generated incorrectly.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Convert Intel AVX instructions to explisit size
format. Part 2.
Also CLMUL converted as well.
Btw, VPINSR was a bit broken since SB constraint
is not applied on all forms but requires 16,32,64
memory sizes too. Fixed.
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Convert Intel AVX instructions to explisit size
format. Part 1.
Also SAR instruction is touched as well.
Signed-off-by: Cyrill Gorcunov <gorcunov@gmail.com>
Fix the disassembly of JRCXZ; in 64-bit mode, we should only accept
JECXZ for disassembly with 32-bit address size override.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Change the relaxed forms to the compact representation. This
*deliberately* does not fix bugs where the relaxed form does not match
the official form; this is strictly a "no change in output" checkin.
All remaining open-coded relaxed forms are very likely bugs, and need
to be individually audited. Furthermore, it is questionable if the
Intel FMA instructions, being destructive, should have relaxed forms
at all.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
1) A number of PMA -> VPM misprint fixed.
2) Spec points to ymmreg in mnemonics even for L=0 instructions. Fixed.
The instructions are still sorted in order of specification follows.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Introduce base XOP/FMA4/CVT16 instructions (SSE5)
based on official specification from AMD (rev 3.03).
Some fixes from Peter Johnson and H. Peter Anvin
included (not updated in AMD spec yet).
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Two bugs with respect to the FMA instructions:
- the variant increment is supposed to be 0x10, not 0x01.
- the base opcode for scalar VFNMADD is 0x9d, not 0x9c
The Perl script which auto-generated the VFM instructions had
incorrectly conflated the VEX.W and VEX.L bits, with the result that
only half the valid instructions were generated.
Fix the disassembly of the alternate forms of register-register
MOVAPD, MOVDQA, MOVDQU, MOVQ, MOVSD, and MOVUPD.
NASM never generates these, but they would be disassembled
incorrectly.
WAIT is technically an instruction, but from an assembler standpoint
it behaves as if it had been a prefix. In particular, it has to be
ordered *before* any real hardware prefixes.
Update the VFMA* instructions to match the AVX spec version 5.
Since these are highly regular, use a small Perl script to generate
the instruction patterns.
The POPCNT instruction should not require sizes on memory operands.
Add the appropriate size flags for that to work.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
The CRC32 instructions require F2, but can also take a 66 prefix to
set the operand size. This is not the SSE model of prefix extension.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Reshuffle the bytecodes for segment register push/pop to make more
sense, and move them from \4 to \344, thus freeing up the single-digit
bytecodes \4..\7 for future use. It doesn't really make sense to use
single-digit bytecodes for this very oddball use.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Change \40 class opcodes which need to be changed to \254. IMUL will
need a separate audit; I'm not convinced we are really sure what all
the IMUL conditions should be.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Add a new opcode for 32->64 bit sign-extended immediate, with warning
on the number not matching.
This unfortunately calls for an audit of all the \4[0123] opcodes, if
they should be replaced by \25[4567]. This only replaces one
instruction (MOV reg64,imm32); other instructions need to be
considered.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
New opcodes to deal with 8-bit immediates which are then sign-extended
to the operand size. These allow us to warn appropriately.
Not sure I'm using these in all the proper places; need audit of all
uses of the \14..\17 opcodes.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Issue better warnings for out-of-range values. This is not yet
complete.
In particular, note we may have out-of-range for values that end up
being subject to optimization. That is because the optimization takes
place on the *truncated* value, not the pre-truncated value.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
The official mnemonic for 32-to-64-bit sign extension is MOVSXD for
some idiotic reason. Add support for it while continue to recognize
MOVSX for this as an alias.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Handle SLDT with a 64-bit register operand. Don't generate a REX.W
prefix in the assembler, since zero-extending is just fine, but do
support it in the disassembler.
BR 1974170: VCVTPD2PS, VCVTPD2DQ, VCVTTPD2DQ with a memory operand are
ambiguous without a specific operand size, so force one to be added.
Split the instruction pattern due to our current clunky handling of
MMX/XMM/YMM registers together with sizes. Fix in the future, please!
For the versions of VCMPxx which already embed their condition code,
we do not want an extra immediate argument.
Todo: fix bytecode compiler to complain more about these.