Commit Graph

437 Commits

Author SHA1 Message Date
Jan Beulich
390ddd6f68 x86: shorten certain template names
Now that we can purge templates, let's use this to improve readability a
little by shortening a few of their names, making functionally similar
ones also have identical names in their multiple incarnations.
2022-08-16 09:15:15 +02:00
Jan Beulich
e07ae9a3ef x86: template-ize certain vector conversion insns
Many of the vector conversion insns come with X/Y/Z suffixed forms, for
disambiguation purposes in AT&T syntax. All of these gorups follow
certain patterns. Introduce "xy" and "xyz" templates to reduce
redundancy.

To facilitate using a uniform name for both AVX and AVX512, further
introduce a means to purge a previously defined template: A standalone
<name> will be recognized to have this effect.

Note that in the course of the conversion VFPCLASSPH is properly split
to separate AT&T and Intel syntax forms, matching VFPCLASSP{S,D} and
yielding the intended "ambiguous operand size" diagnostic in Intel mode.
2022-08-16 09:14:39 +02:00
Jan Beulich
b9df5afb69 x86: template-ize vector packed byte/word integer insns
Many of the vector integer insns come in byte/word element pairs. Most
of these pairs follow certain encoding patterns. Introduce a "bw"
template to reduce redundancy.

Note that in the course of the conversion
- the AVX VPEXTRW template which is not being touched needs to remain
  ahead of the new "combined" ones, as (a) this should be tried first
  when matching insns against templates and (b) its Load attributes
  requires it to be first,
- this add a benign/meaningless IgnoreSize attribute to the memory form
  of PEXTRB; it didn't seem worth avoiding this.
2022-08-16 09:14:19 +02:00
Jan Beulich
d580ae4673 x86: re-order AVX512 S/G templates
The AVX2 gather ones are nicely grouped - do the same for the various
AVX512 scatter/gather ones. On the moved lines also convert EVex=<n> to
EVex<N>.
2022-08-16 09:13:12 +02:00
Jan Beulich
6473a592b4 x86: template-ize vector packed dword/qword integer insns
Many of the vector integer insns come in dword/qword element pairs. Most
of these pairs follow certain encoding patterns. Introduce a "dq"
template to reduce redundancy.

Note that in the course of the conversion
- a few otherwise untouched templates are moved, so they end up next to
  their siblings),
- drop an unhelpful Cpu64 from the GPR form of VPBROADCASTQ, matching
  what we already have for KMOVQ - the diagnostic is better this way for
  insns with multiple forms (i.e. the same Cpu64 attributes on {,V}MOVQ,
  {,V}PEXTRQ, and  {,V}PINSRQ are useful to keep),
- this adds benign/meaningless IgnoreSize attributes to the GPR forms of
  KMOVD and VPBROADCASTD; it didn't seem worth avoiding this.
2022-08-16 09:12:30 +02:00
Jan Beulich
73d214b268 x86: template-ize packed/scalar vector floating point insns
The vast majority of vector FP insns comes in single/double pairs. Many
pairs follow certain encoding patterns. Introduce an "sd" template to
reduce redundancy. Similarly, to further cover similarities between
AVX512F and AVX512-FP16, introduce an "sdh" template.

For element-size Disp8 shift generalize i386-gen's broadcast size
determination, allowing Disp8MemShift to be specified without an operand
in the affected templated templates. While doing the adjustment also
eliminate an unhelpful (lost information) diagnostic combined with a use
after free in what is now get_element_size().

Note that in the course of the conversion
- the AVX512F form of VMOVUPD has a stray (leftover) Load attribute
  dropped,
- VMOVSH has a benign IgnoreSize added (the attribute is still strictly
  necessary for VMOVSD, and necessary for VMOVSS as long as we permit
  strange combinations like "-march=i286+avx"),
- VFPCLASSPH is properly split to separate AT&T and Intel syntax forms,
  matching VFPCLASSP{S,D}.
2022-08-16 09:11:59 +02:00
Jan Beulich
33b6a20af3 revert "x86: Also pass -P to $(CPP) when processing i386-opc.tbl"
This reverts commit 384f368958, which
broke i386-gen's emitting of diagnostics. As a replacement to address
the original issue of newer gcc no longer splicing lines when dropping
the line continuation backslashes, switch to using + as the line
continuation character, doing the line splicing in i386-gen.
2022-08-16 09:11:18 +02:00
Jan Beulich
298d6e70a8 x86-64: adjust MOVQ to/from SReg attributes
It is unclear to me why the corresponding MOV (no Q suffix) can be
issued without REX.W, but MOVQ has to have that prefix (bit). Add
NoRex64 and in exchange drop Size64.
2022-08-09 09:20:07 +02:00
Jan Beulich
87bf025228 x86: adjust MOVSD attributes
The non-SSE2AVX form of the SIMD variant of the instruction needlessly
has the (still multi-purpose) IgnoreSize attribute. All other similar
SSE2 insns use NoRex64 instead. Make this consistent, noting that the
SSE2AVX form can't have the same change made - there the memory operand
doesn't at the same time permit RegXMM (which logic uses when deciding
whether a Q suffix is okay outside of 64-bit mode).
2022-08-09 09:19:36 +02:00
Jan Beulich
05a79382ed x86: fold AVX VGATHERDPD / VPGATHERDQ
While the other three variants each differ in attributes and hence can't
be folded, these two pairs actually can be (and were previously
overlooked). This effectively matches their AVX512VL counterparts, which
are also expressed as a single template.
2022-08-09 09:18:56 +02:00
Jan Beulich
3fbe5a0108 x86: allow use of broadcast with X/Y/Z-suffixed AVX512-FP16 insns
While the x/y/z suffix isn't necessary to use in this case, it is still
odd that these forms don't support broadcast (unlike their AVX512F /
AVX512DQ counterparts). The lack thereof can e.g. make macro-ized
programming more difficult.
2022-08-09 09:18:35 +02:00
Jan Beulich
747f6157e4 x86/Intel: split certain AVX512-FP16 VCVT*2PH templates
One more place where pre-existing templates should have been taken as a
basis: In Intel syntax we want to consistently issue an "ambiguous
operand size" error when a size-less memory operand is specified for an
insn where register use alone isn't sufficient for disambiguation.
2022-08-09 09:18:04 +02:00
Jan Beulich
0aea480cd8 x86: properly mark i386-only insns
Just like all Size64 insns are marked Cpu64, all Size32 insns ought to
be marked Cpu386.
2022-08-03 09:00:39 +02:00
Jan Beulich
2c735193b8 x86: also use D for MOVBE
First of all rename the meanwhile misleading Opcode_SIMD_FloatD, as it
has also been used for KMOV* and BNDMOV. Then simplify the condition
selecting which form if "reversing" to use - except for the MOV to/from
control/debug/test registers all extended opcode space insns use bit 0
(rather than bit 1) to indicate the direction (from/to memory) of an
operation. With that, D can simply be set on the first of the two
templates, while the other can be dropped.
2022-08-03 08:59:46 +02:00
Jan Beulich
d2dcf3908f x86: XOP shift insns don't really allow B suffix
By mistake it was permitted to be used from the very introduction of XOP
support.
2022-08-02 08:03:17 +02:00
Jan Beulich
e4971956ea x86: SKINIT with operand needs IgnoreSize
Without it in 16-bit mode a pointless operand size prefix would be
emitted.
2022-08-01 10:53:14 +02:00
Jan Beulich
e4e1fcce52 x86: drop stray NoRex64 from KeyLocker insns
It's entirely unclear why some of the KeyLocker insns had NoRex64 on
them - there's nothing here which could cause emission of REX.W (except
of course a user-specified "rex.w", which we ought to honor anyway).
2022-07-29 09:27:22 +02:00
Jan Beulich
ea09fe9259 x86: replace wrong attributes on VCVTDQ2PH{X,Y}
A standalone (without SAE) StaticRounding attribute is meaningless, and
indeed all other similar insns have ATTSyntax there instead. I can only
assume this was some strange copy-and-paste mistake.
2022-07-21 12:32:25 +02:00
Jan Beulich
987e8a90fa x86/Intel: correct AVX512F scatter insn element sizes
I clearly screwed up in 6ff00b5e12 ("x86/Intel: correct permitted
operand sizes for AVX512 scatter/gather") giving all AVX512F scatter
insns Dword element size. Update testcases (also their gather parts),
utilizing that there previously were two identical lines each (for no
apparent reason).
2022-07-21 12:32:04 +02:00
Jan Beulich
7e864bf71d x86: correct VMOVSH attributes
Both forms were missing VexW0 (thus allowing Evex.W=1 to be encoded by
suitable means, which would cause #UD). The memory operand form further
was using the wrong Masking value, thus allowing zeroing-masking to be
encoded for the store form (which would again cause #UD).
2022-07-18 11:20:44 +02:00
Jan Beulich
8bd915b770 x86: make D attribute usable for XOP and FMA4 insns
This once again allows to reduce redundancy in (and size of) the opcode
table.

Don't go as far as also making D work on the two 5-operand XOP insns:
This would significantly complicate the code, as there the first
(immediate) operand would need special treatment in several places.

Note that the .s suffix isn't being enabled to have any effect, for
being deprecated. Whereas neither {load} nor {store} pseudo prefixes
make sense here, as the respective operands are inputs (loads) only
anyway, regardless of order. Hence there is (as before) no way for the
programmer to request the alternative encoding to be used for register-
only insns.

Note further that it is always the first original template which is
retained (and altered), to make sure the same encoding as before is
used for register-only insns. This has the slightly odd (but pre-
existing) effect of XOP register-only insns having XOP.W clear, but FMA4
ones having VEX.W set.
2022-07-06 15:40:04 +02:00
Jan Beulich
a775efc84d x86: fold Disp32S and Disp32
The only case where 64-bit code uses non-sign-extended (can also be
considered zero-extended) displacements is when an address size override
is in place for a memory operand (i.e. particularly excluding
displacements of direct branches, which - if at all - are controlled by
operand size, and then are still sign-extended, just from 16 bits).
Hence the distinction in templates is unnecessary, allowing code to be
simplified in a number of places. The only place where logic becomes
more complicated is when signed-ness of relocations is determined in
output_disp().

The other caveat is that Disp64 cannot be specified anymore in an insn
template at the same time as Disp32. Unlike for non-64-bit mode,
templates don't specify displacements for both possible addressing
modes; the necessary adjustment to the expected ones has already been
done in match_template() anyway (but of course the logic there needs
tweaking now). Hence the single template so far doing so is split.
2022-07-04 08:32:50 +02:00
Jan Beulich
96016a2f00 x86: drop stray NoRex64 from XBEGIN
Presumably this being there was a result of taking CALL as a reference
when adding the RTM insns. But with No_qSuf the attribute has no effect.
2022-06-29 10:16:22 +02:00
Jan Beulich
cf665fee1d x86: re-work AVX512 embedded rounding / SAE
As a preparatory step to allowing proper non-operand forms of specifying
embedded rounding / SAE, convert the internal representation to non-
operand form. While retaining properties (and in a few cases perhaps
providing more meaningful diagnostics), this means doing away with a few
hundred standalone templates, thus - as a nice side effect - reducing
memory consumption / cache occupancy.
2022-05-27 08:48:09 +02:00
Jan Beulich
36b124126b x86: VFPCLASSSH is Evex.LLIG
This also was mistakenly flagged as Evex.128.
2022-04-27 11:08:57 +02:00
Jan Beulich
bb80cf5b42 x86: VCMPSH is Evex.LLIG
These were mistakenly flagged as Evex.128. Getting the LLIG status right
for insns allowing for SAE is a prereq for planned further work.
2022-04-19 09:25:25 +02:00
Jan Beulich
177e42f83d x86: drop stray CheckRegSize from VFPCLASSPH
Like VFPCLASSP{S,D} it has only a single operand allowing multiple
sizes, hence there are no pairs of operands to check for consistent
size.
2022-04-19 09:24:53 +02:00
Jan Beulich
c4d0963383 x86: also fold remaining multi-vector-size shift insns
By slightly relaxing the checking in operand_type_register_match() we
can fold the vector shift insns with an XMM source as well. While
strictly speaking an overlap in just one size (see the code comment) is
not enough (both operands could have multiple sizes with just a single
common one), this is good enough for all templates we have, or which
could sensibly / usefully appear (within the scope of the present
operand matching model).

Tightening this a little would be possible, but would require broadcast
related information to be passed into the function.
2022-03-18 10:55:45 +01:00
Jan Beulich
a548407ec2 x86: drop stray CheckRegSize from VEXTRACT{F,I}32X4
They have only a single operand allowing multiple sizes, hence there are
no pairs of operands to check for consistent size.
2022-03-18 10:55:20 +01:00
Jan Beulich
22c3694052 x86: fold certain AVX2 templates into their AVX counterparts
Like for AVX512VL we can make the handling of operand sizes a little
more flexible to allow reducing the number of templates we have.
2022-03-18 10:54:53 +01:00
Jan Beulich
ffb864501e x86: drop NoAVX insn attribute
To avoid issues like that addressed by 6e3e5c9e41 ("x86: extend SSE
check to PCLMULQDQ, AES, and GFNI insns"), base the check on opcode
attributes and operand types.
2022-01-06 14:19:56 +01:00
Jan Beulich
f0db6fb6d9 x86: drop NoAVX from POPCNT
With the introduction of CpuPOPCNT the NoAVX attribute has become
meaningless for POPCNT.
2022-01-06 14:19:20 +01:00
Jan Beulich
274be12a22 x86: drop some "comm" template parameters
As already indicated in a remark when introducing these templates, the
"commutative" attribute is ignored for legacy encoding templates. Hence
it is possible to shorten a number of templates by specifying C directly
rather than through a template parameter. I think this helps readability
a bit.
2022-01-06 14:18:54 +01:00
Jan Beulich
edb7c8ec7e x86: templatize FMA insn templates
The operand ordering portion of the mnemonics repeats, causing a flurry
of almost identical templates. Abstract this out.
2022-01-06 14:18:23 +01:00
Alan Modra
a2c5833233 Update year range in copyright notice of binutils files
The result of running etc/update-copyright.py --this-year, fixing all
the files whose mode is changed by the script, plus a build with
--enable-maintainer-mode --enable-cgen-maint=yes, then checking
out */po/*.pot which we don't update frequently.

The copy of cgen was with commit d1dd5fcc38ead reverted as that commit
breaks building of bfp opcodes files.
2022-01-02 12:04:28 +10:30
Cui,Lili
0cc7872125 [PATCH 1/2] Enable Intel AVX512_FP16 instructions
Intel AVX512 FP16 instructions use maps 3, 5 and 6. Maps 5 and 6 use 3 bits
in the EVEX.mmm field (0b101, 0b110). Map 5 is for instructions that were FP32
in map 1 (0Fxx). Map 6 is for instructions that were FP32 in map 2 (0F38xx).
There are some exceptions to this rule. Some things in map 1 (0Fxx) with imm8
operands predated our current conventions; those instructions moved to map 3.
FP32 things in map 3 (0F3Axx) found new opcodes in map3 for FP16 because map3
is very sparsely populated. Most of the FP16 instructions share opcodes and
prefix (EVEX.pp) bits with the related FP32 operations.

Intel AVX512 FP16 instructions has new displacements scaling rules, please refer
to the public software developer manual for detail information.

gas/

2021-08-05  Igor Tsimbalist  <igor.v.tsimbalist@intel.com>
            H.J. Lu  <hongjiu.lu@intel.com>
            Wei Xiao <wei3.xiao@intel.com>
            Lili Cui  <lili.cui@intel.com>

	* config/tc-i386.c (struct Broadcast_Operation): Adjust comment.
	(cpu_arch): Add .avx512_fp16.
	(cpu_noarch): Add noavx512_fp16.
	(pte): Add evexmap5 and evexmap6.
	(build_evex_prefix): Handle EVEXMAP5 and EVEXMAP6.
	(check_VecOperations): Handle {1to32}.
	(check_VecOperands): Handle CheckRegNumb.
	(check_word_reg): Handle Toqword.
	(i386_error): Add invalid_dest_and_src_register_set.
	(match_template): Handle invalid_dest_and_src_register_set.
	* doc/c-i386.texi: Document avx512_fp16, noavx512_fp16.

opcodes/

2021-08-05  Igor Tsimbalist  <igor.v.tsimbalist@intel.com>
            H.J. Lu  <hongjiu.lu@intel.com>
            Wei Xiao <wei3.xiao@intel.com>
            Lili Cui  <lili.cui@intel.com>

	* i386-dis.c (EXwScalarS): New.
	(EXxh): Ditto.
	(EXxhc): Ditto.
	(EXxmmqh): Ditto.
	(EXxmmqdh): Ditto.
	(EXEvexXwb): Ditto.
	(DistinctDest_Fixup): Ditto.
	(enum): Add xh_mode, evex_half_bcst_xmmqh_mode, evex_half_bcst_xmmqdh_mode
	and w_swap_mode.
	(enum): Add PREFIX_EVEX_0F3A08_W_0, PREFIX_EVEX_0F3A0A_W_0,
	PREFIX_EVEX_0F3A26, PREFIX_EVEX_0F3A27, PREFIX_EVEX_0F3A56,
	PREFIX_EVEX_0F3A57, PREFIX_EVEX_0F3A66, PREFIX_EVEX_0F3A67,
	PREFIX_EVEX_0F3AC2, PREFIX_EVEX_MAP5_10, PREFIX_EVEX_MAP5_11,
	PREFIX_EVEX_MAP5_1D, PREFIX_EVEX_MAP5_2A, PREFIX_EVEX_MAP5_2C,
	PREFIX_EVEX_MAP5_2D, PREFIX_EVEX_MAP5_2E, PREFIX_EVEX_MAP5_2F,
	PREFIX_EVEX_MAP5_51, PREFIX_EVEX_MAP5_58, PREFIX_EVEX_MAP5_59,
	PREFIX_EVEX_MAP5_5A_W_0, PREFIX_EVEX_MAP5_5A_W_1,
	PREFIX_EVEX_MAP5_5B_W_0, PREFIX_EVEX_MAP5_5B_W_1,
	PREFIX_EVEX_MAP5_5C, PREFIX_EVEX_MAP5_5D, PREFIX_EVEX_MAP5_5E,
	PREFIX_EVEX_MAP5_5F, PREFIX_EVEX_MAP5_78, PREFIX_EVEX_MAP5_79,
	PREFIX_EVEX_MAP5_7A, PREFIX_EVEX_MAP5_7B, PREFIX_EVEX_MAP5_7C,
	PREFIX_EVEX_MAP5_7D_W_0, PREFIX_EVEX_MAP6_13, PREFIX_EVEX_MAP6_56,
	PREFIX_EVEX_MAP6_57, PREFIX_EVEX_MAP6_D6, PREFIX_EVEX_MAP6_D7
	(enum): Add EVEX_MAP5 and EVEX_MAP6.
	(enum): Add EVEX_W_MAP5_5A, EVEX_W_MAP5_5B,
	EVEX_W_MAP5_78_P_0, EVEX_W_MAP5_78_P_2, EVEX_W_MAP5_79_P_0,
	EVEX_W_MAP5_79_P_2, EVEX_W_MAP5_7A_P_2, EVEX_W_MAP5_7A_P_3,
	EVEX_W_MAP5_7B_P_2, EVEX_W_MAP5_7C_P_0, EVEX_W_MAP5_7C_P_2,
	EVEX_W_MAP5_7D, EVEX_W_MAP6_13_P_0, EVEX_W_MAP6_13_P_2,
	(get_valid_dis386): Properly handle new instructions.
	(intel_operand_size): Handle new modes.
	(OP_E_memory): Ditto.
	(OP_EX): Ditto.
	* i386-dis-evex.h: Updated for AVX512_FP16.
	* i386-dis-evex-mod.h: Updated for AVX512_FP16.
	* i386-dis-evex-prefix.h: Updated for AVX512_FP16.
	* i386-dis-evex-reg.h : Updated for AVX512_FP16.
	* i386-dis-evex-w.h : Updated for AVX512_FP16.
	* i386-gen.c (cpu_flag_init): Add CPU_AVX512_FP16_FLAGS,
	and CPU_ANY_AVX512_FP16_FLAGS. Update CPU_ANY_AVX512F_FLAGS
	and CPU_ANY_AVX512BW_FLAGS.
	(cpu_flags): Add CpuAVX512_FP16.
	(opcode_modifiers): Add DistinctDest.
	* i386-opc.h (enum): (AVX512_FP16): New.
	(i386_opcode_modifier): Add reqdistinctreg.
	(i386_cpu_flags): Add cpuavx512_fp16.
	(EVEXMAP5): Defined as a macro.
	(EVEXMAP6): Ditto.
	* i386-opc.tbl: Add Intel AVX512_FP16 instructions.
	* i386-init.h: Regenerated.
	* i386-tbl.h: Ditto.
2021-08-05 21:03:41 +08:00
H.J. Lu
154b353f68 x86: Add int1 as one byte opcode 0xf1
Also change the x86 disassembler to disassemble 0xf1 as int1, instead of
icebp.

gas/

	PR gas/28088
	* testsuite/gas/i386/opcode.s: Add int1.
	* testsuite/gas/i386/x86-64-opcode.s: Add int1, int3 and int.
	* testsuite/gas/i386/opcode-intel.d: Updated.
	* testsuite/gas/i386/opcode-suffix.d: Likewise.
	* testsuite/gas/i386/opcode.d: Likewise.
	* testsuite/gas/i386/x86-64-opcode.d: Likewise.

opcodes/

	PR gas/28088
	* i386-dis.c (dis386): Replace icebp with int1.
	* i386-opc.tbl: Add int1.
	* i386-tbl.h: Regenerate.
2021-07-14 14:29:02 -07:00
Jan Beulich
fe134c6569 x86: optimize LEA
Over the years I've seen a number of instances where people used

    lea     (%reg1), %reg2

or

    lea     symbol, %reg

despite the same thing being expressable via MOV. Since additionally
LEA often has restrictions towards the ports it can be issued to, while
MOV typically gets dealt with simply by register renaming, transform to
MOV when possible (without growing opcode size and without altering
involved relocation types).

Note that for Mach-O the new 64-bit testcases would fail (for
BFD_RELOC_X86_64_32S not having a representation), and hence get skipped
there.
2021-04-26 10:37:30 +02:00
Jan Beulich
bbe1eca622 x86: move some opcode table entries
For a long time there hasn't been a need anymore to keep together all
templates with identical mnemonics. Move the MOVQ and MOVABS ones next
to their MOV counterparts. Move the string forms of CMPSD and MOVSD next
to their CMPS / MOVS counterparts. Re-arrange what so fgar was the SSE3
section.

This makes reasonably obvious that MONITOR/MWAIT aren't suitable to
cover by CpuSSE3, but adjusting this is left for another time.
2021-03-29 12:06:43 +02:00
Jan Beulich
c8cad9d389 x86: VPSADBW's source operands are also commutative
In commit 79dec6b7ba ("x86-64: optimize certain commutative
VEX-encoded insns") I missed the fact that there being subtraction
involved here doesn't matter, as absolute differences get summed up.
2021-03-29 12:06:09 +02:00
Jan Beulich
5cdaf10025 x86: fold SSE2AVX and their base MMX/SSE templates
This way not only the overall (source) table size shrinks by quite a
bit and the risk of related templates going out of sync with one another
gets lowered, but also (dis)similarities between neighboring templates
become easier to spot.

Note that for certain SSE2AVX templates this results in benign attribute
changes:
- LDMXCSR and STMXCSR: NoAVX gets set,
- MOVMSKPS, PMOVMSKB, PEXTR{B,W} (register destination), and PINSR{B,W}
  (register source): IgnoreSize and NoRex64 get set,
- CVT{DQ,PS}2PD, CVTSD2SS, MOVMSKPD, MOVDDUP, PMOV{S,Z}X{BW,WD,DQ}, and
  ROUNDSD: NoRex64 gets set,
- CVTSS2SD, INSERTPS, PEXTRW (memory destination), PINSRW (memory
  source), and PMOV{S,Z}X{BD,WQ,BQ}: IgnoreSize gets set.
Similarly the "normal" (non-SSE2AVX)
- non-64-bit CVTS{I,S}2SD forms get NoRex64 set,
- CMP{EQ,ORD,NEQ,UNORD}{P,S}{S,D} forms get C set,
all again in a benign way.

The remaining differences in the generated table are due to re-ordering
of entries in the course of being folded into templates.
2021-03-29 12:05:25 +02:00
Jan Beulich
73e45eb208 x86: undo Prefix_0X<nn> use in opcode table
The table entries are more natural to read (and slightly shorter) when
the prefixes, like is the case for VEX/XOP/EVEX-encoded entries, are
specified as part of the opcode. This is particularly noticable for
side-by-side legacy and SSE2AVX entries.

An implication is that we now need to use "unsigned long long" for the
initially parsed opcode in i386-gen. I don't expect this to be an issue.
2021-03-29 12:04:03 +02:00
Jan Beulich
c3344b626d x86-64: don't accept supposedly disabled MOVQ forms
While all of MMX, SSE, and SSE2 are included in "generic64", they can be
individually disabled. There are two MOVQ forms lacking respective
attributes. While the MMX one would get refused anyway (due to MMX
registers not recognized with .nommx), the assembler did happily accept
the SSE2 form. Add respective CPU settings to both, paralleling what the
MOVD counterparts have.
2021-03-26 11:43:19 +01:00
Jan Beulich
c0e54661f7 x86: fix AMD Zen3 insns
For INVLPGB the operand count was wrong (besides %edx there's also %ecx
which is an input to the insn). In this case I see little sense in
retaining the bogus 2-operand template. Plus swapping of the operands
wasn't properly suppressed for Intel syntax.

For PVALIDATE, RMPADJUST, and RMPUPDATE bogus single operand templates
were specified. These get retained, as the address operand is the only
one really needed to expressed non-default address size, but only for
compatibility reasons. Proper multi-operand insn get introduced and the
testcases get adjusted / extended accordingly.

While at it also drop the redundant definition of __amd64__ - we already
have x86_64 defined (or not) to distinguish 64-bit and non-64-bit cases.
2021-03-25 08:18:41 +01:00
Jan Beulich
9a182d0461 x86: derive opcode length from opcode value
In the majority of cases we can easily determine the length from the
encoding, irrespective of whether a prefix is specified there as well.
We further don't even need to record the value in the table entries, as
it's easy enough to determine it (without any guesswork, unless an insn
with major opcode 00 appeared that requires a 2nd opcode byte to be
specified explicitly) when installing the chosen template for further
processing.

Should an encoding appear which
- has a major opcode byte of 66, F3, or F2,
- requires a 2nd opcode byte to be specified explicitly,
- doesn't have a mandatory prefix
we'd need to convert all templates presently encoding a mandatory prefix
this way to the Prefix_0X<nn> model to eliminate the respective guessing
i386-gen does.
2021-03-24 08:33:33 +01:00
Jan Beulich
311845694b x86: don't use opcode_length to identify pseudo prefixes
This is in preparation of opcode_length going away as a field in the
templates. Identify pseudo prefixes by a base opcode of zero instead:
No real prefix has an opcode of zero. This at the same time allows
dropping a curious special case from i386-gen.

Since most attributes are identical for all pseudo prefixes, take the
opportunity and also template them.
2021-03-24 08:31:41 +01:00
Jan Beulich
441f6aca39 x86: split opcode prefix and opcode space representation
Commit 8b65b8953a ("x86: Remove the prefix byte from non-VEX/EVEX
base_opcode") used the opcodeprefix field for two distinct purposes. In
preparation of having VEX/XOP/EVEX and non-VEX templates become similar
in the representatioon of both encoding space and opcode prefixes, split
the field to have a separate one holding an insn's opcode space.
2021-03-23 17:08:39 +01:00
Jan Beulich
742732c7f0 x86: fold some prefix related attributes into a single one
RepPrefixOk, HLEPrefixOk, and NoTrackPrefixOk can't be specified
together, so can share an enum-like field. IsLockable can be inferred
from HLE setting and hence only needs specifying when neither of them
is present.
2021-03-09 08:54:32 +01:00
Jan Beulich
e93a3b27b2 x86-64: make SYSEXIT handling similar to SYSRET's
Despite SYSEXIT being an Intel-only insn in long mode, its behavior
there is similar to SYSRET's: Depending on REX.W execution continues in
either 64-bit or compatibility mode. Hence distinguishing by suffix is
as necessary here as it is there.
2021-03-09 08:53:38 +01:00
Jan Beulich
75363b6d60 x86: infer operand count of templates
Having this count explicitly in the table is redundant and (even if just
slightly) disturbs clarity. Infer the count from the number of operands
actually found.

Also convert the "no operands" indicator from "{ 0 }" to just "{}", as
that (now) ends up being easier to parse.
2021-03-03 12:57:08 +01:00