V{BROADCAST,EXTRACT,INSERT}{F,I}128 and VROUND{P,S}{S,D} aren't promoted
to support EGPR in APX spec. Don't promote them out of APX spec. This
commit effectively reverted:
ec3babb8c1 x86/APX: V{BROADCAST,EXTRACT,INSERT}{F,I}128 can also be expressed
5a635f1f59 x86/APX: VROUND{P,S}{S,D} encodings require AVX512{F,VL}
eea4357967 x86/APX: VROUND{P,S}{S,D} can generally be encoded
gas/
PR gas/32171
* testsuite/gas/i386/x86-64-apx-egpr-promote-inval.s: Add
V{BROADCAST,EXTRACT,INSERT}{F,I}128 tests with EGPR.
* testsuite/gas/i386/x86-64-apx-evex-promoted.s: Remove
V{BROADCAST,EXTRACT,INSERT}{F,I}128 and VROUND{P,S}{S,D} tests
with EGPR.
* testsuite/gas/i386/x86-64-apx-egpr-inval.l: Updated.
* testsuite/gas/i386/x86-64-apx-egpr-promote-inval.l: Likewise.
* testsuite/gas/i386/x86-64-apx-evex-promoted-intel.d: Likewise.
* testsuite/gas/i386/x86-64-apx-evex-promoted-wig.d: Likewise.
* testsuite/gas/i386/x86-64-apx-evex-promoted.d: Likewise.
opcodes/
PR gas/32171
* i386-opc.tbl: Remove V{BROADCAST,EXTRACT,INSERT}{F,I}128 and
VROUND{P,S}{S,D} entries with EGPR.
* i386-tbl.h: Regenerated.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
This leverages commit ("s390: Simplify (dis)assembly of insn operands
with const bits") to relax the operand constraints of the immediate
operand that contains the constant Z- or T-bit of the following extended
mnemonics:
risbgz, risbgnz, risbhgz, risblgz, rnsbgt, rosbgt, rxsbgt
Previously those instructions were the only ones where the assembler
on s390 restricted the specification of the subject I3/I4 operand values
exactly according to their specification to an unsigned 6- or 5-bit
unsigned integer. For any other instructions the assembler allows to
specify any operand value allowed by the instruction format, regardless
of whether the instruction specification is more restrictive.
Allow to specify the subject I3/I4 operand as unsigned 8-bit integer
with the constant operand bits being ORed during assembly.
Relax the instructions subject significant operand bit masks to only
consider the Z/T-bit as significant, so that the instructions get
disassembled as their *z or *t flavor regardless of whether any reserved
bits are set in addition to the Z/T-bit.
Adapt the rnsbg, rosbg, and rxsbg test cases not to inadvertently set
the T-bit in operand I3, as they otherwise get disassembled as their
rnsbgt, rosbgt, and rxsbgt counterpart.
This aligns GNU Assembler to LLVM Assembler.
opcodes/
* s390-opc.c (U6_18, U5_27, U6_26): Remove.
(INSTR_RIE_RRUUU2, INSTR_RIE_RRUUU3, INSTR_RIE_RRUUU4): Define
as INSTR_RIE_RRUUU while retaining insn fmt mask.
(MASK_RIE_RRUUU2, MASK_RIE_RRUUU3, MASK_RIE_RRUUU4): Treat only
Z/T-bit of I3/I4 operand as significant.
gas/testsuite/
* gas/s390/zarch-z10.s (rnsbg, rosbg, rxsbg): Do not set T-bit.
Reported-by: Dominik Steenken <dost@de.ibm.com>
Suggested-by: Ulrich Weigand <ulrich.weigand@de.ibm.com>
Signed-off-by: Jens Remus <jremus@linux.ibm.com>
Simplify assembly and disassembly of extended mnemonics with operands
with constant ORed bits:
Their instruction template already contains the respective constant
operand bits, as they are significant to distinguish the extended from
their base mnemonic. Operands are ORed into the instruction template.
Therefore it is not necessary to OR the constant bits into the operand
value during assembly in s390_insert_operand.
Additionally the constant operand bits from the instruction template
can be used to mask them from the operand value during disassembly in
s390_print_insn_with_opcode. For now do so for non-length unsigned
integer operands only.
The separate instruction formats need to be retained, as their masks
differ, which is relevant during disassembly to distinguish the base
and extended mnemonics from each other.
This affects the following extended mnemonics:
- vfaebs, vfaehs, vfaefs
- vfaezb, vfaezh, vfaezf
- vfaezbs, vfaezhs, vfaezfs
- vstrcbs, vstrchs, vstrcfs
- vstrczb, vstrczh, vstrczf
- vstrczbs, vstrczhs, vstrczfs
- wcefb, wcdgb
- wcelfb, wcdlgb
- wcfeb, wcgdb
- wclfeb, wclgdb
- wfisb, wfidb, wfixb
- wledb, wflrd, wflrx
include/
* opcode/s390.h (S390_OPERAND_OR1, S390_OPERAND_OR2,
S390_OPERAND_OR8): Remove.
opcodes/
* s390-opc.c (U4_OR1_24, U4_OR2_24, U4_OR8_28): Remove.
(INSTR_VRR_VVV0U1, INSTR_VRR_VVV0U2, INSTR_VRR_VVV0U3): Define
as INSTR_VRR_VVV0U0 while retaining respective insn fmt mask.
(INSTR_VRR_VV0UU8): Define as INSTR_VRR_VV0UU while retaining
respective insn fmt mask.
(INSTR_VRR_VVVU0VB1, INSTR_VRR_VVVU0VB2, INSTR_VRR_VVVU0VB3):
Define as INSTR_VRR_VVVU0VB while retaining respective insn fmt
mask.
* s390-dis.c (s390_print_insn_with_opcode): Mask constant
operand bits set in insn template of non-length unsigned
integer operands.
gas/
* config/tc-s390.c (s390_insert_operand): Do not OR constant
operand value bits.
Signed-off-by: Jens Remus <jremus@linux.ibm.com>
EVEX.B4 is used only for GPR (or addressing of memory) operands. SIMD
registers encoded via ModR/M.rm (when ModR/M.mod == 3) have their top
bit in EVEX.X3. Supposedly (doc version 004) EVEX.B4 is ignored when
unused, hence also don't flag such encodings as invalid.
Along the lines of 2513312930 ("x86/APX: apply NDD-to-legacy
transformation to further CMOVcc forms") these can similarly be
converted to the shorter legacy-encoded CMOVcc.
In the patch, in order to support ymm rounding for AVX10.2, we derive
evex attribute for all cases instead of only for rc_none to encode U bit.
Also changed some bad_opcode return due to the share of U bit with APX_F.
gas/ChangeLog:
* config/tc-i386.c
(cpu_flags_match): Handle AVX10_2.
(build_evex_prefix): Handle U bit. Derive evex attribute
for all cases.
(check_VecOperands): Handle AVX10.2 and ymm roundings.
* doc/c-i386.texi: Document .avx10.2.
* testsuite/gas/i386/i386.exp: Run AVX10.2 tests.
* testsuite/gas/i386/x86-64.exp: Ditto.
* testsuite/gas/i386/avx10_2-rounding-intel.d: New test.
* testsuite/gas/i386/avx10_2-rounding-inval.l: Ditto.
* testsuite/gas/i386/avx10_2-rounding-inval.s: Ditto.
* testsuite/gas/i386/avx10_2-rounding.d: Ditto.
* testsuite/gas/i386/avx10_2-rounding.s: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-rounding-intel.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-rounding.d: Ditto.
* testsuite/gas/i386/x86-64-avx10_2-rounding.s: Ditto.
opcodes/ChangeLog:
* i386-dis.c (struct instr_info): Add U bit.
(get_valid_dis386): Handle U bit.
* i386-gen.c (isa_dependencies): Add AVX10.2.
(cpu_flags): Ditto.
* i386-init.h: Regenerated.
* i386-opc.h (CpuAVX10_2): New.
(i386_cpu_flags): Add cpuavx10_2.
* i386-opc.tbl: Add rounding to old entries which do not
permit rounding previously. Also eliminate the redundant
RegXMM for vcvtps2uqq.
* i386-tbl.h: Regenerated.
The special property really only applies to the "extended" byte regs
having legacy word/dword counterparts.
While touching involved code also drop redundant byte checks from a
conditional in establish_rex(): The other remaining RegRex64 uses only
exist on registers which can't be used as register operands anyway.
Hence RegRex64 as an attribute of a (valid) register operand implies
that it's a byte reg.
This patch supports Zcmp instruction 'cm.mva01s' and 'cm.mvsa01'.
All disassemble instructions use the sreg format.
Co-Authored by: Charlie Keaney <charlie.keaney@embecosm.com>
Co-Authored by: Mary Bennett <mary.bennett@embecosm.com>
Co-Authored by: Nandni Jamnadas <nandni.jamnadas@embecosm.com>
Co-Authored by: Sinan Lin <sinan.lin@linux.alibaba.com>
Co-Authored by: Simon Cook <simon.cook@embecosm.com>
Co-Authored by: Shihua Liao <shihua@iscas.ac.cn>
Co-Authored by: Yulong Shi <yulong@iscas.ac.cn>
gas/ChangeLog:
PR 32036
* NEWS: Updated.
* config/tc-riscv.c (validate_riscv_insn): New operators.
(riscv_ip): Ditto.
* testsuite/gas/riscv/zcmp-mv.d: New test.
* testsuite/gas/riscv/zcmp-mv.s: New test.
include/ChangeLog:
PR 32036
* opcode/riscv-opc.h (MATCH_CM_MVA01S): New opcode.
(MASK_CM_MVA01S): New mask.
(MATCH_CM_MVSA01): New opcode.
(MASK_CM_MVSA01): New mask.
(DECLARE_INSN): New declarations.
* opcode/riscv.h (OP_MASK_SREG1): New mask.
(OP_SH_SREG1): New operand code.
(OP_MASK_SREG2): New mask.
(OP_SH_SREG2): New operand code.
(X_A0): New reg number.
(X_A1): Ditto.
(X_S7): Ditto.
(RISCV_SREG_0_7): New macro function.
opcodes/ChangeLog:
PR 32036
* riscv-dis.c (riscv_zcmp_get_sregno): New function.
(print_insn_args): New operators.
* riscv-opc.c (match_sreg1_not_eq_sreg2): New match function.
While 919b75f7e2 ("Trailing space in opcodes/ generated files") took
care of permanently zapping trailing whitespace, 5471128011
("opcodes: cris: move desc & opc files from sim/") then didn't enhance
the newly added code accordingly.
According to the description of the state machine, the expectation
appears to be that (leaving aside labels) any insn mnemonic or
directive would be followed by a comma separated list of operands. That
may have been true very long ago, but the latest with the advent of more
elaborate macros this isn't rhe case anymore. Neither macro parameters
in macro definitions nor macro arguments in macro invocations are
required to be separated by commas. Hence whitespace serves a crucial
role there. Plus even without "real" macros issues exist, in e.g.
.irp n, ...
insn\n\(suffix) operand1, operand2
.endr
Whitespace following the closing parenthesis would have been removed
(ahead of even processing the .irp), as the "opcode" was deemed to have
ended earlier already.
Therefore, squash the distinction between "opcode" and operands, i.e.
fold state 10 back into state 3. Also drop most of the distinction
between "symbol chars" and "relatively normal" ones. Not entirely
unexpectedly this results in the need to skip whitespace in a few more
places in arch-specific code (and quite likely more changes are needed
for insn forms not covered by the testsuite).
As a result the D10V special case is no longer necessary.
In config/tc-sparc.c also move a comment to be next to the code being
commented.
In opcodes/cgen-asm.in some further cleanup is done, following the local
var adjustments.
The first operand is a general register, not an fp register;
the third operand is encoded into RS2, not RS3;
the second operand must match the destination operand.
The `zext.h` is zero-extend halfword instruction that belongs to Zbb.
Currently `zext.h` falls back to 2 shifts if Zbb is not enabled. However, the
encoding and operation is a special case of `pack/packw rd, rs1, rs2`, which
belongs to Zbkb. The instructions pack the low halves of rs1 and rs2 into rd.
When rs2 is zero (x0), they behave like zero-extend instruction, and the
encoding are exactly the same as zext.h.
Thus we can map `zext.h` to `pack` or `packw` (rv64) if Zbkb is enabled,
instead of 2 shifts. This reduces one instruction.
This patch does this by making `zext.h` also available for Zbkb.
opcodes/
* riscv-opc.c (riscv_opcodes): Update `zext.h` entries to use
`ZBB_OR_ZBKB` instruction class.
gas/
* testsuite/gas/riscv/zext-to-pack.s: Add test for mapping zext to
pack/packw encoding.
* testsuite/gas/riscv/zext-to-pack-encoding.d: Likewise.
* testsuite/gas/riscv/zext-to-packw-encoding.d: Likewise.
Add the MT ASE instruction operand types and encodings to the microMIPS
opcode table and enable the assembly of these instructions in GAS from
MIPSr2 onwards. Update the binutils and GAS testsuites accordingly.
References:
"MIPS Architecture for Programmers, Volume IV-f: The MIPS MT Module for
the microMIPS32 Architecture", MIPS Technologies, Inc., Document Number:
MD00768, Revision 1.12, July 16, 2013
Co-Authored-By: Maciej W. Rozycki <macro@redhat.com>
This corrects objdump -d -m armv8.1-m.main output for a testcase found
by oss-fuzz, .inst 0xee2fee79, which hits an assertion.
Obviously the switch case constants should be binary, not hex.
Correcting that is enough to cure this assertion, but I don't see any
point in singling out the invalid case 0b10. In fact, it is just plain
wrong to print "undefined instruction: size equals zero undefined
instruction: size equals two".
I also don't see the need for defensive programming here as is done
elsewhere in checking that "value" is in range before indexing
mve_vec_sizename. There is exactly one MVE_VSHLL_T2 entry in
mve_opcodes. It is easy to verify that "value" is only two bits.
A number of instructions in the regular MIPS opcode table are assembly
idioms for the MT thread context move MFTR and MTTR instructions, so
mark them as aliases accordingly. Add suitable test cases, which also
cover the PAUSE assembly idiom.
PAUSE is an assembly idiom for 'sll $0,$0,5', so mark it as an alias in
the regular MIPS opcode table, matching the microMIPS opcode table. A
test case will be supplied separately.
A number of coprocessor move encodings have been randomly sprinkled over
the regular MIPS and microMIPS opcode tables rather than where they'd be
expected following the alphabetic order. Fix the ordering, taking into
account precedence where it has to be observed for correct disassembly.
No functional change.
Make AL a shorthand for INSN2_ALIAS with the regular MIPS and microMIPS
opcode tables, just as with the MIPS16 opcode table, and use it
throughout. No functional change.
The semantics of the regular MIPS "+t" operand code is exactly the same
as that of the "E" operand code, so replace the former with the latter
in the single MFTC0 instruction with implicit 'sel' == 0 encoding where
it's used, matching the encoding with explicit 'sel' as well as other
instructions.
We print MFTR and MTTR instructions' thread context register operand in
disassembly using the ABI name the register number would correspond to
should the targeted register be a general-purpose register.
However in most cases it is wrong, because general-purpose registers are
only referred when the 'u' and 'sel' operands are 1 and 0 respectively.
And even in these cases the MFGPR and MTGPR aliases take precedence over
the corresponding generic instruction encodings, so you won't see the
valid case to normally trigger.
Conversely decoding the thread context register operand numerically is
always valid, so switch to using it. Adjust test coverage accordingly.
The "-x" operand type is used for the reverse encoding of the BOVC and
BNVC instructions, where 'rs' and 'rt' have been supplied as the second
and the first operand respectively rather than the order the instruction
expects.
In this case we require the register associated with the "-x" operand to
have a higher number than the register associated with the preceding "t"
operand, which precludes the use of $0. The case where 'rs' and 'rt'
both refer to the same register is handled by the straight encoding of
the BOVC and BNVC instructions, which come in the opcode table ahead of
the corresponding reverse encoding.
Therefore clear the ZERO_OK flag for the "-x" operand. No need for an
extra test case as the encodings involved are already covered by "r6"
and its associated GAS tests.
Enforce some checks on the newly added subclass flags:
- If a subclass is set of one insn of an iclass, every insn of that
iclass must have non-zero subclass field.
- For all other iclasses, the subclass bits are zero for all insns.
include/
* opcode/aarch64.h (enum aarch64_insn_class): Identify the
maximum iclass enum value.
opcodes/
* aarch64-gen.c (iclass_has_subclasses_p): New array of bool.
(read_table): Enforce checks on subclass flags.
For detecting irg, add a subclass to identify it in the set of
instructions of iclass dp_2src.
opcodes/
* aarch64-tbl.h: Add subclass flag F_DP_TAG_ONLY for irg insn.
Use the two new subclass flags: F_BRANCH_CALL, F_BRANCH_RET, to indicate
call to and return from subroutine respectively.
opcodes/
* aarch64-tbl.h: Use the new F_BRANCH_* flags.
Use the three new subclass flags: F_ARITH_ADD, F_ARITH_SUB,
F_ARITH_MOV, to indicate add, sub and mov ops respectively.
These flags for subclasses will later be used for SCFI purposes to
create appropriate ginsns. At this time, only those iclasses relevant
to SCFI have the new subclass flags specified.
For addg and subg insns, F_SUBCLASS_OTHER is more suitable because these
operations do more than just simple add or sub.
opcodes/
* aarch64-tbl.h: Use the new F_ARITH_* flags.
The existing iclass information tells us the general shape and purpose
of the instructions. In some cases, however, we need to further disect
the iclass on the basis of other finer-grain information. E.g., for the
purpose of SCFI, we need to know whether a given insn with iclass
of ldst_* is a load or a store.
At the moment, specify subclasses for only those iclasses relevant to
SCFI: ldst_imm9, ldst_pos, ldstpair_indexed, ldstpair_off and
ldstnapair_offs.
Some insns are best tagged with F_SUBCLASS_OTHER rather than F_LDST_LOAD
or F_LDST_STORE:
- stg* ops (as they store tag only),
- prfm,
- ldpsw, ldrsw (32-bit loads with signed extended value. Not useful
for restore operations in context of SCFI.)
- Use F_SUBCLASS_OTHER for all QL_LDST_R8 and QL_LDST_R16 operands.
Also use F_SUBLASS_OTHER for strb/ldrb, strh/ldrh opcodes.
These are not full loads and stores and cannot be allowed for
register save / restore for the purpose of SCFI.
opcodes/
* aarch64-tbl.h: Use the new F_LDST_* flags.
This patch adds support for following sme2.1 zero instructions and
the spec is available here [1].
1. ZERO (single-vector).
2. ZERO (double-vector).
3. ZERO (quad-vector).
The VECTOR GROUP symbols VGx2 and VGx4 are optional for the assembler
for most of the sme and sve instructions. But for few of the sme2.1
zero instruction variants VECTOR GROUP symbols VGx2 and VGx4 are mandatory.
To address this a bit "F_VG_REQ" is introduced in this patch, on setting
F_VG_REQ bit in flags of aarch64_opcode forces the assembler to accept
instruction operand only having VECTOR GROUP symbols.
[1]: https://developer.arm.com/documentation/ddi0602/2024-03/SME-Instructions?lang=en
This patch adds support for following sme2.1 luti2 and luti4 instructions, spec is
available here [1]
1. LUTI2 (two registers) strided.
2. LUTI2 (four registers) strided.
3. LUTI4 (two registers) strided.
4. LUTI4 (four registers) strided.
[1]: https://developer.arm.com/documentation/ddi0602/2024-03/SME-Instructions?lang=en
This patch adds support for followign SVE2p1 instruction, spec is available here [1].
1. PMOV (to vector)
2. PMOV (to predicate)
Both pmov (to vector) and pmov (to predicate) have destination scalable vector
register and source scalable vector register respectively as an operand with no
suffix and optional index. To handle this case we have added 8 new operands in
this patch.
AARCH64_OPND_SVE_Zn0_INDEX, /* Zn[index], bits [9:5]. */
AARCH64_OPND_SVE_Zn1_17_INDEX, /* Zn[index], bits [9:5,17]. */
AARCH64_OPND_SVE_Zn2_18_INDEX, /* Zn[index], bits [9:5,18:17]. */
AARCH64_OPND_SVE_Zn3_22_INDEX, /* Zn[index], bits [9:5,18:17,22]. */
AARCH64_OPND_SVE_Zd0_INDEX, /* Zn[index], bits [4:0]. */
AARCH64_OPND_SVE_Zd1_17_INDEX, /* Zn[index], bits [4:0,17]. */
AARCH64_OPND_SVE_Zd2_18_INDEX, /* Zn[index], bits [4:0,18:17]. */
AARCH64_OPND_SVE_Zd3_22_INDEX, /* Zn[index], bits [4:0,18:17,22]. */
Since the index of the <Zd> operand is optional, the index part is
dropped in disassembly in both the cases of "no index" or "zero index".
As per spec: PMOV <Zd>{[<imm>]}, <Pn>.D
PMOV <Pn>.D, <Zd>{[<imm>]}
Example1:
Assembly: pmov z5[0], p6.d
Disassembly: pmov z5, p6.d
Assembly: pmov z5, p6.d
Disassembly: pmov z5, p6.d
Example2:
Assembly: pmov p4.b, z5[0]
Disassembly: pmov p4.b, z5
Assembly: pmov p4.b, z5
Disassembly: pmov p4.b, z5
[1]: https://developer.arm.com/documentation/ddi0602/2024-03/SVE-Instructions?lang=en