There are no legacy ldind nor ldabs BPF instructions with BPF_SIZE_DW.
For some reason we were (incorrectly) supporting these. This patch
updates the opcodes so the instructions get removed and modifies the
GAS manual and testsuite accordingly.
See discussion at
https://lore.kernel.org/bpf/110aad7a-f8a3-46ed-9fda-2f8ee54dcb89@linux.dev
Tested in bpf-uknonwn-none target, x86-64-linux-gnu host.
include/ChangeLog:
2024-01-29 Jose E. Marchesi <jose.marchesi@oracle.com>
* opcode/bpf.h (enum bpf_insn_id): Remove BPF_INSN_LDINDDW and
BPF_INSN_LDABSDW instructions.
opcodes/ChangeLog:
2024-01-29 Jose E. Marchesi <jose.marchesi@oracle.com>
* bpf-opc.c (bpf_opcodes): Remove BPF_INSN_LDINDDW and
BPF_INSN_LDABSDW instructions.
gas/ChangeLog:
2024-01-29 Jose E. Marchesi <jose.marchesi@oracle.com>
* doc/c-bpf.texi (BPF Instructions): There is no indirect 64-bit
load instruction.
(BPF Instructions): There is no absolute 64-bit load instruction.
* testsuite/gas/bpf/mem.s: Update test accordingly.
* testsuite/gas/bpf/mem-be-pseudoc.d: Likewise.
* testsuite/gas/bpf/mem-be.d: Likewise.
* testsuite/gas/bpf/mem-pseudoc.d: Likewise.
* testsuite/gas/bpf/mem-pseudoc.s: Likewise.
* testsuite/gas/bpf/mem.d: Likewise.
* testsuite/gas/bpf/mem.s: Likewise.
SHA512 instructions were added to the architecture at the same time as SHA3
instructions, but later than the SHA1 and SHA256 instructions. Furthermore,
implementations must support either both or neither of the SHA512 and SHA3
instruction sets. However, SHA512 instructions were originally (and
incorrectly) added to Binutils under the +sha2 flag.
This patch moves SHA512 instructions under the +sha3 flag, which matches the
architecture constraints and existing GCC and LLVM behaviour.
Re-using the entire VEX decode hierarchy for the respective major opcode
has led to those two also being decoded as-if valid. Follow the earlier
USE_X86_64_EVEX_{PFX,W}_TABLE approach to avoid this happening.
As suggested during review already, all such entries have their first
slot as Bad_Opcode, so by adding two more enumerators we can avoid doing
that decode step altogether.
With identical source and destination it can be covered by the NDD-to-
legacy conversion logic as well, even if in this case the original insn
doesn't use an NDD encoding. The size savings are even better here, for
the replacement (BSWAP) not having a ModR/M byte.
GCC 14 will detect when the size and count arguments of calloc are
swapped.
binutils-gdb/opcodes/tic4x-dis.c: In function ‘tic4x_disassemble’:
binutils-gdb/opcodes/tic4x-dis.c:710:32: error: ‘xcalloc’ sizes specified with ‘sizeof’ in the earlier argument and not in the later argument [-Werror=calloc-transposed-args]
710 | optab = xcalloc (sizeof (tic4x_inst_t *), (1 << TIC4X_HASH_SIZE));
| ^~~~~~~~~~~~
binutils-gdb/opcodes/tic4x-dis.c:710:32: note: earlier argument should specify number of elements, later size of each element
binutils-gdb/opcodes/tic4x-dis.c:712:40: error: ‘xcalloc’ sizes specified with ‘sizeof’ in the earlier argument and not in the later argument [-Werror=calloc-transposed-args]
712 | optab_special = xcalloc (sizeof (tic4x_inst_t *), TIC4X_SPESOP_SIZE);
| ^~~~~~~~~~~~
binutils-gdb/opcodes/tic4x-dis.c:712:40: note: earlier argument should specify number of elements, later size of each element
opcodes/ChangeLog:
* /tic4x-dis.c (tic4x_disassemble): Swap size and count xcalloc
arguments.
VRNDSCALE{P,S}{S,D} is the AVX512 generalization of these AVX insns. As
long as the immediate has the top 4 bits clear, they are equivalent to
the earlier VEX-encoded insns, and hence can be used to permit use of
eGPR-s in the memory operand. Since this is the normal way of using
these insns, also alter the resulting diagnostic to complain about the
immediate, not the eGPR use.
When there's a suitably disambiguating register operand, suffixes are
generally omitted (unless in suffix-always mode). All NDD insns have a
suitable register operand, so they shouldn't have suffixes by default.
Along with the relevant unit-tests, this adds the following rcpc3
instructions:
STL1 { <Vt>.D }[<index>], [<Xn|SP>]
LDAP1 { <Vt>.D }[<index>], [<Xn|SP>]
LDAPUR <Bt>, [<Xn|SP>{, #<simm>}]
LDAPUR <Ht>, [<Xn|SP>{, #<simm>}]
LDAPUR <St>, [<Xn|SP>{, #<simm>}]
LDAPUR <Dt>, [<Xn|SP>{, #<simm>}]
LDAPUR <Qt>, [<Xn|SP>{, #<simm>}]
STLUR <Bt>, [<Xn|SP>{, #<simm>}]
STLUR <Ht>, [<Xn|SP>{, #<simm>}]
STLUR <St>, [<Xn|SP>{, #<simm>}]
STLUR <Dt>, [<Xn|SP>{, #<simm>}]
STLUR <Qt>, [<Xn|SP>{, #<simm>}]
with `#<simm>' taking on a signed 8-bit integer value in the range
[-256,255] and `index' the values 0 or 1.
Co-authored-by: Srinath Parvathaneni <srinath.parvathaneni@arm.com>
Given the introduction of the new address operand types for rcpc3
instructions, this patch adds the necessary logic to teach
`general_constraint_met_p` how to proper handle these.
The particular choices of address indexing, along with their encoding
for RCPC3 instructions lead to the requirement of a new set of operand
descriptions, along with the relevant inserter/extractor set.
That is, for the integer load/stores, there is only a single valid
indexing offset quantity and offset mode is allowed - The value is
always equivalent to the amount of data read/stored by the
operation and the offset is post-indexed for Load-Acquire RCpc, and
pre-indexed with writeback for Store-Release insns.
This indexing quantity/mode pair is selected by the setting of a
single bit in the instruction. To represent these insns, we add the
following operand types:
- AARCH64_OPND_RCPC3_ADDR_OPT_POSTIND
- AARCH64_OPND_RCPC3_ADDR_OPT_PREIND_WB
In the case of loads and stores involving SIMD/FP registers, the
optional offset is encoded as an 8-bit signed immediate, but neither
post-indexing or pre-indexing with writeback is available. This
created the need for an operand type similar to
AARCH64_OPND_ADDR_OFFSET, with the difference that FLD_index should
not be checked.
We thus introduce the AARCH64_OPND_RCPC3_ADDR_OFFSET operand, a
variant of AARCH64_OPND_ADDR_OFFSET, w/o the FLD_index bitfield.
Beyond the need to encode any registers involved in data transfer and
the address base register for load/stores, it is necessary to specify
the data register addressing mode and whether the address register is
to be pre/post-indexed, whereby loads may be post-indexed and stores
pre-indexed with write-back.
The use of a single bit to specify both the indexing mode and indexing
value requires a novel function be written to accommodate this for
address operand insertion in assembly and another for extraction in
disassembly, along with the definition of two insn fields for use with
these instructions.
This therefore defines the following functions:
- aarch64_ins_rcpc3_addr_opt_offset
- aarch64_ins_rcpc3_addr_offset
- aarch64_ext_rcpc3_addr_opt_offset
- aarch64_ext_rcpc3_addr_offset
It extends the `do_special_{encoding|decoding}' functions and defines
two rcpc3 instruction fields:
- FLD_opc2
- FLD_rcpc3_size
The allowed immediate offsets in integer rcpc3 load store instructions
are not encoded explicitly in the instruction itself, being rather
implicitly equivalent to the amount of data loaded/stored by the
instruction.
This leads to the requirement that this quantity be calculated based on
the number of registers involved in the transfer, either as data
source or destination registers and their respective qualifiers.
This is done via `calc_ldst_datasize (const aarch64_opnd_info *opnds)'
implemented here, using a cumulative sum of qualifier sizes preceding
the address operand in the OPNDS operand list argument.
Indicating the presence of the Armv8.2-a feature adding further
support for the Release Consistency Model, the `+rcpc3' architectural
extension flag is added to the list of possible `-march' options in
Binutils, together with the necessary macro for encoding rcpc3
instructions.
There are some tlbi operations that don't have a corresponding tlbip operation,
but we were incorrectly using the same list for both. Add the missing tlbi
*nxs operations, and use the F_REG_128 flag to filter tlbi operations that
don't have a tlbip analogue. For increased clarity, I have also used a macro
to reduce duplication between the 'nxs' and non-'nxs' variants, and added a
test to verify that no invalid combinations are accepted.
Additionally, fix two missing checks for AARCH64_OPND_SYSREG_TLBIP that were
preventing disassembly of tlbip instructions.
Hi,
This patch add support for SVE2.1 instructions ld1q,
ld2q, ld3q and ld4q, st1q, st2q, st3q and st4q.
Regression testing for aarch64-none-elf target and found no regressions.
Ok for binutils-master?
Regards,
Srinath.
Hi,
This patch add support for SVE2.1 instruction faddqv,
fmaxnmqv, fmaxqv, fminnmqv and fminqv.
Regression testing for aarch64-none-elf target and found no regressions.
Ok for binutils-master?
Regards,
Srinath.
Hi,
This patch add support for SVE2.1 instruction dupq, eorqv and extq.
Regression testing for aarch64-none-elf target and found no regressions.
Ok for binutils-master?
Regards,
Srinath.
Hi,
This patch add support for FEAT_SVE2p1 (SVE2.1 Extension) feature
along with +sve2p1 optional flag to enabe this feature.
Also support for following SVE2p1 instructions is added
addqv, andqv, smaxqv, sminqv, umaxqv, uminqv and uminqv.
Regression testing for aarch64-none-elf target and found no regressions.
Ok for binutils-master?
Regards,
Srinath.
Hi,
This patch add support for FEAT_SME2p1 and "movaz" instructions
along with the optional flag +sme2p1.
Following "movaz" instructions are add:
Move and zero two ZA tile slices to vector registers.
Move and zero four ZA tile slices to vector registers.
Regression testing for aarch64-none-elf target and found no regressions.
Ok for binutils-master?
Regards,
Srinath.
Hi,
This patch add support for SVE2.1 and SME2.1 non-widening BFloat16
(FEAT_B16B16) instructions.
Following instructions predicated, unpredicated and indexed
variants are added in this patch.
bfadd, bfclamp, bfmax bfmaxnm, bfmin,bfminnm,
bfmla,bfmls,bfmul and bfsub.
Regression testing for aarch64-none-elf target and found no regressions.
Ok for binutils-master?
Regards,
Srinath.
The ginsn representation keeps the DWARF register number of the
operands. The API ginsn_dw2_regnum relies on the the relative ordering
of these register entries in the table. Add a comment to make it clear.
opcodes/
* i386-reg.tbl: Add a comment.
Some x86 instructions affect the stack pointer implicitly. Add a new
operand constraint to reflect this. This will be useful for SCFI
implmentation to ensure its correctness.
Mark all push, pop, call, ret, enter, leave, INT, iret instructions.
opcodes/
* i386-gen.c: Update opcode_modifiers.
* i386-opc.h: Add a new constraint.
* i386-opc.tbl: Update the affected instructions.
* i386-tbl.h: Regenerated.
Rex2 is currently an operand constraint. For the upcoming SCFI
implementation in GAS, we need to identify operations which implicitly
update the stack pointer. An operand constraint enumerator for implicit
stack op seems more appropriate than an attribute. However, two opcodes
currently necessitate both Rex2 and an implicit stack op marker; this
prompts revisiting the current representations a bit.
Make Rex2 a standalone attribute, so that later a new operand constraint
may be added for IMPLICIT_STACK_OP.
ChangeLog:
* gas/config/tc-i386.c (is_apx_rex2_encoding): Update the check.
* opcodes/i386-gen.c: Add a new BITFIELD for Rex2.
* opcodes/i386-opc.h (REX2_REQUIRED): Remove.
* opcodes/i386-opc.tbl: Remove Rex2 operand constraint.
* opcodes/i386-tbl.h: Regenerated.
Additionally, change FEAT_XS tlbi variants to be gated on "+xs" instead of
"+d128". This is an incremental improvement; there are still some FEAT_XS tlbi
variants that are gated incorrectly or missing entirely.
Due to the formatted output of objdump, some instructions
that do not require output operands (such as nop/ret) will
have extra spaces added after them.
Determine whether to output operands through the format
of opcodes. When opc->format is an empty string, no extra
spaces are output.
This patch adds support for the new AArch64 system registers that are part of the following extensions:
* FEAT_DEBUGv8p9
* FEAT_PMUv3p9
* FEAT_PMUv3_SS
* FEAT_PMUv3_ICNTR
* FEAT_SEBEP
As already indicated during review, we can't get away without certain
adjustments here: Without these, respective {evex}-prefixed insns are
assembled to APX encodings even when APX_F is turned off.
While there also extend the respective comment in the opcode table, to
explain why this construct is used.
This patch adds support for FEAT_THE doubleword and quadword instructions.
doubleword insturctions are enabled by "+the" flag whereas quadword
instructions are enabled on passing both "+the and +d128" flags.
Support for following sets of instructions is added in this patch.
Read check write compare and swap doubleword:
(rcwcas, rcwcasa, rcwcasal, rcwcasl)
Read check write compare and swap quadword:
(rcwcasp,rcwcaspa, rcwcaspal, rcwcaspl)
Read check write software compare and swap doubleword:
(rcwscas, rcwscasa, rcwscasal, rcwscasl)
Read check write software compare and swap quadword:
(rcwscasp, rcwscaspa, rcwscaspal, rcwscaspl)
Read check write atomic bit clear on doubleword:
(rcwclr, rcwclra, rcwclral, rcwclrl)
Read check write atomic bit clear on quadword:
(rcwclrp, rcwclrpa, rcwclrpal, rcwclrpl)
Read check write software atomic bit clear on doubleword:
(rcwsclr, rcwsclra, rcwsclral, rcwsclrl)
Read check write software atomic bit clear on quadword:
(rcwsclrp,rcwsclrpa, rcwsclrpal,rcwsclrpl)
Read check write atomic bit set on doubleword:
(rcwset,rcwseta, rcwsetal,rcwsetl)
Read check write atomic bit set on quadword:
(rcwsetp,rcwsetpa,rcwsetpal,rcwsetpl)
Read check write software atomic bit set on doubleword:
(rcwsset,rcwsseta,rcwssetal,rcwssetl)
Read check write software atomic bit set on quadword:
(rcwssetp,rcwssetpa,rcwssetpal,rcwssetpl)
Read check write swap doubleword:
(rcwswp,rcwswpa,rcwswpal,rcwswpl)
Read check write swap quadword:
(rcwswpp,rcwswppa, rcwswppal,rcwswppl)
Read check write software swap doubleword:
(rcwsswp,rcwsswpa,rcwsswpal,rcwsswpl)
Read check write software swap quadword:
(rcwsswpp,rcwsswppa,rcwsswppal,rcwsswppl)
With the addition of 128-bit system registers to the Arm architecture
starting with Armv9.4-a, a mechanism for manipulating their contents
is introduced with the `msrr' and `mrrs' instruction pair.
These move values from one such 128-bit system register into a pair of
contiguous general-purpose registers and vice-versa, as for example:
msrr ttlb0_el1, x0, x1
mrrs x0, x1, ttlb0_el1
This patch adds the necessary support for these instructions, adding
checks for system-register width by defining a new operand type in the
form of `AARCH64_OPND_SYSREG128' and the `aarch64_sys_reg_128bit_p'
predicate, responsible for checking whether the requested system
register table entry is marked as implemented in the 128-bit mode via
the F_REG_128 flag.
The 2020 Architecture Extensions to the Arm A-profile architecture
added FEAT_XS, the XS attribute feature, giving cores the ability to
identify devices which can be subject to long response delays. TLB
invalidate (TLBI) operations and barriers can also be annotated with
this attribute[1].
With the introduction of the 128-bit translation tables with the
Armv8.9-a/Armv9.4-a Translation Hardening Extension, a series of new
TLB invalidate operations are introduced which make use of this
extension. These are added to aarch64_sys_regs_tlbi[] for use
with the `tlbip' insn.
[1] https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/arm-a-profile-architecture-developments-2020
The addition of 128-bit page table descriptors and, with it, the
addition of 128-bit system registers for these means that special
"invalidate translation table entry" instructions are needed to cope
with the new 128-bit model. This is introduced with the `tlbpi'
instruction, implemented here.
Some 128-bit system operations (mrrs, msrr, tlbip, and sysp) take two
qualified operands and one of unqualified type (e.g. system register
name, tlbip operation). This creates the need for adequate qualifiers
to handle this.
This patch therefore introduces the `QL_SRC_X2' and `QL_DST_X2' qualifier
specifiers, which expand to `QLF3(NIL,X,X)' and `QLF3(X,X,NIL)',
respectively.
While CRn and CRm fields in the SYSP instruction are 4-bit wide and
are thus able to accommodate values in the range 0-15, the
specifications for the SYSP instructions limit their ranges to 8-9 for
CRm and 0-7 in the case of CRn.
This led to the need to signal in some way to the operand parser that
a given operand is under special restrictions regarding its use. This
is done via the new `F_OPD_NARROW' flag, indicating a narrowing in the
range of operand values for fields in the instruction tagged with the
flag.
The flag is then used in `parse_operands' when the instruction is
assembled, but needs not be taken into consideration during
disassembly.
Mirroring the use of the `sys' - System Instruction assembly
instruction, this implements its 128-bit counterpart, `sysp'.
This optionally takes two contiguous general-purpose registers
starting at an even number or, when these are omitted, by default
sets both of these to xzr.
Syntax:
sysp #<op1>, <Cn>, <Cm>, #<op2>{, <Xt1>, <Xt2>}
Analysis of the allowed operand values for `sysp' and `tlbip' reveals
a significant departure from the allowed behavior for operand register
pairs (hitherto labeled AARCH64_OPND_PAIRREG) observed for other
insns in this category.
For instructions `casp', `mrrs' and `msrr' the register pair must
always start at an even index and the second register in the pair is
the index + 1. This precludes the use of xzr as the first register,
given it corresponds to register number 31.
This is different in the case of `sysp' and `tlbip', however. These
allow the use of xzr and, where the first operand in the pair is
omitted, this is the default value assigned to it. When this
operand is assigned xzr, it is expected that the second operand will
likewise take on a value of xzr.
These two instructions therefore "break" two rules of register pairs:
* The first of the two registers is odd-numbered.
* The index of the second register is equal to that of the first,
and not n+1.
To allow for this departure from hitherto standard behavior, we
extend the functionality of the assembler by defining an extension of
the AARCH64_OPND_PAIRREG, called AARCH64_OPND_PAIRREG_OR_XZR.
It is used in defining `sysp' and `tlbip' and allows
`operand_general_constraint_met_p' to allow the pair to both take on
the value of xzr.
Given the introduction of the new Armv9.4-a `sysp' insn using the
following syntax:
sysp #<op1>, <Cn>, <Cm>, #<op2>{, <Xt1>, <Xt2>}
and by extension the need to encode 6 assembly operands, extend
Binutils to handle instructions taking 6 operands, up from a previous
maximum of 5.
Indicating the presence of the Armv9.4-a features concerning 128-bit
Page Table Descriptors, 128-bit System Registers and Instructions,
the "+d128" architectural extension flag is added to the list of
possible -march options in Binutils, together with the necessary macro
for encoding d128 instructions.
Since 0x66 is the opcode prefix for adcx, it is wrong to use the 'S'
prefix:
'S' => print 'w', 'l' or 'q' if suffix_always is true
on adcx. Add
'L' => print 'l' or 'q' if suffix_always is true
replace S with L on adcx and adox.
gas/
PR binutils/31219
* testsuite/gas/i386/suffix.d: Updated.
* testsuite/gas/i386/x86-64-suffix.d: Likewise.
* testsuite/gas/i386/suffix.s: Add tests for adcx and adox.
* testsuite/gas/i386/x86-64-suffix.s: Likewise.
opcodes/
PR binutils/31219
* i386-dis.c: Add the 'L' suffix.
(prefix_table): Replace S with L on adcx and adox.
(putop): Handle the 'L' suffix.
There are a number of issues with 734dfd1cc9 ("x86: pack CPU flags in
opcode table"):
- the condition when two array slots need writing wasn't correct (with
enough new Cpu* added an out of bounds array access would validly have
been complained about by the compiler),
- table generation didn't take into account CpuAttrUnused and CpuUnused
being independent, and hence there not always (not) being an "unused"
bitfield member in both structures,
- cpu_flags_from_attr() wasn't ready for use on big-endian hosts,
- there were two style violations.
Since the particularity of "th.vsetvli" was not taken into account in the
initial support patches for XTheadVector, the program operation failed
due to instruction coding errors. According to T-Head SPEC ([1]), the
"vsetvl" in the XTheadVector extension consists of SEW, LMUL and EDIV,
which is quite different from the "V" extension. Therefore, we cannot
simply reuse the processing of vsetvl in V extension.
We have set up tens of thousands of test cases to ensure that no
further encoding issues are there, and and execute all compiled test
files on real HW and make sure they don't trigger SIGILL.
Ref:
[1] https://github.com/T-head-Semi/thead-extension-spec/releases/download/2.3.0/xthead-2023-11-10-2.3.0.pdf
Co-developed-by: Lifang Xia <lifang_xia@linux.alibaba.com>
Co-developed-by: Christoph Müllner <christoph.muellner@vrull.eu>
gas/ChangeLog:
* config/tc-riscv.c (validate_riscv_insn): Add handling for
th.vsetvli.
(my_getThVsetvliExpression): New function.
(riscv_ip): Likewise.
* testsuite/gas/riscv/x-thead-vector.d: Likewise.
* testsuite/gas/riscv/x-thead-vector.s: Likewise.
include/ChangeLog:
* opcode/riscv.h (OP_MASK_XTHEADVLMUL): New macro.
(OP_SH_XTHEADVLMUL): Likewise.
(OP_MASK_XTHEADVSEW): Likewise.
(OP_SH_XTHEADVSEW): Likewise.
(OP_MASK_XTHEADVEDIV): Likewise.
(OP_SH_XTHEADVEDIV): Likewise.
(OP_MASK_XTHEADVTYPE_RES): Likewise.
(OP_SH_XTHEADVTYPE_RES): Likewise.
opcodes/ChangeLog:
* riscv-dis.c (print_insn_args): Likewise.
* riscv-opc.c: Likewise.
Adds two new external authors to etc/update-copyright.py to cover
bfd/ax_tls.m4, and adds gprofng to dirs handled automatically, then
updates copyright messages as follows:
1) Update cgen/utils.scm emitted copyrights.
2) Run "etc/update-copyright.py --this-year" with an extra external
author I haven't committed, 'Kalray SA.', to cover gas testsuite
files (which should have their copyright message removed).
3) Build with --enable-maintainer-mode --enable-cgen-maint=yes.
4) Check out */po/*.pot which we don't update frequently.
Suppose we want to use la.got to generate 32 pcrel and
32 abs instruction sequences respectively. According to
the existing conditions, to generate 32 pcrel sequences
use -mabi=ilp32*, and to generate 32 abs use -mabi=ilp32*
and -mla-global-with-abs.
Due to the fact that the conditions for generating 32 abs
also satisfy 32 pcrel, using -mabi=ilp32* and -mla-global-with-abs
will result in only generating instruction sequences of 32 pcrel.
By modifying the conditions for macro expansion and adjusting
the matching order of macro instructions, it is ensured that
the correct sequence of instructions can be generated.
This adds an extra feature: Commas inside double quotes are not an
arg delimiter, and thus can be part of the arg.
* loongarch-coder.c (loongarch_split_args_by_comma): Commas
inside quotes are not arg delimiters.
In order to make it easier to complete the compiler's support for
the XTheadVector extension and to be as compatible as possible
with the programming model of the 'V' extension ([1]), we consider
adding a few pseudo instructions ([2]).
th.vmmv.m vd,vs => th.vmand.mm vd,vs,vs
th.vneg.v vd,vs => th.vrsub.vx vd,vs,x0
th.vncvt.x.x.v vd,vs,vm => th.vnsrl.vx vd,vs,x0,vm
th.vfneg.v vd,vs => th.vfsgnjn.vv vd,vs,vs
th.vfabs.v vd,vs => th.vfsgnjx.vv vd,vs,vs
Ref:
[1] https://gcc.gnu.org/pipermail/gcc-patches/2023-December/641302.html
[2] https://github.com/T-head-Semi/thead-extension-spec/pull/40
Co-developed-by: Lifang Xia <lifang_xia@linux.alibaba.com>
Co-developed-by: Christoph Müllner <christoph.muellner@vrull.eu>
gas/ChangeLog:
* testsuite/gas/riscv/x-thead-vector.d: Add tests for new
pseudoinstructions.
* testsuite/gas/riscv/x-thead-vector.s: Likewise.
opcodes/ChangeLog:
* riscv-opc.c: Add new pseudoinstructions.
This patch adds non-ND, non-NF forms of EVEX promotion insn.
EVEX extension of legacy instructions:
All promoted legacy instructions are placed in EVEX map 4, which is
currently reserved.
EVEX extension of EVEX instructions:
All existing EVEX instructions are extended by APX using the extended
EVEX prefix, so that they can access all 32 GPRs.
EVEX extension of VEX instructions:
Promoting a VEX instruction into the EVEX space does not change the map
id, the opcode, or the operand encoding of the VEX instruction.
Note: The promoted versions of MOVBE will be extended to include the “MOVBE
reg1, reg2”.
gas/ChangeLog:
2023-12-28 Lingling Kong <lingling.kong@intel.com>
H.J. Lu <hongjiu.lu@intel.com>
Lili Cui <lili.cui@intel.com>
Lin Hu <lin1.hu@intel.com>
* config/tc-i386.c (struct _i386_insn): Add has_egpr.
(need_evex_encoding): Adjusted for apx.
(cpu_flags_match): Ditto.
(install_template): Handled APX combines.
(is_apx_evex_encoding): Test apx evex encoding.
(build_apx_evex_prefix): Enabe APX evex prefix.
(md_assemble): Handle apx with evex encoding.
(process_suffix): Handle apx map4 prefix.
(check_register): Assign i.vec_encoding for APX evex instructions.
* testsuite/gas/i386/x86-64-evex.d: Adjust test cases.
* testsuite/gas/i386/x86-64.exp: Adjust x86-64-inval-movbe.
opcodes/ChangeLog:
* i386-dis-evex-len.h: Handle EVEX_LEN_0F38F2, EVEX_LEN_0F38F3.
* i386-dis-evex-prefix.h: Handle PREFIX_EVEX_0F38F2_L_0,
PREFIX_EVEX_0F38F3_L_0, PREFIX_EVEX_MAP4_D8,
PREFIX_EVEX_MAP4_DA, PREFIX_EVEX_MAP4_DB,
PREFIX_EVEX_MAP4_DC, PREFIX_EVEX_MAP4_DD,
PREFIX_EVEX_MAP4_DE, PREFIX_EVEX_MAP4_DF,
PREFIX_EVEX_MAP4_F0, PREFIX_EVEX_MAP4_F1,
PREFIX_EVEX_MAP4_F2, PREFIX_EVEX_MAP4_F8.
* i386-dis-evex-reg.h: Handle REG_EVEX_0F38F3_L_0_P_0.
* i386-dis-evex.h: Add EVEX_MAP4_ for legacy insn
promote to apx to use gpr32
* opcodes/i386-dis-evex-x86-64.h: Handle Add X86_64_EVEX_0F90,
X86_64_EVEX_0F92, X86_64_EVEX_0F93, X86_64_EVEX_0F38F2,
X86_64_EVEX_0F38F3, X86_64_EVEX_0F38F5, X86_64_EVEX_0F38F6,
X86_64_EVEX_0F38F7, X86_64_EVEX_0F3AF0, X86_64_EVEX_0F91.
* i386-dis.c
(struct instr_info): Deleted bool r.
(PREFIX_NP_OR_DATA): New.
(NO_PREFIX): New.
(putop): Ditto.
(X86_64_EVEX_FROM_VEX_TABLE): Diito.
(get_valid_dis386): Decode insn erex in extend evex prefix.
Handle EVEX_MAP4
(print_insn): Handle PREFIX_DATA_AND_NP_ONLY.
(print_register): Handle apx instructions decode.
(OP_E_memory): Diito.
(OP_G): Diito.
(OP_XMM): Diito.
(DistinctDest_Fixup): Diito.
* i386-gen.c (process_i386_opcode_modifier): Add EVEXMAP4.
* i386-opc.h (SPACE_EVEXMAP4): Add legacy insn
promote to evex.
* i386-opc.tbl: Handle some legacy and vex insns don't
support gpr32. And add some legacy insn (map2 / 3) promote
to evex.
This fixes the buffer overflow added in commit 22b78fad28, and a few
other problems.
* loongarch-coder.c (loongarch_split_args_by_comma): Don't
overflow buffer when args == "". Don't remove unbalanced
quotes. Don't trim last arg if max number of args exceeded.
Suffix the instruction description of conditional branch extended
mnemonics with their condition (e.g. "on A high"). This complements
the optional printing of instruction descriptions as comments in the
disassembly.
Due to the added text the maximum description length is increased from
80 to 128 characters (including the trailing '\0' character).
opcodes/
* s390-mkopc.c: Add suffix to conditional branch extended
mnemonic instruction descriptions.
gas/
* testsuite/gas/s390/zarch-insndesc.s: Add test cases for
printing of suffixed instruction description of conditional
branch extended mnemonics.
* testsuite/gas/s390/zarch-insndesc.d: Likewise.
Signed-off-by: Jens Remus <jremus@linux.ibm.com>
Reviewed-by: Andreas Krebbel <krebbel@linux.ibm.com>
Print instruction description as comment in disassembly with s390
architecture specific option "insndesc":
- For objdump it can be enabled with option "-M insndesc"
- In gdb it can be enabled with "set disassembler-options insndesc"
Since comments are not column aligned the output can enhanced for
readability by postprocessing using a filter such as "expand":
... | expand -t 8,16,24,32,40,80
Or when using in combination with objdump option --visualize-jumps:
... | expand | sed -e 's/ *#/\t#/' | expand -t 1,80
Note that the instruction descriptions add about 128 KB to s390-opc.o:
s390-opc.o without instruction descriptions: 216368 bytes
s390-opc.o with instruction descriptions : 348432 bytes
binutils/
* NEWS: Mention new s390-specific disassembler option
"insndesc".
include/
* opcode/s390.h (struct s390_opcode): Add field to hold
instruction description.
opcodes/
* s390-mkopc.c: Copy instruction description from s390-opc.txt
into generated operation code table s390-opc.tab.
* s390-opc.c (s390_opformats): Provide NULL as description in
.insn pseudo-mnemonics opcode table.
* s390-dis.c: Add s390-specific disassembler option "insndesc"
and optionally print the instruction description as comment in
the disassembly when it is specified.
gas/
* testsuite/gas/s390/s390.exp: Add new test disassembly test
case "zarch-insndesc".
* testsuite/gas/s390/zarch-insndesc.s: New test case for s390-
specific disassembler option "insndesc".
* testsuite/gas/s390/zarch-insndesc.d: Likewise.
Signed-off-by: Jens Remus <jremus@linux.ibm.com>
Reviewed-by: Andreas Krebbel <krebbel@linux.ibm.com>
Use strncpy() and snprintf() instead of strcpy() and strcat(). Define
and use macros for string lengths, such as mnemonic, instruction
format, and instruction description.
This is a mechanical change, although some buffers have increased in
length by one character. This has been confirmed by verifying that the
generated opcode/s390-opc.tab is unchanged.
opcodes/
* s390-mkopc.c: Use strncpy() and strncat().
Suggested-by: Nick Clifton <nickc@redhat.com>
Signed-off-by: Jens Remus <jremus@linux.ibm.com>
Reviewed-by: Andreas Krebbel <krebbel@linux.ibm.com>
When the s390-mkopc utility detects an error it prints an error message
to strerr and either continues processing or exists with a non-zero
return code. If it continues without detecting any further error the
final return code was zero, potentially hiding the detected error.
Introduce a global variable to hold the final return code and initialize
it to EXIT_SUCCESS. Introduce a helper function print_error() that
prints an error message to stderr and sets the final return code to
EXIT_FAILURE. Use it to print all error messages. Return the final
return code at the end of the processing.
While at it enhance error messages to state more clearly which mnemonic
an error was detected for. Also continue processing for cases where
subsequent mnemonics can be processed.
opcodes/
* s390-mkopc.c: Enhance error handling. Return EXIT_FAILURE
in case of an error, otherwise EXIT_SUCCESS.
Signed-off-by: Jens Remus <jremus@linux.ibm.com>
Reviewed-by: Andreas Krebbel <krebbel@linux.ibm.com>
Provide descriptions for instructions introduced with commit ba2b480f10
("IBM Z: Implement instruction set extensions"). This complements commit
69341966de ("IBM zSystems: Add support for z16 as CPU name."). Use
instruction names from IBM z/Architecture Principles of Operation [1] as
instruction description.
[1]: IBM z/Architecture Principles of Operation, SA22-7832-13, IBM z16,
https://publibfp.dhe.ibm.com/epubs/pdf/a227832d.pdf
opcodes/
* s390-opc.txt: Add descriptions for IBM z16 (arch14)
instructions.
Signed-off-by: Jens Remus <jremus@linux.ibm.com>
Reviewed-by: Andreas Krebbel <krebbel@linux.ibm.com>
Change the bitwise operations names "and" and "or" to lower case. Change
the register name abbreviations "FPR", "GR", and "VR" to upper case.
opcodes/
* s390-opc.txt: Align letter case of instruction descriptions.
Signed-off-by: Jens Remus <jremus@linux.ibm.com>
Reviewed-by: Andreas Krebbel <krebbel@linux.ibm.com>
Suffix the s390-mkopc build utility executable file name with
EXEEXT_FOR_BUILD. Otherwise it cannot be located when building with
EXEEXT_FOR_BUILD. Use pattern used for other architecture build
utilities and compile and link s390-mkopc in two steps.
While at it also specify the dependencies of s390-mkopc.c.
opcodes/
* Makefile.am: Add target to build s390-mkopc.o. Correct
target to build s390-mkopc$(EXEEXT_FOR_BUILD).
* Makefile.in: Regenerate.
Signed-off-by: Jens Remus <jremus@linux.ibm.com>
Reviewed-by: Andreas Krebbel <krebbel@linux.ibm.com>
This patch add support for FEAT_ECBHB "Exploitative control using
branch history information" adding the "clrbhb" instruction. AFAIU
the same alias was originally added as "clearbhb" before the
architecture was finalized (Mandatory v8.9-a/v9.4-a; Optional
v8.0-a+/v9.0-a+).
This patch add supports for FEAT_SPECRES2 "Enhanced speculation
restriction instructions" adding the "cosp" instruction.
This is mandatory v8.9-a/v9.4-a and optional v8.0-a+/v9.0-a+. It is
enabled by the +predres2 march flag.
Since AVX10.1/256 will also allow 64 bit mask register, we will
remove the restriction for size of the mask register in AVX10.
gas/ChangeLog:
* config/tc-i386.c (VSZ128, VSZ256, VSZ512): New.
(VEX_check_encoding): Remove opcode_modifier check for vsz.
* testsuite/gas/i386/avx10-vsz.l: Remove testcases for mask
registers since they are not needed.
* testsuite/gas/i386/avx10-vsz.s: Ditto.
opcodes/ChangeLog:
* i386-gen.c: Remove Vsz.
* i386-opc.h: Ditto.
* i386-opc.tbl: Remove kvsz.
* i386-tbl.h: Regenerated.
For tail36, it is necessary to explicitly indicate the temporary register.
Therefore, the compiler and users will know that the tail will use a register.
call36 func
pcalau18i $ra, %call36(func)
jirl $ra, $ra, 0;
tail36 $t0, func
pcalau18i $t0, %call36(func)
jirl $zero, $t0, 0;
Now that ATTSyntax and ATTMnemonic aren't use in combination anymore,
fold them and IntelSyntax into a single, enum-like attribute. Note that
this shrinks i386_opcode_modifier back to 2 32-bit words (albeit that's
not for long, seeing in-flight additions for APX).
As noted in the context of d53e6b98a2 ("x86/Intel: correct disassembly
of fsub*/fdiv*") there's no such thing as Intel syntax without Intel
mnemonics. Enforce this on the assembler side, and disentangle command
line option handling on the disassembler side accordingly.
As a result in the opcode table specifying ATTMnemonic|ATTSyntax becomes
redundant with just ATTMnemonic. Drop the now meaningless ATTSyntax and
remove the then no longer accessible templates.
The description of instructions 'th.fmv.hw.x' and 'th.fmv.x.hw' of the
XTheadFmv extension in T-Head specific is incorrect, and it also has
some impact on the implementation of the binutils, so this patch
corrects this.
For details see:
https://github.com/T-head-Semi/thead-extension-spec/pull/34
gas/ChangeLog:
* testsuite/gas/riscv/x-thead-fmv.d: Correct test.
* testsuite/gas/riscv/x-thead-fmv.s: Likewise.
include/ChangeLog:
* opcode/riscv-opc.h (MATCH_TH_FMV_HW_X): Correct coding.
(MASK_TH_FMV_HW_X): Likewise.
(MATCH_TH_FMV_X_HW): Likewise.
(MASK_TH_FMV_X_HW): Likewise.
opcodes/ChangeLog:
* riscv-opc.c: Correct operands.
Add support for jump visualization for the s390 architecture in
disassembly:
objdump -d --visualize-jumps ...
Annotate the (conditional) jump and branch relative instructions with
information required for jump visualization:
- jump: Unconditional jump / branch relative.
- condjump: Conditional jump / branch relative.
- jumpsr: Jump / branch relative to subroutine.
Unconditional jump and branch relative instructions are annotated as
jump.
Conditional jump and branch relative instructions, jump / branch
relative on count/index, and compare and jump / branch relative
instructions are annotated as condjump.
Jump and save (jas, jasl) and branch relative and save (bras, brasl)
instructions are annotated as jumpsr (jump to subroutine).
Provide instruction information required for jump visualization during
disassembly.
The instruction type is provided after determining the opcode.
For non-code it is set to dis_noninsn. Otherwise it defaults to
dis_nonbranch. No annotation is done for data reference instructions
(i.e. instruction types dis_dref and dis_dref2). Note that the
instruction type needs to be provided before printing of the
instruction, as it is used in print_address_func() to translate the
argument value into an address if it is assumed to be a PC-relative
offset. Note that this is never the case on s390, as
print_address_func() is only called with addresses and never with
offsets.
The target of the (conditional) jump and branch relative instructions
is provided during print, when the PC relative operand is decoded.
include/
* opcode/s390.h: Define opcode flags to annotate instruction
class information for jump visualization:
S390_INSTR_FLAG_CLASS_BRANCH, S390_INSTR_FLAG_CLASS_RELATIVE,
S390_INSTR_FLAG_CLASS_CONDITIONAL, and
S390_INSTR_FLAG_CLASS_SUBROUTINE.
Define opcode flags mask S390_INSTR_FLAG_CLASS_MASK for above
instruction class information.
Define helpers for common instruction class flag combinations:
S390_INSTR_FLAGS_CLASS_JUMP, S390_INSTR_FLAGS_CLASS_CONDJUMP,
and S390_INSTR_FLAGS_CLASS_JUMPSR.
opcodes/
* s390-mkopc.c: Add opcode flags to annotate information
for jump visualization: jump, condjump, and jumpsr.
* s390-opc.txt: Annotate (conditional) jump and branch relative
instructions with information for jump visualization.
* s390-dis.c (print_insn_s390, s390_print_insn_with_opcode):
Provide instruction information for jump visualization.
Signed-off-by: Jens Remus <jremus@linux.ibm.com>
Reviewed-by: Andreas Krebbel <krebbel@linux.ibm.com>
... as MSR index specifier: It is unreasonable to demand that people
write less readable / understandable code, just because the present
documentation mentions only Reg64. Whether to also adjust the
disassembler is a separate question, perhaps indeed more tightly tied
to what the spec says.
riscv_is_mapping_symbol currently accepts any symbol that starts with $x
or $d. This patch makes the check more strict, requiring exactly $x, $d,
or $xrv. It also makes use of this stricter mapping in
riscv_is_valid_mapping_symbol.
ChangeLog:
* bfd/cpu-riscv.c (riscv_elf_is_mapping_symbols): Match only
strings that are exactly $x, $d, or $xrv.
* opcodes/riscv-dis.c (riscv_is_valid_mapping_symbol): Use
riscv_elf_is_mapping_symbols.
Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
SiFive has define as set of flexible instruction for extending vector
coprocessor, it able to encoding opcode like .insn but with predefined
format.
List of instructions:
sf.vc.x
sf.vc.i
sf.vc.vv
sf.vc.xv
sf.vc.iv
sf.vc.fv
sf.vc.vvv
sf.vc.xvv
sf.vc.ivv
sf.vc.fvv
sf.vc.vvw
sf.vc.xvw
sf.vc.ivw
sf.vc.fvw
sf.vc.v.x
sf.vc.v.i
sf.vc.v.vv
sf.vc.v.xv
sf.vc.v.iv
sf.vc.v.fv
sf.vc.v.vvv
sf.vc.v.xvv
sf.vc.v.ivv
sf.vc.v.fvv
sf.vc.v.vvw
sf.vc.v.xvw
sf.vc.v.ivw
sf.vc.v.fvw
Spec of Xsfvcp
https://www.sifive.com/document-file/sifive-vector-coprocessor-interface-vcix-software
Co-authored-by: Hau Hsu <hau.hsu@sifive.com>
Co-authored-by: Kito Cheng <kito.cheng@sifive.com>
Back then when the support for the RISC-V vector crypto extensions
was merged, the specification was frozen, but not ratified.
A frozen specification is allowed to change within tight bounds
before ratification and this has happend with the vector crypto
extensions.
The following changes were applied:
* A new extension Zvkb was defined, which is a strict subset of Zvbb.
* Zvkn and Zvks include now Zvkb instead of Zvbb.
This patch implements these changes between the frozen and the
ratified specification.
Note, that this technically an incompatible change of Zvkn and Zvks,
but I am not aware of any project that depends on the currently
implemented behaviour of Zvkn and Zvks. So this patch should be fine.
Reported-By: Jerry Shih <jerry.shih@sifive.com>
Reported-By: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
Currently objdump gets and updates the map state once per symbol. Updating the
state (partiularly riscv_parse_subset) is expensive and grows quadratically
since we iterate over all symbols. By deferring this until once we've found the
symbol of interest, we can reduce the time to dump a 4k insn file of .norvc and
.rvc insns from ~47 seconds to ~0.13 seconds.
opcodes/ChangeLog:
* riscv-dis.c (riscv_get_map_state): Remove state updating logic
and rename to riscv_is_valid_mapping_symbol.
(riscv_update_map_state): Add state updating logic to seperate function.
(riscv_search_mapping_symbol): Use new riscv_update_map_state.
(riscv_data_length): Ditto.
Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
Have i386-gen produce merely the offsets into i386_optab[]. Besides
allowing to shrink the table even on 32-bit builds, this results in
removing a level of indirection from the frequently accessed
current_templates, in return for adding a level of indirection when
looking up mnemonics (commonly happening just once per insn). Plus for
PIE builds of gas it also reduces the number of relocations by about two
thousand. Finally a somewhat ugly static variable can also be eliminated
from i386_displacement().
Fold M_{S,Z}EXTH, deriving signed-ness from the incoming mnemonic. Fold
riscv_ext()'s calls md_assemblef(), the first of which were entirely
identical, while the other pair differed in just a single character.
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
While for some of the macro insns using x0 is kind of okay, as they
would merely resolve to a sequence of hint insns (and hence not cause
misbehavior at runtime), several of them have the degenerate AUIPC
followed by a load, store, or branch using other than the designated
symbol as address and hence causing runtime issues. Refuse to assemble
those, leveraging that the matching function so far wasn't really used
for macro insns: NULL is now allowed, indicating a match (which imo is
preferable over converting match_never() to match_always()), while
other matching functions now (also) used for macro insns need to avoid
calling match_opcode().
Note that for LA the restriction is slightly too strict: In non-PIC mode
using x0 would be okay-ish as per above (as it's just LLA there). Yet
libopcodes doesn't know what mode gas is presently assembling for, so we
want to err on the safe side.
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Add extended mnemonics specified in the z/Architecture Principles of
Operation [1] and z/Architecture Reference Summary [2], that were
previously missing from the opcode table.
The following added extended mnemonics are synonyms to a base mnemonic
and therefore disassemble into their base mnemonic:
jc, jcth, lfi, llgfi, llghi
The following added extended mnemonics are more specific than their base
mnemonic and therefore disassemble into the added extended mnemonic:
risbhgz, risblgz, rnsbgt, rosbgt, rxsbgt
The following added extended mnemonics are more specific than their base
mnemonic, but disassemble into their base mnemonic due to design
constraints:
notr, notgr
The missing extended mnemonic jl* conditional jump long flavors cannot
be added, as they would clash with the existing non-standard extended
mnemonic j* conditional jump flavors jle and jlh. The missing extended
mnemonic jlc jump long conditional is not added, as the related jl*
flavors cannot be added.
Note that these missing jl* conditional jump long flavors are already
defined as non-standard jg* flavors instead. While the related missing
extended mnemonic jlc could be added as non-standard jgc instead it is
forgone in favor of not adding further non-standard mnemonics.
The missing extended mnemonics sllhh, sllhl, slllh, srlhh, srlhl, and
srllh cannot be implemented using the current design, as they require
computed operands. For that reason the following missing extended
mnemonics are not added as well, as they fall into the same category of
instructions that operate on high and low words of registers. They
should better be added together, not to confuse the user, which of those
instructions are currently implemented or not.
lhhr, lhlr, llhfr, llchhr, llchlr, llclhr, llhhhr, llhhlr, llhlhr,
nhhr, nhlr, nlhr, ohhr, ohlr, olhr, xhhr, xhlr, xlhr
[1] IBM z/Architecture Principles of Operation, SA22-7832-13, IBM z16,
https://publibfp.dhe.ibm.com/epubs/pdf/a227832d.pdf
[2] IBM z/Architecture Reference Summary, SA22-7871-11,
https://www.ibm.com/support/pages/sites/default/files/2022-09/SA22-7871-11.pdf
opcodes/
* s390-opc.c: Define operand formats R_CP16_28, U6_18, and
U5_27. Define instruction formats RIE_RRUUU3, RIE_RRUUU4,
and RRF_R0RR4.
* s390-opc.txt: Add extended mnemonics jc, jcth, lfi, llgfi,
llghi, notgr, notr, risbhgz, risblgz, rnsbgt, rosbgt, and
rxsbgt.
gas/
* config/tc-s390.c: Add support to insert operand for format
R_CP16_28, reusing existing logic for format V_CP16_12.
* testsuite/gas/s390/esa-g5.s: Add test for extended mnemonic
jc.
* testsuite/gas/s390/esa-g5.d: Likewise.
* testsuite/gas/s390/zarch-z900.s: Add test for extended
mnemonic llghi.
* testsuite/gas/s390/zarch-z900.d: Likewise.
* testsuite/gas/s390/zarch-z9-109.s: Add tests for extended
mnemonics lfi and llgfi.
* testsuite/gas/s390/zarch-z9-109.d: Likewise.
* testsuite/gas/s390/zarch-z10.s: Add tests for extended
mnemonics rnsbgt, rosbgt, and rxsbgt.
* testsuite/gas/s390/zarch-z10.d: Likewise.
* testsuite/gas/s390/zarch-z196.s: Add tests for extended
mnemonics jcth, risbhgz, and risblgz.
* testsuite/gas/s390/zarch-z196.d: Likewise.
* testsuite/gas/s390/zarch-arch13.s: Add tests for extended
mnemonics notr and notgr.
* testsuite/gas/s390/zarch-arch13.d: Likewise.
Signed-off-by: Jens Remus <jremus@linux.ibm.com>
Reviewed-by: Andreas Krebbel <krebbel@linux.ibm.com>
The IBM z/Architecture Principle of Operation [1] specifies the last
operand(s) of some (extended) mnemonics to be optional. Align the
mnemonic definitions in the opcode table according to specification.
This changes the last operand of the following (extended) mnemonics to
be optional:
risbg, risbgz, risbgn, risbgnz, risbhg, risblg, rnsbg, rosbg, rxsbg
Note that efpc and sfpc actually have only one operand, but had
erroneously been defined to have two. For backwards compatibility the
wrong RR register format must be retained. Since the superfluous second
operand is defined as optional the instruction can still be coded as
specified.
[1]: IBM z/Architecture Principles of Operation, SA22-7832-13, IBM z16,
https://publibfp.dhe.ibm.com/epubs/pdf/a227832d.pdf
opcodes/
* s390-opc.txt: Align optional operand definition to
specification.
testsuite/
* zarch-z10.s: Add test cases for risbg, risbgz, rnsbg, rosbg,
and rxsbg.
* zarch-z10.d: Likewise.
* zarch-z196.s: Add test cases for risbhg and risblg.
* zarch-z196.d: Likewise.
* zarch-zEC12.s: Add test cases for risbgn and risbgnz.
* zarch-zEC12.d: Likewise.
Signed-off-by: Jens Remus <jremus@linux.ibm.com>
Reviewed-by: Andreas Krebbel <krebbel@linux.ibm.com>
This is a purely mechanical change. It allows subsequent insertions into
the operands table without having to renumber all operand indices.
The only differences in the resulting ELF object are in the .debug_info
section. This has been confirmed by diffing the following xxd and readelf
output:
xxd s390-opc.o
readelf -aW -x .text -x .data -x .bss -x .rodata -x .debug_info \
-x .symtab -x .strtab -x .shstrtab --debug-dump s390-opc.o
opcodes/
* s390-opc.c: Make operand table indices relative to each other.
Signed-off-by: Jens Remus <jremus@linux.ibm.com>
Reviewed-by: Andreas Krebbel <krebbel@linux.ibm.com>
T-Head has a range of vendor-specific instructions. Therefore
it makes sense to group them into smaller chunks in form of
vendor extensions.
This patch adds permutation instructions for the "XTheadVector"
extension. The 'th' prefix and the "XTheadVector" extension
are documented in a PR for the RISC-V toolchain conventions ([1]).
[1] https://github.com/riscv-non-isa/riscv-toolchain-conventions/pull/19
Co-developed-by: Lifang Xia <lifang_xia@linux.alibaba.com>
Co-developed-by: Christoph Müllner <christoph.muellner@vrull.eu>
gas/ChangeLog:
* testsuite/gas/riscv/x-thead-vector.d: Add tests for
permutation instructions.
* testsuite/gas/riscv/x-thead-vector.s: Likewise.
include/ChangeLog:
* opcode/riscv-opc.h (MATCH_TH_VMVXS): New.
opcodes/ChangeLog:
* riscv-opc.c: Likewise.
T-Head has a range of vendor-specific instructions. Therefore
it makes sense to group them into smaller chunks in form of
vendor extensions.
This patch adds mask instructions for the "XTheadVector"
extension. The 'th' prefix and the "XTheadVector" extension
are documented in a PR for the RISC-V toolchain conventions ([1]).
[1] https://github.com/riscv-non-isa/riscv-toolchain-conventions/pull/19
Co-developed-by: Lifang Xia <lifang_xia@linux.alibaba.com>
Co-developed-by: Christoph Müllner <christoph.muellner@vrull.eu>
gas/ChangeLog:
* testsuite/gas/riscv/x-thead-vector.d: Add tests for
mask instructions.
* testsuite/gas/riscv/x-thead-vector.s: Likewise.
include/ChangeLog:
* opcode/riscv-opc.h (MATCH_TH_VMPOPCM): New.
opcodes/ChangeLog:
* riscv-opc.c: Likewise.
T-Head has a range of vendor-specific instructions. Therefore
it makes sense to group them into smaller chunks in form of
vendor extensions.
This patch adds reductions instructions for the "XTheadVector"
extension. The 'th' prefix and the "XTheadVector" extension
are documented in a PR for the RISC-V toolchain conventions ([1]).
[1] https://github.com/riscv-non-isa/riscv-toolchain-conventions/pull/19
Co-developed-by: Lifang Xia <lifang_xia@linux.alibaba.com>
Co-developed-by: Christoph Müllner <christoph.muellner@vrull.eu>
gas/ChangeLog:
* testsuite/gas/riscv/x-thead-vector.d: Add tests for
reductions instructions.
* testsuite/gas/riscv/x-thead-vector.s: Likewise.
opcodes/ChangeLog:
* riscv-opc.c: Likewise.
T-Head has a range of vendor-specific instructions. Therefore
it makes sense to group them into smaller chunks in form of
vendor extensions.
This patch adds floating-point arithmetic instructions for the
"XTheadVector" extension. The 'th' prefix and the
"XTheadVector" extension are documented in a PR for the
RISC-V toolchain conventions ([1]).
[1] https://github.com/riscv-non-isa/riscv-toolchain-conventions/pull/19
Co-developed-by: Lifang Xia <lifang_xia@linux.alibaba.com>
Co-developed-by: Christoph Müllner <christoph.muellner@vrull.eu>
gas/ChangeLog:
* testsuite/gas/riscv/x-thead-vector.d: Add tests for
floating-point arithmetic instructions.
* testsuite/gas/riscv/x-thead-vector.s: Likewise.
include/ChangeLog:
* opcode/riscv-opc.h (MATCH_TH_VFSQRTV): New.
opcodes/ChangeLog:
* riscv-opc.c: Likewise.
T-Head has a range of vendor-specific instructions. Therefore
it makes sense to group them into smaller chunks in form of
vendor extensions.
This patch adds fixed-point arithmetic instructions for the
"XTheadVector" extension. The 'th' prefix and the
"XTheadVector" extension are documented in a PR for the
RISC-V toolchain conventions ([1]).
[1] https://github.com/riscv-non-isa/riscv-toolchain-conventions/pull/19
Co-developed-by: Lifang Xia <lifang_xia@linux.alibaba.com>
Co-developed-by: Christoph Müllner <christoph.muellner@vrull.eu>
gas/ChangeLog:
* testsuite/gas/riscv/x-thead-vector.d: Add tests for
fixed-point arithmetic instructions.
* testsuite/gas/riscv/x-thead-vector.s: Likewise.
include/ChangeLog:
* opcode/riscv-opc.h (MATCH_TH_VAADDVV): New.
opcodes/ChangeLog:
* riscv-opc.c: Likewise.
T-Head has a range of vendor-specific instructions. Therefore
it makes sense to group them into smaller chunks in form of
vendor extensions.
This patch adds integer arithmetic instructions for the
"XTheadVector" extension. The 'th' prefix and the
"XTheadVector" extension are documented in a PR for the
RISC-V toolchain conventions ([1]).
[1] https://github.com/riscv-non-isa/riscv-toolchain-conventions/pull/19
Co-developed-by: Lifang Xia <lifang_xia@linux.alibaba.com>
Co-developed-by: Christoph Müllner <christoph.muellner@vrull.eu>
gas/ChangeLog:
* testsuite/gas/riscv/x-thead-vector.d: Add tests for
integer arithmetic instructions.
* testsuite/gas/riscv/x-thead-vector.s: Likewise.
include/ChangeLog:
* opcode/riscv-opc.h (MATCH_TH_VADCVVM): New.
opcodes/ChangeLog:
* riscv-opc.c: Likewise.
T-Head has a range of vendor-specific instructions. Therefore
it makes sense to group them into smaller chunks in form of
vendor extensions.
This patch adds the sub-extension "XTheadZvamo" for the
"XTheadVector" extension, and it provides AMO instructions
for T-Head VECTOR vendor extension. The 'th' prefix and the
"XTheadVector" extension are documented in a PR for the
RISC-V toolchain conventions ([1]).
[1] https://github.com/riscv-non-isa/riscv-toolchain-conventions/pull/19
Co-developed-by: Lifang Xia <lifang_xia@linux.alibaba.com>
Co-developed-by: Christoph Müllner <christoph.muellner@vrull.eu>
bfd/ChangeLog:
* elfxx-riscv.c (riscv_multi_subset_supports): Add support
for "XTheadZvamo" extension.
(riscv_multi_subset_supports_ext): Likewise.
gas/ChangeLog:
* doc/c-riscv.texi:
* testsuite/gas/riscv/x-thead-vector-zvamo.d: New test.
* testsuite/gas/riscv/x-thead-vector-zvamo.s: New test.
include/ChangeLog:
* opcode/riscv-opc.h (MATCH_TH_VAMOADDWV): New.
* opcode/riscv.h (enum riscv_insn_class): Add insn class.
opcodes/ChangeLog:
* riscv-opc.c: Likewise.
T-Head has a range of vendor-specific instructions. Therefore it
makes sense to group them into smaller chunks in form of vendor
extensions.
This patch adds provides load/store segment instructions for T-Head VECTOR
vendor extension, which same as the "Zvlsseg" extension in RVI 0.71 vector
extension, but belongs to the "XTheadVector" extension. The 'th' prefix
and the "XTheadVector" extension are documented in a PR for the
RISC-V toolchain conventions ([1]).
[1] https://github.com/riscv-non-isa/riscv-toolchain-conventions/pull/19
Co-developed-by: Lifang Xia <lifang_xia@linux.alibaba.com>
Co-developed-by: Christoph Müllner <christoph.muellner@vrull.eu>
gas/ChangeLog:
* testsuite/gas/riscv/x-thead-vector.d: Add test.
* testsuite/gas/riscv/x-thead-vector.s: Likewise.
include/ChangeLog:
* opcode/riscv-opc.h (MATCH_TH_VLSEG2BV): New.
opcodes/ChangeLog:
* riscv-opc.c: Likewise.
T-Head has a range of vendor-specific instructions. Therefore
it makes sense to group them into smaller chunks in form of
vendor extensions.
This patch adds load/store instructions for the "XTheadVector"
extension. The 'th' prefix and the "XTheadVector" extension are
documented in a PR for the RISC-V toolchain conventions ([1]).
[1] https://github.com/riscv-non-isa/riscv-toolchain-conventions/pull/19
Co-developed-by: Lifang Xia <lifang_xia@linux.alibaba.com>
Co-developed-by: Christoph Müllner <christoph.muellner@vrull.eu>
gas/ChangeLog:
* testsuite/gas/riscv/x-thead-vector.d: Add tests for
load/store instructions.
* testsuite/gas/riscv/x-thead-vector.s: Likewise.
include/ChangeLog:
* opcode/riscv-opc.h (MATCH_TH_VLBV): New.
opcodes/ChangeLog:
* riscv-opc.c: Likewise.
T-Head has a range of vendor-specific instructions.
Therefore it makes sense to group them into smaller chunks
in form of vendor extensions.
This patch adds configuration-setting instructions for the "XTheadVector"
extension. The 'th' prefix and the "XTheadVector" extension are documented
in a PR for the RISC-V toolchain conventions ([1]).
[1] https://github.com/riscv-non-isa/riscv-toolchain-conventions/pull/19
Co-developed-by: Lifang Xia <lifang_xia@linux.alibaba.com>
Co-developed-by: Christoph Müllner <christoph.muellner@vrull.eu>
gas/ChangeLog:
* testsuite/gas/riscv/x-thead-vector.d: New test.
* testsuite/gas/riscv/x-thead-vector.s: New test.
opcodes/ChangeLog:
* riscv-opc.c: Likewise..
T-Head has a range of vendor-specific instructions.
Therefore it makes sense to group them into smaller chunks
in form of vendor extensions.
This patch adds the CSRs for XTheadVector. Because of the
conflict between encoding and teh 'V' extension, it is implemented
by alias. The 'th' prefix and the "XTheadVector" extension are
documented in a PR for the RISC-V toolchain conventions ([1]).
[1] https://github.com/riscv-non-isa/riscv-toolchain-conventions/pull/19
Co-developed-by: Lifang Xia <lifang_xia@linux.alibaba.com>
Co-developed-by: Christoph Müllner <christoph.muellner@vrull.eu>
gas/ChangeLog:
* config/tc-riscv.c (enum riscv_csr_class): Add the class for
the CSRs of the "XTheadVector" extension.
(riscv_csr_address): Likewise.
* testsuite/gas/riscv/x-thead-vector-csr-warn.d: New test.
* testsuite/gas/riscv/x-thead-vector-csr-warn.l: New test.
* testsuite/gas/riscv/x-thead-vector-csr.d: New test.
* testsuite/gas/riscv/x-thead-vector-csr.s: New test.
include/ChangeLog:
* opcode/riscv-opc.h (DECLARE_CSR_ALIAS): Likewise.
opcodes/ChangeLog:
* riscv-dis.c (print_insn_args): Prefix the CSRs disassembly with 'th'.
Recently, -Walloc-size warnings started to kick in. Fix these two
calloc() calls to match the intended usage pattern.
opcodes/ChangeLog:
* arc-dis.c (init_arc_disasm_info): Fix calloc() call.
* ppc-dis.c (powerpc_init_dialect): Ditto.
{disp16} is invalid to use in 64-bit mode, while {disp32} is invalid to
use on pre-386 CPUs. The latter, also affecting other (real) prefixes,
further requires that like for insns we fully check the CPU flags; till
now only Cpu64/CpuNo64 were taken into consideration.
This patch adds the permission model enhancement and memory
attribute index enhancement features and their corresponding
system registers in AArch64 assembler.
Permission Indirection Extension (FEAT_S1PIE, FEAT_S2PIE)
Permission Overlay Extension (FEAT_S1POE, FEAT_S2POE)
Memory Attribute Index Enhancement (FEAT_AIE)
Extension to Translation Control Registers (FEAT_TCR2)
These features are available by default from Armv9.4-A architecture.
This patch also adds support for:
1. FEAT_RASv2 feature and "ERXGSR_EL1" system register.
RASv2 feature is enabled by passing +rasv2 to -march
(eg: -march=armv8-a+rasv2).
2. FEAT_SCTLR2 and following system registers.
SCTLR2_EL1, SCTLR2_EL12, SCTLR2_EL2 and SCTLR2_EL3.
3. FEAT_FGT2 and following system registers.
HDFGRTR2_EL2, HDFGWTR2_EL2, HFGRTR2_EL2, HFGWTR2_EL2
4. FEAT_PFAR and following system registers.
PFAR_EL1, PFAR_EL2 and PFAR_EL12.
FEAT_RASv2, FEAT_SCTLR2, FEAT_FGT2 and FEAT_PFAR features are by default
enabled from Armv9.4-A architecture.
This patch also adds support for two read only system registers
id_aa64mmfr3_el1 and id_aa64mmfr4_el1, which are available from
Armv8-A Architecture.
This patch adds features to the Statistical Profiling Extension,
identified as FEAT_SPEv1p4, FEAT_SPE_FDS, and FEAT_SPE_CRR, which
are enabled by default from Armv9.4-A.
Also adds support for system register "pmsdsfr_el1".
The erroneous omission of a "reg_value == " in the THE system register
encoding check added in [1] led to an error which was not picked up in
GCC but which was flagged in Clang due to its use of
[-Werror,-Wconstant-logical-operand] check. Together with this fix we
add a new test for the THE registers to pick up their illegal use,
adding an extra and important layer of validation.
Furthermore, in separating system register from instruction
implementation (with which only the former was of concern in the cited
patch), additions made to `aarch64-tbl.h' are rolled back so
that these can be added later when adding THE instructions to the
codebase, a more natural place for these changes.
[1] https://sourceware.org/pipermail/binutils/2023-November/130314.html
opcodes/ChangeLog:
* aarch64-opc.c (aarch64_sys_ins_reg_supported_p): Fix typo.
* aarch64-tbl.h (THE): Remove.
(aarch64_feature_set aarch64_feature_the): Likewise.
gas/ChangeLog:
* testsuite/gas/aarch64/illegal-sysreg-8.l: Add tests for THE
system registers.
* testsuite/gas/aarch64/illegal-sysreg-8.s: Likewise.
As we have grown more uses of it, it becomes increasingly more desirable
to replace it by a simpler check. Have i386-gen do at build time what so
far was done at runtime: Deal with templates indicating EVEX-encoding by
other than the EVex attribute, and set that to "dynamic" in such cases.
This then allows simplifying a number of other conditionals as well.
Right now the opcode table has entries with ISA restrictions of the form
FEAT1|FEAT2, the meaning of which depends on context and requires
special treatment in tc-i386.c: Sometimes this means "both features
requires", whereas originally it was intended to solely mean "all of
these features required". Split the field, with the original one
regaining its original meaning. The new field now truly means "any of
these". The combination of both fields is still and &&-type check, i.e.
(all of these) && (any of these). In the opcode table more involved
combinations of features then also need expressing this way: "all"
entities first, follow by "any" entities enclosed in parentheses, e.g.
x64&(AVX|AVX512F). If the "all" part is empty, parentheses may not be
added around the "any" part (unless parsing logic was further relaxed).
Note that this way AVX512VL no longer needs as much special treatment,
and hence templates previously using AVX512F|AVX512VL are switched to
just AVX512VL.
Note further that this requires FMA handling as resulting from
da0784f961 ("x86: fold FMA VEX and EVEX templates") to be slightly
re-done: FMA now becomes more similar to AVX and AVX2.
First of all we want to also accumulate its reverse dependencies, such
that we can use them in cpu_flags_match(). This is in particular in
preparation of APX additions, such that e.g. BMI VEX-encoding templates
can become combined VEX/EVEX ones.
Once we have the reverse dependencies, we can further leverage them to
omit explicit "&x64" from any insn templates dealing with 64-bit-mode-
only ISA extensions. Besides helping readability for several insn
templates we already have, this will also help with what is going to be
added for APX (as all of the new templates would otherwise need to have
"&x64").
Note that rather than leaving a meaningless CPU_64_FLAGS (which is
unused anyway), its emitting is now also suppressed.
Enable the `+lse128' feature modifier which, together with new
internal feature flags, enables LSE128 instructions, which are
represented via the new `_LSE128_INSN' macro.
gas/ChangeLog:
* config/tc-aarch64.c (aarch64_features): Add new "lse128"
entry.
include/ChangeLog:
* include/opcode/aarch64.h (enum aarch64_feature_bit): New
AARCH64_FEATURE_LSE128 feature bit.
(enum aarch64_insn_class): New lse128_atomic instruction class.
opcodes/ChangeLog:
* opcodes/aarch64-tbl.h (aarch64_feature_lse128): New.
(LSE128): Likewise.
(_LSE128_INSN): Likewise.
Given the particular encoding of the LSE128 instructions, create the
necessary shared input+output operand register description and
handling in the code to allow for the encoding of the LSE128 128-bit
atomic operations.
gas/ChangeLog:
* config/tc-aarch64.c (parse_operands):
include/ChangeLog:
* opcode/aarch64.h (enum aarch64_opnd):
opcodes/ChangeLog:
* aarch64-opc.c (fields):
(aarch64_print_operand):
* aarch64-opc.h (enum aarch64_field_kind):
* aarch64-tbl.h (AARCH64_OPERANDS):
In preparation for the implementation of 128-bit system register
support across the toolchain, this patch adds the feature flag
F_REG_128 and adds it to relevant system registers in
`aarch64-sys-regs.def'.
Given the shared nature of this file, this change is made necessary
initially to implement argument validation in the `__arm_rsr128' and
`__armwsr128' ACLE intrinsics in GCC, but will be of subsequent use in
the binutils implementation of the corresponding `mrrs' and `msrr'
instructions.
Regression tested on aarch64-linux-gnu, no regressions.
opcodes/ChangeLog:
* aarch64-opc.h (F_REG_128): New flag.
* aarch64-sys-regs.def (par_el1): Add F_REG_128 flag.
(rcwmask_el1): Likewise.
(rcwsmask_el1): Likewise.
(ttbr0_el1): Likewise.
(ttbr0_el12): Likewise.
(ttbr0_el2): Likewise.
(ttbr1_el1): Likewise.
(ttbr1_el12): Likewise.
(ttbr1_el2): Likewise.
(vttbr_el2): Likewise.
Add Binutils support for system registers associated with the
Translation Hardening Extension (THE).
In doing so, we also add core feature support for THE, enabling its
associated feature flag and implementing the necessary
feature-checking machinery.
Regression tested on aarch64-linux-gnu, no regressions.
gas/ChangeLog:
* config/tc-aarch64.c (aarch64_features): Add "+the" feature modifier.
* doc/c-aarch64.texi (AArch64 Extensions): Update
documentation for `the' option.
* testsuite/gas/aarch64/sysreg-8.s: Add tests for `the'
associated system registers.
* testsuite/gas/aarch64/sysreg-8.d: Likewise.
include/ChangeLog:
* opcode/aarch64.h (enum aarch64_feature_bit): Add
AARCH64_FEATURE_THE.
opcode/ChangeLog:
* aarch64-opc.c (aarch64_sys_ins_reg_supported_p): Add `the'
system register check support.
* aarch64-sys-regs.def: Add `rcwmask_el1' and `rcwsmask_el1'
* aarch64-tbl.h: Define `THE' preprocessor macro.
Spec: https://docs.openhwgroup.org/projects/cv32e40p-user-manual/en/latest/instruction_set_extensions.html
Contributors:
Mary Bennett <mary.bennett@embecosm.com>
Nandni Jamnadas <nandni.jamnadas@embecosm.com>
Pietra Ferreira <pietra.ferreira@embecosm.com>
Charlie Keaney
Jessica Mills
Craig Blackmore <craig.blackmore@embecosm.com>
Simon Cook <simon.cook@embecosm.com>
Jeremy Bennett <jeremy.bennett@embecosm.com>
Helene Chelin <helene.chelin@embecosm.com>
bfd/ChangeLog:
* elfxx-riscv.c (riscv_multi_subset_supports): Added `xcvalu`
instruction class.
(riscv_multi_subset_supports_ext): Likewise.
gas/ChangeLog:
* config/tc-riscv.c (validate_riscv_insn): Added the necessary
operands for the extension.
(riscv_ip): Likewise.
* doc/c-riscv.texi: Noted XCValu as an additional ISA extension
for CORE-V.
* testsuite/gas/riscv/cv-alu-boundaries.d: New test.
* testsuite/gas/riscv/cv-alu-boundaries.l: New test.
* testsuite/gas/riscv/cv-alu-boundaries.s: New test.
* testsuite/gas/riscv/cv-alu-fail-march.d: New test.
* testsuite/gas/riscv/cv-alu-fail-march.l: New test.
* testsuite/gas/riscv/cv-alu-fail-march.s: New test.
* testsuite/gas/riscv/cv-alu-fail-operand-01.d: New test.
* testsuite/gas/riscv/cv-alu-fail-operand-01.l: New test.
* testsuite/gas/riscv/cv-alu-fail-operand-01.s: New test.
* testsuite/gas/riscv/cv-alu-fail-operand-02.d: New test.
* testsuite/gas/riscv/cv-alu-fail-operand-02.l: New test.
* testsuite/gas/riscv/cv-alu-fail-operand-02.s: New test.
* testsuite/gas/riscv/cv-alu-fail-operand-03.d: New test.
* testsuite/gas/riscv/cv-alu-fail-operand-03.l: New test.
* testsuite/gas/riscv/cv-alu-fail-operand-03.s: New test.
* testsuite/gas/riscv/cv-alu-fail-operand-04.d: New test.
* testsuite/gas/riscv/cv-alu-fail-operand-04.l: New test.
* testsuite/gas/riscv/cv-alu-fail-operand-04.s: New test.
* testsuite/gas/riscv/cv-alu-fail-operand-05.d: New test.
* testsuite/gas/riscv/cv-alu-fail-operand-05.l: New test.
* testsuite/gas/riscv/cv-alu-fail-operand-05.s: New test.
* testsuite/gas/riscv/cv-alu-fail-operand-06.d: New test.
* testsuite/gas/riscv/cv-alu-fail-operand-06.l: New test.
* testsuite/gas/riscv/cv-alu-fail-operand-06.s: New test.
* testsuite/gas/riscv/cv-alu-fail-operand-07.d: New test.
* testsuite/gas/riscv/cv-alu-fail-operand-07.l: New test.
* testsuite/gas/riscv/cv-alu-fail-operand-07.s: New test.
* testsuite/gas/riscv/cv-alu-insns.d: New test.
* testsuite/gas/riscv/cv-alu-insns.s: New test.
opcodes/ChangeLog:
* riscv-dis.c (print_insn_args): Disassemble xcb operand.
* riscv-opc.c: Defined the MASK and added XCValu instructions.
include/ChangeLog:
* opcode/riscv-opc.h: Added corresponding MATCH and MASK macros
for XCValu.
* opcode/riscv.h: Added corresponding EXTRACT and ENCODE macros
for XCValu.
(enum riscv_insn_class): Added the XCValu instruction class.
Spec: https://docs.openhwgroup.org/projects/cv32e40p-user-manual/en/latest/instruction_set_extensions.html
Contributors:
Mary Bennett <mary.bennett@embecosm.com>
Nandni Jamnadas <nandni.jamnadas@embecosm.com>
Pietra Ferreira <pietra.ferreira@embecosm.com>
Charlie Keaney
Jessica Mills
Craig Blackmore <craig.blackmore@embecosm.com>
Simon Cook <simon.cook@embecosm.com>
Jeremy Bennett <jeremy.bennett@embecosm.com>
Helene Chelin <helene.chelin@embecosm.com>
bfd/ChangeLog:
* elfxx-riscv.c (riscv_multi_subset_supports): Added `xcvmac`
instruction class.
(riscv_multi_subset_supports_ext): Likewise.
gas/ChangeLog:
* config/tc-riscv.c (validate_riscv_insn): Added the necessary
operands for the extension.
(riscv_ip): Likewise.
* doc/c-riscv.texi: Noted XCVmac as an additional ISA extension
for CORE-V.
* testsuite/gas/riscv/cv-mac-fail-march.d: New test.
* testsuite/gas/riscv/cv-mac-fail-march.l: New test.
* testsuite/gas/riscv/cv-mac-fail-march.s: New test.
* testsuite/gas/riscv/cv-mac-fail-operand.d: New test.
* testsuite/gas/riscv/cv-mac-fail-operand.l: New test.
* testsuite/gas/riscv/cv-mac-fail-operand.s: New test.
* testsuite/gas/riscv/cv-mac-insns.d: New test.
* testsuite/gas/riscv/cv-mac-insns.s: New test.
opcodes/ChangeLog:
* riscv-dis.c (print_insn_args): Disassemble information with
the EXTRACT macro implemented.
* riscv-opc.c: Defined the MASK and added
XCVmac instructions.
include/ChangeLog:
* opcode/riscv-opc.h: Added corresponding MATCH and MASK macros
for XCVmac.
* opcode/riscv.h: Added corresponding EXTRACT and ENCODE macros
for uimm.
(enum riscv_insn_class): Added the XCVmac instruction class.
Within the groups L{B,BU,H,HU,W,WU,D}, S{B,H,W,D}, FL{H,W,D,Q}, and
FS{H,W,D,Q} the sole difference between the handling is the insn
mnemonic passed to the common handling functions. The intended mnemonic,
however, can easily be retrieved. Furthermore leverags that Sx and FSx
are then handled identically, too, and hence their cases can also be
folded.
This patch adds support for 10 new AArch64 system registers
(gcscre0_el1, gcscr_el1, gcscr_el12, gcscr_el2, gcscr_el3,
gcspr_el0, gcspr_el1 ,gcspr_el12, gcspr_el2 and gcspr_el3),
which are enabled on using Guarded Control Stack (+gcs flag)
feature.
This patch adds support for Guarded control stack data synchronization
instruction (GCSB DSYNC). This instruction is allocated to existing
HINT space and uses the HINT number 19 and to match this an entry is
added to the aarch64_hint_options array.
This patch adds for Guarded Control Stack Extension (GCS) extension. GCS feature is
optional from Armv9.4-A architecture and enabled by passing +gcs option to -march
(eg: -march=armv9.4-a+gcs) or using ".arch_extension gcs" directive in the assembly file.
Also this patch adds support for GCS instructions gcspushx, gcspopcx, gcspopx,
gcsss1, gcsss2, gcspushm, gcspopm, gcsstr and gcssttr.
This patch adds support for Check Feature Status Extension (CHK) which
is mandatory from Armv8.0-A. Also this patch supports "chkfeat" instruction
(hint #40).
Given the shared use of the aarch64-sys-regs.def file across Binutils
and GCC, add instructions for keeping the file synchronized across the
two codebases.
Namely, it should be made clear that all changes are first to be made
in Binutils and the updated file copied across to GCC.
opcodes/ChangeLog
* opcodes/aarch64-sys-regs.def: Update file-description header
comment.
There is currently a bug in the bit masking for the barrel shift
instructions because the bit mask is not including all of the
register bits which must be zero. With this patch, the disassembler
can be sure that the 32-bit value is indeed a barrel shift instruction
and not a data value in memory.
This fix can be verified by assembling and disassembling the following:
.text
.long 0x65005f5f
With this patch, the bug is fixed, and the objdump will know that
0x65005f5f is not a barrel shift instruction.
Signed-off-by: Neal Frager <neal.frager@amd.com>
Signed-off-by: Michael J. Eager <eager@eagercon.com>
This patches adds new bsefi and bsifi instructions.
BSEFI- The instruction shall extract a bit field from a
register and place it right-adjusted in the destination register.
The other bits in the destination register shall be set to zero.
BSIFI- The instruction shall insert a right-adjusted bit field
from a register at another position in the destination register.
The rest of the bits in the destination register shall be unchanged.
Further documentation of these instructions can be found here:
https://docs.xilinx.com/v/u/en-US/ug984-vivado-microblaze-ref
With version 6 of the patch, no new relocation types are added as
this was unnecessary for adding the bsefi and bsifi instructions.
FIXED: Segfault caused by incorrect termination of microblaze_opcodes.
Signed-off-by: nagaraju <nagaraju.mekala@amd.com>
Signed-off-by: Ibai Erkiaga <ibai.erkiaga-elorza@amd.com>
Signed-off-by: Neal Frager <neal.frager@amd.com>
Signed-off-by: Michael J. Eager <eager@eagercon.com>
The elf psabi allows for mapping symbols to be of the form $x<ISA>.<any>
opcodes/
* riscv-dis.c (riscv_get_map_state): allow mapping symbol to
be suffixed by a unique identifier .<any>
This patches adds new bsefi and bsifi instructions.
BSEFI- The instruction shall extract a bit field from a
register and place it right-adjusted in the destination register.
The other bits in the destination register shall be set to zero.
BSIFI- The instruction shall insert a right-adjusted bit field
from a register at another position in the destination register.
The rest of the bits in the destination register shall be unchanged.
Further documentation of these instructions can be found here:
https://docs.xilinx.com/v/u/en-US/ug984-vivado-microblaze-ref
This patch has been tested for years of AMD Xilinx Yocto
releases as part of the following patch set:
https://github.com/Xilinx/meta-xilinx/tree/master/meta-microblaze/recipes-devtools/binutils/binutils
Signed-off-by: nagaraju <nagaraju.mekala@amd.com>
Signed-off-by: Ibai Erkiaga <ibai.erkiaga-elorza@amd.com>
Signed-off-by: Neal Frager <neal.frager@amd.com>
Signed-off-by: Michael J. Eager <eager@eagercon.com>
This patch moves instances of system register definitions, represented
by the SYSREG macro, out of their original place in `aarch64-opc.c'
and into a dedicated .def file, `aarch64-sys-regs.def'.
System register entries in this new file are ordered alphabetically by
name. This choice is made to enable the use of fast search algorithms
such as binary search when validating register names.
The SYSREG macro, defined as SYSREG (name, encoding, flags, features)
is kept as is and used in the def file, but all other SR_* macros
which previously served as indirections to SYSREG are removed.
opcodes/ChangeLog:
* aarch64-opc.c (SR_CORE): Macro definition and uses deleted.
(SR_FEAT): Likewise.
(SR_FEAT2): Likewise.
(SR_V8_1_A): Likewise.
(SR_V8_4_A): Likewise.
(SR_V8A): Likewise.
(SR_V8R): Likewise.
(SR_V8_1A): Likewise.
(SR_V8_2A): Likewise.
(SR_V8_3A): Likewise.
(SR_V8_4A): Likewise.
(SR_V8_6A): Likewise.
(SR_V8_7A): Likewise.
(SR_V8_8A): Likewise.
(SR_GIC): Likewise.
(SR_AMU): Likewise.
(SR_LOR): Likewise.
(SR_PAN): Likewise.
(SR_RAS): Likewise.
(SR_RNG): Likewise.
(SR_SME): Likewise.
(SR_SSBS): Likewise.
(SR_SVE): Likewise.
(SR_ID_PFR2): Likewise.
(SR_PROFILE): Likewise.
(SR_MEMTAG): Likewise.
(SR_SCXTNUM): Likewise.
(SR_EXPAND_ELx): Likewise.
(SR_EXPAND_EL12): Likewise.
* opcodes/aarch64-sys-regs.def: New.
This patch adds a mechanism for system register name alias detection
to register-matching mechanisms.
A new `F_REG_ALIAS' flag is added to the set of register flags and
used to label which entries in aarch64_sys_regs[] correspond to
aliases (and thus which CPENC values are non-unique in this array).
Where this is used is, for example, in `aarch64_print_operand' where,
in the case of system register decoding, the aarch64_sys_regs[] array
is iterated through until a match in CPENC value is made and the first
match accepted. If insufficient care is given in the ordering of
system registers in this array, the alias is encountered before the
"real" register and used incorrectly as the register name in the
disassembled output.
With this flag and the new `aarch64_sys_reg_alias_p' test, search
candidates corresponding to aliases can be conveniently skipped over.
One concrete example of where this is useful is with the
`trcextinselr0' system register. It was initially placed in the
system register list before `trcextinselr', in contrast to a more
natural alphabetical order.
include/ChangeLog:
* opcode/aarch64.h: add `aarch64_sys_reg_alias_p' prototype.
opcodes/ChangeLog:
* aarch64-opc.c (aarch64_sys_reg_alias_p): New.
(aarch64_print_operand): add aarch64_sys_reg_alias_p check.
(aarch64_sys_regs): Add F_REG_ALIAS flag to "trcextinselr"
entry.
* aarch64-opc.h (F_REG_ALIAS): New.
Following the folding of some generic AVX/AVX2 templates with their
AVX512F counterpart ones, do this for FMA ones as well, requiring one
further adjustment to cpu_flags_match().
In anticipation of APX introduce logic to reduce the number of templates
we have now, allowing to limit some the number of ones we then need to
gain.
The fundamental requirements are that
- attributes be compatible, which specifically means VexW needs to be
the same in the templates (which often isn't the case, for VEX
encodings having far more WIG tha, EVEX ones),
- the EVEX form being AVX512F (with or without AVX512VL), not any of its
extensions (the same will then be required for APX - it'll need to be
APX_F).
Note that in check_register() there's now a redundant zmm check. Since
this logic will need revisiting for APX anyway, I'd like to keep it that
way for now. (Similarly a couple of if()-s which could be folded are
kept separate, to reduce code churn when adding APX support.)
Add a macro pcaddi instruction to support "pcaddi rd, symbol".
pcaddi has a 20-bit signed immediate, it can address a +/- 2MB pc relative
address, and the address should be 4-byte aligned.
The AArch64 feature-flag code is currently limited to a maximum
of 64 features. This patch reworks it so that the limit can be
increased more easily. The basic idea is:
(1) Turn the ARM_FEATURE_FOO macros into an enum, with the enum
counting bit positions.
(2) Make the feature-list macros take an array index argument
(currently always 0). The macros then return the
aarch64_feature_set contents for that array index.
An N-element array would then be initialised as:
{ MACRO (0), ..., MACRO (N - 1) }
(3) Provide convenience macros for initialising an
aarch64_feature_set for:
- a single feature
- a list of individual features
- an architecture version
- an architecture version + a list of additional features
(2) and (3) use the preprocessor to generate static initialisers.
The main restriction was that uses of the same preprocessor macro
cannot be nested. So if a macro wants to do something for N individual
arguments, it needs to use a chain of N macros to do it. There then
needs to be a way of deriving N, as a preprocessor token suitable for
pasting.
The easiest way of doing that was to precede each list of features
by the number of features in the list. So an aarch64_feature_set
initialiser for three features A, B and C would be written:
AARCH64_FEATURES (3, A, B, C)
This scheme makes it difficult to keep AARCH64_FEATURE_CRYPTO as a
synonym for SHA2+AES, so the patch expands the former to the latter.
Now that CpuLM is used solely in cpu_arch_flags and cpu_arch[] while
Cpu64 is solely used in insn templates, they no longer need to be
treated different from other "ordinary" flags; the only "unusual" one
left if CpuNo64. Fold both, leaving just Cpu64.
Looking at the VEX and EVEX forms of vcvtneps2bf16 I noticed that
operand purpose isn't properly reflected in Vxy's definition. Rename
"dst" to "src", thus bringing things in line with Exy.
Recognize "/<number>" suffixes on both -march=+avx10.1 and the
corresponding .arch directive, setting an upper bound on the vector size
that insns may use. Such a restriction can be reset by setting a new base
architecture, by using a suffix-less form, by disabling AVX10, or by
enabling any other VEX/EVEX-based vector extension.
While for most insns we can suppress their use with too wide operands
via registers becoming unavailable (or in Intel syntax memory operand
size specifiers not being recognized), mask register insns have to have
their minimum required vector size specified in a new attribute. (Of
course this new attribute could also be used on other insns.)
Note that .insn continues to be permitted to emit EVEX{512,256} (and
VEX256 ones) encodings regardless of vector size restrictions in place.
Of course these can't be expressed using zmm (or ymm) operands then,
but need using the EVEX.512.* forms (broadcast forms may be usable right
now, but this may go away so shouldn't be relied upon). This is why no
assertions should be added to build_{e,}vex_prefix().
Since this is merely a re-branding of certain AVX512* features, there's
little code to be added.
The main aspect here are new testcases. In order to be able to re-use
some of the existing testcases, several of them need their start symbols
adjusted. Note that 256- and 128-bit tests want adding here, as these
need to work right away. Subsequently they'll gain vector length
constraints.
Since it was missing and is wanted here, also add an AVX512VL+VPOPCNTDQ
test.
These probably should have been put in place already anyway, but they're
very much wanted in order to then put AVX10.1 support on top. Note that
to avoid reverse dependencies towards SSE (just like we already do for
AVX and XOP), add_isa_dependencies() needs some further tweaking.
While there also address a related anomaly: Disabling AES but neither
AVX nor VAES (similarly for {,V}PCLMULQDQ) would better keep the 128-bit
VEX-encoded forms available. Note that for this the VAES insns are moved
past the AVX+AES ones, to avoid the property-11 test suddenly failing.
The test really is wrong, but let's not also make things inconsistent:
Without the movement, YMM use would be correctly recorded for the
128-bit forms simply because the first template already matches, as long
as VAES wasn't disabled. Yet it still wouldn't be if only AVX+AES were
enabled. Nor would behavior here then be the same as for VPCLMUL* insns.
gprofng uses insn_type in print_address_func().
But insn_type is always zero on aarch64.
opcodes/ChangeLog:
2023-09-07 Vladimir Mezentsev <vladimir.mezentsev@oracle.com>
* opcodes/aarch64-dis.c (print_insn_aarch64_word): Set insn_type for
branch instructions.
While the patch already committed for pr30793 prevents the asan error,
there is a problem: Now the last element of bundle_words never gets
written. That's very likely wrong, or KVXMAXBUNDLEWORDS is too big.
So this patch rearranges things a little to support writing of all of
bundle_words and does the parallel bit checking only when filling
bundle_words. In the normal case, kvx_reassemble_bundle will see
bundle_words[word_count-1] with the parallel bit clear and all other
words having it set. In the error case where all words in
bundle_words have the parallel bit set, kvx_reassemble_bundle will be
passed a wordcount of KVXMAXBUNDLEWORDS + 1. I've also made
kvx_reassemble_bundle return true for success rather than zero, and
removed the unnecessary check for zero wordcount.
PR 30793
* kvx-dis.c (kvx_reassemble_bundle): Return bool, true on success.
Fail if wordcount is too large. Don't check for wordcount zero.
Don't check kvx_has_parallel_bit.
(print_insn_kvx): Rewrite code reading bundle_words as a for loop.
Don't stop reading at KVXMAXBUNDLEWORDS - 1.
(decode_prologue_epilogue_bundle): Similarly.
The vendor operands should be named starting with `X', and preferably the
second letter (or multiple following letters) is enough to differentiate
them from other vendors.
Therefore, added letter `t' after `X' for t-head operands, to differentiate
from future different vendor's operands.
bfd/
* elfxx-riscv.c (riscv_supported_vendor_x_ext): Removed the vendor
document link since it should already be recorded in the
gas/doc/c-riscv.texi.
gas/
* config/tc-riscv.c (validate_riscv_insn): Added `t' after `X' for
t-head operands. Minor updates for indents and comments.
(riscv_ip): Likewise.
* doc/c-riscv.texi: Minor updates.
opcodes/
* riscv-dis.c (print_insn_args): Added `t' after `X' for t-head
operands. Minor updates for indents and comments.
* riscv-opc.c (riscv_opcode): Likewise.
There's no need to have almost identical code twice. Do away with
M_VMSGEU and instead simply use an unused (for these macros) field to
tell apart both variants.
The name we use internally isn't in line with the SDM, and also isn't in
line with CpuVPCLMULQDQ. Add the missing suffix, but of course leave
alone user facing names.
Commit 916fae9135 ("Add Size64 to movq/vmovq with Reg64 operand" was
right in adding the attribute to MOVQ, but there was no need to add it
to VMOVQ. (See also the AVX512F form, which doesn't have the attribute
either.)
For disassembly to only use spec-mandated aliases, respective non-alias
entries need to come ahead of their alias ones. Since identical
mnemonics need to stay together, whole groups are moved up where
necessary.
This partly reverts 839189bc93 ("RISC-V: re-arrange opcode table for
consistent alias handling"), but then also goes beyond a plain revert.
Reviewed-by: Tsukasa OI <research_trasio@irq.a4lg.com>
Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com>
Although XVentanaCondOps instructions are XLEN-agonistic, Ventana's manual
only defines them only for RV64 (because all Ventana's processors implement
RV64).
This commit limits XVentanaCondOps instructions RV64-only to match the
behavior of the manual and LLVM.
Note that this commit alone will not make XVentanaCondOps extension with
RV32 invalid (it just makes XVentanaCondOps on RV32 empty).
opcodes/ChangeLog:
* riscv-opc.c (riscv_opcodes): Restrict "vt.maskc" and "vt.maskcn"
to XLEN=64.
gas/ChangeLog:
* testsuite/gas/riscv/x-ventana-condops-32.d: New failure test.
* testsuite/gas/riscv/x-ventana-condops-32.l: Likewise.
This patch sets GUILE to just plain 'guile'.
In the distant ("devo") past, the top-level build did support building
Guile in-tree. However, I don't think this really works any more.
For one thing, there are no build dependencies on it, so there's no
guarantee it would actually be built before the uses.
This patch also removes the use of "-s" as an option to cgen scheme
scripts. With my latest patch upstream, this is no longer needed.
After the upstream changes, either Guile 2 or Guile 3 will work, with
or without the compiler enabled.
2023-08-24 Tom Tromey <tom@tromey.com>
* cgen.sh: Don't pass "-s" to cgen.
* Makefile.in: Rebuild.
* Makefile.am (GUILE): Simplify.
i386: warning: format ‘%u’ expects argument of type ‘unsigned int’,
but argument 4 has type ‘size_t’ {aka ‘long unsigned int’} [-Wformat=]
ia64: warning: ignoring return value of ‘fgets’
* i386-gen.c (process_i386_opcodes): Correct format string.
* ia64-gen.c (load_insn_classes, load_depfile): Don't ignore
fgets return value.
opcodes/
* kvx-dis.c (print_insn_kvx): Change the loop condition so that
wordcount is always less than KVXMAXBUNDLEWORDS.
(decode_prologue_epilogue_bundle): Likewise.
Historically, flags and variables relating to architectural revisions
for the A-profile architecture omitted the trailing `A' such that, for
example, assembling for `-march=armv8.4-a' set the `AARCH64_ARCH_V8_4'
flag in the assembler.
This leads to some ambiguity, since Binutils also targets the
R-profile Arm architecture. Therefore, it seems prudent to have
everything associated with the A-profile cores end in `A' and likewise
`R' for the R-profile. Referring back to the example above, the flag
set for `-march=armv8.4-a' is better characterized if labeled
`AARCH64_ARCH_V8_4A'.
The only exception to the rule of appending `A' to variables is found
in the handling of the `AARCH64_FEATURE_V8' macro, as it is the
baseline from which ALL processors derive and should therefore be left
unchanged.
In reflecting the `ARM' architectural nomenclature choices, where we
have `ARM_ARCH_V8A' and `ARM_ARCH_V8R', the choice is made to not have
an underscore separating the numerical revision number and the
A/R-profile indicator suffix. This has meant that renaming of
R-profile related flags and variables was warranted, thus going from
`.*_[vV]8_[rR]' to `.*_[vV]8[rR]'.
Finally, this is more in line with conventions within GCC and adds consistency
across the toolchain.
gas/ChangeLog:
* gas/config/tc-aarch64.c:
(aarch64_cpus): Reference to arch feature macros updated.
(aarch64_archs): Likewise.
include/ChangeLog:
* include/opcode/aarch64.h:
(AARCH64_FEATURE_V8A): Updated name: V8_A -> V8A.
(AARCH64_FEATURE_V8_1A): A-suffix added.
(AARCH64_FEATURE_V8_2A): Likewise.
(AARCH64_FEATURE_V8_3A): Likewise.
(AARCH64_FEATURE_V8_4A): Likewise.
(AARCH64_FEATURE_V8_5A): Likewise.
(AARCH64_FEATURE_V8_6A): Likewise.
(AARCH64_FEATURE_V8_7A): Likewise.
(AARCH64_FEATURE_V8_8A):Likewise.
(AARCH64_FEATURE_V9A): Likewise.
(AARCH64_FEATURE_V8R): Updated name: V8_R -> V8R.
(AARCH64_ARCH_V8A_FEATURES): Updated name: V8_A -> V8A.
(AARCH64_ARCH_V8_1A_FEATURES): A-suffix added.
(AARCH64_ARCH_V8_2A_FEATURES): Likewise.
(AARCH64_ARCH_V8_3A_FEATURES): Likewise.
(AARCH64_ARCH_V8_4A_FEATURES): Likewise.
(AARCH64_ARCH_V8_5A_FEATURES): Likewise.
(AARCH64_ARCH_V8_6A_FEATURES): Likewise.
(AARCH64_ARCH_V8_7A_FEATURES): Likewise.
(AARCH64_ARCH_V8_8A_FEATURES): Likewise.
(AARCH64_ARCH_V9A_FEATURES): Likewise.
(AARCH64_ARCH_V9_1A_FEATURES): Likewise.
(AARCH64_ARCH_V9_2A_FEATURES): Likewise.
(AARCH64_ARCH_V9_3A_FEATURES): Likewise.
(AARCH64_ARCH_V8A): Updated name: V8_A -> V8A.
(AARCH64_ARCH_V8_1A): A-suffix added.
(AARCH64_ARCH_V8_2A): Likewise.
(AARCH64_ARCH_V8_3A): Likewise.
(AARCH64_ARCH_V8_4A): Likewise.
(AARCH64_ARCH_V8_5A): Likewise.
(AARCH64_ARCH_V8_6A): Likewise.
(AARCH64_ARCH_V8_7A): Likewise.
(AARCH64_ARCH_V8_8A): Likewise.
(AARCH64_ARCH_V9A): Likewise.
(AARCH64_ARCH_V9_1A): Likewise.
(AARCH64_ARCH_V9_2A): Likewise.
(AARCH64_ARCH_V9_3A): Likewise.
(AARCH64_ARCH_V8_R): Updated name: V8_R -> V8R.
opcodes/ChangeLog:
* opcodes/aarch64-opc.c (SR_V8A): Updated name: V8_A -> V8A.
(SR_V8_1A): A-suffix added.
(SR_V8_2A): Likewise.
(SR_V8_3A): Likewise.
(SR_V8_4A): Likewise.
(SR_V8_6A): Likewise.
(SR_V8_7A): Likewise.
(SR_V8_8A): Likewise.
(aarch64_sys_regs): Reference to arch feature macros updated.
(aarch64_pstatefields): Reference to arch feature macros updated.
(aarch64_sys_ins_reg_supported_p): Reference to arch feature macros
updated.
* opcodes/aarch64-tbl.h:
(aarch64_feature_v8_2a): a-suffix added.
(aarch64_feature_v8_3a): Likewise.
(aarch64_feature_fp_v8_3a): Likewise.
(aarch64_feature_v8_4a): Likewise.
(aarch64_feature_fp_16_v8_2a): Likewise.
(aarch64_feature_v8_5a): Likewise.
(aarch64_feature_v8_6a): Likewise.
(aarch64_feature_v8_7a): Likewise.
(aarch64_feature_v8r): Updated name: v8_r-> v8r.
(ARMV8R): Updated name: V8_R-> V8R.
(ARMV8_2A): A-suffix added.
(ARMV8_3A): Likewise.
(FP_V8_3A): Likewise.
(ARMV8_4A): Likewise.
(FP_F16_V8_2A): Likewise.
(ARMV8_5): Likewise.
(ARMV8_6A): Likewise.
(ARMV8_6A_SVE): Likewise.
(ARMV8_7A): Likewise.
(V8_2A_INSN): `A' added to macro symbol.
(V8_3A_INSN): Likewise.
(V8_4A_INSN): Likewise.
(FP16_V8_2A_INSN): Likewise.
(V8_5A_INSN): Likewise.
(V8_6A_INSN): Likewise.
(V8_7A_INSN): Likewise.
(V8R_INSN): Updated name: V8_R-> V8R.
kvx_dis_init currently always returns true, but error conditions do so
by "return -1" which converts to true. The return status is ignored
anyway, and it doesn't make much sense to error on unexpected arch or
mach: If print_insn_kvx is called then the atch is known to be kvx,
and it's better to choose some default for a user passing an unknown
mach value rather than segfaulting in decode_insn when env.opc_table
is NULL.
I've chosen the default mach to be bfd_mach_kv3_1, the default in
bfd/cpu-kvx.c, not that it matters very much. In normal objdump/gdb
usage, info->mach won't be an unexpected value.
* kvx-dis.c (kvx_dis_init): Return void. Don't error on
unexpected arch or mach. Default to bfd_mach_kv3_1 for
unknown mach. Don't clear info->disassembler_options.
The neg/neg32 BPF instructions always use BPF_SRC_K (=0) in their header
source bit, despite operating on registers. If BPF_SRC_X (=1) is set,
the instructions are rejected by the kernel.
Because of this there are also no neg/neg32 instructions which operate
on immediates, so remove them.
bd434cc4d9 was a similar fix in the old
CGEN-based port, but was not carried forward in the new port.
include/
* opcode/bpf.h (enum bpf_insn_id): Remove spurious entries
BPF_INSN_NEGI and BPF_INSN_NEG32I.
opcodes/
* bpf-opc.c (bpf_opcodes): Remove erroneous NEGI and NEG32I
instructions.
gas/
* doc/c-bpf.texi (BPF Instructions): Remove erroneous neg and
neg32 instructions operating on immediates.
* testsuite/gas/bpf/alu.s: Adapt accordingly.
* testsuite/gas/bpf/alu.d: Likewise.
* testsuite/gas/bpf/alu-be.d: Likewise
* testsuite/gas/bpf/alu32.s: Likewise.
* testsuite/gas/bpf/alu32.d: Likewise.
* testsuite/gas/bpf/alu32-be.d: Likewise.
* testsuite/gas/bpf/alu-pseudoc.s: Likewise.
* testsuite/gas/bpf/alu-pseudoc.d: Likewise.
* testsuite/gas/bpf/alu-be-pseudoc.d: Likewise.
* testsuite/gas/bpf/alu32-pseudoc.s: Likewise.
* testsuite/gas/bpf/alu32-pseudoc.d: Likewise.
* testsuite/gas/bpf/alu32-be-pseudoc.d: Likewise.
I had reason yesterday to want to regenerate configury files which I
do with --enable-maintainer-mode, and added --enable-cgen-maint
accidentally. The first problem I hit is that sim looks for cgen in a
different directory by default than opcodes, and I had my source
layout set up for opcodes rather than sim. Fix that by making both
use ../cgen first, then ../../cgen relative to sim/ and opcodes/. The
next problem was that various sim local.mk files expected generated
sources in the build dir rather than the source dir. Fix that by
adding $(srcdir) to paths. Finally, the generated iq2000 files had a
compile error, fixed by the cpu/iq2000.cpu patch.
cpu/
* iq2000.cpu (syscall): Add pc arg.
opcodes/
* configure.ac (cgendir): Default to ../../cgen, but use ../cgen
if found there.
* configure: Regenerate.
sim/m4/
* sim_ac_option_cgen_maint.m4 (cgendir): Look in ../cgen too.
sim/
* cris/local.mk: Add $(srcdir) to paths for regenerated source.
* frv/local.mk: Likewise.
* iq2000/local.mk: Likewise.
* lm32/local.mk: Likewise.
* m32r/local.mk: Likewise.
* or1k/local.mk: Likewise.
* Makefile.in: Regenerate.
* configure: Regenerate.
The documentation of the 'Zfa' extension states that "fli.h" is available
"if the Zfh or Zvfh extension is implemented" (both the latest and the
oldest editions are checked).
This fact was not reflected in Binutils ('Zvfh' implies 'Zfhmin', not full
'Zfh' extension and "fli.h" required 'Zfh' and 'Zfa' extensions).
This commit makes "fli.h" also available when both 'Zfa' and 'Zvfh'
extensions are implemented.
bfd/ChangeLog:
* elfxx-riscv.c (riscv_multi_subset_supports): Add new
instruction class handling.
(riscv_multi_subset_supports_ext): Likewise.
gas/ChangeLog:
* testsuite/gas/riscv/zfa-zvfh.s: New test.
* testsuite/gas/riscv/zfa-zvfh.d: Ditto.
include/ChangeLog:
* opcode/riscv.h (enum riscv_insn_class): Add new instruction
class.
opcodes/ChangeLog:
* riscv-opc.c (riscv_opcodes): Change instruction class of "fli.h"
from INSN_CLASS_ZFH_AND_ZFA to new INSN_CLASS_ZFH_OR_ZVFH_AND_ZFA.
This commit adds 'Zihintntl' extension and its hint instructions.
This is based on:
<0dc91f505e>,
the first ISA Manual noting that the 'Zihintntl' extension is ratified.
Note that compressed 'Zihintntl' hints require either 'C' or
'Zca' extension.
Co-authored-by: Nelson Chu <nelson@rivosinc.com>
bfd/ChangeLog:
* elfxx-riscv.c (riscv_supported_std_z_ext): Add 'Zihintntl'
standard hint 'Z' extension.
(riscv_multi_subset_supports): Support new instruction classes.
(riscv_multi_subset_supports_ext): Likewise.
gas/ChangeLog:
* testsuite/gas/riscv/zihintntl.s: New test for 'Zihintntl'
including auto-compression without C prefix and explicit C prefix.
* testsuite/gas/riscv/zihintntl.d: Likewise.
* testsuite/gas/riscv/zihintntl-na.d: Likewise.
* testsuite/gas/riscv/zihintntl-base.s: New test for correspondence
between 'Zihintntl' and base 'I' or 'C' instructions.
* testsuite/gas/riscv/zihintntl-base.d: Likewise.
include/ChangeLog:
* opcode/riscv.h (enum riscv_insn_class): Add new instruction
classes: INSN_CLASS_ZIHINTNTL and INSN_CLASS_ZIHINTNTL_AND_C.
(MASK_NTL_P1, MATCH_NTL_P1, MASK_NTL_PALL,
MATCH_NTL_PALL, MASK_NTL_S1, MATCH_NTL_S1, MASK_NTL_ALL,
MATCH_NTL_ALL, MASK_C_NTL_P1, MATCH_C_NTL_P1, MASK_C_NTL_PALL,
MATCH_C_NTL_PALL, MASK_C_NTL_S1, MATCH_C_NTL_S1, MASK_C_NTL_ALL,
MATCH_C_NTL_ALL): New.
opcodes/ChangeLog:
* riscv-opc.c (riscv_opcodes): Add instructions from the
'Zihintntl' extension.
The longest register name is 4 characters (plus a nul one), so using a
4- or 8-byte pointer to get at it is neither space nor time efficient.
Embed the names right into the array. For PIE this also reduces the
number of base relocations in the final image.
To avoid old gcc, when generating 32-bit code, bogusly warning about
bounds being exceeded in the code processing Cs/Cw, Ct/Cx, and CD,
an adjustment to EXTRACT_BITS() is needed: This macro shouldn't supply
a 64-bit value, and it also doesn't need to - all operand fields to
date are far more narrow than 32 bits. This in turn allows dropping a
number of casts elsewhere.
This regenerates config files changed by the previous 44 commits.
Note that subject lines in these commits mostly match the gcc git
originating commit.
The table constantly growing in two dimensions (number of table entries
times number of ISA extension flags) doesn't scale very well. Use a more
compact representation: Only identifiers which need to combine with
other identifiers retain individual flag bits. All others are combined
into an enum, with a new helper added to transform the table entries
into the original i386_cpu_flags layout. This way the table in the final
binary shrinks by almost a third (the generated source code shrinks by
about half), and isn't likely to grow again in that dimension any time
soon.
While moving the 3DNow! fields, drop the stray inner 'a' from their
names.
Their check_func should be "match_never", not "match_opcode". The reasons
this error did not cause any disassembler errors are:
1. The problem will not reproduce if "no-aliases" is specified
(because macro instructions are handled as aliases).
2. If not, all affected compressed instructions or their aliases
precede before "vmsge{,u}.vx" macro instructions.
However, it'll easily break if we reorder opcode entries. This commit
fixes this issue before the *accident* occurs.
opcodes/ChangeLog:
* riscv-opc.c (riscv_opcodes): Make sure that we never match to
vmsge{,u}.vx instructions unless specified in the assembler.
The 32-bit non-fetching atomic instructions treat the source register as
32-bits, which means in the pseudo-c syntax the "w" registers should be
used rather than the "r" registers.
opcodes/
* bpf-opc-c (bpf_opcodes): Use %sw for AAD32, AOR32, AAND32
and AXOR32 pseudo-c dialect asm templates.
gas/
* testsuite/gas/bpf/atomic-be-pseudoc.d: Use "w" for source reg
in non-fetching 32-bit atomic instructions.
* testsuite/gas/bpf/atomic-pseudoc.d: Likewise.
* testsuite/gas/bpf/atomic-pseudoc.s: Likewise.
It makes little sense to have this comment meanwhile over a hundred
lines ahead of the array. In fact until spotting the comment, I was
wondering why those pretty important aspects aren't spelled out
anywhere.
Since I was poking at cris-dis.c to avoid the sanitizer warning,
I figure I might as well make use of stpcpy and sprintf return value
in other places in this file.
* cris-dis.c (format_hex): Use sprintf return value.
(format_reg): Use stpcpy and sprintf return, avoiding strlen.
(format_sup_reg): Likewise.
Simplify the sprintf calls, and use sprintf return value. Older code
in binutils avoided using the sprintf return count of chars printed,
because with some older C libraries it wasn't reliable. Nowadays it
should be OK to use (and we already use the return value elsewhere).
sprintf can't return an error status of -1 here.
* cris-dis.c (format_dec): Avoid sanitizer warning. Use sprintf
return value rather than calling strlen.
This patch fixes a regression recently introduced in the BPF
disassembler, that was assuming an abfd was always available in
info->section->owner. Apparently this is not so in GDB, and therefore
https://sourceware.org/bugzilla/show_bug.cgi?id=30705.
Tested in bpf-unkonwn-none.
opcodes/ChangeLog:
2023-07-31 Jose E. Marchesi <jose.marchesi@oracle.com>
PR 30705
* bpf-dis.c (print_insn_bpf): Check that info->section->owner is
actually available before using it.
This patch adds support for EF_BPF_CPUVER bits in the ELF
machine-dependent header flags. These bits encode the BPF CPU
version for which the object file has been compiled for.
The BPF assembler is updated so it annotates the object files it
generates with these bits.
The BPF disassembler is updated so it honors EF_BPF_CPUVER to use the
appropriate ISA version if the user didn't specify an explicit ISA
version in the command line. Note that a value of zero in
EF_BPF_CPUVER is interpreted by the disassembler as "use the later
supported version" (the BPF CPU versions start with v1.)
The readelf utility is updated to pretty print EF_BPF_CPUVER when it
prints out the ELF header:
$ readelf -h a.out
ELF Header:
...
Flags: 0x4, CPU Version: 4
Tested in bpf-unknown-none.
include/ChangeLog:
2023-07-30 Jose E. Marchesi <jose.marchesi@oracle.com>
* elf/bpf.h (EF_BPF_CPUVER): Define.
* opcode/bpf.h (BPF_XBPF): Change from 0xf to 0xff so it fits in
EF_BPF_CPUVER.
binutils/ChangeLog:
2023-07-30 Jose E. Marchesi <jose.marchesi@oracle.com>
* readelf.c (get_machine_flags): Recognize and pretty print BPF
machine flags.
opcodes/ChangeLog:
2023-07-30 Jose E. Marchesi <jose.marchesi@oracle.com>
* bpf-dis.c: Initialize asm_bpf_version to -1.
(print_insn_bpf): Set BPF ISA version from the cpu version ELF
header flags if no explicit version set in the command line.
* disassemble.c (disassemble_init_for_target): Remove unused code.
gas/ChangeLog:
2023-07-30 Jose E. Marchesi <jose.marchesi@oracle.com>
* config/tc-bpf.h (elf_tc_final_processing): Define.
* config/tc-bpf.c (bpf_elf_final_processing): New function.
This patch fixes the BPF_INSN_NEGR and BPF_INSN_NEG32R BPF
instructions to not use their source registers.
Tested in bpf-unknown-none.
opcodes/ChangeLog:
2023-07-26 Jose E. Marchesi <jose.marchesi@oracle.com>
* bpf-opc.c (bpf_opcodes): Fix BPF_INSN_NEGR to not use a src
register.
gas/ChangeLog:
2023-07-26 Jose E. Marchesi <jose.marchesi@oracle.com>
* testsuite/gas/bpf/alu.s: The register neg instruction gets only
one argument.
* testsuite/gas/bpf/alu32-be-pseudoc.d: Likewise.
* testsuite/gas/bpf/alu32-pseudoc.d: Likewise.
* testsuite/gas/bpf/alu32-pseudoc.s: Likewise.
* testsuite/gas/bpf/alu-pseudoc.d: Likewise.
* testsuite/gas/bpf/alu-be-pseudoc.d: Likewise.
* testsuite/gas/bpf/alu-pseudoc.s: Likewise.
* testsuite/gas/bpf/alu-be.d: Likewise.
* testsuite/gas/bpf/alu.d: Likewise.
* testsuite/gas/bpf/alu32-be.d: Likewise.
* testsuite/gas/bpf/alu32.d: Likewise.
* testsuite/gas/bpf/alu32.s: Likewise.
* doc/c-bpf.texi (BPF Instructions): Update accordingly.
This patch adds support for the BPF V4 ISA byte swap instructions to
opcodes, assembler and disassembler.
Tested in bpf-unknown-none.
include/ChangeLog:
2023-07-24 Jose E. Marchesi <jose.marchesi@oracle.com>
* opcode/bpf.h (BPF_IMM32_BSWAP16): Define.
(BPF_IMM32_BSWAP32): Likewise.
(BPF_IMM32_BSWAP64): Likewise.
(enum bpf_insn_id): New entries BPF_INSN_BSWAP{16,32,64}.
opcodes/ChangeLog:
2023-07-24 Jose E. Marchesi <jose.marchesi@oracle.com>
* bpf-opc.c (bpf_opcodes): Add entries for the BSWAP*
instructions.
gas/ChangeLog:
2023-07-24 Jose E. Marchesi <jose.marchesi@oracle.com>
* doc/c-bpf.texi (BPF Instructions): Document BSWAP* instructions.
* testsuite/gas/bpf/alu.s: Test BSWAP{16,32,64} instructions.
* testsuite/gas/bpf/alu.d: Likewise.
* testsuite/gas/bpf/alu-be.d: Likewise.
* testsuite/gas/bpf/alu-pseudoc.s: Likewise.
* testsuite/gas/bpf/alu-pseudoc.d: Likewise.
* testsuite/gas/bpf/alu-be-pseudoc.d: Likewise.
This patch fixes the pseudoc syntax of the V4 instructions MOVS* and
LDXS* in order to reflect https://reviews.llvm.org/D144829.
opcodes/ChangeLog:
2023-07-24 Jose E. Marchesi <jose.marchesi@oracle.com>
* bpf-opc.c (bpf_opcodes): Fix pseudo-c syntax for MOVS* and LDXS*
instructions.
gas/ChangeLog:
2023-07-24 Jose E. Marchesi <jose.marchesi@oracle.com>
* doc/c-bpf.texi (BPF Instructions): Fix pseudoc syntax for MOVS*
and LDXS* instructions.
* testsuite/gas/bpf/mem-pseudoc.d: Likewise.
* testsuite/gas/bpf/mem-be-pseudoc.d: Likewise.
* testsuite/gas/bpf/mem-pseudoc.s: Likewise.
* testsuite/gas/bpf/alu-pseudoc.s: Likewise.
* testsuite/gas/bpf/alu-pseudoc.d: Likewise.
* testsuite/gas/bpf/alu-be-pseudoc.d: Likewise.
* testsuite/gas/bpf/alu32-pseudoc.s: Likewise.
* testsuite/gas/bpf/alu32-pseudoc.d: Likewise.
* testsuite/gas/bpf/alu32-be-pseudoc.d: Likewise.
This patch adds support for the V4 BPF instruction jal/gotol, which is
like ja/goto but it supports a signed 32-bit PC-relative (in number of
64-bit words minus one) target operand instead of the 16-bit signed
operand of the other instruction. This greatly increases the jump
range in BPF programs.
Tested in bpf-unkown-none.
bfd/ChangeLog:
2023-07-24 Jose E. Marchesi <jose.marchesi@oracle.com>
* reloc.c: New reloc BFD_RELOC_BPF_DISPCALL32.
* elf64-bpf.c (bpf_reloc_type_lookup): Handle the new reloc.
* libbfd.h (bfd_reloc_code_real_names): Regenerate.
gas/ChangeLog:
2023-07-24 Jose E. Marchesi <jose.marchesi@oracle.com>
* config/tc-bpf.c (struct bpf_insn): New field `id'.
(md_assemble): Save the ids of successfully parsed instructions
and use the new BFD_RELOC_BPF_DISPCALL32 whenever appropriate.
(md_apply_fix): Adapt to the new BFD reloc.
* testsuite/gas/bpf/jump.s: Test JAL.
* testsuite/gas/bpf/jump.d: Likewise.
* testsuite/gas/bpf/jump-pseudoc.d: Likewise.
* testsuite/gas/bpf/jump-be.d: Likewise.
* testsuite/gas/bpf/jump-be-pseudoc.d: Likewise.
* doc/c-bpf.texi (BPF Instructions): Document new instruction
jal/gotol.
Document new operand type disp32.
include/ChangeLog:
2023-07-24 Jose E. Marchesi <jose.marchesi@oracle.com>
* opcode/bpf.h (enum bpf_insn_id): Add entry BPF_INSN_JAL.
(enum bpf_insn_id): Remove spurious entry BPF_INSN_CALLI.
opcodes/ChangeLog:
2023-07-23 Jose E. Marchesi <jose.marchesi@oracle.com>
* bpf-opc.c (bpf_opcodes): Add entry for jal.
This tiny patch makes the BPF disassembler to emit, e.g.
ldxdw %r1, [%r0+0]
instead of
ldxdw %r1, [%r00]
when the offset is 0, to avoid confusion.
opcodes/
* bpf-dis.c (print_insn_bpf): Print offsets with value 0 as "+0".
gas/
* testsuite/gas/bpf/mem.s: Add tests with offset 0.
* testsuite/gas/bpf/mem-pseudoc.s: Likewise.
* testsuite/gas/bpf/mem.d: Update accordingly.
* testsuite/gas/bpf/mem-be.d: Likewise.
* testsuite/gas/bpf/mem-pseudoc.d: Likewise.
* testsuite/gas/bpf/mem-be-pseudoc.d: Likewise.
This commit adds the signed load to register (ldxs*) instructions
introduced in the BPF ISA version 4, including opcodes and assembler
tests.
Tested in bpf-unknown-none.
include/ChangeLog:
2023-07-21 Jose E. Marchesi <jose.marchesi@oracle.com>
* opcode/bpf.h (enum bpf_insn_id): Add entries for signed load
instructions.
(BPF_MODE_SMEM): Define.
opcodes/ChangeLog:
2023-07-21 Jose E. Marchesi <jose.marchesi@oracle.com>
* bpf-opc.c (bpf_opcodes): Add entries for LDXS{B,W,H,DW}
instructions.
gas/ChangeLog:
2023-07-21 Jose E. Marchesi <jose.marchesi@oracle.com>
* testsuite/gas/bpf/mem.s: Add signed load instructions.
* testsuite/gas/bpf/mem-pseudoc.s: Likewise.
* testsuite/gas/bpf/mem.d: Likewise.
* testsuite/gas/bpf/mem-pseudoc.d: Likewise.
* testsuite/gas/bpf/mem-be.d: Likewise.
* doc/c-bpf.texi (BPF Instructions): Document the signed load
instructions.
This commit adds the signed register move (movs) instructions
introduced in the BPF ISA version 4, including opcodes and assembler
tests.
Tested in bpf-unknown-none.
include/ChangeLog:
2023-07-21 Jose E. Marchesi <jose.marchesi@oracle.com>
* opcode/bpf.h (BPF_OFFSET16_MOVS8): Define.
(BPF_OFFSET16_MOVS16): Likewise.
(BPF_OFFSET16_MOVS32): Likewise.
(enum bpf_insn_id): Add entries for MOVS{8,16,32}R and
MOVS32{8,16,32}R.
opcodes/ChangeLog:
2023-07-21 Jose E. Marchesi <jose.marchesi@oracle.com>
* bpf-opc.c (bpf_opcodes): Add entries for MOVS{8,16,32}R and
MOVS32{8,16,32}R instructions. and MOVS32I instructions.
gas/ChangeLog:
2023-07-21 Jose E. Marchesi <jose.marchesi@oracle.com>
* testsuite/gas/bpf/alu.s: Test movs instructions.
* testsuite/gas/bpf/alu-pseudoc.s: Likewise.
* testsuite/gas/bpf/alu32.s: Likewise for movs32 instruction.
* testsuite/gas/bpf/alu32-pseudoc.s: Likewise.
* testsuite/gas/bpf/alu.d: Add expected results.
* testsuite/gas/bpf/alu32.d: Likewise.
* testsuite/gas/bpf/alu-be.d: Likewise.
* testsuite/gas/bpf/alu32-be.d: Likewise.
* testsuite/gas/bpf/alu-pseudoc.d: Likewise.
* testsuite/gas/bpf/alu32-pseudoc.d: Likewise.
* testsuite/gas/bpf/alu-be-pseudoc.d: Likewise.
* testsuite/gas/bpf/alu32-be-pseudoc.d: Likewise.
This was breaking --enable-targets=all builds.
opcodes/ChangeLog:
2023-07-21 Jose E. Marchesi <jose.marchesi@oracle.com>
* Makefile.am (TARGET64_LIBOPCODES_CFILES): Add missing bpf-dis.c
* Makefile.in: Regenerate.
CGEN is cool, but the BPF architecture is simply too bizarre for it.
The weird way of BPF to handle endianness in instruction encoding, the
weird C-like alternative assembly syntax, the weird abuse of
multi-byte (or infra-byte) instruction fields as opcodes, the unusual
presence of opcodes beyond the first 32-bits of some instructions, are
all examples of what makes it a PITA to continue using CGEN for this
port. The bpf.cpu file is becoming so complex and so nested with
p-macros that it is very difficult to read, and quite challenging to
update. Also, every time we are forced to change something in CGEN to
accommodate BPF requirements (which is often) we have to do extensive
testing to make sure we do not break any other target using CGEN.
This is getting un-maintenable.
So I have decided to bite the bullet and revamp/rewrite the port so it
no longer uses CGEN. Overall, this involved:
* To remove the cpu/bpf.{cpu,opc} descriptions.
* To remove the CGEN generated files.
* To replace the CGEN generated opcodes table with a new hand-written
opcodes table for BPF.
* To replace the CGEN generated disassembler wih a new disassembler
that uses the new opcodes.
* To replace the CGEN generated assembler with a new assembler that uses the
new opcodes.
* To replace the CGEN generated simulator with a new simulator that uses the
new opcodes. [This is pushed in GDB in another patch.]
* To adapt the build systems to the new situation.
Additionally, this patch introduces some extensions and improvements:
* A new BPF relocation BPF_RELOC_BPF_DISP16 plus corresponding ELF
relocation R_BPF_GNU_64_16 are added to the BPF BFD port. These
relocations are used for section-relative 16-bit offsets used in
load/store instructions.
* The disassembler now has support for the "pseudo-c" assembly syntax of
BPF. What dialect to use when disassembling is controlled by a command
line option.
* The disassembler now has support for dumping instruction immediates in
either octal, hexadecimal or decimal. The used output base is controlled
by a new command-line option.
* The GAS BPF test suite has been re-structured and expanded in order to
test the disassembler pseudoc syntax support. Minor bugs have been also
fixed there. The assembler generic tests that were disabled for bpf-*-*
targets due to the previous implementation of pseudoc syntax are now
re-enabled. Additional tests have been added to test the new features of
the assembler. .dump files are no longer used.
* The linker BPF test suite has been adapted to the command line options
used by the new disassembler.
The result is very satisfactory. This patchs adds 3448 lines of code
and removes 10542 lines of code.
Tested in:
* Target bpf-unknown-none with 64-bit little-endian host and 32-bit
little-endian host.
* Target x86-64-linux-gnu with --enable-targets=all
Note that I have not tested in a big-endian host yet. I will do so
once this lands upstream so I can use the GCC compiler farm.
I have not included ChangeLog entries in this patch: these would be
massive and not very useful, considering this is pretty much a rewrite
of the port. I beg the indulgence of the global maintainers.
Bring disassembly back in line with what the assembler accepts, thus
also making it self-consistent (with, in particular selector load/store
insns). While there further add D to all affected insns except ARPL
(where S is used, matching LAR/LSL), to also behave correctly in suffix-
always mode.
While there also hook up the Intel variant of the LKGS test.
For whatever reason in c9f5b96bda ("x86: correct handling of LAR and
LSL") I didn't realize that we can easily use Sv instead of going
through mod_table[]. Redo this aspect of that change.
This patch support Zcb extension, contains new compressed instructions,
some instructions depend on other existed extension, like 'zba', 'zbb'
and 'zmmul'. Zcb also imply Zca extension to enable the compressing
features.
Co-Authored by: Charlie Keaney <charlie.keaney@embecosm.com>
Co-Authored by: Mary Bennett <mary.bennett@embecosm.com>
Co-Authored by: Nandni Jamnadas <nandni.jamnadas@embecosm.com>
Co-Authored by: Sinan Lin <sinan.lin@linux.alibaba.com>
Co-Authored by: Simon Cook <simon.cook@embecosm.com>
Co-Authored by: Shihua Liao <shihua@iscas.ac.cn>
Co-Authored by: Yulong Shi <yulong@iscas.ac.cn>
bfd/ChangeLog:
* elfxx-riscv.c (riscv_multi_subset_supports): New extension.
(riscv_multi_subset_supports_ext): Ditto.
gas/ChangeLog:
* config/tc-riscv.c (validate_riscv_insn): New operators.
(riscv_ip): Ditto.
* testsuite/gas/riscv/zcb.d: New test.
* testsuite/gas/riscv/zcb.s: New test.
include/ChangeLog:
* opcode/riscv-opc.h (MATCH_C_LBU): New opcode.
(MASK_C_LBU): New mask.
(MATCH_C_LHU): New opcode.
(MASK_C_LHU): New mask.
(MATCH_C_LH): New opcode.
(MASK_C_LH): New mask.
(MATCH_C_SB): New opcode.
(MASK_C_SB): New mask.
(MATCH_C_SH): New opcode.
(MASK_C_SH): New mask.
(MATCH_C_ZEXT_B): New opcode.
(MASK_C_ZEXT_B): New mask.
(MATCH_C_SEXT_B): New opcode.
(MASK_C_SEXT_B): New mask.
(MATCH_C_ZEXT_H): New opcode.
(MASK_C_ZEXT_H): New mask.
(MATCH_C_SEXT_H): New opcode.
(MASK_C_SEXT_H): New mask.
(MATCH_C_ZEXT_W): New opcode.
(MASK_C_ZEXT_W): New mask.
(MATCH_C_NOT): New opcode.
(MASK_C_NOT): New mask.
(MATCH_C_MUL): New opcode.
(MASK_C_MUL): New mask.
(DECLARE_INSN): New opcode.
* opcode/riscv.h (EXTRACT_ZCB_BYTE_UIMM): New inline func.
(EXTRACT_ZCB_HALFWORD_UIMM): Ditto.
(ENCODE_ZCB_BYTE_UIMM): Ditto.
(ENCODE_ZCB_HALFWORD_UIMM): Ditto.
(VALID_ZCB_BYTE_UIMM): Ditto.
(VALID_ZCB_HALFWORD_UIMM): Ditto.
(enum riscv_insn_class): New extension class.
opcodes/ChangeLog:
* riscv-dis.c (print_insn_args): New operators.
* riscv-opc.c: New instructions.
First of all it is entirely unclear why THREE_BYTE_TABLE_PREFIX() was
introduced by bf890a93a7. Nothing uses the .prefix_requirement values
from the two relevant entries.
And then having VEX_Cn_TABLE() and friends take arguments is misleading.
These aren't used (or pointlessly used in the case of VEX_C5_TABLE); the
respective table index is decoded from the insn (or implied in the case
of VEX_C5_TABLE).
Several already use OP_R(), which rejects the memory forms of insns, and
a few others can easily be converted to do so as well. Note that for it
to be able to use BadOp() without forward declaration, OP_Skip_MODRM() is
moved down.
While there add the previously missing PREFIX_OPCODE to legacy opcode
0FD7.
Now that we have OP_R(), use it here as well, while wiring memory-only
operands to OP_M() at the same time. To keep the number of consumed
opcode bytes similar to before, make BadOp() also account for VEX/XOP/
EVEX prefix bytes. To keep that change simple, convert need_vex to an
actual count of prefix bytes (keeping intact all prior boolean uses of
the field).
Note how this improves disassembly of such bad encodings, by at least
leaving a hint towards what a "nearby" instruction is. (For KSHIFT*
change the immediates test testcases use, such that disassembly remains
sufficiently in sync.)
While there also use Ux for VPMOV{B,W,D,Q}2M, where decoding through
mod_table[] was missing in the earlier scheme.
Fold OP_MS() and OP_XS() into OP_R(), paralleling OP_M(). Use operand
names (largely) matching those in the SDM. For 128-bit-only forms use
Uxmm though, marking 256-bit forms as bad. This then allows no longer
going through vex_len_table[] for two of the insns.
Specifically _do not_ continue to mis-use v_mode.
Several already use OP_M(), which rejects the register forms of insns,
and a few others can easily be converted to do so as well. (Note that
FXSAVE_Fixup() wires through to OP_M(). Note further that OP_IndirE(),
which wasn't placed very well anyway, is moved down to avoid the need to
forward-declare BadOp().)
Also adjust formatting of and drop PREFIX_OPCODE from a few adjacent
entries.
Most of them use Mx already for the memory operand, which rejects the
register form of the insn. Use that operand type also for the two EVEX
forms which so far have used EXEvexXNoBcst (and thus failed to reject
the register forms), compensating by flagging broadcast as bad for all
Mx. This way several other insns which don't permit embedded broadcast
either are also covered at the same time.
By changing decode order to do ModR/M.mod last (rather than VEX.L), the
VEX entries (which are already reused by EVEX decoding) can be folded
with their legacy counterparts as well. Note how this change of decode
order also allows removing two auxiliary #define-s, which were
introduced during earlier folding (because of that unhelpful order of
steps).
Introduce macro V to expand to 'v' in the VEX/EVEX case, and replace a
couple of abort()s where legacy code can now legitimately make it. While
there for {,V}LDDQU drop hoing through mod_table[] - OP_M() rejects
register operands quite fine.