The BPF pseudo-c syntax supports both MOV and LDDW instructions:
mov: r1 = EXPR
lddw: r1 = EXPR ll
Note that the white space between EXPR and `ll' is necessary in order
to avoid ambiguity with the assembler's support for C-like numerical
suffixes. This patch adds a new test to the GAS BPF testsuite to make
sure that instructions like:
r1 = 666ll
are interpreted as `mov %r1,666', not as `lddw %r1,666'.
This matches clang's assembler behavior.
2023-10-30 Jose E. Marchesi <jose.marchesi@oracle.com>
* testsuite/gas/bpf/alu-pseudoc.s: Add test to make sure C-like
suffix `ll' is not interpreted as lddw syntax.
* testsuite/gas/bpf/alu-pseudoc.d: Update expected results.
* testsuite/gas/bpf/alu-be-pseudoc.d: Likewise.
We are used to generate these kinds of relocations by data directives.
Considering the following example,
.word (A + 3) - (B + 2)
The GAS will generate a pair of ADD/SUB for this,
R_RISCV_ADD, A + 1
R_RISCV_SUB, 0
The addend of R_RISCV_SUB will always be zero, and the summary of the
constants will be stored in the addend of R_RISCV_ADD/SET. Therefore,
we can always add the addend of these data relocations when doing relocations.
But unfortunately, I had heard that if we are using .reloc to generate
the data relocations will make the relocations failed. Refer to this,
.reloc offset, R_RISCV_ADD32, A + 3
.reloc offset, R_RISCV_SUB32, B + 2
.word 0
Then we can get the relocations as follows,
R_RISCV_ADD, A + 3
R_RISCV_SUB, B + 2
Then... Current LD does the relocation, B - A + 3 + 2, which is wrong
obviously...
So first of all, this patch fixes the wrong relocation behavior of
R_RISCV_SUB* relocations.
Afterwards, considering the uleb128 direcitve, we will get a pair of
SET_ULEB128/SUB_ULEB128 relocations for it for now,
.uleb128 (A + 3) - (B + 2)
R_RISCV_SET_ULEB128, A + 1
R_RISCV_SUB_ULEB128, B + 1
Which looks also wrong obviously, the summary of the constants should only
be stored into the addend of SET_ULEB128, and the addend of SUB_ULEB128 should
be zero like other SUB relocations. But the current LD will still get the right
relocation values since we only add the addend of SUB_ULEB128 by accident...
Anyway, this patch also fixes the behaviors above, to make sure that no matter
using .uleb128 or .reloc directives, we should always get the right values.
bfd/
* elfnn-riscv.c (perform_relocation): Clarify that SUB relocations
should substract the addend, rather than add.
(riscv_elf_relocate_section): Since SET_ULEB128 won't go into
perform_relocation, we should add it's addend here in advance.
gas/
* config/tc-riscv.c (riscv_insert_uleb128_fixes): Set the addend of
SUB_ULEB128 to zero since it should already be added into the addend
of SET_ULEB128.
The as and ld use _bfd_error_handler to output error messages when
checking relocation alignment and relocation overflow. However, the
abfd value passed by as to the function is NULL, resulting in an
internal error. The ld passes a non-null value to the function,
so it can output an error message normally.
First of all add f32_5[], allowing to eliminate the extra slot-is-NULL
code from i386_output_nops(). Plus then introduce f32_8[] and f16_5[]
following the same concept of adding a %cs segment override prefix.
Also re-use patterns when possible and correct comments as applicable.
Similarly re-use testcase expectations as much as possible, where they
need touching anyway.
The two are distinct in opcodes/, distinguished precisely by CpuNOP
that's relevant in i386_generate_nops(), yet the function has the PPro
case label in the other group. Simply removing it revealed that
cpu_arch[] had a wrong entry for i686.
While there also add PROCESSOR_IAMCU to the respective comment.
Making GENERIC64 a special case was never correct; prior to the
generalization of ".arch .no*" to cover all ISA extensions other
processor families supporting long NOPs should have been covered as
well. When introducing ".arch .nonops" (among others) it wasn't
apparent that a hidden implication of .cpunop not being possible to
separately turn off existed here. Seeing that the two large case label
blocks in the 2nd switch() already had identical behavior, simply
collapse all of the (useful) case labels into a single "default" one.
Since we don't key the NOP selection to user-controlled properties, we
may not use i386 features; otherwise we would violate a possible .arch
directive restricting ISA to pre-386.
Except for the shared 1- and 2-byte cases, the LEA uses corrupt %rsi
(by zero-extending %esi to %rsi). Introduce separate 64-bit patterns
which keep %rsi intact.
What matters is what was in effect at the time the original directive
was issued. Later changes to global state (bitness or ISA) must not
affect what code is generated.
The recorded value, and not the global variable, will want using in
TC_FRAG_INIT(). The so far file scope variable therefore needs to become
external, to be accessible there.
This patch makes a cosmetic change to the reloc_weaksym.s
by making the bneid instruction all lower case like all of
the other instructions in the example.
Signed-off-by: Neal Frager <neal.frager@amd.com>
Signed-off-by: Michael J. Eager <eager@eagercon.com>
This patch adds the R_MICROBLAZE_32_NONE relocation type.
This is a 32-bit reloc that stores the 32-bit pc relative
value in two words (with an imm instruction).
Add test case to gas test suite.
Signed-off-by: Neal Frager <neal.frager@amd.com>
Signed-off-by: Michael J. Eager <eager@eagercon.com>
There is currently a bug in the bit masking for the barrel shift
instructions because the bit mask is not including all of the
register bits which must be zero. With this patch, the disassembler
can be sure that the 32-bit value is indeed a barrel shift instruction
and not a data value in memory.
This fix can be verified by assembling and disassembling the following:
.text
.long 0x65005f5f
With this patch, the bug is fixed, and the objdump will know that
0x65005f5f is not a barrel shift instruction.
Signed-off-by: Neal Frager <neal.frager@amd.com>
Signed-off-by: Michael J. Eager <eager@eagercon.com>
According to the commit 51498ab9ab, the q extension was no longer allowed
for rv32 since version 2.2. Therefore, make sure the version of q is larger
than 2.2, in case the new extension conflict breaks the toolchain regressions,
which built with the old -misa-spec.
gas/
* testsuite/gas/riscv/zfa-zvfh.d: Set q to v2.2.
* testsuite/gas/riscv/zfa.d: Likewise.
This patch adds new gas tests for the
microblaze bsefi and bsifi instructions.
Signed-off-by: Neal Frager <neal.frager@amd.com>
Signed-off-by: Michael J. Eager <eager@eagercon.com>
Since RV32E and RV64E are now ratified, this commit prepares the ABI
support for LP64E (LP64 with reduced GPRs).
gas/ChangeLog:
* config/tc-riscv.c (riscv_set_abi_by_arch): Update the error
message. (md_parse_option): Accept "lp64e".
* doc/c-riscv.texi: Update the documentation to allow "lp64e".
* testsuite/gas/riscv/mabi-fail-rv32e-lp64f.l:
Change error message.
* testsuite/gas/riscv/mabi-fail-rv32e-lp64d.l: Likewise.
* testsuite/gas/riscv/mabi-fail-rv32e-lp64q.l: Likewise.
Since RV32E *and* RV64E are ratified, RV64E is no longer invalid.
This commit removes a restriction that prevents making base ISA with
reduced GPRs with XLEN > 32.
bfd/ChangeLog:
* elfxx-riscv.c (riscv_parse_check_conflicts): Remove RV64E
conflict since the ratified 'E' base ISAs include RV64E.
gas/ChangeLog:
* testsuite/gas/riscv/march-fail-base-02.d: Removed.
* testsuite/gas/riscv/march-fail-base-02.l: Removed.
This patches adds new bsefi and bsifi instructions.
BSEFI- The instruction shall extract a bit field from a
register and place it right-adjusted in the destination register.
The other bits in the destination register shall be set to zero.
BSIFI- The instruction shall insert a right-adjusted bit field
from a register at another position in the destination register.
The rest of the bits in the destination register shall be unchanged.
Further documentation of these instructions can be found here:
https://docs.xilinx.com/v/u/en-US/ug984-vivado-microblaze-ref
With version 6 of the patch, no new relocation types are added as
this was unnecessary for adding the bsefi and bsifi instructions.
FIXED: Segfault caused by incorrect termination of microblaze_opcodes.
Signed-off-by: nagaraju <nagaraju.mekala@amd.com>
Signed-off-by: Ibai Erkiaga <ibai.erkiaga-elorza@amd.com>
Signed-off-by: Neal Frager <neal.frager@amd.com>
Signed-off-by: Michael J. Eager <eager@eagercon.com>
For the instructions of R_LARCH_B16/B21, if the immediate overflow,
add a B instruction and R_LARCH_B26 relocation.
For example:
.L1
...
blt $t0, $t1, .L1
R_LARCH_B16
change to:
.L1
...
bge $t0, $t1, .L2
b .L1
R_LARCH_B26
.L2
Some older kernels cannot handle the newly generated R_LARCH_32/64_PCREL,
so the assembler generates R_LARCH_ADD32/64+R_LARCH_SUB32/64 by default,
and use the assembler option mthin-add-sub to generate R_LARCH_32/64_PCREL
as much as possible.
The Option of mthin-add-sub does not affect the generation of R_LARCH_32_PCREL
relocation in .eh_frame.
This patches adds new bsefi and bsifi instructions.
BSEFI- The instruction shall extract a bit field from a
register and place it right-adjusted in the destination register.
The other bits in the destination register shall be set to zero.
BSIFI- The instruction shall insert a right-adjusted bit field
from a register at another position in the destination register.
The rest of the bits in the destination register shall be unchanged.
Further documentation of these instructions can be found here:
https://docs.xilinx.com/v/u/en-US/ug984-vivado-microblaze-ref
This patch has been tested for years of AMD Xilinx Yocto
releases as part of the following patch set:
https://github.com/Xilinx/meta-xilinx/tree/master/meta-microblaze/recipes-devtools/binutils/binutils
Signed-off-by: nagaraju <nagaraju.mekala@amd.com>
Signed-off-by: Ibai Erkiaga <ibai.erkiaga-elorza@amd.com>
Signed-off-by: Neal Frager <neal.frager@amd.com>
Signed-off-by: Michael J. Eager <eager@eagercon.com>
The range check should be checking for the range
ffffffff80000000..7fffffff, not ffffffff70000000.
This patch has been tested for years of AMD Xilinx Yocto
releases as part of the following patch set:
https://github.com/Xilinx/meta-xilinx/tree/master/meta-microblaze/recipes-devtools/binutils/binutils
Signed-off-by: nagaraju <nagaraju.mekala@amd.com>
Signed-off-by: Neal Frager <neal.frager@amd.com>
Signed-off-by: Michael J. Eager <eager@eagercon.com>
Updated show usage for MicroBlaze specific assembler options
to include new entries.
This patch has been tested for years of AMD Xilinx Yocto
releases as part of the following patch set:
https://github.com/Xilinx/meta-xilinx/tree/master/meta-microblaze/recipes-devtools/binutils/binutils
Signed-off-by: nagaraju <nagaraju.mekala@amd.com>
Signed-off-by: Neal Frager <neal.frager@amd.com>
---
V1->V2:
- removed new options which were unnecessary
- added documentation for MicroBlaze specific options
Signed-off-by: Michael J. Eager <eager@eagercon.com>
AVX-* features / insns paralleling earlier introduced AVX512* ones can
be encoded more compactly when the respective feature was explicitly
enabled by the user.
Apparently from its introduction the variable was only ever written (the
only read is merely to determine whether to write it with another value).
(Since, due to the need to re-indent, the adjacent lines setting
cpu_arch_tune need touching anyway, switch to using PREOCESSOR_*
constants where applicable, to make more obvious what the resulting
state is going to be.)
These may not be set from a value derived from cpu_arch_flags: That
starts with (almost) all functionality enabled, while cpu_arch_isa_flags
is supposed to track features that were explicitly enabled (and perhaps
later disabled) by the user.
To avoid needing to do any such adjustment in two places (each),
introduce helper functions used by both command line handling and
directive processing.
Following the folding of some generic AVX/AVX2 templates with their
AVX512F counterpart ones, do this for FMA ones as well, requiring one
further adjustment to cpu_flags_match().
In anticipation of APX introduce logic to reduce the number of templates
we have now, allowing to limit some the number of ones we then need to
gain.
The fundamental requirements are that
- attributes be compatible, which specifically means VexW needs to be
the same in the templates (which often isn't the case, for VEX
encodings having far more WIG tha, EVEX ones),
- the EVEX form being AVX512F (with or without AVX512VL), not any of its
extensions (the same will then be required for APX - it'll need to be
APX_F).
Note that in check_register() there's now a redundant zmm check. Since
this logic will need revisiting for APX anyway, I'd like to keep it that
way for now. (Similarly a couple of if()-s which could be folded are
kept separate, to reduce code churn when adding APX support.)
SAE / embedded rounding are invalid when there's the memory operand, as
the bit encoding this specifies broadcast in that case.
Broadcast needs to be specified on the memory operand.
PR gas/30856
In 5cc007751c ("x86: further adjust extend-to-32bit-address
conditions") I neglected the case of PUSH, which is the only insn
allowing (proper) symbol addresses to be used as immediates (not
displacements, like CALL/JMP) in the absence of any register operands.
Since it defaults to 64-bit operand size, guessing an L suffix is wrong
there.
Add a macro pcaddi instruction to support "pcaddi rd, symbol".
pcaddi has a 20-bit signed immediate, it can address a +/- 2MB pc relative
address, and the address should be 4-byte aligned.
The AArch64 feature-flag code is currently limited to a maximum
of 64 features. This patch reworks it so that the limit can be
increased more easily. The basic idea is:
(1) Turn the ARM_FEATURE_FOO macros into an enum, with the enum
counting bit positions.
(2) Make the feature-list macros take an array index argument
(currently always 0). The macros then return the
aarch64_feature_set contents for that array index.
An N-element array would then be initialised as:
{ MACRO (0), ..., MACRO (N - 1) }
(3) Provide convenience macros for initialising an
aarch64_feature_set for:
- a single feature
- a list of individual features
- an architecture version
- an architecture version + a list of additional features
(2) and (3) use the preprocessor to generate static initialisers.
The main restriction was that uses of the same preprocessor macro
cannot be nested. So if a macro wants to do something for N individual
arguments, it needs to use a chain of N macros to do it. There then
needs to be a way of deriving N, as a preprocessor token suitable for
pasting.
The easiest way of doing that was to precede each list of features
by the number of features in the list. So an aarch64_feature_set
initialiser for three features A, B and C would be written:
AARCH64_FEATURES (3, A, B, C)
This scheme makes it difficult to keep AARCH64_FEATURE_CRYPTO as a
synonym for SHA2+AES, so the patch expands the former to the latter.
The new Synopsys ARCv3 ISA has a similar instruction format like
the old ARCv1 and ARCv2 ISA. Thus, the ARCv3 addition is using
whatever we have for old ARC processors plus some ARCv3 spcific mods.
To distinguish between various ARC variants, we introduced two new
configure defines named TARGET_ARCv3_32 and TARGET_ARCv3_64 which are
set when we choose either an ARC32 (ARCv3/32) ISA toolchain or an
ARC64 (ARCv3/64) ISA toolchain.
gas/
xxxx-xx-xx Claudiu Zissulescu <claziss@synopsys.com>
* gas/config/tc-arc.h: Selectively define default target macros.
* gas/configure.ac: Add ARC64 target.
* gas/configure.tgt: Likewise.
* gas/configure: Regenerate
* gas/config.in: Regenerate.
* gas/config/tc-arc.c (DEFAULT_ARCH): New macro.
(default_arch): New variable.
(md_pseudo_table): Add xword.
(md_shortopts): Only a few options are recognized by the new ARC64
assembler.
(md_longopts): Likewise.
(ARC_CPU_TYPE_A64x): New define.
(ARC_CPU_TYPE_A32x): Likewise.
(cpu_type): New arch field.
(selected_cpu): Update fields.
(arc_opcode_hash_entry_iterator_init): Formating.
(arc_opcode_hash_entry_iterator_next): Likewise.
(arc_select_cpu): Likewise.
(arc_option): Likewise.
(check_cpu_feature): Likewise.
(debug_exp): Recognize new expression operands.
(parse_reloc_symbol): Parse new signed/unsigend cases.
(parse_opcode_flags): Update for the case when the flags needs
insert/extract functions.
(find_opcode_match): Match new signed/unsigned 32-bit immediates.
(autodetect_attributes): PLT34 only available for ARC64.
(md_assemble): Extend match characters.
(declare_fp_set): New function.
(init_default_arch): Likewise.
(md_begin): Detect and initialize the correct CPU and coresponding
registers.
(md_pcrel_from_section): Add new relocs.
(arc_target_format): New function.
(md_apply_fix): Add new relocs.
(md_parse_option): Update options.
(arc_show_cpu_list): Update with ARC64 cpus.
(md_show_usage): Update messages.
(may_relax_expr): Add PLT34 case.
(assemble_insn): Update for ARC64.
(arc_make_nops): New function.
(arc_handle_align): Refurbish this function, use arc_make_nops.
(tc_arc_fix_adjustable): Update messages.
Signed-off-by: Claudiu Zissulescu <claziss@synopsys.com>