Commit Graph

6675 Commits

Author SHA1 Message Date
Richard Earnshaw
01613db787 arm: remove FPA instructions from assembler
These can no-longer be generated as the options to reach them have now
gone.  So remove the parsing support for FPA instructions.
2024-06-05 17:45:45 +01:00
Richard Earnshaw
44d174bcb6 arm: remove options to select the FPA
Remove the command-line options to choose the FPA (or FPE - an
emulated FPA).  From this point on it should be impossible to assemble
the old FPA instructions.
2024-06-05 17:45:45 +01:00
Richard Earnshaw
c27b73c525 arm: change default FPUs from FPA to none
Change the cases where the default FPU was FPA to none.  This should
ensure that any code that used settings to pick the floating-point
order will not silently produce a different output.  The options that
explicitly set the FPA remain for the moment.
2024-06-05 17:45:45 +01:00
Richard Earnshaw
e8cf93739c arm: redirect fp constant data directives through a wrapper
Assembler directives such as .float, or .double are handled by generic
code, but on Arm, their output can vary depeding on the type of FPU
begin targetted.  When we remove FPA support we don't want to silently
generate different code for processors that previously defaulted to
the FPA, so redirect these directives through a wrapper function that
checks the FPU is enabled; we use the legacy -mno-fpu in the test to
catch this.

Also fix a few tests so that they won't start to fail on targets (eg
arm-wince-pe) where there is no default format for the FPU and we pick
this from the default processor type.
2024-06-05 17:45:45 +01:00
Richard Earnshaw
be9943151a arm: adjust FPU selection logic
The logic here seems to be overly complex, so simplify it a bit.  One
particular problem was that using the legacy -mno-fpu option was not
working properly, as this has all the feature bits set to zero causing
the code to then pick a different FPU as the default.  Fix this by
only selecting an FPU as a fallback if the code has not otherwise
selected one: there was only one route by which this could happen.

This patch is really a pre-cursor to the following one where we want
to make no-fpu internally a fall-back position for some legacy
processors where previously we would have dropped back to the FPA.
2024-06-05 17:45:45 +01:00
Richard Earnshaw
d3a79e2833 arm: default to softvfp on armv6 or later cores
From armv6 onwards a lot of cores started to come with a physical VFP
implementation; but many still did not and in some cases there are
both variants.  For the cores that lacked a physical VFP we would fall
back to FPU_NONE if the platform/ABI did not mandate something else.
To make matters worse, FPU_NONE is internal state used to imply
soft-fpa (ie a mixed-endian double format), so any use of .double in
hand-written assembly is almost certainly generating incorrect output.

That's undesirable, all these cores should really default to a softvfp
model.
2024-06-05 17:45:45 +01:00
Richard Earnshaw
51c2c0f62b arm: rename FPU_ARCH_VFP to FPU_ARCH_SOFTVFP
FPU_ARCH_VFP has always meant VFP floating-point format (natural FP
word order) but without any VFP instructions.  But the name
FPU_ARCH_VFP is potentially confusing.  This patch just changes the
name to make the meaning clearer.
2024-06-05 17:45:45 +01:00
Mary Bennett
b0f266f38b RISC-V: Add support for XCVbi extension in CV32E40P
Spec: https://docs.openhwgroup.org/projects/cv32e40p-user-manual/en/latest/instruction_set_extensions.html

Contributors:
  Mary Bennett <mary.bennett682@gmail.com>
  Nandni Jamnadas <nandni.jamnadas@embecosm.com>
  Pietra Ferreira <pietra.ferreira@embecosm.com>
  Charlie Keaney
  Jessica Mills
  Craig Blackmore <craig.blackmore@embecosm.com>
  Simon Cook <simon.cook@embecosm.com>
  Jeremy Bennett <jeremy.bennett@embecosm.com>
  Helene Chelin <helene.chelin@embecosm.com>
  Nazareno Bruschi <nazareno.bruschi@embecosm.com>
  Lin Sinan

include/ChangeLog:

	* opcode/riscv-opc.h: Add corresponding MATCH and MASK
	macros for XCVbi.
	* opcode/riscv.h: Add corresponding EXTRACT and ENCODE macros
	for XCVbi.
	(enum riscv_insn_class): Add the XCVbi instruction class.

gas/ChangeLog:

	* config/tc-riscv.c (validate_riscv_insn): Add the necessary
	operands for the extension.
	(riscv_ip): Likewise.
	* doc/c-riscv.texi: Note XCVbi as an additional ISA extension
	for CORE-V.
	* testsuite/gas/riscv/cv-bi-beqimm.d: New test.
	* testsuite/gas/riscv/cv-bi-beqimm.s: New test.
	* testsuite/gas/riscv/cv-bi-bneimm.d: New test.
	* testsuite/gas/riscv/cv-bi-bneimm.s: New test.
	* testsuite/gas/riscv/cv-bi-fail-march.d: New test.
	* testsuite/gas/riscv/cv-bi-fail-march.l: New test.
	* testsuite/gas/riscv/cv-bi-fail-march.s: New test.
	* testsuite/gas/riscv/cv-bi-fail-operand-01.d: New test.
	* testsuite/gas/riscv/cv-bi-fail-operand-01.l: New test.
	* testsuite/gas/riscv/cv-bi-fail-operand-01.s: New test.
	* testsuite/gas/riscv/cv-bi-fail-operand-02.d: New test.
	* testsuite/gas/riscv/cv-bi-fail-operand-02.l: New test.
	* testsuite/gas/riscv/cv-bi-fail-operand-02.s: New test.
	* testsuite/gas/riscv/cv-bi-fail-operand-03.d: New test.
	* testsuite/gas/riscv/cv-bi-fail-operand-03.l: New test.
	* testsuite/gas/riscv/cv-bi-fail-operand-03.s: New test.
	* testsuite/gas/riscv/march-help.l: Add xcvbi string.

include/ChangeLog:

	* opcode/riscv-opc.h: Add corresponding MATCH and MASK
	macros for XCVbi.
	* opcode/riscv.h: Add corresponding EXTRACT and ENCODE macros
	for XCVbi.
	(enum riscv_insn_class): Add the XCVbi instruction class.

opcodes/ChangeLog:

	* riscv-dis.c (print_insn_args): Add disassembly for new operand.
	* riscv-opc.c: Add XCVbi instructions.
2024-06-05 18:09:22 +08:00
mengqinggang
5f4fa40e4d LoongArch: Make align symbol be in same section with alignment directive
R_LARCH_ALIGN (psABI v2.30) requires a symbol index. The symbol is only
created at the first time to handle alignment directive. This means that
all other sections may use this symbol. If the section of this symbol is
discarded, there may be problems. Search it in its own section.

Remove elf_backend_data.is_rela_normal() function added at commit daeda14191.

Co-authored-by: Jinyang He <hejinyang@loongson.cn>
Reported-by: WANG Xuerui <git@xen0n.name>
Link: https://lore.kernel.org/loongarch/2abbb633-a10e-71cc-a5e1-4d9e39074066@loongson.cn/T/#t
2024-06-04 19:47:20 +08:00
Jan Beulich
939c703789 x86: reduce check_{byte,word,long,qword}_reg() overhead
These run after template matching. Therefore it is quite pointless for
them to check all operands, when operand sizes matching across operands
is already known. Exit the loops early in such cases.

In check_byte_reg() also drop a long-stale part of a comment.
2024-05-31 10:22:50 +02:00
Jan Beulich
b83021de7a x86/Intel: warn about undue mnemonic suffixes
Except for very few insns mnemonic suffixes aren't permitted in Intel
syntax. Warn about such for now, indicating that they will be outright
refused down the road.

While fiddling with testcases to address fallout, drop a few things
which should never have been tested as valid Intel syntax.

Also add a previously missing line to simd-suffix.d.
2024-05-29 10:03:00 +02:00
Jan Beulich
6ccf16c19d x86/Intel: SHLD/SHRD have dual meaning
Since we uniformly permit D suffixes in Intel mode whenever in AT&T mode
an L suffix may be used, we need to be consistent with this.

Take the easy route, despite that still leading to an anomaly which is
also visible from the new testcase:

	shld	eax, ecx, 1
	shld	eax, ecx, cl

can mean two things with APX: SHL with a D suffix in NDD EVEX encoding,
or the traditional SHLD in legacy encoding.
2024-05-29 10:02:01 +02:00
Alan Modra
7574c0c2b3 PR31796, Internal error in write_function_pdata at obj-coff-seh
PR31796 is the result of lack of aarch64 support in obj-coff-seh.c.
Nick fixed this with commit 73c8603c3f.  Make the seh support
consistently warn in future if some archictecture is missing, rather
than giving internal errors.

	PR 31796
	* config/obj-coff-seh.c (verify_target): New function.
	(obj_coff_seh_handler, obj_coff_seh_endproc, obj_coff_seh_proc),
	(obj_coff_seh_endprologue): Use it.
2024-05-29 10:28:22 +09:30
saurabh.jha@arm.com
444c60fe33 gas, aarch64: Add SVE2 lut extension
Introduces instructions for the SVE2 lut extension for AArch64. They are documented in the following links:
* luti2: https://developer.arm.com/documentation/ddi0602/2024-03/SVE-Instructions/LUTI2--Lookup-table-read-with-2-bit-indices-?lang=en
* luti4: https://developer.arm.com/documentation/ddi0602/2024-03/SVE-Instructions/LUTI4--Lookup-table-read-with-4-bit-indices-?lang=en

These instructions use new SVE2 vector operands. They are called
SVE_Zm1_23_INDEX, SVE_Zm2_22_INDEX, and Zm3_12_INDEX and they have
1 bit, 2 bit, and 3 bit indices respectively.

The lsb and width of these new operands are the same as many existing
operands but the convention is to give different names to fields that
serve different purpose so we introduced new fields in aarch64-opc.c
and aarch64-opc.h.

We made a design choice for the second operand of the halfword variant of
luti4 with two register tables. We could have either defined a new operand,
like SVE_Znx2, or we could have use the existing operand SVE_ZnxN. With
the new operand, we would need to implement constraints on register
lists based on either operand or opcode flag. With existing operand, we
could just existing constraint checks using opcode flag. We chose
the second approach and went with SVE_ZnxN and added opcode flag to
enforce lengths of vector register list operands. This way, we can reuse
the existing constraint check logic.
2024-05-28 17:28:29 +01:00
saurabh.jha@arm.com
c3bb4211d9 gas, aarch64: Add AdvSIMD lut extension
Introduces instructions for the Advanced SIMD lut extension for AArch64. They are documented in the following links:
* luti2: https://developer.arm.com/documentation/ddi0602/2024-03/SIMD-FP-Instructions/LUTI2--Lookup-table-read-with-2-bit-indices-?lang=en
* luti4: https://developer.arm.com/documentation/ddi0602/2024-03/SIMD-FP-Instructions/LUTI4--Lookup-table-read-with-4-bit-indices-?lang=en

These instructions needed definition of some new operands. We will first
discuss operands for the third operand of the instructions and then
discuss a vector register list operand needed for the second operand.

The third operands are vectors with bit indices and without type
qualifiers. They are called Em_INDEX1_14, Em_INDEX2_13, and Em_INDEX3_12
and they have 1 bit, 2 bit, and 3 bit indices respectively. For these
new operands, we defined new parsing case branch. The lsb and width of
these operands are the same as many existing but the convention is to
give different names to fields that serve different purpose so we
introduced new fields in aarch64-opc.c and aarch64-opc.h for these new
operands.

For the second operand of these instructions, we introduced a new
operand called LVn_LUT. This represents a vector register list with
stride 1. We defined new inserter and extractor for this new operand and
it is encoded in FLD_Rn. We are enforcing the number of registers in the
reglist using opcode flag rather than operand flag as this is what other
SIMD vector register list operands are doing. The disassembly also uses
opcode flag to print the correct number of registers.
2024-05-28 17:28:29 +01:00
Nick Clifton
73c8603c3f Fix: internal error in write_function_pdata at obj-coff-seh
PR 31796
2024-05-28 16:30:14 +01:00
Jan Beulich
6b15ec5165 x86: simplify VexVVVV_SRC2 handling for the XOP case
As already suggested during review, rather than having an extra
conditional in build_modrm_byte() (a code path used for quite a few
more insns, including even certain GPR ones), adjust the attribute in
the installed template to properly describe things with operands
swapped.
2024-05-24 12:21:57 +02:00
Jan Beulich
fb40ea39de x86: simplify / consolidate check_{word,long,qword}_reg()
These run after template matching. Therefore operands are already known
to match the template in use. With the loop bodies skipping anything not
a GPR in the actual operands, there's therefore no need to check the
template's operand type for permitting Reg or Accum.

At the same time bring the three functions in sync for the "byte" part
of the logic, as far as checking the template for other sizes (qword
specifically) goes. Plus drop a stale comment from check_qword_reg(),
when all three are now behaving the same in this regard.
2024-05-24 11:51:21 +02:00
Jan Beulich
acd86c81f0 x86: correct VCVT{,U}SI2SD
Properly reject inappropriate suffixes (No_lSuf / No_qSuf mistakenly
omitted by cf665fee1d ["x86: re-work AVX512 embedded rounding / SAE"]),
to avoid emitting bad or arbitrarily guessed instructions. Interestingly
check_{long,qword}_suffix() don't help here, which perhaps is another
indication that the way they work right now isn't quite appropriate.

Sadly correcting just the templates breaks operand ambiguity detection,
since so far that worked from a single template permitting more than one
suffix. Here we have ambiguity though which can now be noticed only when
taking all (matching) templates together. Therefore we need to determine
further matching templates (see code comments for constraints), to then
accumulate permitted suffixes across all of them.
2024-05-24 11:50:38 +02:00
Cui, Lili
bbe8d019ed Support APX zero-upper
This patch is to enable ZU for IMUL (opcodes 0x69 and 0x6B) and SETcc.
Since the spec only recommends one form of setzu, I won't be adding
set<cc>reg32/reg64 support in this patch.

gas/ChangeLog:

        * config/tc-i386.c (build_apx_evex_prefix): Handle ZU.
        * testsuite/gas/i386/x86-64.exp: Added new tests for ZU.
        * testsuite/gas/i386/x86-64.exp: Added new tests for ZU.
        * testsuite/gas/i386/x86-64-apx-zu-intel.d: New test.
        * testsuite/gas/i386/x86-64-apx-zu-inval.l: Ditto.
        * testsuite/gas/i386/x86-64-apx-zu-inval.s: Ditto.
        * testsuite/gas/i386/x86-64-apx-zu.d: Ditto.
        * testsuite/gas/i386/x86-64-apx-zu.s: Ditto.

opcodes/ChangeLog:

        * i386-dis-evex-prefix.h: Handle PREFIX_EVEX_MAP4_40 ~
        PREFIX_EVEX_MAP4_4F.
        * i386-dis-evex.h: Ditto.
        * i386-dis.c (struct dis386): Add new micro 'ZU'.
        (putop): Handle %ZU.
        * i386-gen.c: Added ZU.
        * i386-opc.h: Ditto.
        * i386-opc.tbl: Added new templates to support ZU.
2024-05-22 16:15:47 +08:00
Cui, Lili
b757e3c1ac X86: Remove "i.rex" to eliminate extra conditional branch
Resulting code will do better without the extra conditional branch.
Remove "i.rex" to eliminate extra conditional branch.

gas/ChangeLog:

        * config/tc-i386.c (establish_rex): Remove i.rex.
2024-05-22 15:26:34 +08:00
Cui, Lili
9f8b42c871 Add check for 8-bit old registers in EVEX format
Since APX supports EVEX from legacy instructions, we need to check
the 8-bit old registers in EVEX format. And add test cases for it.

gas/ChangeLog:

        * config/tc-i386.c (md_assemble): Add invalid check for old byte
        registers in EVEX format.
        * testsuite/gas/i386/x86-64-apx-inval.l: Add new test.
        * testsuite/gas/i386/x86-64-apx-inval.s: Ditto.
2024-05-22 10:16:14 +08:00
Cui, Lili
3a8ecbdade x86: Split REX/REX2 old registers judgment.
Split "REX/REX2 old register checking" and "add empty rex prefix"
into two separate branches.

gas/ChangeLog:

        * config/tc-i386.c (establish_rex): Split the judgments.
2024-05-22 09:18:38 +08:00
Jan Beulich
c924b84f20 gas: drop remnants of ia64-*-aix*
With BFD not supporting that target anymore, GAS can't possibly support
it either.
2024-05-21 13:43:08 +02:00
Sung-hun Kim
3ce12d6d4a RISC-V: PR31733, Change initial CFI operation from DW_CFA_def_cfa_register to DW_CFA_def_cfa
The DWARF specification (especially, DWARF4 and 5 [1,2]) states that
DW_CFA_def_cfa_register cannot be used as the first CFI operation.
It said DW_CFA_def_cfa_register as follows:

  ... This operation is valid only if the current CFA rule is defined
  to use a register and offset.

So, DW_CFA_def_cfa_register can be used after that other definition
operation such as DW_CFA_def_cfa is called. However, the current gas
code emits DW_CFA_def_cfa_register as an initial CFI operation for RISCV.

In the libgcc, the unwinding function does not care about it, so it can
unwind the call stack. However, on the third party library such as
libunwindstack in Android, it causes a fatal error.

This patch changes the initial CFI operation to DW_CFA_def_cfa with
offset 0. It works as same as the previous one, but it does not have
any limitation so it satisfies the DWARF spec. This change resolves
the compatibility issue while preserving the original behaviour.

[1] DWARF4 specification, https://dwarfstd.org/doc/DWARF4.pdf
[2] DWARF5 specification, https://dwarfstd.org/doc/DWARF5.pdf

Signed-off-by: Sung-hun Kim <sfoon.kim@samsung.com>
Reviewed-By: Andrew Burgess <aburgess@redhat.com>
Approved-By: Nelson Chu <nelson@rivosinc.com>

gas/
	PR 31733
	config/tc-riscv.c (riscv_cfi_frame_initial_instructions): Use
	DW_CFA_def_cfa rather than DW_CFA_def_cfa_register.
2024-05-20 10:50:20 +08:00
mengqinggang
de203ed568 LoongArch: gas: Adjust DWARF CIE alignment factors
Set DWARF2_CIE_DATA_ALIGNMENT (data alignment factors) to -8.
It helps to save space.

Data Alignment Factor
A signed LEB128 encoded value that is factored out of all offset
instructions that are associated with this CIE or its FDEs. This value
shall be multiplied by the register offset argument of an offset
instruction to obtain the new offset value.
2024-05-17 15:38:29 +08:00
Victor Do Nascimento
7d0383ad39 aarch64: fp8 convert and scale - add feature flags and related structures 2024-05-16 13:22:30 +01:00
Richard Earnshaw
7e544ad81a arm: remove incorrect handling of FP bignums in move_or_literal_pool
This hunk of code in move_or_literal_pool just looks wrong, but I
can't find a testcase that will tickle it to prove it.  It looks a bit
like it was intended to catch cases where a bignum contained a
floating-point value, but there were a number of problems with it.

- It tested X_add_number == -1, but an FP bignum is indicated by any
  value <= 0.
- It converted the floating-point value to extended precision, but
  that's not used on Arm beyond the legacy FPA code.  No attempt was
  made to match the FP value to the intended memory/mov operation.

Since I can't construct a viable testcase, I've just removed the existing
code and made the function error out in this case: this seems more sensible
than generating wrong code or trying to write something more complex that
can't be tested anyway.
2024-05-16 11:10:15 +01:00
Andrew Carlotti
53071aac47 aarch64: Add sysreg features to +d128 dependencies
We should revisit sysreg feature enablement and dependencies in future, but
this change should help until then.

OK for master?
2024-05-15 12:29:22 +01:00
Andrew Carlotti
16b4196349 aarch64: Add simd dependency to +sha2
This matches the existing behaviour in GCC and LLVM, and also the current
documentation.

OK for master?
2024-05-15 12:27:09 +01:00
Richard Earnshaw
58bd8d56c7 arm: remove Maverick support from the assembler.
Delete all the Maverick instructions and register handling from the
assembler.  We continue to recognize -mcpu=ep9312, but treat it as an
alias for arm920t.  We no-longer recognize -mfpu=maverick.
2024-05-14 10:56:58 +01:00
Cui, Lili
c8866e3ec5 x86: Drop using extension_opcode to encode vvvv register
gas/ChangeLog:

        * config/tc-i386.c (build_modrm_byte): Dropped the use of
        extension_opcode to encode the vvvv register.
        * testsuite/gas/i386/x86-64-sse2avx.d: Added new testcases.
        * testsuite/gas/i386/x86-64-sse2avx.s: Diito.

opcodes/ChangeLog:

        * i386-opc.tbl: Added DstVVVV to some extension_opcode instructions.
	* i386-tbl.h: Regenerated.
2024-05-06 18:33:45 +08:00
Cui, Lili
0820c9f5fc x86: Drop SwapSources
gas/ChangeLog:

        * config/tc-i386.c (build_modrm_byte): Dropped the use of
	SWAP_SOURCES to encode the vvvv register.

opcodes/ChangeLog:

        * i386-opc.h (SWAP_SOURCES): Dropped.
        (NO_DEFAULT_MASK): Adjusted the value.
        (ADDR_PREFIX_OP_REG): Ditto.
        (DISTINCT_DEST): Ditto.
        (IMPLICIT_STACK_OP): Ditto.
        (VexVVVV_SRC2): New.
        * i386-opc.tbl: Dropped SwapSources and replaced its VexVVVV
	with Src1VVVV.
	* i386-tbl.h: Regenerated.
2024-05-06 18:21:28 +08:00
Cui, Lili
f2a3a8814d x86: Use vexvvvv as the switch state to encode the vvvv register
Use vexvvvv as the switch state, and replace VexVVVV with Src1VVVV.
Src1VVVV means using VEX.vvvv encodes the first source register
operand. The old logic did not check vexvvvv first, which made the
logic here very complicated.

gas/ChangeLog:

        * config/tc-i386.c (optimize_encoding): Replaced 1 with Src1VVVV.
        (build_modrm_byte): Used vexvvvv to encode the vvvv register.
        (s_insn): Replaced 1 with Src1VVVV.

opcodes/ChangeLog:

        * i386-opc.h (VexVVVV_DST): Adjusted the value.
        (Src1VVVV): New.
        * i386-opc.tbl: Replaced part VexVVVV with Src1VVVV.
	* i386-tbl.h: Regenerated.
2024-05-06 18:16:42 +08:00
Jan Beulich
24187fb9c0 x86/APX: extend SSE2AVX coverage
Legacy encoded SIMD insns are converted to AVX ones in that mode. When
eGPR-s are in use, i.e. with APX, convert to AVX10 insns (where
available; there are quite a few which can't be converted).

Note that LDDQU is represented as VMOVDQU32 (and the prior use of the
sse3 template there needs dropping, to get the order right).

Note further that in a few cases, due to the use of templates, AVX512VL
is used when AVX512F would suffice. Since AVX10 is the main reference,
this shouldn't be too much of a problem.
2024-05-03 09:26:25 +02:00
David Faust
dffb4a0784 bpf: fix calculation when deciding to relax branch
In certain cases we were calculating the jump displacement incorrectly
when deciding whether to relax a branch.  This meant for some branches,
such as a very long backwards conditional branch, relaxation was not
done when it should have been.  The result was to error later, because
the actual jump displacement was too large to fit in the original
instruction.

This patch fixes up the displacement calculation so that those branches
are correctly relaxed and no longer result in an error.  In addition, it
changes md_convert_frag to install fixups for the JAL instructions in
the resulting relaxations rather than encoding the displacement value
directly.

gas/
	* config/tc-bpf.c (relaxed_branch_length): Correct displacement
	calculation when relaxing.
	(md_convert_frag): Likewise.  Install fixups for JAL
	instructions resulting from relaxation.
	* testsuite/gas/bpf/jump-relax-ja-be.d: Correct and expand test.
	* testsuite/gas/bpf/jump-relax-ja.d: Likewise.
	* testsuite/gas/bpf/jump-relax-ja.s: Likewise.
	* testsuite/gas/bpf/jump-relax-jump-be.d: Likewise.
	* testsuite/gas/bpf/jump-relax-jump.d: Likewise.
	* testsuite/gas/bpf/jump-relax-jump.s: Likewise.
2024-04-25 13:16:34 -07:00
Jinyang He
035068a0cd LoongArch: gas: Simplify relocations in sections without code flag
Gas should not emit ADD/SUB relocation pairs for label differences
if they are in the same section without code flag even relax enabled.
Because the real value is not be affected by relaxation and it can be
compute out in assembly stage. Thus, correct the `TC_FORCE_RELOCATION
_SUB_SAME` and the label differences in same section without code
flag can be resolved in fixup_segment().
2024-04-25 11:53:15 +08:00
Claudio Bantaloukas
3e562e4be8 arm: Fix MVE vmla encoding 2024-04-23 17:59:57 +01:00
Cui, Lili
b5247082c4 x86/APX: Add invalid check for APX EVEX.X4.
gas/ChangeLog:

        * config/tc-i386.c (build_apx_evex_prefix): Added invalid check for APX
        X4.
        * testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d: Added invalid
        testcase.
        * testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s: Ditto.

opcodes/ChangeLog:

        * i386-dis.c (get_valid_dis386): Added invalid check for APX X4.
2024-04-22 09:25:56 +08:00
mengqinggang
20eee7540b LoongArch: Add -mignore-start-align option
Ignore .align at the start of a section may result in misalignment when
partial linking. Manually add -mignore-start-align option without partial
linking.

Gcc -falign-functions add .align 5 to the start of a section, it causes some
error message mismatch. Set these testcases to xfail on LoongArch target.
2024-04-20 12:10:40 +08:00
H.J. Lu
2d4c39a885 x86: Fix a memory leak in md_assemble
Fix a memory leak in md_assemble where copy may be cleared and may be
the same as copy:

      if (copy && !mnem_suffix)
        {
          line = copy;
          copy = NULL;
  no_match:

	* config/tc-i386.c (md_assemble): Properly free the xstrdup
	memory.
2024-04-16 06:32:33 -07:00
H.J. Lu
bf649e72d3 x86-64: Use long NOPs for Intel Core processors
Use long NOPs for Intel Core processors since they are faster than
multiple NOPs.  Don't use them for 64-bit processors by default since
Intel Atom processors can only decode 4 prefixes in 1 cycle.

	* config/tc-i386.c (alt64_9): New.
	(alt64_10): Likewise.
	(alt64_11): Likewise.
	(alt64_12): Likewise.
	(alt64_13): Likewise.
	(alt64_14): Likewise.
	(alt64_15): Likewise.
	(alt64_patt): Likewise.
	(i386_generate_nops): Use alt64_patt for Intel Core processors
	in 64-bit mode.
	* testsuite/gas/i386/x86-64-nops-1-core2.d: Expect long NOPs.
	* testsuite/gas/i386/x86-64-nops-4-core2.d: Likewise.
	* testsuite/gas/i386/ilp32/x86-64-nops-1-core2.d: Replace
	../x86-64-nops-1.d with ../x86-64-nops-1-core2.d.
	* testsuite/gas/i386/ilp32/x86-64-nops-4-core2.d: Replace
	../x86-64-nops-4.d with ../x86-64-nops-4-core2.d.
2024-04-10 20:24:41 -07:00
Alex Coplan
b3a561abc3 arm: Fix encoding of MVE vqshr[u]n
As it stands, these insns are incorrectly encoded as vqrshr[u]n.
Concretely, the problem can be seen as follows:

$ cat t.s
vqrshrnb.s16 q0,q0,#8
vqshrnb.s16 q0,q0,#8
$ gas/as-new t.s -march=armv8.1-m.main+mve -o t.o
$ binutils/objdump -d t.o -m armv8.1-m.main

t.o:     file format elf32-littlearm

Disassembly of section .text:

00000000 <.text>:
   0:   ee88 0f41       vqrshrnb.s16    q0, q0, #0
   4:   ee88 0f41       vqrshrnb.s16    q0, q0, #0

Here we assemble these two instructions to the same opcode.  The
encoding of the first is the correct, while the encoding of the second
is incorrect, and the bottom bit should be clear, see the Armv8-M ARM:
https://developer.arm.com/documentation/ddi0553/latest/

There is an additional problem here in that the disassembly of the
immediate is incorrect.  llvm-objdump shows the correct disassembly
here:

t.o:    file format elf32-littlearm

Disassembly of section .text:

00000000 <$t>:
       0: ee88 0f41     vqrshrnb.s16    q0, q0, #8
       4: ee88 0f41     vqrshrnb.s16    q0, q0, #8

Note that we defer adding a test for the correct encoding of these insns
until the next patch which fixes the disassembly issue.
2024-04-09 10:09:25 +01:00
Jiawei
9132c8152b RISC-V: Support Zcmp push/pop instructions.
Support zcmp extension push/pop/popret and popret zero instructions.
The `reg_list' is a list containing 1 to 13 registers, we can use:
"{ra}, {ra, s0}, {ra, s0-s1}, {ra, s0-s2} ... {ra, s0-sN}"
to present this feature.

Passed gcc/binutils regressions of riscv-gnu-toolchain.

Most of work was finished by Sinan Lin.
Co-Authored by: Charlie Keaney <charlie.keaney@embecosm.com>
Co-Authored by: Mary Bennett <mary.bennett@embecosm.com>
Co-Authored by: Nandni Jamnadas <nandni.jamnadas@embecosm.com>
Co-Authored by: Sinan Lin <sinan.lin@linux.alibaba.com>
Co-Authored by: Simon Cook <simon.cook@embecosm.com>
Co-Authored by: Shihua Liao <shihua@iscas.ac.cn>
Co-Authored by: Yulong Shi <yulong@iscas.ac.cn>

bfd/ChangeLog:

        * elfxx-riscv.c (riscv_implicit_subset): Imply zca for zcmp.
	(riscv_supported_std_z_ext): Added zcmp with version 1.0.
	(riscv_parse_check_conflicts): Zcmp conflicts with d/zcd.
        (riscv_multi_subset_supports): Handle zcmp.
        (riscv_multi_subset_supports_ext): Ditto.

gas/ChangeLog:

	* NEWS: Updated.
        * config/tc-riscv.c (regno_to_reg_list): New function, used to map
	register to reg_list number.
        (reglist_lookup): Called reglist_lookup_internal.  Return false if
	reg_list number is zero, which is an invalid value.
	(reglist_lookup_internal): Parse register list, and return the last
	register by regno_to_reg_list.
        (validate_riscv_insn):  New operators.
        (riscv_ip): Ditto.
	* testsuite/gas/riscv/march-help.l: Updated.
        * testsuite/gas/riscv/zcmp-push-pop-fail.d: New test.
        * testsuite/gas/riscv/zcmp-push-pop-fail.l: New test.
        * testsuite/gas/riscv/zcmp-push-pop-fail.s: New test.
        * testsuite/gas/riscv/zcmp-push-pop.d: New test.
        * testsuite/gas/riscv/zcmp-push-pop.s: New test.

include/ChangeLog:

        * opcode/riscv-opc.h (MATCH/MASK_CM_PUSH): New macros for zcmp.
        (MATCH/MASK_CM_POP): Ditto.
        (MATCH/MASK_CM_POPRET): Ditto.
        (MATCH/MASK_CM_POPRETZ): Ditto.
        (DECLARE_INSN): New declarations for zcmp.
        * opcode/riscv.h (EXTRACT/ENCODE/VALID_ZCMP_SPIMM): Handle spimm
	operand for zcmp.
        (OP_MASK_REG_LIST): Handle operand for zcmp register list.
        (OP_SH_REG_LIST): Ditto.
        (ZCMP_SP_ALIGNMENT): New argument, used in riscv_get_sp_base.
        (X_S0, X_S1, X_S2, X_S10, X_S11): New register numbers.
        (enum riscv_insn_class): Added INSN_CLASS_ZCMP.
        (extern riscv_get_sp_base): Added.

opcodes/ChangeLog:

        * riscv-dis.c (print_reg_list): New function, used to get zcmp
	reg_list field.
        (riscv_get_spimm): New function, used to get zcmp sp adjustment
	immediate.
        (print_insn_args): Handle new operands for zcmp.
        * riscv-opc.c (riscv_get_sp_base): New function, used by gas and
	objdump.  Get sp base adjustment.
	(riscv_opcodes): Added zcmp instructions.
2024-04-09 15:56:12 +08:00
Cui, Lili
dd74a60337 Support APX NF
For the case when NDD and NF are both 0 in evex-promoted format,
we will fully support and test it in another patch.

gas/ChangeLog:

       * NEWS: Support Intel APX NF.
       * config/tc-i386.c (enum i386_error): Add unsupported_nf.
       (struct _i386_insn): Add has_nf.
       (is_apx_evex_encoding): Ditto.
       (build_apx_evex_prefix): Encode the NF bit.
       (md_assemble): Handle unsupported_nf.
       (parse_insn): Handle Prefix_NF and report bad for illegal combination.
       (can_convert_NDD_to_legacy): Replace i.tm.opcode_modifier.nf with i.has_nf.
       (match_template): Support D for APX_F insns and check NF support.
       * testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d: Add bad test for NF bit.
       * testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s: Ditto.
       * testsuite/gas/i386/x86-64-apx-inval.l: Ditto.
       * testsuite/gas/i386/x86-64-apx-inval.s: Ditto.
       * testsuite/gas/i386/x86-64.exp: Add apx nf tests.
       * testsuite/gas/i386/x86-64-apx-nf-intel.d: New test.
       * testsuite/gas/i386/x86-64-apx-nf.d: Ditto.
       * testsuite/gas/i386/x86-64-apx-nf.s: Ditto.

opcodes/ChangeLog:

       * i386-dis-evex.h: Add %NF to the instructions that support APX NF and
       add new instruction imul, popcnt, tzcnt and lzcnt to EVEX table.
       * i386-dis-evex-reg.h: Ditto.
       * i386-dis.c (struct instr_info): Add nf.
       (struct dis386): Add "NF" for EVEX.NF.
       (get_valid_dis386): Set ins->vex.nf and report bad-nf for illegal case.
       (print_insn): Handle ins.vex.nf.
       (putop): Handle "%NF".
       * i386-opc.h (Prefix_NF): New.
       * i386-opc.tbl: Added new entries to support full APX NF instructions.
       * i386-mnem.h: Regenerated.
       * i386-tbl.h: Regenerated.
2024-04-07 17:28:25 +08:00
Cui, Lili
8963a60d7b x86/APX: Remove KEYLOCKER and SHA promotions from EVEX MAP4
APX spec removed KEYLOCKER and SHA promotions from EVEX MAP4.
https://www.intel.com/content/www/us/en/developer/articles/technical/advanced-performance-extensions-apx.html

gas/ChangeLog:

        * NEWS: Mention that remove KEYLOCKER and SHA promotions from EVEX
	* MAP4.
        * config/tc-i386.c (process_operands): Removed special handling of
	* KEYLOCKER and SHA.
        * testsuite/gas/i386/x86-64-apx-egpr-promote-inval.l: Removed KEYLOCKER
        * and SHA instructions.
        * testsuite/gas/i386/x86-64-apx-egpr-promote-inval.s: Ditto.
        * testsuite/gas/i386/x86-64-apx-evex-promoted-bad.d: Ditto.
        * testsuite/gas/i386/x86-64-apx-evex-promoted-bad.s: Ditto.
        * testsuite/gas/i386/x86-64-apx-evex-promoted-intel.d: Ditto.
        * testsuite/gas/i386/x86-64-apx-evex-promoted-wig.d: Ditto.
        * testsuite/gas/i386/x86-64-apx-evex-promoted.d: Ditto.
        * testsuite/gas/i386/x86-64-apx-evex-promoted.s: Ditto.

opcodes/ChangeLog:

        * i386-dis-evex-prefix.h: Removed KEYLOCKER and SHA instructions.
        * i386-dis-evex.h: Ditto.
        * i386-opc.tbl: Ditto.
        * i386-dis.c (print_vector_reg): Removed special handling of KEYLOCKER
	*  and SHA.
2024-04-03 09:50:00 +08:00
mengqinggang
7918b183ec LoongArch: gas: Ignore .align if it is at the start of a section
Ignore .align if it is at the start of a section and the alignment
can be divided by the section alignment, the section alignment
can ensure this .align has a correct alignment.
2024-04-01 09:38:17 +08:00
mengqinggang
daeda14191 BFD: Fix the bug of R_LARCH_AGLIN caused by discard section
To represent the first and third expression of .align, R_LARCH_ALIGN need to
associate with a symbol. We define a local symbol for R_LARCH_AGLIN.
But if the section of the local symbol is discarded, it may result in
a undefined symbol error.

Instead, we use the section name symbols, and this does not need to
add extra symbols.

During partial linking (ld -r), if the symbol associated with a relocation is
STT_SECTION type, the addend of relocation needs to add the section output
offset. We prevent it for R_LARCH_ALIGN.

The elf_backend_data.rela_normal only can set all relocations of a target to
rela_normal. Add a new function is_rela_normal to elf_backend_data, it can
set part of relocations to rela_normal.
2024-03-31 14:21:00 +08:00
Jan Beulich
b58829cdef x86/SSE2AVX: move checking
It has always been looking a little odd to me that this was done deep
in cpu_flags_match(). Move it to match_template() itself - there's no
need to do anything complex when encountering such a template while it
cannot possibly be used.
2024-03-28 11:55:53 +01:00
Jan Beulich
ebe82bfdb3 x86/SSE2AVX: respect prefixes
1) Without -msse2avx we unconditionally honor REX.W. Hence we ought to
   also do so with -msse2avx, converting to VEX.W.

2) {rex} doesn't prevent conversion to VEX encodings. Thus {rex2}
   shouldn't either.
2024-03-28 11:55:25 +01:00