Since we accept these without suffix / operand size specifier, we should
also do so with one. (The fact that we unilaterally accept these, other
than far branches, rather than limiting them to Intel64 mode, will be
taken care of later on.)
Also take the opportunity and make sure "lfs <reg>, tbyte ptr <mem>"
et al get rejected outside of 64-bit mode. This became broken by
dc2be329b9 ("i386: Only check suffix in instruction mnemonic").
Furthermore cover lgdt et al in the Intel syntax handling as well, which
continued to work after said commit just by coincidence.
While dc2be329b9 ("i386: Only check suffix in instruction mnemonic")
has made the assembler accept these in the first place (they were wrongly
rejected before), the generated code was still wrong in that it lacked
an operand size override. (In 64-bit code, other than in 16- and 32-bit
ones, CALL and JMP with memory operands are all entirely unambiguous: No
operand size can have two meanings.)
Test also memory operands with operand size specifier, which was broken
prior to dc2be329b9 ("i386: Only check suffix in instruction
mnemonic"), due to the template not permitting any suffixes. Note that
this uncovered a disassembler issue, which is being fixed here as well.
While segment registers are registers, their use doesn't allow sizing
of insns without suffix / explicit operand size specifier. Prevent
PUSH and POP of segment registers from entering that path, instead
allowing them to observe the stackop_size setting just like other
PUSH/POP and alike do.
Insns permitting only GPR operands (and hence implicit sizing when
there's no suffix) don't ever have their DefaultSize attribute
inspected, so it shouldn't be there in the first place.
Additionally XBEGIN is like JMP, not CALL, and hence shouldn't be
converted to 32-bit operand size in .code16gcc mode. While the same is
true for SYSRET, it permitting more than one suffix makes it FLDENV-
like, and hence rather than dropping the attribute, for now add it to
the exclusion list to avoid it getting an operand size prefix emitted
in .code16gcc mode. (This will be dealt with later, perhaps together
with FLDENV and friends.)
The flag controlling the default DWARF CIE version to produce now
starts with the value -1. This can be modified with the command line
flag as before, but after command line flag processing, in
md_after_parse_args targets can, if the global still has the value -1,
override this value. This gives a target specific default.
If a CIE version is not select either by command line flag, or a
target specific default, then some new code in dwarf2_init now select
a global default. This remains as version 1 to match previous
behaviour.
This RISC-V has a target specific default of version provided, this
make the return column uleb128, which means we can use all DWARF
registers include CSRs.
I chose to switch to version 3 rather than version 4 as this is most
similar to the global default (version 1). Switching to version 4
adds additional columns to the CIE header.
gas/ChangeLog:
* as.c (flag_dwarf_cie_version): Change initial value to -1, and
update comment.
* config/tc-riscv.c (riscv_after_parse_args): Set
flag_dwarf_cie_version if it has not already been set.
* dwarf2dbg.c (dwarf2_init): Initialise flag_dwarf_cie_version if
needed.
* testsuite/gas/riscv/default-cie-version.d: New file.
* testsuite/gas/riscv/default-cie-version.s: New file.
ld/ChangeLog:
* testsuite/ld-elf/eh5.d: Accept version 3 DWARF CIE.
Change-Id: Ibbfe8f0979fba480bf0a359978b09d2b3055555e
In version 1 of DWARF CIE format, the return register column is just a
single byte. For targets with large numbers of DWARF registers, any
use of a register with a high number for the return column
will (currently) silently overflow giving incorrect DWARF.
This commit adds an error when the overflow occurs.
gas/ChangeLog:
* dw2gencfi.c (output_cie): Error on return column overflow.
* testsuite/gas/riscv/cie-rtn-col-1.d: New file.
* testsuite/gas/riscv/cie-rtn-col-3.d: New file.
* testsuite/gas/riscv/cie-rtn-col.s: New file.
Change-Id: I1809f739ba7771737ec012807f0260e1a3ed5e64
This commit gives DWARF register numbers to the RISC-V CSRs inline
with the RISC-V ELF specification here:
https://github.com/riscv/riscv-elf-psabi-doc/blob/master/riscv-elf.md
The CSRs are defined being numbered from 4096 to 8191.
This adds support to the assembler, required in order to reference
CSRs in, for example .cfi directives.
I have then extended dwarf.c in order to support printing CSR names in
the dumped DWARF output. As the CSR name space is quite large and
only sparsely populated, I have provided a new function to perform
RISC-V DWARF register name lookup which uses a switch statement rather
than the table base approach that other architectures use.
Any CSR that does not have a known name will return a name based on
'csr%d' with the %d being replaced by the offset of the CSR from 4096.
gas/ChangeLog:
* config/tc-riscv.c (tc_riscv_regname_to_dw2regnum): Lookup CSR
names too.
* testsuite/gas/riscv/csr-dw-regnums.d: New file.
* testsuite/gas/riscv/csr-dw-regnums.s: New file.
binutils/ChangeLog:
* dwarf.c (regname_internal_riscv): New function.
(init_dwarf_regnames_riscv): Use new function.
Change-Id: I3f70bc24fa8b3c75744e6775eeeb87db70c7ecfb
We build a hash table of all register classes and numbers. The hash
key is the register name and the hash value is the class and number
encoded into a single value, which is of type 'void *'.
When we pull the values out of the hash we cast them to be a pointer
to a structure, however, we never access the fields of that structure,
we just decode the register class and number from the pointer value
itself.
This commit removes the structure and treats the encoded class and
number as a 'void *' during hash lookup.
gas/ChangeLog:
* config/tc-riscv.c (struct regname): Delete.
(hash_reg_names): Handle value as 'void *'.
Change-Id: Ie7d8f46ca3798f56f4af94395279de684f87f9cc
psb CYSNC was not finding that CSYNC was a correct spelling.
The problem was upper case version was being put in the
wrong hashtable. This fixes the problem by using the
correct hashtable.
Also adds testcases for the upper case versions.
* config/tc-aarch64.c (md_begin): Use correct
hash table for uppercase version of hint.
* testsuite/gas/aarch64/system-2.s: Extend psb case to uppercase.
* testsuite/gas/aarch64/system-2.d: Update.
Change-Id: If43f8b85cacd24840d596c3092b0345e5f212766
All symbols, sizes and relocations in this section are octets instead of
bytes. Required for DWARF debug sections as DWARF information is
organized in octets, not bytes.
bfd/
* section.c (struct bfd_section): New flag SEC_ELF_OCTETS.
* archures.c (bfd_octets_per_byte): New parameter sec.
If section is not NULL and SEC_ELF_OCTETS is set, one octet es
returned [ELF targets only].
* bfd.c (bfd_get_section_limit): Provide section parameter to
bfd_octets_per_byte.
* bfd-in2.h: regenerate.
* binary.c (binary_set_section_contents): Move call to
bfd_octets_per_byte into section loop. Provide section parameter
to bfd_octets_per_byte.
* coff-arm.c (coff_arm_reloc): Provide section parameter
to bfd_octets_per_byte.
* coff-i386.c (coff_i386_reloc): likewise.
* coff-mips.c (mips_reflo_reloc): likewise.
* coff-x86_64.c (coff_amd64_reloc): likewise.
* cofflink.c (_bfd_coff_link_input_bfd): likewise.
(_bfd_coff_reloc_link_order): likewise.
* elf.c (_bfd_elf_section_offset): likewise.
(_bfd_elf_make_section_from_shdr): likewise.
Set SEC_ELF_OCTETS for sections with names .gnu.build.attributes,
.debug*, .zdebug* and .note.gnu*.
* elf32-msp430.c (rl78_sym_diff_handler): Provide section parameter
to bfd_octets_per_byte.
* elf32-nds.c (nds32_elf_get_relocated_section_contents): likewise.
* elf32-ppc.c (ppc_elf_addr16_ha_reloc): likewise.
* elf32-pru.c (pru_elf32_do_ldi32_relocate): likewise.
* elf32-s12z.c (opru18_reloc): likewise.
* elf32-sh.c (sh_elf_reloc): likewise.
* elf32-spu.c (spu_elf_rel9): likewise.
* elf32-xtensa.c (bfd_elf_xtensa_reloc): likewise
* elf64-ppc.c (ppc64_elf_brtaken_reloc): likewise.
(ppc64_elf_addr16_ha_reloc): likewise.
(ppc64_elf_toc64_reloc): likewise.
* elflink.c (bfd_elf_final_link): likewise.
(bfd_elf_perform_complex_relocation): likewise.
(elf_fixup_link_order): likewise.
(elf_link_input_bfd): likewise.
(elf_link_sort_relocs): likewise.
(elf_reloc_link_order): likewise.
(resolve_section): likewise.
* linker.c (_bfd_generic_reloc_link_order): likewise.
(bfd_generic_define_common_symbol): likewise.
(default_data_link_order): likewise.
(default_indirect_link_order): likewise.
* srec.c (srec_set_section_contents): likewise.
(srec_write_section): likewise.
* syms.c (_bfd_stab_section_find_nearest_line): likewise.
* reloc.c (_bfd_final_link_relocate): likewise.
(bfd_generic_get_relocated_section_contents): likewise.
(bfd_install_relocation): likewise.
For section which have SEC_ELF_OCTETS set, multiply output_base
and output_offset with bfd_octets_per_byte.
(bfd_perform_relocation): likewise.
include/
* coff/ti.h (GET_SCNHDR_SIZE, PUT_SCNHDR_SIZE, GET_SCN_SCNLEN),
(PUT_SCN_SCNLEN): Adjust bfd_octets_per_byte calls.
binutils/
* objdump.c (disassemble_data): Provide section parameter to
bfd_octets_per_byte.
(dump_section): likewise
(dump_section_header): likewise. Show SEC_ELF_OCTETS flag if set.
gas/
* as.h: Define SEC_OCTETS as SEC_ELF_OCTETS if OBJ_ELF.
* dwarf2dbg.c: (dwarf2_finish): Set section flag SEC_OCTETS for
.debug_line, .debug_info, .debug_abbrev, .debug_aranges, .debug_str
and .debug_ranges sections.
* write.c (maybe_generate_build_notes): Set section flag
SEC_OCTETS for .gnu.build.attributes section.
* frags.c (frag_now_fix): Don't divide by OCTETS_PER_BYTE if
SEC_OCTETS is set.
* symbols.c (resolve_symbol_value): Likewise.
ld/
* ldexp.c (fold_name): Provide section parameter to
bfd_octets_per_byte.
* ldlang (init_opb): New argument s. Set opb_shift to 0 if
SEC_ELF_OCTETS for the current section is set.
(print_input_section): Pass current section to init_opb.
(print_data_statement,print_reloc_statement,
print_padding_statement): Likewise.
(lang_check_section_addresses): Call init_opb for each
section.
(lang_size_sections_1,lang_size_sections_1,
lang_do_assignments_1): Likewise.
(lang_process): Pass NULL to init_opb.
This patch changes the CRC extension to use the core feature bits instead
of the coproc/fpu feature bits.
CRC is not an fpu feature and it causes issues with the new fpu reset
patch (f439988037). CRC can be set using
the '.arch_extension' directive, which sets bits in the coproc bitfield. When
a '.fpu' directive is encountered, the CRC feature bit gets removed and
there is no way to set it back using '.fpu'.
With this patch, CRC will be marked in the feature core bits, which prevents
it from getting removed when setting/changing the fpu options.
gas/ChangeLog:
* config/tc-arm.c (arm_ext_crc): New.
(crc_ext_armv8): Remove.
(insns): Rename crc_ext_armv8 to arm_ext_crc.
(arm_cpus): Replace CRC_EXT_ARMV8 with ARM_EXT2_CRC.
(armv8a_ext_table, armv8r_ext_table,
arm_option_extension_value_table): Redefine the crc
extension in terms of ARM_EXT2_CRC.
* gas/testsuite/gas/arm/crc-ext.s: New.
* gas/testsuite/gas/arm/crc-ext.d: New.
include/ChangeLog:
* opcode/arm.h (ARM_EXT2_CRC): New extension feature
to replace CRC_EXT_ARMV8.
(CRC_EXT_ARMV8): Remove and mark bit as unused.
(ARM_ARCH_V8A_CRC, ARM_ARCH_V8_1A, ARM_ARCH_V8_2A,
ARM_ARCH_V8_3A, ARM_ARCH_V8_4A, ARM_ARCH_V8_5A,
ARM_ARCH_V8_6A): Redefine using ARM_EXT2_CRC instead of
CRC_EXT_ARMV8.
opcodes/ChangeLog:
* opcodes/arm-dis.c (arm_opcodes, thumb32_opcodes):
Change the coproc CRC conditions to use the extension
feature set, second word, base on ARM_EXT2_CRC.
Add a flag to control the version of CIE that is generated. By
default gas produces CIE version 1, and this continues to be the
default after this patch.
However, a user can now provide --gdwarf-cie-version=NUMBER to switch
to either version 3 or version 4 of CIE, version 2 was never released,
and so causes an error as does any number less than 1 or greater than
4.
Producing version 4 CIE requires two new fields to be added to the
CIE, an address size field, and an segment selector field. For a flat
address space the DWARF specification indicates that the segment
selector should be 0, and the address size fields just contains the
address size in bytes. For now we support 4 or 8 byte addresses, and
the segment selector is always produced as 0. At some future time we
might need to allow targets to override this.
gas/ChangeLog:
* as.c (parse_args): Parse --gdwarf-cie-version option.
(flag_dwarf_cie_version): New variable.
* as.h (flag_dwarf_cie_version): Declare.
* dw2gencfi.c (output_cie): Switch from DW_CIE_VERSION to
flag_dwarf_cie_version.
* doc/as.texi (Overview): Document --gdwarf-cie-version.
* NEWS: Likewise.
* testsuite/gas/cfi/cfi.exp: Add new tests.
* testsuite/gas/cfi/cie-version-0.d: New file.
* testsuite/gas/cfi/cie-version-1.d: New file.
* testsuite/gas/cfi/cie-version-2.d: New file.
* testsuite/gas/cfi/cie-version-3.d: New file.
* testsuite/gas/cfi/cie-version-4.d: New file.
* testsuite/gas/cfi/cie-version.s: New file.
include/ChangeLog:
* dwarf2.h (DW_CIE_VERSION): Delete.
Change-Id: I9de19461aeb8332b5a57bbfe802953d0725a7ae8
... instead of an operand one. Which operand it applies to can be
determined from other operand properties, but as it turns out the only
place it is actually used at doesn't even need further qualification.
The CMPS test case derivation from their MOVS counterparts I did in
d241b91073 ("x86/Intel: correct MOVSD and CMPSD handling") ended up
with misplaced closing parentheses in som regexps. Correct this.
We have to enable the f extension through -march or ELF attribute if we use the
FPR in .insn directive. The behavior is same as the riscv_opcodes.
2019-11-12 Nelson Chu <nelson.chu@sifive.com>
opcodes/
* riscv-opc.c (riscv_insn_types): Replace the INSN_CLASS_I with
INSN_CLASS_F and the INSN_CLASS_C with INSN_CLASS_F_AND_C if we
use the floating point register (FPR).
gas/
* testsuite/gas/riscv/insn.d: Add the f extension to -march option.
Change-Id: I4f59d04c82673ef84c56ecd2659ad8ce164dd626
This patch enables a few instructions for Armv8.1-M MVE. Currently VLDM,
VSTM, VSTR, VLDR, VPUSH and VPOP are enabled only when the Armv8-M
Floating-point Extension is enabled. According to the ARMv8.1-M ARM,
section A.1.4.2[1], they can be enabled by having "Armv8-M Floating-point
Extension and/or Armv8.1-M MVE".
[1]https://developer.arm.com/docs/ddi0553/bh/armv81-m-architecture-reference-manual
2019-11-12 Mihail Ionescu <mihail.ionescu@arm.com>
* config/tc-arm.c (do_vfp_nsyn_push): Move in order to enable it for
both fpu_vfp_ext_v1xd and mve_ext and add call to the aliased vstm
instruction for mve_ext.
(do_vfp_nsyn_pop): Move in order to enable it for both
fpu_vfp_ext_v1xd and mve_ext and add call to the aliased vldm
instruction for mve_ext.
(do_neon_ldm_stm): Add fpu_vfp_ext_v1 and mve_ext checks.
(insns): Enable vldm, vldmia, vldmdb, vstm, vstmia, vstmdb, vpop,
vpush, and fldd, fstd, flds, fsts for arm_ext_v6t2 instead
of fpu_vfp_ext_v1xd.
* testsuite/gas/arm/v8_1m-mve.s: New.
* testsuite/gas/arm/v8_1m-mve.d: New.
This patch updates the decoding of the VMOV and VMVN instructions which depend on cmode.
Previously VMOV and VMVN with cmode 1101 were not allowed.
The cmode changes also required updating of the MVE conflict checking.
Now instructions with opcodes 0xef800d50 and 0xef800e70 correctly get decoded as VMOV
and VMVN, respectively.
2019-11-12 Mihail Ionescu <mihail.ionescu@arm.com>
* opcodes/arm-dis.c (mve_opcodes): Enable VMOV imm to vec with
cmode 1101.
(is_mve_encoding_conflict): Update cmode conflict checks for
MVE_VMVN_IMM.
2019-11-12 Mihail Ionescu <mihail.ionescu@arm.com>
* gas/config/tc-arm.c (do_neon_mvn): Allow mve_ext cmode=0xd.
* testsuite/gas/arm/mve-vmov-vmvn-vorr-vbic.s: New test.
* testsuite/gas/arm/mve-vmov-vmvn-vorr-vbic.d: Likewise.
EsSeg (a per-operand bit) is used with IsString (a per-insn attribute)
only. Extend the attribute to 2 bits, thus allowing to encode
- not a string insn,
- string insn with neither operand requiring use of %es:,
- string insn with 1st operand requiring use of %es:,
- string insn with 2nd operand requiring use of %es:,
which covers all possible cases, allowing to drop EsSeg.
The (transient) need to comment out the OTUnused #define did uncover an
oversight in the earlier OTMax -> OTNum conversion, which is being taken
care of here.
Drop the remaining instances left in place by commit c3949f432f ("x86:
limit ImmExt abuse), now that we have a way to specify specific GPRs.
Take the opportunity and also introduce proper 16-bit forms of
applicable SVME insns as well as 1-operand forms of CLZERO.
Special register "class" instances can't be combined with one another
(neither in templates nor in register entries), and hence it is not a
good use of resources (memory as well as execution time) to represent
them as individual bits of a bit field.
Furthermore the generalization becoming possible will allow
improvements to the handling of insns accepting only individual
registers as their operands.
We should check suffix in instruction mnemonic when matching instruction.
In Intel syntax, normally we check for memory operand size. But the same
mnemonic with 2 different encodings can have the same memory operand
size and i.suffix is set to LONG_DOUBLE_MNEM_SUFFIX from memory operand
size in Intel syntax to distinguish them. When there is no suffix in
mnemonic, we check LONG_DOUBLE_MNEM_SUFFIX in i.suffix for mnemonic
suffix.
gas/
PR gas/25167
* config/tc-i386.c (match_template): Don't check instruction
suffix set from operand.
* testsuite/gas/i386/code16.d: New file.
* testsuite/gas/i386/code16.s: Likewise.
* testsuite/gas/i386/i386.exp: Run code16.
* testsuite/gas/i386/x86-64-branch-4.l: Updated.
opcodes/
PR gas/25167
* i386-opc.tbl: Remove IgnoreSize from cmpsd and movsd.
* i386-tbl.h: Regenerated.
Many operand types, in particular the various kinds of registers, can't
be combined with one another (neither in templates nor in register
entries), and hence it is not a good use of resources (memory as well as
execution time) to represent them as individual bits of a bit field.
Hi,
This patch is part of a series that adds support for Armv8.6-A
to binutils.
In this last patch, the new Data Gathering Hint mnemonic is introduced.
Committed on behalf of Mihail Ionescu.
gas/ChangeLog:
2019-11-07 Mihail Ionescu <mihail.ionescu@arm.com>
* testsuite/gas/aarch64/dgh.s: New test.
* testsuite/gas/aarch64/dgh.d: New test.
opcodes/ChangeLog:
2019-11-07 Mihail Ionescu <mihail.ionescu@arm.com>
* opcodes/aarch64-tbl.h (V8_6_INSN): New macro for v8.6 instructions.
(aarch64_opcode_table): Add data gathering hint mnemonic.
* opcodes/aarch64-dis-2.c: Account for new instruction.
Is it ok for trunk?
Regards,
Mihail
Hi,
This patch is part of a series that adds support for Armv8.6-A
(Matrix Multiply and BFloat16 extensions) to binutils.
This patch introduces the Matrix Multiply (Int8, F32, F64) extensions
to the arm backend.
The following Matrix Multiply instructions are added: vummla, vsmmla,
vusmmla, vusdot, vsudot[1].
[1]https://developer.arm.com/docs/ddi0597/latest/simd-and-floating-point-instructions-alphabetic-order
Committed on behalf of Mihail Ionescu.
gas/ChangeLog:
2019-11-07 Mihail Ionescu <mihail.ionescu@arm.com>
* config/tc-arm.c (arm_ext_i8mm): New feature set.
(do_vusdot): New.
(do_vsudot): New.
(do_vsmmla): New.
(do_vummla): New.
(insns): Add vsmmla, vummla, vusmmla, vusdot, vsudot mnemonics.
(armv86a_ext_table): Add i8mm extension.
(arm_extensions): Move bf16 extension to context sensitive table.
(armv82a_ext_table, armv84a_ext_table, armv85a_ext_table):
Move bf16 extension to context sensitive table.
(armv86a_ext_table): Add i8mm extension.
* doc/c-arm.texi: Document i8mm extension.
* testsuite/gas/arm/i8mm.s: New test.
* testsuite/gas/arm/i8mm.d: New test.
* testsuite/gas/arm/bfloat17-cmdline-bad-3.d: Update test.
include/ChangeLog:
2019-11-07 Mihail Ionescu <mihail.ionescu@arm.com>
* opcode/arm.h (ARM_EXT2_I8MM): New feature macro.
opcodes/ChangeLog:
2019-11-07 Mihail Ionescu <mihail.ionescu@arm.com>
* arm-dis.c (neon_opcodes): Add i8mm SIMD instructions.
Regression tested on arm-none-eabi.
Is this ok for trunk?
Regards,
Mihail
Hi,
This patch is part of a series that adds support for Armv8.6-A
(Matrix Multiply and BFloat16 extensions) to binutils.
This patch introduces the Matrix Multiply (Int8, F32, F64) extensions
to the aarch64 backend.
The following instructions are added: {s/u}mmla, usmmla, {us/su}dot,
fmmla, ld1rob, ld1roh, d1row, ld1rod, uzip{1/2}, trn{1/2}.
Committed on behalf of Mihail Ionescu.
gas/ChangeLog:
2019-11-07 Mihail Ionescu <mihail.ionescu@arm.com>
* config/tc-aarch64.c: Add new arch fetures to suppport the mm extension.
(parse_operands): Add new operand.
* testsuite/gas/aarch64/i8mm.s: New test.
* testsuite/gas/aarch64/i8mm.d: New test.
* testsuite/gas/aarch64/f32mm.s: New test.
* testsuite/gas/aarch64/f32mm.d: New test.
* testsuite/gas/aarch64/f64mm.s: New test.
* testsuite/gas/aarch64/f64mm.d: New test.
* testsuite/gas/aarch64/sve-movprfx-mm.s: New test.
* testsuite/gas/aarch64/sve-movprfx-mm.d: New test.
include/ChangeLog:
2019-11-07 Mihail Ionescu <mihail.ionescu@arm.com>
* opcode/aarch64.h (AARCH64_FEATURE_I8MM): New.
(AARCH64_FEATURE_F32MM): New.
(AARCH64_FEATURE_F64MM): New.
(AARCH64_OPND_SVE_ADDR_RI_S4x32): New.
(enum aarch64_insn_class): Add new instruction class "aarch64_misc" for
instructions that do not require special handling.
opcodes/ChangeLog:
2019-11-07 Mihail Ionescu <mihail.ionescu@arm.com>
* aarch64-tbl.h (aarch64_feature_i8mm_sve, aarch64_feature_f32mm_sve,
aarch64_feature_f64mm_sve, aarch64_feature_i8mm, aarch64_feature_f32mm,
aarch64_feature_f64mm): New feature sets.
(INT8MATMUL_INSN, F64MATMUL_SVE_INSN, F64MATMUL_INSN,
F32MATMUL_SVE_INSN, F32MATMUL_INSN): New macros to define matrix multiply
instructions.
(I8MM_SVE, F32MM_SVE, F64MM_SVE, I8MM, F32MM, F64MM): New feature set
macros.
(QL_MMLA64, OP_SVE_SBB): New qualifiers.
(OP_SVE_QQQ): New qualifier.
(INT8MATMUL_SVE_INSNC, F64MATMUL_SVE_INSNC,
F32MATMUL_SVE_INSNC): New feature set for bfloat16 instructions to support
the movprfx constraint.
(aarch64_opcode_table): Support for SVE_ADDR_RI_S4x32.
(aarch64_opcode_table): Define new instructions smmla,
ummla, usmmla, usdot, sudot, fmmla, ld1rob, ld1roh, ld1row, ld1rod
uzip{1/2}, trn{1/2}.
* aarch64-opc.c (operand_general_constraint_met_p): Handle
AARCH64_OPND_SVE_ADDR_RI_S4x32.
(aarch64_print_operand): Handle AARCH64_OPND_SVE_ADDR_RI_S4x32.
* aarch64-dis-2.c (aarch64_opcode_lookup_1, aarch64_find_next_opcode):
Account for new instructions.
* opcodes/aarch64-asm-2.c (aarch64_insert_operand): Support the new
S4x32 operand.
* aarch64-opc-2.c (aarch64_operands): Support the new S4x32 operand.
Regression tested on arm-none-eabi.
Is it ok for trunk?
Regards,
Mihail
Hi,
This patch is part of a series that adds support for Armv8.6-A
(Matrix Multiply and BFloat16 extensions) to binutils.
This patch implements the '.bfloat' directive for the AArch64 backend.
The syntax for the directive is:
.bfloat16 <0-n numbers>
e.g.
.bfloat16 12.0
.bfloat16 0.123, 1.0, NaN, 5
This is implemented by utilizing the ieee_atof_detail function in order
to encode the slightly
different bfloat16 format.
Added testcases to verify the correct encoding for various bfloat16
values (NaN, Infinity (+ & -), normals, subnormals etc...).
Cross compiled and tested on aarch64-none-elf and aarch64-none-linux-gnu
with no issues.
Committed on behalf of Mihail Ionescu.
gas/ChangeLog:
2019-10-29 Mihail Ionescu <mihail.ionescu@arm.com>
2019-10-29 Barnaby Wilks <barnaby.wilks@arm.com>
* config/tc-aarch64.c (md_atof): Add encoding for the bfloat16 format.
* testsuite/gas/aarch64/bfloat16-directive-le.d: New test.
* testsuite/gas/aarch64/bfloat16-directive-be.d: New test.
* testsuite/gas/aarch64/bfloat16-directive.s: New test.
Is it ok for trunk?
Regards,
Mihail
Hi,
This patch is part of a series that adds support for Armv8.6-A
(Matrix Multiply and BFloat16 extensions) to binutils.
This patch implements the '.bfloat16' directive for the Arm backend.
The syntax for the directive is:
.bfloat16 <0-n numbers>
e.g.
.bfloat16 12.0
.bfloat16 0.123, 1.0, NaN, 5
This is implemented by utilizing the ieee_atof_detail function (included
in the previous patch) in order to encode the slightly different
bfloat16 format.
Added testcases to verify the correct encoding for various bfloat16
values (NaN, Infinity (+ & -), normals, subnormals etc...).
Cross compiled and tested on arm-none-eabi and arm-none-linux-gnueabihf
with no issues.
Committed on behalf of Mihail Ionescu.
gas/ChangeLog:
2019-10-21 Mihail Ionescu <mihail.ionescu@arm.com>
2019-10-21 Barnaby Wilks <barnaby.wilks@arm.com>
* config/tc-arm.c (md_atof): Add encoding for bfloat16
* testsuite/gas/arm/bfloat16-directive-le.d: New test.
* testsuite/gas/arm/bfloat16-directive-be.d: New test.
* testsuite/gas/arm/bfloat16-directive.s: New test.
Is it ok for trunk?
Regards,
Mihail
Hi,
This patch is part of a series that adds support for Armv8.6-A
(Matrix Multiply and BFloat16 extensions).
This patch contains some general refactoring of the atof_ieee
function, exposing a function that allows a higher level of control
over the format of IEEE-like floating point numbers.
This has been done in order to be able to add a directive for assembling
floating point literals in the bfloat16 format in the following patches.
Committed on behalf of Mihail Ionescu.
Tested on arm-none-eabi, arm-none-linux-gnueabihf, aarch64-none-elf
and aarch64-none-linux-gnuwith no issues.
gas/ChangeLog:
2019-10-21 Mihail Ionescu <mihail.ionescu@arm.com>
2019-10-21 Barnaby Wilks <barnaby.wilks@arm.com>
* as.h (atof_ieee_detail): Add prototype for atof_ieee_detail function.
(atof_ieee): Move some code into the atof_ieee_detail function.
(atof_ieee_detail): Add function that provides a higher level of control over generating
IEEE-like numbers.
Is it ok for trunk?
Regards,
Mihail
Hi,
This patch is part of a series that adds support for Armv8.6-A
(Matrix Multiply and BFloat16 extensions) to binutils.
This patch introduces BFloat16 instructions to the arm backend.
The following BFloat16 instructions are added: vdot, vfma{l/t},
vmmla, vfmal{t/b}, vcvt, vcvt{t/b}.
gas/ChangeLog:
2019-11-07 Mihail Ionescu <mihail.ionescu@arm.com>
2019-11-07 Matthew Malcomson <matthew.malcomson@arm.com>
* config/tc-arm.c (arm_archs): Add armv8.6-a option.
(cpu_arch_ver): Add TAG_CPU_ARCH_V8 tag for Armv8.6-a.
* doc/c-arm.texi (-march): New armv8.6-a arch.
* config/tc-arm.c (arm_ext_bf16): New feature set.
(enum neon_el_type): Add NT_bfloat value.
(B_MNEM_vfmat, B_MNEM_vfmab): New bfloat16 encoder
helpers.
(BAD_BF16): New message.
(parse_neon_type): Add bf16 type specifier.
(enum neon_type_mask): Add N_BF16 type.
(type_chk_of_el_type): Account for NT_bfloat.
(el_type_of_type_chk): Account for N_BF16.
(neon_three_args): Split out from neon_three_same.
(neon_three_same): Part split out into neon_three_args.
(CVT_FLAVOUR_VAR): Add bf16_f32 cvt flavour.
(do_neon_cvt_1): Account for vcvt.bf16.f32.
(do_bfloat_vmla): New.
(do_mve_vfma): New function to deal with the mnemonic clash between the BF16
vfmat and the MVE vfma in a VPT block with a 't'rue condition.
(do_neon_cvttb_1): Account for vcvt{t,b}.bf16.f32.
(do_vdot): New
(do_vmmla): New
(insns): Add vdot and vmmla mnemonics.
(arm_extensions): Add "bf16" extension.
* doc/c-arm.texi: Document "bf16" extension.
* testsuite/gas/arm/attr-march-armv8_6-a.d: New test.
* testsuite/gas/arm/bfloat16-bad.d: New test.
* testsuite/gas/arm/bfloat16-bad.l: New test.
* testsuite/gas/arm/bfloat16-bad.s: New test.
* testsuite/gas/arm/bfloat16-cmdline-bad-2.d: New test.
* testsuite/gas/arm/bfloat16-cmdline-bad-3.d: New test.
* testsuite/gas/arm/bfloat16-cmdline-bad.d: New test.
* testsuite/gas/arm/bfloat16-neon.s: New test.
* testsuite/gas/arm/bfloat16-non-neon.s: New test.
* testsuite/gas/arm/bfloat16-thumb-bad.d: New test.
* testsuite/gas/arm/bfloat16-thumb-bad.l: New test.
* testsuite/gas/arm/bfloat16-thumb.d: New test.
* testsuite/gas/arm/bfloat16-vfp.d: New test.
* testsuite/gas/arm/bfloat16.d: New test.
* testsuite/gas/arm/bfloat16.s: New test.
include/ChangeLog:
2019-11-07 Mihail Ionescu <mihail.ionescu@arm.com>
2019-11-07 Matthew Malcomson <matthew.malcomson@arm.com>
* opcode/arm.h (ARM_EXT2_V8_6A, ARM_AEXT2_V8_6A,
ARM_ARCH_V8_6A): New.
* opcode/arm.h (ARM_EXT2_BF16): New feature macro.
(ARM_AEXT2_V8_6A): Include above macro in definition.
opcodes/ChangeLog:
2019-11-07 Mihail Ionescu <mihail.ionescu@arm.com>
2019-11-07 Matthew Malcomson <matthew.malcomson@arm.com>
* arm-dis.c (select_arm_features): Update bfd_march_arm_8 with
Armv8.6-A.
(coprocessor_opcodes): Add bfloat16 vcvt{t,b}.
(neon_opcodes): Add bfloat SIMD instructions.
(print_insn_coprocessor): Add new control character %b to print
condition code without checking cp_num.
(print_insn_neon): Account for BFloat16 instructions that have no
special top-byte handling.
Regression tested on arm-none-eabi.
Is it ok for trunk?
Regards,
Mihail
Hi,
This patch is part of a series that adds support for Armv8.6-A
(Matrix Multiply and BFloat16 extensions) to binutils.
This patch introduces the following BFloat16 instructions to the
aarch64 backend: bfdot, bfmmla, bfcvt, bfcvtnt, bfmlal[t/b],
bfcvtn2.
Committed on behalf of Mihail Ionescu.
gas/ChangeLog:
2019-11-07 Mihail Ionescu <mihail.ionescu@arm.com>
2019-11-07 Matthew Malcomson <matthew.malcomson@arm.com>
* config/tc-aarch64.c (vectype_to_qualifier): Special case the
S_2H operand qualifier.
* doc/c-aarch64.texi: Document bf16 and bf16mmla4 extensions.
* testsuite/gas/aarch64/bfloat16.d: New test.
* testsuite/gas/aarch64/bfloat16.s: New test.
* testsuite/gas/aarch64/illegal-bfloat16.d: New test.
* testsuite/gas/aarch64/illegal-bfloat16.l: New test.
* testsuite/gas/aarch64/illegal-bfloat16.s: New test.
* testsuite/gas/aarch64/sve-bfloat-movprfx.s: New test.
* testsuite/gas/aarch64/sve-bfloat-movprfx.d: New test.
include/ChangeLog:
2019-11-07 Mihail Ionescu <mihail.ionescu@arm.com>
2019-11-07 Matthew Malcomson <matthew.malcomson@arm.com>
* opcode/aarch64.h (AARCH64_FEATURE_BFLOAT16): New feature macros.
(AARCH64_ARCH_V8_6): Include BFloat16 feature macros.
(enum aarch64_opnd_qualifier): Introduce new operand qualifier
AARCH64_OPND_QLF_S_2H.
(enum aarch64_insn_class): Introduce new class "bfloat16".
(BFLOAT16_SVE_INSNC): New feature set for bfloat16
instructions to support the movprfx constraint.
opcodes/ChangeLog:
2019-11-07 Mihail Ionescu <mihail.ionescu@arm.com>
2019-11-07 Matthew Malcomson <matthew.malcomson@arm.com>
* aarch64-asm.c (aarch64_ins_reglane): Use AARCH64_OPND_QLF_S_2H
in reglane special case.
* aarch64-dis-2.c (aarch64_opcode_lookup_1,
aarch64_find_next_opcode): Account for new instructions.
* aarch64-dis.c (aarch64_ext_reglane): Use AARCH64_OPND_QLF_S_2H
in reglane special case.
* aarch64-opc.c (struct operand_qualifier_data): Add data for
new AARCH64_OPND_QLF_S_2H qualifier.
* aarch64-tbl.h (QL_BFDOT QL_BFDOT64, QL_BFDOT64I, QL_BFMMLA2,
QL_BFCVT64, QL_BFCVTN64, QL_BFCVTN2_64): New qualifiers.
(aarch64_feature_bfloat16, aarch64_feature_bfloat16_sve,
aarch64_feature_bfloat16_bfmmla4): New feature sets.
(BFLOAT_SVE, BFLOAT): New feature set macros.
(BFLOAT_SVE_INSN, BFLOAT_BFMMLA4_INSN, BFLOAT_INSN): New macros
to define BFloat16 instructions.
(aarch64_opcode_table): Define new instructions bfdot,
bfmmla, bfcvt, bfcvtnt, bfdot, bfdot, bfcvtn, bfmlal[b/t]
bfcvtn2, bfcvt.
Regression tested on aarch64-elf.
Is it ok for trunk?
Regards,
Mihail
Hi,
This patch is part of a series that adds support for Armv8.6-A
to binutils.
This first patch adds the Armv8.6-A flag to binutils.
No instructions are behind it at the moment.
Commited on behalf of Mihail Ionescu.
gas/ChangeLog:
2019-11-07 Mihail Ionescu <mihail.ionescu@arm.com>
2019-11-07 Matthew Malcomson <matthew.malcomson@arm.com>
* config/tc-aarch64.c (armv8.6-a): New arch.
* doc/c-aarch64.texi (armv8.6-a): Document new arch.
include/ChangeLog:
2019-11-07 Mihail Ionescu <mihail.ionescu@arm.com>
2019-11-07 Matthew Malcomson <matthew.malcomson@arm.com>
* opcode/aarch64.h (AARCH64_FEATURE_V8_6): New.
(AARCH64_ARCH_V8_6): New.
opcodes/ChangeLog:
2019-11-07 Mihail Ionescu <mihail.ionescu@arm.com>
2019-11-07 Matthew Malcomson <matthew.malcomson@arm.com>
* aarch64-tbl.h (ARMV8_6): New macro.
Is it ok for trunk?
Regards,
Mihail
As the comments (here: almost, in the opcode table: fully) correctly
state - all register operands except MONITOR's address one are fixed
at 32 bit size. Don't print 64-bit registers there.
Also adjust x86-64-suffix.d's name such that it wouldn't be identical to
x86-64-rep-suffix.d's, but instead resemble that of its sibling
x86-64-suffix-intel.d.
Alter the sequence of conditions evaluated, without affecting the
overall result. This is going to help subsequent changes (and as a nice
side effect also slightly reduces overall indentation depth).
While doing this take the liberty of simplifying the calculation of the
operand index of the register operand in ShortForm handling.
If the extension is not found in the context sensitive table, the legacy
tables are still checked as a fallback. This is particularly useful for
Armv8.1-M as it enables the use of '.arch_extension' with the 'mve' and
'mve.fp' extensions which are not part of the legacy table.
* config/tc-arm.c (selected_ctx_ext_table) New static variable.
(arm_parse_arch): Set context sensitive extension table based on the
chosen base architecture.
(s_arm_arch_extension): Change to lookup extensions in the new context
sensitive tables.
* gas/testsuite/gas/arm/mve-ext.s: New.
* gas/testsuite/gas/arm/mve-ext.d: New.
* gas/testsuite/gas/arm/mvefp-ext.s: New.
* gas/testsuite/gas/arm/mvefp-ext.d: New.
This is a shorthand for the immediate argument being 0, as described here:
https://developer.arm.com/docs/ddi0596/latest/base-instructions-alphabetic-order/ldraa-ldrab-load-register-with-pointer-authentication
This is because the instructions still have a use with an immediate
argument of 0, unlike loads without the PAC functionality. Currently,
the mnemonics are
LDRAA Xt, [Xn, #<simm10>]!
LDRAB Xt, [Xn, #<simm10>]!
After this patch they become
LDRAA Xt, [Xn {, #<simm10>}]!
LDRAB Xt, [Xn {, #<simm10>}]!
gas * config/tc-aarch64.c (parse_address_main): Accept the omission of
the immediate argument for ldraa and ldrab as a shorthand for the
immediate being 0.
* testsuite/gas/aarch64/ldraa-ldrab-no-offset.d: New test.
* testsuite/gas/aarch64/ldraa-ldrab-no-offset.s: New test.
* testsuite/gas/aarch64/illegal-ldraa.s: Modified to accept the
writeback form with no offset.
* testsuite/gas/aarch64/illegal-ldraa.s: Removed missing offset
error.
opcodes * aarch64-opc.c (print_immediate_offset_address): Don't print the
immediate for the writeback form of ldraa/ldrab if it is 0.
* aarch64-tbl.h: Updated the documentation for ADDR_SIMM10.
* aarch64-opc-2.c: Regenerated.