Unaligned load/store instructions on aligned memory or register are as
fast as aligned load/store instructions on modern Intel processors. Add
a command-line option, -muse-unaligned-vector-move, to x86 assembler to
encode encode aligned vector load/store instructions as unaligned
vector load/store instructions.
* NEWS: Mention -muse-unaligned-vector-move.
* config/tc-i386.c (use_unaligned_vector_move): New.
(encode_with_unaligned_vector_move): Likewise.
(md_assemble): Call encode_with_unaligned_vector_move for
-muse-unaligned-vector-move.
(OPTION_MUSE_UNALIGNED_VECTOR_MOVE): New.
(md_longopts): Add -muse-unaligned-vector-move.
(md_parse_option): Handle -muse-unaligned-vector-move.
(md_show_usage): Add -muse-unaligned-vector-move.
* doc/c-i386.texi: Document -muse-unaligned-vector-move.
* testsuite/gas/i386/i386.exp: Run unaligned-vector-move and
x86-64-unaligned-vector-move.
* testsuite/gas/i386/unaligned-vector-move.d: New file.
* testsuite/gas/i386/unaligned-vector-move.s: Likewise.
* testsuite/gas/i386/x86-64-unaligned-vector-move.d: Likewise.
Patch is adding Cortex-R52+ as 'cortex-r52plus' command line
flag for -mcpu option.
bfd/
* cpu-arm.c: New Cortex-R52+ CPU.
gas/
* NEWS: Update docs.
* config/tc-arm.c: New Cortex-R52+ CPU.
* doc/c-arm.texi: Update docs.
* testsuite/gas/arm/cpu-cortex-r52plus.d: New test.
Patch is adding new 'armv9-a` command line flag to -march for AArch64.
gas/
* config/tc-aarch64.c: Add 'armv9-a' command line flag.
* docs/c-aarch64.text: Update docs.
* NEWS: Update docs.
include/
* opcode/aarch64.h (AARCH64_FEATURE_V9): New define.
(AARCH64_ARCH_V9): New define.
The .insn directive can let users use their own instructions, or
some new instruction, which haven't supported in the old binutils.
For example, if users want to use sifive cache instruction, they
cannot just write "cflush.d1.l1" in the assembly code, they should
use ".insn i SYSTEM, 0, x0, x10, -0x40". But the .insn directive
may not easy to use for some cases, and not so friendly to users.
Therefore, I believe most of the users will use ".word 0xfc050073",
to encode the instructions directly, rather than use .insn. But
once we have supported the mapping symbols, the .word directives
are marked as data, so disassembler won't dump them as instructions
as usual. I have discussed this with Kito many times, we all think
extend the .insn direcitve to support the hardcode encoding, is the
easiest way to resolve the problem. Therefore, there are two more
.insn formats are proposed as follows,
(original) .insn <type>, <operand1>, <operand2>, ...
.insn <insn-length>, <value>
.insn <value>
The <type> is string, and the <insn-length> and <value> are constants.
gas/
* config/tc-riscv.c (riscv_ip_hardcode): Similar to riscv_ip,
but assembles an instruction according to the hardcode values
of .insn directive.
* doc/c-riscv.texi: Document two new .insn formats.
* testsuite/gas/riscv/insn-fail.d: New testcases.
* testsuite/gas/riscv/insn-fail.l: Likewise.
* testsuite/gas/riscv/insn-fail.s: Likewise.
* testsuite/gas/riscv/insn.d: Updated.
* testsuite/gas/riscv/insn.s: Likewise.
Some frontends, like the gcc Objective-C frontend, emit symbols with $
characters in them. The AVR target code in gas treats $ as a line separator,
so the code doesn?t assemble correctly.
Provide a machine-specific option to disable treating $ as a line separator.
* config/tc-avr.c (enum options): Add option flag.
(struct option): Add option -mno-dollar-line-separator.
(md_parse_option): Adjust treatment of $ when option is present.
* config/tc-avr.h: Use avr_line_separator_chars.
This is to be able to generate data acted upon by AVX512-BF16 and
AMX-BF16 insns. While not part of the IEEE standard, the format is
sufficiently standardized to warrant handling in config/atof-ieee.c.
Arm, where custom handling was implemented, may want to leverage this as
well. To be able to also use the hex forms supported for other floating
point formats, a small addition to the generic hex_float() is needed.
Extend existing x86 testcases.
This is to be able to generate data passed to {,V}CVTPH2PS and acted
upon by AVX512-FP16 insns. To be able to also use the hex forms
supported for other floating point formats, a small addition to the
generic hex_float() is needed.
Extend existing x86 testcases.
The ELF psABI-s are quite clear here: On 32-bit the underlying data type
is 12 bytes long (with 2 bytes of trailing padding), while on 64-bit it
is 16 bytes long (with 6 bytes of padding). Make s_space() capable of
handling 'x' (and 'p') type floating point being other than 12 bytes
wide (also adjusting documentation). This requires duplicating the
definition of X_PRECISION in the target speciifc header; the compiler
would complain if this was out of sync with config/atof-ieee.c.
Note that for now padding space doesn't get separated from actual
storage, which means that things will work correctly only for little-
endian cases, and which also means that by specifying large enough
numbers padding space can be set to non-zero. Since the logic is needed
for a single little-endian architecture only for now, I'm hoping that
this might be acceptable for the time being; otherwise the change will
become more intrusive.
Note also that this brings the emitted data size of .ds.x vs .tfloat in
line for non-ELF targets as well; the issue will be even more obvious
when further taking into account a subsequent patch fixing .dc.x/.dcb.x
(where output sizes currently differ depending on input format).
Extend existing x86 testcases.
Intel AVX512 FP16 instructions use maps 3, 5 and 6. Maps 5 and 6 use 3 bits
in the EVEX.mmm field (0b101, 0b110). Map 5 is for instructions that were FP32
in map 1 (0Fxx). Map 6 is for instructions that were FP32 in map 2 (0F38xx).
There are some exceptions to this rule. Some things in map 1 (0Fxx) with imm8
operands predated our current conventions; those instructions moved to map 3.
FP32 things in map 3 (0F3Axx) found new opcodes in map3 for FP16 because map3
is very sparsely populated. Most of the FP16 instructions share opcodes and
prefix (EVEX.pp) bits with the related FP32 operations.
Intel AVX512 FP16 instructions has new displacements scaling rules, please refer
to the public software developer manual for detail information.
gas/
2021-08-05 Igor Tsimbalist <igor.v.tsimbalist@intel.com>
H.J. Lu <hongjiu.lu@intel.com>
Wei Xiao <wei3.xiao@intel.com>
Lili Cui <lili.cui@intel.com>
* config/tc-i386.c (struct Broadcast_Operation): Adjust comment.
(cpu_arch): Add .avx512_fp16.
(cpu_noarch): Add noavx512_fp16.
(pte): Add evexmap5 and evexmap6.
(build_evex_prefix): Handle EVEXMAP5 and EVEXMAP6.
(check_VecOperations): Handle {1to32}.
(check_VecOperands): Handle CheckRegNumb.
(check_word_reg): Handle Toqword.
(i386_error): Add invalid_dest_and_src_register_set.
(match_template): Handle invalid_dest_and_src_register_set.
* doc/c-i386.texi: Document avx512_fp16, noavx512_fp16.
opcodes/
2021-08-05 Igor Tsimbalist <igor.v.tsimbalist@intel.com>
H.J. Lu <hongjiu.lu@intel.com>
Wei Xiao <wei3.xiao@intel.com>
Lili Cui <lili.cui@intel.com>
* i386-dis.c (EXwScalarS): New.
(EXxh): Ditto.
(EXxhc): Ditto.
(EXxmmqh): Ditto.
(EXxmmqdh): Ditto.
(EXEvexXwb): Ditto.
(DistinctDest_Fixup): Ditto.
(enum): Add xh_mode, evex_half_bcst_xmmqh_mode, evex_half_bcst_xmmqdh_mode
and w_swap_mode.
(enum): Add PREFIX_EVEX_0F3A08_W_0, PREFIX_EVEX_0F3A0A_W_0,
PREFIX_EVEX_0F3A26, PREFIX_EVEX_0F3A27, PREFIX_EVEX_0F3A56,
PREFIX_EVEX_0F3A57, PREFIX_EVEX_0F3A66, PREFIX_EVEX_0F3A67,
PREFIX_EVEX_0F3AC2, PREFIX_EVEX_MAP5_10, PREFIX_EVEX_MAP5_11,
PREFIX_EVEX_MAP5_1D, PREFIX_EVEX_MAP5_2A, PREFIX_EVEX_MAP5_2C,
PREFIX_EVEX_MAP5_2D, PREFIX_EVEX_MAP5_2E, PREFIX_EVEX_MAP5_2F,
PREFIX_EVEX_MAP5_51, PREFIX_EVEX_MAP5_58, PREFIX_EVEX_MAP5_59,
PREFIX_EVEX_MAP5_5A_W_0, PREFIX_EVEX_MAP5_5A_W_1,
PREFIX_EVEX_MAP5_5B_W_0, PREFIX_EVEX_MAP5_5B_W_1,
PREFIX_EVEX_MAP5_5C, PREFIX_EVEX_MAP5_5D, PREFIX_EVEX_MAP5_5E,
PREFIX_EVEX_MAP5_5F, PREFIX_EVEX_MAP5_78, PREFIX_EVEX_MAP5_79,
PREFIX_EVEX_MAP5_7A, PREFIX_EVEX_MAP5_7B, PREFIX_EVEX_MAP5_7C,
PREFIX_EVEX_MAP5_7D_W_0, PREFIX_EVEX_MAP6_13, PREFIX_EVEX_MAP6_56,
PREFIX_EVEX_MAP6_57, PREFIX_EVEX_MAP6_D6, PREFIX_EVEX_MAP6_D7
(enum): Add EVEX_MAP5 and EVEX_MAP6.
(enum): Add EVEX_W_MAP5_5A, EVEX_W_MAP5_5B,
EVEX_W_MAP5_78_P_0, EVEX_W_MAP5_78_P_2, EVEX_W_MAP5_79_P_0,
EVEX_W_MAP5_79_P_2, EVEX_W_MAP5_7A_P_2, EVEX_W_MAP5_7A_P_3,
EVEX_W_MAP5_7B_P_2, EVEX_W_MAP5_7C_P_0, EVEX_W_MAP5_7C_P_2,
EVEX_W_MAP5_7D, EVEX_W_MAP6_13_P_0, EVEX_W_MAP6_13_P_2,
(get_valid_dis386): Properly handle new instructions.
(intel_operand_size): Handle new modes.
(OP_E_memory): Ditto.
(OP_EX): Ditto.
* i386-dis-evex.h: Updated for AVX512_FP16.
* i386-dis-evex-mod.h: Updated for AVX512_FP16.
* i386-dis-evex-prefix.h: Updated for AVX512_FP16.
* i386-dis-evex-reg.h : Updated for AVX512_FP16.
* i386-dis-evex-w.h : Updated for AVX512_FP16.
* i386-gen.c (cpu_flag_init): Add CPU_AVX512_FP16_FLAGS,
and CPU_ANY_AVX512_FP16_FLAGS. Update CPU_ANY_AVX512F_FLAGS
and CPU_ANY_AVX512BW_FLAGS.
(cpu_flags): Add CpuAVX512_FP16.
(opcode_modifiers): Add DistinctDest.
* i386-opc.h (enum): (AVX512_FP16): New.
(i386_opcode_modifier): Add reqdistinctreg.
(i386_cpu_flags): Add cpuavx512_fp16.
(EVEXMAP5): Defined as a macro.
(EVEXMAP6): Ditto.
* i386-opc.tbl: Add Intel AVX512_FP16 instructions.
* i386-init.h: Regenerated.
* i386-tbl.h: Ditto.
I've been repeatedly confused by, in particular, the .dc.a potable[]
entry being conditional. Grepping in gas/config/ reveals only very few
targets actually #define-ing it. But as of 7be1c4891a the symbol is
always defined, so #ifdef-s are pointless (and, as said, potentially
confusing).
Also adjust documentation to reflect this.
Use the pattern from other projects where we generate the html pages
in a dir named the same as the project. So now we have:
gas/doc/gas.html - single html page
gas/doc/gas/ - multiple html pages
This works for projects that have a doc/ subdir already, but gprof &
ld require a little tweaking since they generate their docs in their
respective toplevels.
This better matches other GNU projects like autoconf/automake where
the html manual is the single page form. We'll support the multi-page
form in a follow up change.
Its (documented) behavior is unhelpful in particular in 64-bit build
environments: While printing large 32-bit numbers in decimal already
isn't very meaningful to most people, this even more so goes for yet
larger 64-bit numbers. bfd_sprintf_vma() still tries to limit the number
of digits printed (without depending on a build system property), but
uniformly produces hex output.
opcodes/
* s390-mkopc.c (main): Accept arch14 as cpu string.
* s390-opc.txt: Add new arch14 instructions.
include/
* opcode/s390.h (enum s390_opcode_cpu_val): Add
S390_OPCODE_ARCH14.
gas/
* config/tc-s390.c (s390_parse_cpu): New entry for arch14.
* doc/c-s390.texi: Document arch14 march option.
* testsuite/gas/s390/s390.exp: Run the arch14 related tests.
* testsuite/gas/s390/zarch-arch14.d: New test.
* testsuite/gas/s390/zarch-arch14.s: New test.
Update 80387 floating point 's' suffix to read:
* Integer constructors are '.word', '.long' or '.int', and '.quad'
for the 16-, 32-, and 64-bit integer formats. The corresponding
instruction mnemonic suffixes are 's' (short), 'l' (long), and 'q'
(quad).
instead of 's' (single).
PR gas/27106
* doc/c-i386.texi: Update 80387 floating point 's' suffix
The SHF_GNU_RETAIN section flag is an extension to the GNU ELF OSABI.
It is defined as follows:
=========================================================
Section Attribute Flags
+-------------------------------------+
| Name | Value |
+-------------------------------------+
| SHF_GNU_RETAIN | 0x200000 (1 << 21) |
+-------------------------------------+
SHF_GNU_RETAIN
The link editor should not garbage collect the section.
=========================================================
The .section directive accepts the "R" flag, which indicates
SHF_GNU_RETAIN should be applied to the section.
There is not a direct mapping of SHF_GNU_RETAIN to the BFD
section flag SEC_KEEP. Keeping these flags distinct allows
SHF_GNU_RETAIN sections to be explicitly removed by placing them in
/DISCARD/.
bfd/ChangeLog:
* elf-bfd.h (enum elf_gnu_osabi): Add elf_gnu_osabi_retain.
(struct elf_obj_tdata): Increase has_gnu_osabi to 4 bits.
* elf.c (_bfd_elf_make_section_from_shdr): Set elf_gnu_osabi_retain
for SHF_GNU_RETAIN.
(_bfd_elf_final_write_processing): Report if SHF_GNU_RETAIN is
not supported by the OSABI.
Adjust error messages.
* elflink.c (elf_link_input_bfd): Copy enabled has_gnu_osabi bits from
input BFD to output BFD.
(bfd_elf_gc_sections): gc_mark the section if SHF_GNU_RETAIN is set.
binutils/ChangeLog:
* NEWS: Announce SHF_GNU_RETAIN support.
* readelf.c (get_elf_section_flags): Handle SHF_GNU_RETAIN.
Recognize SHF_GNU_RETAIN and SHF_GNU_MBIND only for supported OSABIs.
* testsuite/binutils-all/readelf.exp: Run new tests.
Don't run run_dump_test when there isn't an assembler available.
* testsuite/lib/binutils-common.exp (supports_gnu_osabi): Adjust
comment.
* testsuite/binutils-all/readelf-maskos-1a.d: New test.
* testsuite/binutils-all/readelf-maskos-1b.d: New test.
* testsuite/binutils-all/readelf-maskos.s: New test.
* testsuite/binutils-all/retain1.s: New test.
* testsuite/binutils-all/retain1a.d: New test.
* testsuite/binutils-all/retain1b.d: New test.
gas/ChangeLog:
* NEWS: Announce SHF_GNU_RETAIN support.
* config/obj-elf.c (obj_elf_change_section): Merge SHF_GNU_RETAIN bit
between section declarations.
(obj_elf_parse_section_letters): Handle 'R' flag.
Handle numeric flag values within the SHF_MASKOS range.
(obj_elf_section): Validate SHF_GNU_RETAIN usage.
* doc/as.texi: Document 'R' flag to .section directive.
* testsuite/gas/elf/elf.exp: Run new tests.
* testsuite/gas/elf/section10.d: Unset SHF_GNU_RETAIN bit.
* testsuite/gas/elf/section10.s: Likewise.
* testsuite/gas/elf/section22.d: New test.
* testsuite/gas/elf/section22.s: New test.
* testsuite/gas/elf/section23.s: New test.
* testsuite/gas/elf/section23a.d: New test.
* testsuite/gas/elf/section23b.d: New test.
* testsuite/gas/elf/section23b.err: New test.
* testsuite/gas/elf/section24.l: New test.
* testsuite/gas/elf/section24.s: New test.
* testsuite/gas/elf/section24a.d: New test.
* testsuite/gas/elf/section24b.d: New test.
include/ChangeLog:
* elf/common.h (SHF_GNU_RETAIN): Define.
ld/ChangeLog:
* NEWS: Announce support for SHF_GNU_RETAIN.
* ld.texi (garbage collection): Document SHF_GNU_RETAIN.
(Output Section Discarding): Likewise.
* testsuite/ld-elf/elf.exp: Run new tests.
* testsuite/ld-elf/retain1.s: New test.
* testsuite/ld-elf/retain1a.d: New test.
* testsuite/ld-elf/retain1b.d: New test.
* testsuite/ld-elf/retain2.d: New test.
* testsuite/ld-elf/retain2.ld: New test.
* testsuite/ld-elf/retain2.map: New test.
* testsuite/ld-elf/retain3.d: New test.
* testsuite/ld-elf/retain3.s: New test.
* testsuite/ld-elf/retain4.d: New test.
* testsuite/ld-elf/retain4.s: New test.
* testsuite/ld-elf/retain5.d: New test.
* testsuite/ld-elf/retain5.map: New test.
* testsuite/ld-elf/retain5lib.s: New test.
* testsuite/ld-elf/retain5main.s: New test.
* testsuite/ld-elf/retain6a.d: New test.
* testsuite/ld-elf/retain6b.d: New test.
* testsuite/ld-elf/retain6lib.s: New test.
* testsuite/ld-elf/retain6main.s: New test.
* read.c (stringer): Treat space separated, quote enclosed strings
as a single string.
* doc/as.texi (asciz): Mention this behaviour in the description
of the asciz directive.
* testsuite/gas/all/asciz.s: New test.
* testsuite/gas/all/asciz.d: New test driver.
* testsuite/gas/all/gas.exp: Run the new test.
Extract FLAGM (Condition flag manipulation) feature from Armv8.4-A.
Please note that FLAGM stays a Armv8.4-A feature but now can be
assigned to other architectures or CPUs.
New -march option +flagm is added to enable independently this
feature.
This patch adds support for AArch64 -march=armv8.7-a command line option
in GAS.
Please note that this change ONLY extends -march= command line interface
with a new "armv8.7-a" option. Architectural changes like new instructions
will be added in following patches.
gas/ChangeLog:
2020-10-16 Przemyslaw Wirkus <przemyslaw.wirkus@arm.com>
* NEWS: Docs update.
* config/tc-aarch64.c (armv8.7-a): New arch.
* doc/c-aarch64.texi (-march=armv8.7-a): Update docs.
include/ChangeLog:
2020-10-16 Przemyslaw Wirkus <przemyslaw.wirkus@arm.com>
* opcode/aarch64.h (AARCH64_FEATURE_V8_7): New feature bitmask.
(AARCH64_ARCH_V8_7): New arch feature set.
opcodes/ChangeLog:
2020-10-16 Przemyslaw Wirkus <przemyslaw.wirkus@arm.com>
* aarch64-tbl.h (ARMV8_7): New macro.
PR 26253
gas * config/obj-elf.c (obj_elf_section): Accept a numeric value for
the "o" section flag. Interpret it as a section index. Allow an
index of zero.
* doc/as.texi: Document the new behaviour.
* NEWS: Mention the new feature. Tidy entries.
* testsuite/gas/elf/sh-link-zero.s: New test.
* testsuite/gas/elf/sh-link-zero.d: New test driver.
* testsuite/gas/elf/elf.exp: Run the new test.
* testsuite/gas/elf/section21.l: Updated expected assembler
output.
bfd * elf.c (_bfd_elf_setup_sections): Do not complain about an
sh_link value of zero when the SLF_LINK_ORDER flag is set.
(assign_section_numbers): Likewise.
* config/obj-elf (elf_pseudo_table): Add attach_to_group.
(obj_elf_attach_to_group): New function.
* doc/as.texi: Document the new directive.
* NEWS: Mention the new feature.
* testsuite/gas/elf/attach-1.s: New test.
* testsuite/gas/elf/attach-1.d: New test driver.
* testsuite/gas/elf/attach-2.s: New test.
* testsuite/gas/elf/attach-2.d: New test driver.
* testsuite/gas/elf/attach-err.s: New test.
* testsuite/gas/elf/attach-err.d: New test driver.
* testsuite/gas/elf/attach-err.err: New test error output.
* testsuite/gas/elf/elf.exp: Run the new tests.
This patch adds support for Arm's Neoverse N2 CPU to AArch64 binutils.
gas/ChangeLog:
* config/tc-aarch64.c (aarch64_cpus): Add neoverse-n2.
* doc/c-aarch64.texi: Document support for Neoverse N2.