It's not clear to me why they had been introduced - the respective
comments in opcodes/i386-gen.c are certainly wrong: ymm<N> registers
are very well supported (and necessary) with just AVX512F.
Neither 287 wrt 8087 nor 387 wrt 287 are proper supersets - in each case
some insns get removed from the ISA (they become NOPs, but code intended
for newer co-processors should not use them).
Furthermore with .no87, ST should not be recognized as a register name.
The VEX3SOURCES code was (originally) written with just space
indentation, which is not in line with general coding style as well as
the style later in the function.
There are no templates with VexImmExt and ImmExt set at the same time.
There are also no VEX3SOURCES templates with CpuFMA. I assume both are
left-overs from the implementation of an early specification which was
later revised.
Define and handle TLS relocations for FDPIC in BFD and gas.
In gas, the new relocations are rejected if the --fdpic option was not
specified.
We also define the __tdata_start symbol to mark the start of the
.tdata section. This allows FDPIC static binaries to find the start of
.tdata section, since phdr->p_vaddr of TLS segment is not a valid
value for FDPIC.
2018-04-25 Christophe Lyon <christophe.lyon@st.com>
Mickaël Guêné <mickael.guene@st.com>
bfd/:
* bfd-in2.h (BFD_RELOC_ARM_TLS_GD32_FDPIC)
(BFD_RELOC_ARM_TLS_LDM32_FDPIC, BFD_RELOC_ARM_TLS_IE32_FDPIC): New
relocations.
* elf32-arm.c (elf32_arm_howto_table_2): Add R_ARM_TLS_GD32_FDPIC,
R_ARM_TLS_LDM32_FDPIC, R_ARM_TLS_IE32_FDPIC relocations.
(elf32_arm_reloc_map): Add R_ARM_TLS_GD32_FDPIC,
R_ARM_TLS_LDM32_FDPIC, R_ARM_TLS_IE32_FDPIC.
(struct elf32_arm_link_hash_table): Update comment.
(elf32_arm_final_link_relocate): Handle TLS FDPIC relocations.
(IS_ARM_TLS_RELOC): Likewise.
(elf32_arm_check_relocs): Likewise.
(allocate_dynrelocs_for_symbol): Likewise.
(elf32_arm_size_dynamic_sections): Update comment.
* reloc.c: Add BFD_RELOC_ARM_TLS_GD32_FDPIC,
BFD_RELOC_ARM_TLS_LDM32_FDPIC, BFD_RELOC_ARM_TLS_IE32_FDPIC.
gas/
* config/tc-arm.c (reloc_names): Add TLSGD_FDPIC, TLSLDM_FDPIC,
GOTTPOFF_FDIC relocations.
(md_apply_fix): Handle the new TLS FDPIC relocations.
(tc_gen_reloc): Likewise.
(arm_fix_adjustable): Likewise.
include/
* elf/arm.h: Add R_ARM_TLS_GD32_FDPIC, R_ARM_TLS_LDM32_FDPIC,
R_ARM_TLS_IE32_FDPIC.
ld/
* scripttempl/elf.sc: Define __tdata_start for .tdata section.
ELF files targetting ARM FDPIC use the ELFOSABI_ARM_FDPIC flag.
Set it appropriately in file generators (eg. gas), and handle it in
readers (eg. readelf).
2018-04-25 Christophe Lyon <christophe.lyon@st.com>
Mickaël Guêné <mickael.guene@st.com>
bfd/
* elf32-arm.c (elf32_arm_print_private_bfd_data): Support
EF_ARM_PIC and ELFOSABI_ARM_FDPIC.
(elf32_arm_post_process_headers): Support ELFOSABI_ARM_FDPIC.
(ELF_OSABI): Define to ELFOSABI_ARM_FDPIC.
binutils/
* readelf.c (decode_ARM_machine_flags): Support EF_ARM_PIC.
(get_osabi_name): Support ELFOSABI_ARM_FDPIC.
gas/
* config/tc-arm.c (arm_fdpic): New.
(elf32_arm_target_format): Support FDPIC.
(OPTION_FDPIC): New.
(md_longopts): Support FDPIC.
(md_parse_option): Likewise.
(md_show_usage): Likewise.
include/
* elf/arm.h (EF_ARM_FDPIC): New.
Rn is supposed to have a 5 bit range but instead was given 4 bits
causing these instructions to disassemble as unknown instructions.
opcodes/
* aarch64-tbl.h (sqrdmlah, sqrdmlsh): Fix masks.
gas/
* testsuite/gas/aarch64/rdma.s: Test for larger register numbers.
* testsuite/gas/aarch64/rdma.d: Update results.
* testsuite/gas/aarch64/rdma-directive.d: Likewise.
All of these warnings were false positives. -Wstringop-truncation is
particularly annoying when it warns about strncpy used quite correctly.
bfd/
* elf-linux-core.h (swap_linux_prpsinfo32_ugid32_out): Disable
gcc-8 string truncation warning.
(swap_linux_prpsinfo32_ugid16_out): Likewise.
(swap_linux_prpsinfo64_ugid32_out): Likewise.
(swap_linux_prpsinfo64_ugid16_out): Likewise.
* elf.c (elfcore_write_prpsinfo): Likewise.
gas/
* stabs.c (generate_asm_file): Use memcpy rather than strncpy.
Remove call to strlen inside loop.
* config/tc-cr16.c (getreg_image): Warning fix.
* config/tc-crx.c (getreg_image): Warning fix.
Andrew Sadek <andrew.sadek.se@gmail.com>
A new implemented feature in GCC Microblaze that allows Position
Independent Code to run using Data Text Relative addressing instead
of using Global Offset Table.
Its aim was to make 'PIC' more efficient and flexible as elf size
excess performance overhead were noticed when using GOT due to the
indirect addressing.
include/ChangeLog:
* bfdlink.h (Add flag): Add new flag @ 'bfd_link_info' struct.
* elf/microblaze.h (Add 3 new relocations):
R_MICROBLAZE_TEXTPCREL_64, R_MICROBLAZE_TEXTREL_64
and R_MICROBLAZE_TEXTREL_32_LO for relax function.
bfd/ChangeLog:
* bfd/reloc.c (2 new BFD relocations):
BFD_RELOC_MICROBLAZE_64_TEXTPCREL &
BFD_RELOC_MICROBLAZE_64_TEXTPCREL
* bfd/bfd-in2.h: Regenerate
* bfd/libbfd.h: Regenerate
* bfd/elf32-microblaze.c (Handle new relocs): define 'HOWTO' of 3
new relocs and handle them in both relocate and relax functions.
(microblaze_elf_reloc_type_lookup): add mapping between for new
bfd relocs.
(microblaze_elf_relocate_section): Handle new relocs in case of
elf relocation.
(microblaze_elf_relax_section): Handle new relocs for elf relaxation.
gas/ChangeLog:
* gas/config/tc-microblaze.c (Handle new relocs directives in
assembler): Handle new relocs from compiler output.
(imm_types): add new imm types for data text relative addressing
TEXT_OFFSET, TEXT_PC_OFFSET
(md_convert_frag): conversion for BFD_RELOC_MICROBLAZE_64_TEXTPCREL,
BFD_RELOC_MICROBLAZE_64_TEXTPCREL
(md_apply_fix): apply fix for BFD_RELOC_MICROBLAZE_64_TEXTPCREL,
BFD_RELOC_MICROBLAZE_64_TEXTPCREL
(md_estimate_size_before_relax): estimate size for
BFD_RELOC_MICROBLAZE_64_TEXTPCREL,
BFD_RELOC_MICROBLAZE_64_TEXTPCREL
(tc_gen_reloc): generate relocations for
BFD_RELOC_MICROBLAZE_64_TEXTPCREL,
BFD_RELOC_MICROBLAZE_64_TEXTPCREL
ld/ChangeLog:
* ld/lexsup.c (Add 2 ld options):
(ld_options): add disable-multiple-abs-defs @ 'ld_options' array
(parse_args): parse new option and pass flag to 'link_info' struct.
* ld/ldlex.h (Add enum): add new enum @ 'option_values' enum.
* ld/ld.texinfo (Add new option): Add description for
'disable-multiple-abs-defs'
* ld/main.c: Initialize flags with false @ 'main'. Handle
disable-multiple-abs-defs @ 'mutiple_definition'.
PR 23054
* cond.c (s_ifsef): Replace use of obstack_copy with obstack_alloc
followed by memcpy.
(s_if, s_ifb, s_ifc, s_ifeqs): Likewise.
* obj-elf.c (elf_adjust_symtab): Check for local symbols before
attempting to dereference the sy_next field of a symbol.
* stabs.c (get_stab_string_offset): Fail if there is no string
following the stab directive.
Since only the first 32 bits of input operand are used for tpause and
umwait, the REX.W bit is skipped. Both 32-bit registers and 64-bit
registers are allowed.
gas/
* testsuite/gas/i386/x86-64-waitpkg.s: Add 32-bit registers
tests for tpause and umwait.
* testsuite/gas/i386/x86-64-waitpkg-intel.d: Updated.
* testsuite/gas/i386/x86-64-waitpkg.d: Likewise.
opcodes/
* i386-dis.c (prefix_table): Replace Em with Edq on tpause and
umwait.
* i386-opc.tbl: Allow 32-bit registers for tpause and umwait in
64-bit mode.
* i386-tbl.h: Regenerated.
config/plugins.m4 has
if test "$plugins" = "yes"; then
AC_SEARCH_LIBS([dlopen], [dl])
fi
Plugin uses dlsym, but libasan.so only intercepts dlopen, not dlsym:
[hjl@gnu-tools-1 binutils-text]$ nm -D /lib64/libasan.so.4| grep " dl"
0000000000038580 W dlclose
U dl_iterate_phdr
000000000004dc50 W dlopen
U dlsym
U dlvsym
[hjl@gnu-tools-1 binutils-text]$
Testing dlopen for libdl leads to false negative when -fsanitize=address
is used. It results in link failure:
../bfd/.libs/libbfd.a(plugin.o): undefined reference to symbol 'dlsym@@GLIBC_2.16'
dlsym should be used to check if libdl is needed for plugin.
bfd/
PR gas/22318
* configure: Regenerated.
binutils/
PR gas/22318
* configure: Regenerated.
gas/
PR gas/22318
* configure: Regenerated.
gprof/
PR gas/22318
* configure: Regenerated.
ld/
PR gas/22318
* configure: Regenerated.
"vex" has many fields to control how to decode an instruction. Clear
all fields in "vex" before decoding an instruction to avoid using values
left from the previous instruction.
gas/
PR binutils/23025
* testsuite/gas/i386/prefix.s: Add tests for vcvtpd2dq with
VEX and EVEX prefixes.
* testsuite/gas/i386/prefix.d: Updated.
opcodes/
PR binutils/23025
* i386-dis.c (get_valid_dis386): Don't set vex.prefix nor vex.w
to 0.
(print_insn): Clear vex instead of vex.evex.
This patch adds the following relocation support into binutils gas.
BFD_RELOC_AARCH64_TLSLE_LDST16_TPREL_LO12,
BFD_RELOC_AARCH64_TLSLE_LDST16_TPREL_LO12_NC,
BFD_RELOC_AARCH64_TLSLE_LDST32_TPREL_LO12,
BFD_RELOC_AARCH64_TLSLE_LDST32_TPREL_LO12_NC,
BFD_RELOC_AARCH64_TLSLE_LDST64_TPREL_LO12,
BFD_RELOC_AARCH64_TLSLE_LDST64_TPREL_LO12_NC,
BFD_RELOC_AARCH64_TLSLE_LDST8_TPREL_LO12,
BFD_RELOC_AARCH64_TLSLE_LDST8_TPREL_LO12_NC.
Those relocations includes both ip64 and ilp32 variant.
It again can be inferred from other information.
The vpopcntd templates all need to have Dword added to their memory
operands; the lack thereof was actually a bug preventing certain Intel
syntax code to assemble, so test cases get extended.
In the course of folding their patterns (possible now that the pointless
and partly even bogus VecESize are no longer in the way) I've noticed
that vcvt*2usi, other than their vcvt*2si counterparts, don't allow for
any suffixes. As that is supposedly intentional, make the disassembler
consistently omit suffixes for all to-scalar-int conversion insns.
Since they're shorter to encode, the 0xa0...0xa3 encodings are preferred
for moves between accumulator and absolute address outside of 64-bit
mode. With HLE release semantics this encoding is unsupported though,
with the assembler raising an error. The operation is valid though, we
merely need to pick the longer encoding in that case.
The wrong placement of the Load attribute in the templates prevented
this from working. The disassembler also didn't handle this consistently
with other similar dual-encoding insns.
While many templates allowing multiple suitably matching XMM/YMM/ZMM
operand sizes can be folded, a few need to be split in order to not
wrongly accept "xmmword ptr" operands when only XMM registers are
permitted (and memory operands are more narrow). Add a test case
validating this.
Since the addition of pseudo prefixes changed how the scrubber treats
'{', we need to explicitly strip whitespace in check_VecOperations ().
* config/tc-i386.c (check_VecOperations): Strip whitespace.
* testsuite/gas/i386/optimize-1.s: Add whitespaces before
{%k7} and {z},
* testsuite/gas/i386/x86-64-optimize-2.s: Likewise.
We can optimize AVX512 instructions with EVEX128 only if AVX512VL is
enabled:
1. Instruction is an AVX512VL instruction. Or
2. AVX512VL is enabled explicitly by -march=+avx512vl/".arch .avx512vl".
We should optimize EVEX instructions with EVEX128 encoding when pseudo
{evex} prefix is used.
* config/tc-i386.c (set_cpu_arch): Set cpu_arch_isa_flags.
(md_parse_option): Likewise.
(optimize_encoding): Check i.tm.cpu_flags and cpu_arch_isa_flags
for cpuavx512vl instead of cpu_arch_flags. Optimize EVEX with
EVEX128 when EVEX encoding is required.
* testsuite/gas/i386/i386.exp: Run optimize-4, optimize-5,
x86-64-optimize-5 and x86-64-optimize-6.
* testsuite/gas/i386/optimize-1.d: Updated.
* testsuite/gas/i386/x86-64-optimize-2.d: Likewise.
* testsuite/gas/i386/optimize-4.d: New file.
* testsuite/gas/i386/optimize-4.s: Likewise.
* testsuite/gas/i386/optimize-5.d: Likewise.
* testsuite/gas/i386/optimize-5.s: Likewise.
* testsuite/gas/i386/x86-64-optimize-5.d: Likewise.
* testsuite/gas/i386/x86-64-optimize-5.s: Likewise.
* testsuite/gas/i386/x86-64-optimize-6.d: Likewise.
* testsuite/gas/i386/x86-64-optimize-6.s: Likewise.
"clr reg" is an alias of "xor reg, reg". We can encode "clr reg64" as
"xor reg32, reg32".
gas/
* config/tc-i386.c (optimize_encoding): Also encode "clr reg64"
as "xor reg32, reg32".
* testsuite/gas/i386/x86-64-optimize-1.s: Add "clr reg64" tests.
* testsuite/gas/i386/x86-64-optimize-1.d: Updated.
opcodes/
* i386-opc.tbl: Add Optimize to clr.
* i386-tbl.h: Regenerated.
The differences between some of the register and memory forms of the
same insn often don't really require the templates to be separate. For
example, Disp8MemShift is simply irrelevant to register forms. Fold
these as far as possible, and also fold register-only forms. Further
folding is possible, but needs other prereq work done first.
A note regarding EVEXDYN: This is intended to be used only when no other
properties of the template would make is_evex_encoding() return true. In
all "normal" cases I think it is preferable to omit this indicator, to
keep the table half way readable.
Their memory forms were bogusly using VexLWP instead of VexNDD. Adjust
VexNDD handling to cope with these, allowing their register and memory
forms to be folded.
They aren't really useful (anymore?): The conflicting operand size check
isn't applicable to any insn validly using respective memory operand
sizes (and if they're used wrongly, another error would result), and the
logic in process_suffix() can be easily changed to work without them.
While re-structuring conditionals in process_suffix() also drop the
CMPXCHG8B special case in favor of a NoRex64 attribute in the opcode
table.
Some BMI/BMI2 insns allow their middle operands to be a memory one. In
such a case, matching register types between operands 0 and 1 as well as
1 and 2 won't help - operands 0 and 2 also need to be checked.
Make more obvious what the success and failure paths are, and in
particular that what used to be at the "skip" label can't be reached
by what used to be straight line code.
Just like for the AVX/AES and AVX/PCLMUL combinations, AVX/GFN,
AVX512F/GFNI, AVX512F/VAES, and AVX512F/PCLMUL need special handling to
deal with the pair of required checks specified in the templates.
fsub/fsubr/fsubp/fsubrp as well as fdiv/fdivr/fdivp/fdivrp disassembly
should match (a) the Intel SDM and (b) respective input fed to gas (both
of course with the exception of when we intentionally convert bogus
insns, accompanied by a warning).
Drop "second": For one there's no other source register (the other
source operand is in memory), and in Intel syntax such numbering would
also be wrong.
Take the opportunity and also
- properly place declarations ahead of statements
- use %u format for unsigned int arguments
- fix indentation
This requires a change to ModR/M handling: Recording of displacement
types must not discard operand size information. Change the respective
code to alter only .disp<N>.
Oops, not tested well enough. -mpower9 sets all the PPC_OPCODE_POWERn
for n <= 9.
* config/tc-ppc.c (ppc_handle_align): Correct last patch. Really
don't emit a group terminating nop for power9. Simplify cpu
tests.
Power9 doesn't have a group terminating nop, so we may as well emit a
normal nop for power9. Not that it matters a great deal, I believe
ori 2,2,0 will be treated exactly as ori 0,0,0 by the hardware.
* config/tc-ppc.c (ppc_handle_align): Don't emit a group
terminating nop for power9.
xcoff (32-bit) objdump accepted but ignored -M options unless
-mpowerpc was also given. This patch fixes that, leaving the default
as -Mpwr for xcoff. I've also enabled more tests for xcoff targets.
binutils/
* configure.ac: Add objdump_private_desc_xcoff for rs6000.
* configure: Regenerate.
gas/
* testsuite/gas/ppc/aix.exp: Run for rs6000 too.
* testsuite/gas/ppc/ppc.exp: Run more tests for non-ELF targets.
* testsuite/gas/ppc/machine.d: Don't run for PE targets.
opcodes/
* disassemble.c (disassembler): Use bfd_arch_powerpc entry for
bfd_arch_rs6000.
* disassemble.h (print_insn_rs6000): Delete.
* ppc-dis.c (powerpc_init_dialect): Handle rs6000.
(disassemble_init_powerpc): Call powerpc_init_dialect for rs6000.
(print_insn_rs6000): Delete.
Commit 4d354d8b89 introduced a NULL
pointer dereference by replacing a pointer assignment by a pointer
dereference assignment without adding a NULL pointer check. This patch
fixes it.
2018-03-02 Thomas Preud'homme <thomas.preudhomme@arm.com>
gas/
* config/tc-arm.c (md_begin): Add NULL pointer check before
dereferencing march_ext_opt.
I've always found the code in ARM backend of gas to control what
CPU/architecture and FPU are selected by the user and to support
autodetection of features complex and confusing. Chief among the
issues I have with that code is the lack of comments to explain
the meaning of the various variables. This patch addresses that
and much more:
- add comments to explain meaning of all arm_feature_set variables
- keep track of currently selected CPU, extensions and FPU in a separate
set of new variables
- make naming of variable more consistent
- remove dead code
- simplify handling of extensions
The overall approach is as follows:
* restrict m*_opt variable to hold the feature bits of the
corresponding mcpu/march/mfpu command-line options
* record selected CPU, extensions and FPU in new selected_* during
md_begin
* whenever a .cpu/.arch/.arch_extension/.fpu directive is met, update
the corresponding selected_* variables (eg. selected_arch, then
selected_cpu for a .cpu or .arch directive) and then finally
cpu_variant from them
* pass extension feature set pointer by value to arm_parse_extension
since it's only ever called from arm_parse_cpu and arm_parse_arch
which allocate the extension feature set themselves
* likewise, remove allocation from s_arm_arch_extension since the use
of arm_feature_set structure for selected_ext rather than a pointer
alleviate the need for it
* in autodetection mode, only set all CPU fits in cpu_variant but leave
selected_cpu* variables unset
* in md_begin, remove dead "else if" to set a default FPU when no FPU
was selected. Setting a default FPU based on CPU as did the code
before it turn dead should be based on the default FPU field of the
CPU and architecture table as will be done in a separate patch. Logic
is wrong anyway since it sets VFP2 as default FPU for Armv6-M and
Armv7-M
Hopefully that should be enough to understand the change but if not feel
free to ask questions about the patch. While I believe the new code is
easier to understand, it remains complex and the old one was even more
complex so the change is difficult to understand.
2018-03-01 Thomas Preud'homme <thomas.preudhomme@arm.com>
gas/
* config/tc-arm.c (cpu_variant, arm_arch_used, thumb_arch_used,
legacy_cpu, legacy_fpu, mcpu_cpu_opt, dyn_mcpu_ext_opt,
mcpu_fpu_opt, march_cpu_opt, dyn_march_ext_opt, march_fpu_opt,
mfpu_opt, object_arch, selected_cpu): Comment meaning of variables.
(dyn_mcpu_ext_opt): Also rename into ...
(mcpu_ext_opt): This.
(dyn_march_ext_opt): Also rename into ...
(march_ext_opt): This.
(object_arch): Also rename into ...
(selected_object_arch): This and make it a plain arm_feature_set
structure.
(selected_arch, selected_ext, selected_fpu): New static variables.
(mark_feature_used): Fix comments, feature is marked as used iff it is
currently allowed.
(do_bx): Adapt to change in name and type of object_arch.
(md_begin): Set selected_arch rather than mcpu_cpu_opt, selected_ext
rather than dyn_mcpu_ext_opt and selected_fpu rather than mfpu_opt.
Remove dead code to set default FPU if architecture version is greater
than 5. Set all CPU bits of cpu_variant directly in autodection
leaving mcpu_cpu_opt, selected_arch and selected_cpu unset.
(arm_parse_extension): Take extension feature set pointer parameter by
value rather than by pointer. Remove allocation code. Adapt code
accordingly.
(arm_parse_cpu): Adapt to variable renaming and changes in
arm_parse_extension () signature.
(arm_parse_arch): Likewise.
(aeabi_set_public_attributes): Also set selected_arch and selected_ext
in addition to selected_cpu. Set flags_arch and flags_ext from them
instead of selected_cpu. Adapt to variables renaming and type change.
(arm_md_post_relax): Adapt to variable renaming.
(s_arm_cpu): Set selcted_cpu_cpu and selected_ext instead of
mcpu_cpu_opt and dyn_mcpu_ext_opt. Set selected_cpu from them and
cpu_variant from selected_cpu and selected_fpu.
(s_arm_arch): Likewise.
(s_arm_object_arch): Adapt to variable renaming.
(s_arm_arch_extension): Use ARM_CPU_IS_ANY instead of checking feature
set against arm_any. Check selected_arch rather than *mcpu_cpu_opt.
Set selected_ext rather than *dyn_mcpu_ext_opt and remove allocation
code.
(s_arm_fpu): Set selected_fpu instead of mfpu_opt. Set all CPU feature
bits if in autodetection mode.
On x86, some instructions have alternate shorter encodings:
1. When the upper 32 bits of destination registers of
andq $imm31, %r64
testq $imm31, %r64
xorq %r64, %r64
subq %r64, %r64
known to be zero, we can encode them without the REX_W bit:
andl $imm31, %r32
testl $imm31, %r32
xorl %r32, %r32
subl %r32, %r32
This optimization is enabled with -O, -O2 and -Os.
2. Since 0xb0 mov with 32-bit destination registers zero-extends 32-bit
immediate to 64-bit destination register, we can use it to encode 64-bit
mov with 32-bit immediates. This optimization is enabled with -O, -O2
and -Os.
3. Since the upper bits of destination registers of VEX128 and EVEX128
instructions are extended to zero, if all bits of destination registers
of AVX256 or AVX512 instructions are zero, we can use VEX128 or EVEX128
encoding to encode AVX256 or AVX512 instructions. When 2 source
registers are identical, AVX256 and AVX512 andn and xor instructions:
VOP %reg, %reg, %dest_reg
can be encoded with
VOP128 %reg, %reg, %dest_reg
This optimization is enabled with -O2 and -Os.
4. 16-bit, 32-bit and 64-bit register tests with immediate may be
encoded as 8-bit register test with immediate. This optimization is
enabled with -Os.
This patch does:
1. Add {nooptimize} pseudo prefix to disable instruction size
optimization.
2. Add optimize to i386_opcode_modifier to tell assembler that encoding
of an instruction may be optimized.
gas/
PR gas/22871
* NEWS: Mention -O[2|s].
* config/tc-i386.c (_i386_insn): Add no_optimize.
(optimize): New.
(optimize_for_space): Likewise.
(fits_in_imm7): New function.
(fits_in_imm31): Likewise.
(optimize_encoding): Likewise.
(md_assemble): Call optimize_encoding to optimize encoding.
(parse_insn): Handle {nooptimize}.
(md_shortopts): Append "O::".
(md_parse_option): Handle -On.
* doc/c-i386.texi: Document -O0, -O, -O1, -O2 and -Os as well
as {nooptimize}.
* testsuite/gas/cfi/cfi-x86_64.d: Pass -O0 to assembler.
* testsuite/gas/i386/ilp32/cfi/cfi-x86_64.d: Likewise.
* testsuite/gas/i386/i386.exp: Run optimize-1, optimize-2,
optimize-3, x86-64-optimize-1, x86-64-optimize-2,
x86-64-optimize-3 and x86-64-optimize-4.
* testsuite/gas/i386/optimize-1.d: New file.
* testsuite/gas/i386/optimize-1.s: Likewise.
* testsuite/gas/i386/optimize-2.d: Likewise.
* testsuite/gas/i386/optimize-2.s: Likewise.
* testsuite/gas/i386/optimize-3.d: Likewise.
* testsuite/gas/i386/optimize-3.s: Likewise.
* testsuite/gas/i386/x86-64-optimize-1.s: Likewise.
* testsuite/gas/i386/x86-64-optimize-1.d: Likewise.
* testsuite/gas/i386/x86-64-optimize-2.d: Likewise.
* testsuite/gas/i386/x86-64-optimize-2.s: Likewise.
* testsuite/gas/i386/x86-64-optimize-3.d: Likewise.
* testsuite/gas/i386/x86-64-optimize-3.s: Likewise.
* testsuite/gas/i386/x86-64-optimize-4.d: Likewise.
* testsuite/gas/i386/x86-64-optimize-4.s: Likewise.
opcodes/
PR gas/22871
* i386-gen.c (opcode_modifiers): Add Optimize.
* i386-opc.h (Optimize): New enum.
(i386_opcode_modifier): Add optimize.
* i386-opc.tbl: Add "Optimize" to "mov $imm, reg",
"sub reg, reg/mem", "test $imm, acc", "test $imm, reg/mem",
"and $imm, acc", "and $imm, reg/mem", "xor reg, reg/mem",
"movq $imm, reg" and AVX256 and AVX512 versions of vandnps,
vandnpd, vpandn, vpandnd, vpandnq, vxorps, vxorpd, vpxor,
vpxord and vpxorq.
* i386-tbl.h: Regenerated.
This patch makes GAS emit a warning when trying to assemble the Armv8.2
FP16 instructions VMOVX and VINS with condition codes. The Armv8-A
Reference Manual specifies these instructions without conditional codes
and says that if they are found in an IT block that they are CONSTRAINED
UNPREDICABLE.
gas/ChangeLog:
2018-02-22 Andre Vieira <andre.simoesdiasvieira@arm.com>
* config/tc-arm.c (do_neon_movhf): If conditional error out when in arm
mode and emit warning in thumb mode.
* testsuite/gas/arm/armv8-2-fp16-scalar-bad.s: Add new tests.
* testsuite/gas/arm/armv8-2-fp16-scalar-bad.l: Idem.
Add {rex} pseudo prefix to generate a REX byte for integer and legacy
vector instructions if possible. Note that this differs from the rex
prefix which generates REX prefix unconditionally.
gas/
* config/tc-i386.c (_i386_insn): Add rex_encoding.
(md_assemble): When i.rex_encoding is true, generate a REX byte
if possible.
(parse_insn): Set i.rex_encoding for {rex}.
* doc/c-i386.texi: Document {rex}.
* testsuite/gas/i386/x86-64-pseudos.s: Add {rex} tests.
* testsuite/gas/i386/x86-64-pseudos.d: Updated.
opcodes/
* i386-opc.tbl: Add {rex},
* i386-tbl.h: Regenerated.
Add a pair of MIPS16 branch tests to verify correct R_MIPS16_PC16_S1
relocation generation for cross-section references in a single source.
This complements commit c9775dde32 ("MIPS16: Add R_MIPS16_PC16_S1
branch relocation support").
gas/
* testsuite/gas/mips/mips16-branch-reloc-4.d: New test.
* testsuite/gas/mips/mips16-branch-reloc-5.d: New test.
* testsuite/gas/mips/mips16-branch-reloc-4.s: New test source.
* testsuite/gas/mips/mips16-branch-reloc-5.s: New test source.
* testsuite/gas/mips/mips.exp: Run the new tests.
Literal movement code may grow auto litpool so big that it won't be
possible to jump around it. Limit the size of auto litpools by 1/2 of
the jump range.
gas/
2018-02-20 Max Filippov <jcmvbkbc@gmail.com>
* config/tc-xtensa.c (struct litpool_frag): Add new field
literal_count.
(MAX_AUTO_POOL_LITERALS, MAX_EXPLICIT_POOL_LITERALS)
(MAX_POOL_LITERALS): New macro definitions.
(auto_litpool_limit): Initialize to 0.
(md_parse_option): Set auto_litpool_limit in the presence of
--auto-litpools option.
(xtensa_maybe_create_literal_pool_frag): Zero-initialize
literal_count field.
(xg_find_litpool): New function. Make sure that found literal
pool size is within the limit.
(xtensa_move_literals): Extract literal pool search code into
the new function.
* testsuite/gas/xtensa/all.exp: Add auto-litpools-2 test.
* testsuite/gas/xtensa/auto-litpools-2.d: New file.
* testsuite/gas/xtensa/auto-litpools-2.s: New file.
* testsuite/gas/xtensa/auto-litpools.d: Fix up changed
addresses.
* testsuite/gas/xtensa/auto-litpools.s: Change literal value so
that objdump doesn't get out of sync.
Documentation for .arch_extension says it accepts the same architectural
extensions as those accepted by -mcpu. Given the name and the fact that
-march for obvious reason also accept the same extensions, I believe
it's worth mentioning that it accepts the same extensions as both
-march and -mcpu. This commit addresses that.
2018-02-20 Thomas Preud'homme <thomas.preudhomme@arm.com>
gas/
* doc/c-arm.texi (.arch_extension): Mention extensions it accepts are
also the same as -march.
Implement the '.nop SIZE[, CONTROL]' assembler directive, which emits
SIZE bytes filled with no-op instructions. SIZE is absolute expression.
The optional CONTROL byte controls how no-op instructions should be
generated. If the comma and @var{control} are omitted, CONTROL is
assumed to be zero.
For Intel 80386 and AMD x86-64 targets, CONTROL byte specifies the size
limit of a single no-op instruction. The valid values of CONTROL byte
are between 0 and 8 for 16-bit mode, between 0 and 10 for 32-bit mode,
between 0 and 11 for 64-bit mode. When 0 is used, the no-op size limit
is set to the maximum supported size.
2 new relax states, rs_space_nop and rs_fill_nop, are added to enum
_relax_state, which are similar to rs_space and rs_fill, respectively,
but they fill with no-op instructions, instead of a single byte. A
target backend must override the default md_generate_nops to generate
proper no-op instructions. Otherwise, an error of unimplemented .nop
directive will be issued whenever .nop directive is used.
* NEWS: Mention .nop directive.
* as.h (_relax_state): Add rs_space_nop and rs_fill_nop.
* read.c (potable): Add .nop.
(s_nop): New function.
* read.h (s_nop): New prototype.
* write.c (cvt_frag_to_fill): Handle rs_space_nop and
rs_fill_nop.
(md_generate_nops): New function.
(relax_segment): Likewise.
(write_contents): Use md_generate_nops for rs_fill_nop.
* config/tc-i386.c (alt64_11): New.
(alt64_patt): Likewise.
(md_convert_frag): Handle rs_space_nop.
(i386_output_nops): New function.
(i386_generate_nops): Likewise.
(i386_align_code): Call i386_output_nops.
* config/tc-i386.h (i386_generate_nops): New.
(md_generate_nops): Likewise.
* doc/as.texinfo: Document .nop directive.
* testsuite/gas/i386/i386.exp: Run .nop directive tests.
* testsuite/gas/i386/nop-1.d: New file.
* testsuite/gas/i386/nop-1.s: Likewise.
* testsuite/gas/i386/nop-2.d: Likewise.
* testsuite/gas/i386/nop-2.s: Likewise.
* testsuite/gas/i386/nop-3.d: Likewise.
* testsuite/gas/i386/nop-3.s: Likewise.
* testsuite/gas/i386/nop-4.d: Likewise.
* testsuite/gas/i386/nop-4.s: Likewise.
* testsuite/gas/i386/nop-5.d: Likewise.
* testsuite/gas/i386/nop-5.s: Likewise.
* testsuite/gas/i386/nop-6.d: Likewise.
* testsuite/gas/i386/nop-6.s: Likewise.
* testsuite/gas/i386/nop-bad-1.l: Likewise.
* testsuite/gas/i386/nop-bad-1.s: Likewise.
* testsuite/gas/i386/x86-64-nop-1.d: Likewise.
* testsuite/gas/i386/x86-64-nop-2.d: Likewise.
* testsuite/gas/i386/x86-64-nop-3.d: Likewise.
* testsuite/gas/i386/x86-64-nop-4.d: Likewise.
* testsuite/gas/i386/x86-64-nop-5.d: Likewise.
* testsuite/gas/i386/x86-64-nop-6.d: Likewise.
The build attribute number for Armv8.4-A is currently incorrectly set to that of Armv8-M.
This patch fixes that by setting it as part of the Armv8-A family and adds a test for it.
gas/
2018-02-15 Tamar Christina <tamar.christina@arm.com>
* config/tc-arm.c (cpu_arch_ver): Renumber ARM_ARCH_V8_4A.
* testsuite/gas/arm/attr-march-armv8_4-a.d: New.
For jumps requiring multiple trampolines trampoline placement code may
place multiple sequential trampolines into the same frag. Don't do that.
gas/
2018-02-13 Max Filippov <jcmvbkbc@gmail.com>
* config/tc-xtensa.c (xg_find_best_trampoline): Skip trampoline
frag that contains source address.
PR 22773
* config/tc-arm.c (md_apply_fix): Test Rn field of Thumb ORR
instruction before assuming that it is a MOV instruction.
* testsuite/gas/arm/pr22773.s: New test.
* testsuite/gas/arm/pr22773.d: New test driver.
* testsuite/gas/arm/pr22773.l: New expected output.
Since there is no need to prepare for PLT branch on x86-64, generate
R_X86_64_PLT32, instead of R_X86_64_PC32, if possible, which can be
used as a marker for 32-bit PC-relative branches.
To compile Linux kernel, this patch:
From: "H.J. Lu" <hjl.tools@gmail.com>
Subject: [PATCH] x86: Treat R_X86_64_PLT32 as R_X86_64_PC32
On i386, there are 2 types of PLTs, PIC and non-PIC. PIE and shared
objects must use PIC PLT. To use PIC PLT, you need to load
_GLOBAL_OFFSET_TABLE_ into EBX first. There is no need for that on
x86-64 since x86-64 uses PC-relative PLT.
On x86-64, for 32-bit PC-relative branches, we can generate PLT32
relocation, instead of PC32 relocation, which can also be used as
a marker for 32-bit PC-relative branches. Linker can always reduce
PLT32 relocation to PC32 if function is defined locally. Local
functions should use PC32 relocation. As far as Linux kernel is
concerned, R_X86_64_PLT32 can be treated the same as R_X86_64_PC32
since Linux kernel doesn't use PLT.
is needed. It is available on hjl/plt32/master branch at
https://github.com/hjl-tools/linux
bfd/
PR gas/22791
* elf64-x86-64.c (is_32bit_relative_branch): Removed.
(elf_x86_64_relocate_section): Check PIC relocations in PIE.
Remove is_32bit_relative_branch usage. Disallow PC32 reloc
against protected function in shared object.
gas/
PR gas/22791
* config/tc-i386.c (need_plt32_p): New function.
(output_jump): Generate BFD_RELOC_X86_64_PLT32 if possible.
(md_estimate_size_before_relax): Likewise.
* testsuite/gas/i386/reloc64.d: Updated.
* testsuite/gas/i386/x86-64-jump.d: Likewise.
* testsuite/gas/i386/x86-64-mpx-branch-1.d: Likewise.
* testsuite/gas/i386/x86-64-mpx-branch-2.d: Likewise.
* testsuite/gas/i386/x86-64-relax-2.d: Likewise.
* testsuite/gas/i386/x86-64-relax-3.d: Likewise.
* testsuite/gas/i386/ilp32/reloc64.d: Likewise.
* testsuite/gas/i386/ilp32/x86-64-branch.d: Likewise.
ld/
PR gas/22791
* testsuite/ld-x86-64/mpx1c.rd: Updated.
* testsuite/ld-x86-64/pr22791-1.err: New file.
* testsuite/ld-x86-64/pr22791-1a.c: Likewise.
* testsuite/ld-x86-64/pr22791-1b.s: Likewise.
* testsuite/ld-x86-64/pr22791-2.rd: Likewise.
* testsuite/ld-x86-64/pr22791-2a.s: Likewise.
* testsuite/ld-x86-64/pr22791-2b.c: Likewise.
* testsuite/ld-x86-64/pr22791-2c.s: Likewise.
* testsuite/ld-x86-64/x86-64.exp: Run PR ld/22791 tests.
Correct a duplicate `Loongson-3A tests' GAS test name introduced with
commit 9867540240 ("Add Loongson3A specific instructions"),
<https://sourceware.org/ml/binutils/2010-12/msg00447.html>, shared
between gas/testsuite/gas/mips/loongson-3a.d and
gas/testsuite/gas/mips/loongson-3a-2.d.
gas/
* testsuite/gas/mips/loongson-3a-2.d: Rename test.
Correct a commit 2d6dda7161 ("MIPS/BFD: Correctly report unsupported
`.reginfo' section size") issue and avoid a GAS test failure:
regexp_diff match failure
regexp "^.*: Incorrect `\.reginfo' section size; expected 24, got 28$"
line "../as-new: dump.o: Incorrect `.reginfo' section size; expected 24, got 32"
FAIL: MIPS assembled .reginfo section size (n32)
on MIPS targets other than bare-metal ones. The reason for this failure
is section padding to alignment, done in `size_seg'. For n32 `.reginfo'
the section alignment is set to 3, and therefore the section is padded
to a multiple of 8, except for bare-metal targets, for which padding is
unconditionally disabled in `md_section_align'.
Use `--no-pad-sections' then to disable padding for all targets, so that
the size of `.reginfo' is always the same, matching the message pattern.
gas/
* testsuite/gas/mips/reginfo-2-n32.d: Add `--no-pad-sections' to
`as' flags.
The instruction encoding for the MIPS r6 sigrie instruction seems to be
incorrect. It's currently 0x4170xxxx (which overlaps with ei, di, evp,
and dvp), but should be 0x0417xxxx. See ISA reference[1][2].
References:
[1] "MIPS Architecture for Programmers Volume II-A: The MIPS32
Instruction Set Manual", Imagination Technologies, Inc., Document
Number: MD00086, Revision 6.06, December 15, 2016, Table A.4 "MIPS32
REGIMM Encoding of rt Field", p. 452
[2] "MIPS Architecture For Programmers Volume II-A: The MIPS64
Instruction Set Reference Manual", Imagination Technologies, Inc.,
Document Number: MD00087, Revision 6.06, December 15, 2016, Table
A.4 "MIPS64 REGIMM Encoding of rt Field", p. 581
opcodes/
* mips-opc.c (mips_builtin_opcodes): Correct "sigrie" encoding.
gas/
* testsuite/gas/mips/r6.d: Update for "sigrie" encoding fix.
* testsuite/gas/mips/r6-n32.d: Likewise.
* testsuite/gas/mips/r6-n64.d: Likewise.
Checks for insn alignment were hopelessly confused when misaligned
data starts a new frag. The real-world testcase happened to run out
of frag space in the middle of emitting a trace-back table via
something like:
.byte 0 /* VERSION=0 */
.byte 9 /* LANG=C++ */
.byte 34 /* Bits on: has_tboff, fp_present */
.byte 64 /* Bits on: name_present */
.byte 128 /* Bits on: stores_bc, FP_SAVED=0 */
.byte 0 /* Bits on: GP_SAVED=0 */
.byte 2 /* FIXEDPARMS=2 */
.byte 1 /* FLOATPARMS=0, parmsonstk */
.long 0
.long 768 /* tb_offset: 0x300 */
.hword 45 /* Function name length: 45 */
.long 0x334e5a5f
.long 0x31766f70
.long 0x65744932
.long 0x69746172
.long 0x7a5f6e6f
.long 0x64504533
.long 0x5f534e50
.long 0x72463431
.long 0x61746361
.long 0x74535f6c
.long 0x74637572
.byte 0x45
.byte 0
The trigger being those misaligned .long's output for the function
name. A most horrible way to output a string, especially considering
endian issues..
PR 22819
* config/tc-ppc.c (md_assemble): Rewrite insn alignment checking.
(ppc_frag_check): Likewise.
* testsuite/gas/ppc/misalign.d,
* testsuite/gas/ppc/misalign.l,
* testsuite/gas/ppc/misalign.s: New test.
* testsuite/gas/ppc/misalign2.d,
* testsuite/gas/ppc/misalign2.s: New test.
* testsuite/gas/ppc/ppc.exp: Run them.
Correct a commit f0531ed6a4 ("Compress loads/stores with implicit 0
offset.") regression and remove a `-Wshadow' compilation error:
cc1: warnings being treated as errors
.../gas/config/tc-riscv.c: In function 'riscv_handle_implicit_zero_offset':
.../gas/config/tc-riscv.c:1194: error: declaration of 'expr' shadows a global declaration
.../gas/expr.h:180: error: shadowed declaration is here
make[4]: *** [tc-riscv.o] Error 1
which for versions of GCC before 4.8 prevents GAS for RISC-V targets
from being built. See also GCC PR c/53066.
gas/
* config/tc-riscv.c (riscv_handle_implicit_zero_offset): Rename
`expr' parameter to `ep'.
Report an error when an unsupported `.reginfo' section size is found in
`_bfd_mips_elf_section_processing', removing an assertion that triggers
at elfxx-mips.c:7105 in GAS when assembling input like:
.section .reginfo
.word 0xdeadbeef
and in `objcopy --rename-section' when renaming an incorrectly sized
section to `.reginfo'.
bfd/
* elfxx-mips.c (_bfd_mips_elf_section_processing): For
SHT_MIPS_REGINFO sections don't assert the correct size and
report an error instead.
binutils/
* testsuite/binutils-all/mips/mips-reginfo.d: New test.
* testsuite/binutils-all/mips/mips-reginfo-n32.d: New test.
* testsuite/binutils-all/mips/mips-reginfo.s: New test source.
* testsuite/binutils-all/mips/mips.exp: Run the new tests.
gas/
* testsuite/gas/mips/reginfo-2.d: New test.
* testsuite/gas/mips/reginfo-2-n32.d: New test.
* testsuite/gas/mips/reginfo-2.l: New test stderr output.
* testsuite/gas/mips/reginfo-2.s: New test source.
* testsuite/gas/mips/mips.exp: Run the new tests.
The PR22714 testcase is such that the input buffer processed by
do_scrub_chars ends on this line
1: bug "Returning to usermode but unexpected PSR bits set?", \@
right at the backslash. (The line is part of a macro definition.)
The next input buffer then starts with '@' which starts a comment on
ARM, and the check for \@ fails due to to == tostart. Now it would be
possible to simply access to[-1] in this particular case, but that's
ugly, and to be absolutely safe from people deliberately trying to
crash gas we'd need the read.c:read_a_source_file buffer passed to
do_scrub_chars to have a single byte pad at the start.
PR 22714
* app.c (last_char): New static var.
(struct app_save): Add last_char field.
(app_push, app_pop): Handle it.
(do_scrub_chars): Use last_char in test for "\@". Set last_char.
The .dc.a directive has wrong size (32 bits) on SPARC 64-bit because
the assembler sets the correct BFD architecture only at the very end
of the processing and it's too late for the directive. It's fixed by
defining TARGET_MACH and making it return a sensible default value.
gas/
* config/tc-sparc.h (sparc_mach): Declare.
(TARGET_MACH): Define to above.
* config/tc-sparc.c (sparc_mach): New function.
(sparc_md_end): Minor tweak.
ld/
* testsuite/ld-elf/pr22450.d: Remove reference to SPARC64.
Fix a commit 0a44bf6950 ("mips-vxworks support"),
<https://sourceware.org/ml/binutils/2006-03/msg00179.html>, regression
and override the choice of the `vxworks' target environment introduced
with commit ea3eed1500 ("Add generic vxworks GAS target."),
<https://sourceware.org/ml/binutils/2005-01/msg00052.html>, for
`mips-*-windiss' targets as they have not been converted to the VxWorks
target format introduced with the former commit, removing a GAS target
format selection failure:
Assembler messages:
Fatal error: selected target format 'elf32-bigmips-vxworks' unknown
on any assembly attempt with `mips-windiss' and equivalent target
configurations.
gas/
* configure.tgt: Use generic emulation for `mips-*-windiss',
overriding the blanket choice made for `*-*-windiss'.
Use `mips-*-sysv4*' rather than `mips-*-sysv4*MP*' to match the system
type for System V Release 4 MIPS targets, removing a GAS target
selection failure:
Assembler messages:
Fatal error: selected target format 'elf32-bigmips' unknown
on any assembly attempt with `mips-sysv4' and equivalent target
configurations. These would typically be called `mips-sni-sysv4'
(Sinix) vs `mips-dde-sysv4.2MP' (Supermax).
This corrects commit 8614eeee67 ("Traditional MIPS patches"),
<https://sourceware.org/ml/binutils/2000-07/msg00018.html>, making GAS
target selection match commit dd745cfae5 ("Traditional MIPS patches"),
<https://sourceware.org/ml/binutils/2000-07/msg00018.html>, and commit
3548145dcb ("Traditional MIPS patches"),
<https://sourceware.org/ml/binutils/2000-07/msg00018.html>, which added
support for these targets to BFD and LD respectively.
gas/
* configure.tgt: Use `mips-*-sysv4*' rather than
`mips-*-sysv4*MP*'.
Correct an issue with the `mips64*-ps2-elf*' target introduced with
commit e407c74b5b ("Support for MIPS R5900 (Sony Playstation 2)"),
<https://sourceware.org/ml/binutils/2012-12/msg00240.html> and make
the n32 ABI the default for GAS, consistently with how BFD and LD
are configured for this target.
gas/
* configure.ac: Also set `mips_default_abi' to N32_ABI for
`mips64*-ps2-elf*'.
* configure: Regenerate.
Remove an issue with `as --help' always reporting `o32' as the default
ABI regardless of what the default actually is, originally caused by
commit cac012d6d3 ("check mips abi x linker emulation compatibility"),
<https://sourceware.org/ml/binutils/2003-05/msg00187.html> missing an
update here.
gas/
* config/tc-mips.c (md_show_usage): Correctly indicate the
configuration-specific default ABI.
Correct a commit 25499ac7ee ("MIPS16e2: Add MIPS16e2 ASE support") GAS
bug and add missing help text for the `-mmips16e2' and `-mno-mips16e2'
options added with said commit.
gas/
* config/tc-mips.c (md_show_usage): Report `-mmips16e2' and
`-mno-mips16e2' options.
Remove spurious comments after the definition of ToC and ToU.
2018-01-19 Thomas Preud'homme <thomas.preudhomme@arm.com>
gas/
* config/tc-arm.c (ToC macro): Remove spurious comment.
(ToU macro): Likewise.
gas/
* testsuite/gas/riscv/c-zero-imm.s: Test addi that compresses to c.nop.
* testsuite/gas/riscv/c-zero-imm.d: Likewise.
opcodes/
* riscv-opc.c (match_c_nop): New.
(riscv_opcodes) <addi>: Handle an addi that compresses to c.nop.
Armv8-M Security Extensions introduced some Thumb-only opcodes
(eg. sg). These are defined using the TUE and TCE macros, setting the
Arm execution state related fields to 0/NULL.
This patch adds 2 new macros to avoid filling this field and clearly
identify Thumb-only instructions.
2018-01-15 Thomas Preud'homme <thomas.preudhomme@arm.com>
gas/
* config/tc-arm.c (ToC): Define macro.
(ToU): Likewise.
(insns): Make use of above macros for new instructions introduced in
Armv8-M.
Newly introduced instructions common to ARMv8-M Baseline and Mainline
are currently all marked as unconditional. However, all instructions but
sg (ie. blxns, bxns, tt, ttt, tta, ttat, vlldm and vlstm) do actually
support conditional execution. This patch fixes the definition of these
instructions accordingly.
2018-01-15 Thomas Preud'homme <thomas.preudhomme@arm.com>
gas/
* config/tc-arm.c (insns): Make blxns, bxns, tt, ttt, tta, ttat, vlldm
and vlstm conditionally executable and reindent parameters.
* testsuite/gas/arm/archv8m-cmse-main.s: Add conditional version of
aforementionned instructions.
Deprecations related to the use of the IT instruction introduced in
Armv8-A do not apply to Armv8-M Baseline and mainline. However the
warning logic do not distinguish between the various profiles and warn
whenever the architecture version is 8.
This patch adds a check to exclude M profile architectures from this
warning. This works as expected when -march is specified on the
command-line or a .arch/.cpu directive exist. However, in autodetection
mode the CPU/architecture targeted is only known once the instructions
have been all processed but this code is run when IT instruction is
processed. It is therefore not possible to distinguish between Armv8-M
and Armv8-A in that mode.
The approach chosen here is not to warn in autodetection mode. The udf.d
testcase that relied on that behavior to test deprecation warning for
Armv8-A is therefore updated to explicitely pass -march=armv8-a.
2018-01-15 Thomas Preud'homme <thomas.preudhomme@arm.com>
gas/
* config/tc-arm.c (it_fsm_post_encode): Do not warn if targeting M
profile architecture or if in autodetection mode. Clarify that
deprecation is for performance reason and concerns Armv8-A and Armv8-R.
* testsuite/gas/arm/armv8-ar-bad.l: Adapt to new IT deprecation warning
message.
* testsuite/gas/arm/armv8-ar-it-bad.l: Likewise.
* testsuite/gas/arm/sp-pc-validations-bad-t-v8a.l: Likewise.
* testsuite/gas/arm/udf.l: Likewise.
* testsuite/gas/arm/udf.d: Assemble for Armv8-A explicitely.
Occasionally I build an out-of-tree a.out target (m68k-amigaos). After
a system upgrade which included a newer compiler (clang 4) the build
produces warnings like this:
warning: macro expansion producing 'defined' has undefined behavior
[-Wexpansion-to-defined]
This is caused by the macro gas/config/aout_gnu.h:USE_EXTENDED_RELOC.
Since it is in a header file, the warning triggers for several files.
I am unsure what solution is preferable, thus I am suggesting two
patches:
a) keep the offending macro but define it explicitly to 0 and 1
b) replace the macro usage with its value where it is used.
Either patch removes the warning for clang. I did not check with a
recent GCC.
* gas/config/aout_gnu.h (USE_EXTENDED_RELOC): Explicitly
define to 0 and 1. Remove a dangling reference to "AMD 29000"
in a comment.
Just like their packed counterparts the memory operand is always 16
bytes wide, and the Disp8 scaling is the same for all of them. (As a
side note: I'm also surprised by there being AVX512VL variants of
these as well as the AVX512_4VNNIW ones - the SDM doesn't define any
such.)
Adjust the test cases also for the packed forms to actually live up to
their promise of testing correct Disp8 encoding.
gas/
* testsuite/gas/riscv/auipc-x0.d: New.
* testsuite/gas/riscv/auipc-x0.s: New.
opcodes/
* riscv-dis.c (maybe_print_address): If base_reg is zero,
then the hi_addr value is zero.
CSDB is a new instruction which Arm has defined. As it shares the
encoding space with NOP instructions, it is available from Armv3 in
Arm mode, and Armv6T2 in Thumb mode.
OK? If so, please commit on my behalf as I don't have commit rights
over here.
Thanks, James
---
opcodes/
2018-01-09 James Greenhalgh <james.greenhalgh@arm.com>
* arm-dis.c (arm_opcodes): Add csdb.
(thumb32_opcodes): Add csdb.
gas/
2018-01-09 James Greenhalgh <james.greenhalgh@arm.com>
* config/tc-arm.c (insns): Add csdb, enable for Armv3 and above
in Arm execution state, and Armv6T2 and above in Thumb execution
state.
* testsuite/gas/arm/csdb.s: New.
* testsuite/gas/arm/csdb.d: New.
* testsuite/gas/arm/thumb2_it_bad.l: Add csdb.
* testsuite/gas/arm/thumb2_it_bad.s: Add csdb.
CSDB is a new instruction which Arm has defined. It has the same encoding as
HINT #0x14 and is available at all architecture levels.
opcodes * aarch64-tbl.h (aarch64_opcode_table): Add "csdb".
* aarch64-asm-2.c: Regenerate.
* aarch64-dis-2.c: Regenerate.
* aarch64-opc-2.c: Regenerate.
gas * testsuite/gas/aarch64/system.d: Update expected results to expect
CSDB.
For historical reason, we allow movd/vmovd with 64-bit register and
memeory operands. But for vmovd, we failed to handle 64-bit memeory
operand. This has been gone unnoticed since AT&T syntax always treats
memory operand as 32-bit memory. This patch properly encodes vmovd
with 64-bit memeory operands. It also removes AVX512 vmovd with 64-bit
operands since GCC has
case TYPE_SSEMOV:
switch (get_attr_mode (insn))
{
case MODE_DI:
/* Handle broken assemblers that require movd instead of movq. */
if (!HAVE_AS_IX86_INTERUNIT_MOVQ
&& (GENERAL_REG_P (operands[0]) || GENERAL_REG_P (operands[1])))
return "%vmovd\t{%1, %0|%0, %1}";
return "%vmovq\t{%1, %0|%0, %1}";
and all AVX512 GNU assemblers set HAVE_AS_IX86_INTERUNIT_MOVQ, GCC won't
generate AVX512 vmovd with 64-bit operand.
gas/
PR gas/22681
* testsuite/gas/i386/i386.exp: Run x86-64-movd and
x86-64-movd-intel.
* testsuite/gas/i386/x86-64-movd-intel.d: New file.
* testsuite/gas/i386/x86-64-movd.d: Likewise.
* testsuite/gas/i386/x86-64-movd.s: Likewise.
opcodes/
PR gas/22681
* i386-opc.tbl: Properly encode vmovd with Qword memeory operand.
Remove AVX512 vmovd with 64-bit operands.
* i386-tbl.h: Regenerated.
gas/
* testsuite/gas/riscv/priv-reg.s: Add missing stval and mtval.
* testsuite/gas/riscv/priv-reg.d: Likewise.
include/
* opcode/riscv-opc.h (CSR_SBADADDR): Rename to CSR_STVAL. Rename
DECLARE_CSR entry. Add alias to map sbadaddr to CSR_STVAL.
(CSR_MBADADDR): Rename to CSR_MTVAL. Rename DECLARE_CSR entry.
Add alias to map mbadaddr to CSR_MTVAL.
Dot products deviate from the normal disassembly rules for lane indexed
instruction. Their canonical representation is in the form of:
v0.2s, v0.8b, v0.4b[0] instead of v0.2s, v0.8b, v0.b[0] to try to denote
that these instructions select 4x 1 byte elements instead of a single 1 byte
element.
Previously we were disassembling them following the normal rules, this patch
corrects the disassembly.
gas/
PR gas/22559
* config/tc-aarch64.c (vectype_to_qualifier): Support AARCH64_OPND_QLF_S_4B.
* gas/testsuite/gas/aarch64/dotproduct.d: Update disassembly.
include/
PR gas/22559
* aarch64.h (aarch64_opnd_qualifier): Add AARCH64_OPND_QLF_S_4B.
opcodes/
PR gas/22559
* aarch64-asm.c (aarch64_ins_reglane): Change AARCH64_OPND_QLF_S_B to
AARCH64_OPND_QLF_S_4B
* aarch64-dis.c (aarch64_ext_reglane): Change AARCH64_OPND_QLF_S_B to
AARCH64_OPND_QLF_S_4B
* aarch64-opc.c (aarch64_opnd_qualifiers): Add 4b variant.
* aarch64-tbl.h (QL_V2DOT): Change S_B to S_4B.
Previously parse_vector_type_for_operand was changed to allow the use of 4b
register size for indexed lane instructions. However this had the unintended
side effect of also allowing 4b for normal vector registers.
Because this support was only partial the rest of the tool silently treated
4b as 8b and continued. This patch adds full support for 4b so it can be
properly distinguished from 8b and the correct errors are generated.
With this patch you still can't encode any instruction which actually requires
v<num>.4b but such instructions don't exist so to prevent needing a workaround
in get_vreg_qualifier_from_value this was just omitted.
gas/
PR gas/22529
* config/tc-aarch64.c (vectype_to_qualifier): Support AARCH64_OPND_QLF_V_4B.
* gas/testsuite/gas/aarch64/pr22529.s: New.
* gas/testsuite/gas/aarch64/pr22529.d: New.
* gas/testsuite/gas/aarch64/pr22529.l: New.
include/
PR gas/22529
* opcode/aarch64.h (aarch64_opnd_qualifier): Add AARCH64_OPND_QLF_V_4B.
opcodes/
PR gas/22529
* aarch64-opc.c (aarch64_opnd_qualifiers): Add 4b variant.
Just like for instructions in GPRs, there's no need to have separate
templates for otherwise identical insns acting on XMM or YMM registers
(or memory of the same size).
Use a combination of a single new Reg bit and Byte, Word, Dword, or
Qword instead.
Besides shrinking the number of operand type bits this has the benefit
of making register handling more similar to accumulator handling (a
generic flag is being accompanied by a "size qualifier"). It requires,
however, to split a few insn templates, as it is no longer correct to
have combinations like Reg32|Reg64|Byte. This slight growth in size will
hopefully be outweighed by this change paving the road for folding a
presumably much larger number of templates later on.
Pseudo prefixes must be used on an instruction. Issue an error when
pseudo prefix is used without instruction.
PR gas/22623
* gas/config/tc-i386.c (output_insn): Check pseudo prefix
without instruction.
* testsuite/gas/i386/i386.exp: Run inval-pseudo.
* testsuite/gas/i386/inval-pseudo.l: New file.
* testsuite/gas/i386/inval-pseudo.s: Likewise.
Again these look to be typos: No template currently allows for any two
(or all three) of RegXMM, RegYMM, and RegZMM in a single operand. Quite
clearly ! are missing, after the addition of which the checks for the
first and (if present) second operands also fully match up.
I'm rather certain the missing ! was just a typo, the more with the
similar check in mind that's in the same function a few hundred lines
down (in the body of "if (vex_reg != (unsigned int) ~0)"). Of course
this can't be demonstrated by a test case - internal data structure
consistency is being checked here, and neither form of the check
triggers with any current template.
It is also not really clear to me why operand_type_equal() is being used
in the {X,Y,Z}MM register check here, rather than just testing the
respective bits: Just like Reg32|Reg64 is legal in an operand template,
I don't see why e.g. RegXMM|RegYMM wouldn't be. For example it ought to
be possible to combine
vaddpd, 3, 0x6658, None, 1, CpuAVX, Modrm|Vex|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Xmmword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S|RegXMM, RegXMM, RegXMM }
vaddpd, 3, 0x6658, None, 1, CpuAVX, Modrm|Vex=2|VexOpcode=0|VexVVVV=1|VexW=1|IgnoreSize|No_bSuf|No_wSuf|No_lSuf|No_sSuf|No_qSuf|No_ldSuf, { Ymmword|Unspecified|BaseIndex|Disp8|Disp16|Disp32|Disp32S|RegYMM, RegYMM, RegYMM }
into a single template (with setting of VEX.L suitably handled elsewhere
if that's not already happening anyway).
Additionally I don't understand why this uses abort() instead of
gas_assert().
Both of these latter considerations then also apply to the
aforementioned other check in the same function.
opcodes * disassemble.c: Enable disassembler_needs_relocs for PRU.
gas * testsuite/gas/pru/extern.s: New test for print of U16_PMEMM
relocation.
* testsuite/gas/pru/extern.d: New test driver.
Don't use address where symbol gets resolved, as during section
relaxation symbols will slide, instead canonicalize symbols and check
that they are are the same.
This fixes a bug when a relaxed jump goes into the wrong trampoline.
gas/
2017-12-07 Max Filippov <jcmvbkbc@gmail.com>
* config/tc-xtensa.c (xg_order_trampoline_chain): Replace
xg_order_trampoline_chain_entry call with check for
canonicalized symbol equality and offset equality.
While we shouldn't outright reject such (as was wrongly done by commit
4d36230d59 ("x86: Update segment register check in Intel syntax"), as
MASM accepts them even silently, issue (by default) a warning for such
questionable constructs.
gas/
* config/tc-riscv.c (riscv_frag_align_code): New local insn_alignment.
Early return if bytes less than or equal to insn_alignment.
* testsuite/gas/riscv/align-1.l: New.
* testsuite/gas/riscv/align-1.s: New.
* testsuite/gas/riscv/riscv.exp: Use run_dump_tests. Use run_list_test
for align-1.
This should be an obvious fix.
It corrects the register number for IP1 to 17.
gas/
2017-11-29 Renlin Li <renlin.li@arm.com>
* config/tc-aarch64.c (reg_names): Fix IP1 register alias error.
* testsuite/gas/aarch64/register_aliases.s: Add IP0 and IP1 tests.
* testsuite/gas/aarch64/register_aliases.d: Update.
bfd/
* po/Make-in (datadir): Define as @datadir@.
(localedir): Define as @localedir@.
(gnulocaledir, gettextsrcdir): Use @datarootdir@.
binutils/
* po/Make-in (datadir): Define as @datadir@.
(localedir): Define as @localedir@.
(gnulocaledir, gettextsrcdir): Use @datarootdir@.
gas/
* po/Make-in (datadir): Define as @datadir@.
(localedir): Define as @localedir@.
(gnulocaledir, gettextsrcdir): Use @datarootdir@.
gold/
* po/Make-in (datadir): Define as @datadir@.
(localedir): Define as @localedir@.
(gnulocaledir, gettextsrcdir): Use @datarootdir@.
gprof/
* po/Make-in (datadir): Define as @datadir@.
(localedir): Define as @localedir@.
(gnulocaledir, gettextsrcdir): Use @datarootdir@.
ld/
* po/Make-in (datadir): Define as @datadir@.
(localedir): Define as @localedir@.
(gnulocaledir, gettextsrcdir): Use @datarootdir@.
opcodes/
* po/Make-in (datadir): Define as @datadir@.
(localedir): Define as @localedir@.
(gnulocaledir, gettextsrcdir): Use @datarootdir@.
find_trampoline_seg takes noticeable time when assembling source with
many sections. Cache the result of the most recent search and check it
first. No functional changes.
gas/
2017-11-27 Max Filippov <jcmvbkbc@gmail.com>
* config/tc-xtensa.c (find_trampoline_seg): Add static variable
that caches the result of the most recent search.
There is a recurring pattern in assembly files generated by a compiler
where a lot of jumps in a function are going to the same place. When
these jumps are relaxed with trampolines the assembler generates a
separate jump thread from each source.
Create an index of trampoline jump targets for each segment and see if a
jump being relaxed goes to a location from that index, in which case
replace its target with a location of existing trampoline jump that
results in the shortest path to the original target.
gas/
2017-11-27 Max Filippov <jcmvbkbc@gmail.com>
* config/tc-xtensa.c (trampoline_chain_entry, trampoline_chain)
(trampoline_chain_index): New structures.
(trampoline_index): Add chain_index field.
(xg_order_trampoline_chain_entry, xg_sort_trampoline_chain)
(xg_find_chain_entry, xg_get_best_chain_entry)
(xg_order_trampoline_chain, xg_get_trampoline_chain)
(xg_find_best_eq_target, xg_add_location_to_chain)
(xg_create_trampoline_chain, xg_get_single_symbol_slot): New
functions.
(xg_relax_fixups): Call xg_find_best_eq_target to adjust jump
target to point to an existing jump. Call
xg_create_trampoline_chain to create new jump target. Call
xg_add_location_to_chain to add newly created trampoline jump
to the corresponding chain.
(add_jump_to_trampoline): Extract loop searching for a single
slot with a symbol into a separate function, replace that code
with a call to that function.
(relax_frag_immed): Call xg_find_best_eq_target to adjust jump
target to point to an existing jump.
* testsuite/gas/xtensa/all.exp: Add trampoline-2 test.
* testsuite/gas/xtensa/trampoline.d: Adjust absolute addresses
as many duplicate trampoline chains are now coalesced.
* testsuite/gas/xtensa/trampoline.s: Add _nop so that objdump
stays in sync with instruction stream.
* testsuite/gas/xtensa/trampoline-2.l: New test result file.
* testsuite/gas/xtensa/trampoline-2.s: New test source file.
There's almost exact copy of the trampoline placement code in the
search_trampolines function that is used for jumps generated for relaxed
branch instructions. Get rid of the duplication and reuse
xg_find_best_trampoline function for that.
gas/
2017-11-27 Max Filippov <jcmvbkbc@gmail.com>
* config/tc-xtensa.c (search_trampolines, get_best_trampoline):
Remove definitions.
(xg_find_best_trampoline_for_tinsn): New function.
(relax_frag_immed): Replace call to get_best_trampoline with a
call to xg_find_best_trampoline_for_tinsn.
* testsuite/gas/xtensa/trampoline.d: Adjust absolute addresses
as the placement of trampolines for relaxed branches has been
changed.
Replace linked list of trampoline frags with an ordered array, so that
instead of indexing fixups trampolines could be indexed. Keep each array
in the trampoline_seg structure, so there's no need to rebuild it for
every new processed segment. Don't run relaxation for each trampoline
frag, instead run it for each fixup in the current segment that needs
relaxation at the beginning of each relaxation pass. This way the
complexity of this process drops from about O(n^2 * m) to about
O(log n * m), where n is the number of trampoline frags and m is the
number of fixups that need relaxation in the segment.
gas/
2017-11-27 Max Filippov <jcmvbkbc@gmail.com>
* config/tc-xtensa.c (trampoline_index): New structure.
(trampoline_seg): Replace trampoline list with trampoline index.
(xg_find_trampoline, xg_add_trampoline_to_index)
(xg_remove_trampoline_from_index, xg_add_trampoline_to_seg)
(xg_is_trampoline_frag_full, xg_get_fulcrum)
(xg_find_best_trampoline, xg_relax_fixup, xg_relax_fixups)
(xg_is_relaxable_fixup): New functions.
(J_MARGIN): New macro.
(xtensa_create_trampoline_frag): Use xg_add_trampoline_to_seg
instead of open-coded addition to the linked list.
(dump_trampolines): Iterate through the trampoline_seg::index.
(cached_fixupS, cached_fixup, fixup_cacheS, fixup_cache)
(fixup_order, xtensa_make_cached_fixup)
(xtensa_realloc_fixup_cache, xtensa_cache_relaxable_fixups)
(xtensa_find_first_cached_fixup, xtensa_delete_cached_fixup)
(xtensa_add_cached_fixup, check_and_update_trampolines): Remove
definitions.
(xg_relax_trampoline): Extract logic into separate functions,
replace body with a call to xg_relax_fixups.
(search_trampolines): Replace search in linked list with search
in index. Change data type of address-tracking variables from
int to offsetT. Replace abs with labs.
(xg_append_jump): Finish the trampoline frag if it's full.
(add_jump_to_trampoline): Remove trampoline frag from the index
if the frag is full.
* config/tc-xtensa.h (xtensa_frag_type): Remove next_trampoline.
* testsuite/gas/xtensa/trampoline.d: Adjust absolute addresses
as the placement of trampolines has slightly changed.
* testsuite/gas/xtensa/trampoline.s: Add _nop so that objdump
stays in sync with instruction stream.
The split between fragS and trampoline_frag doesn't save much space, but
makes trampolines management much more awkward. Merge trampoline_frag
data into the xtensa_frag_type, which is a part of fragS. No functional
changes.
gas/
2017-11-27 Max Filippov <jcmvbkbc@gmail.com>
* config/tc-xtensa.c (init_trampoline_frag): Replace pointer to
struct trampoline_frag parameter with pointer to fragS.
(xg_append_jump): Remove jump_around parameter.
(struct trampoline_frag): Remove.
(struct trampoline_seg): Change type of trampoline_list from
struct trampoline_frag to fragS.
(xtensa_create_trampoline_frag): Don't allocate struct
trampoline_frag. Initialize new fragS::tc_frag_data fields.
(dump_trampolines, xg_relax_trampoline, search_trampolines)
(get_best_trampoline, init_trampoline_frag)
(add_jump_to_trampoline, relax_frag_immed): Replace pointer to
struct trampoline_frag with a pointer to fragS.
(xg_append_jump): Remove jump_around parameter, use
fragS::tc_frag_data.jump_around_fix instead.
(xg_relax_trampoline, init_trampoline_frag)
(add_jump_to_trampoline): Don't pass jump_around parameter to
xg_append_jump.
* config/tc-xtensa.h (struct xtensa_frag_type): Add new fields:
needs_jump_around, next_trampoline and jump_around_fix.
xtensa_create_trampoline_frag has opencoded fragment equivalent to
find_trampoline_seg. Drop the fragment and use find_trampoline_seg
instead. No functional changes.
gas/
2017-11-27 Max Filippov <jcmvbkbc@gmail.com>
* config/tc-xtensa.c (find_trampoline_seg): Move above the first
use.
(xtensa_create_trampoline_frag): Replace trampoline seg search
code with a call to find_trampoline_seg.
init_trampoline_frag, add_jump_to_trampoline and xg_relax_trampoline add
a jump to the end of a trampoline frag. Extract it into a separate
funciton and use it in all these places. No functional changes.
gas/
2017-11-27 Max Filippov <jcmvbkbc@gmail.com>
* config/tc-xtensa.c (xg_append_jump): New function.
(xg_relax_trampoline, init_trampoline_frag)
(add_jump_to_trampoline): Replace trampoline jump assembling
code with a call to xg_append_jump.
To make measurement and changes easier extract trampoline relaxation
function. No functional changes.
gas/
2017-11-27 Max Filippov <jcmvbkbc@gmail.com>
* config/tc-xtensa.c (xg_relax_trampoline): New function.
(xtensa_relax_frag): Replace trampoline relaxation code with a
call to xg_relax_trampoline.
For one the register type used for masking should be validated. And then
we shouldn't accept input producing encodings which will #UD when
executed, as is the case when EVEX.Z is set while EVEX.AAA is zero.
Despite EVEX encodings not being available in real and VM86 modes,
16-bit addressing still needs to be handled properly for 16-bit
protected mode as well as 16-bit addressing in 32-bit mode. Neither
should displacements be dropped silently by the assembler, nor should
the disassembler fail to correctly scale 8-bit displacements.
Except for %eip-relative addressing, where we don't have a suitable
relocation type silently wrapping at the 4G boundary, consistently
force use of R_X86_64_32 (in ELF terms) instead of its sign-extending
counterpart. This wasn't right in case there was no base register in
the addressing expression.
Make the assembler recognize UD0, supporting only the newer form
expecting a ModR/M byte.
Make assembler and disassembler properly emit / expect a ModR/M byte for
UD1.
For the testsuite, as arch-4 already tests all UDn, avoid producing a
huge delta for other tests using UD2B by making them use UD2 instead.
Multiple errors are more confusing than helpful, as the more generic
one often implies a sufficiently different adjustment than would
actually be needed to fix the code. Additionally it makes it more
cumbersome to add missing error checks, as the testsuite then needs
extra updating.
* as.c: Include write.h.
(common_emul_init): Use FAKE_LABEL_NAME.
* ecoff.c (add_file, ecoff_directive_end, ecoff_directive_loc):
Likewise.
(ecoff_build_symbols): Use FAKE_LABEL_CHAR.
* expr.c (get_symbol_name): Use FAKE_LABEL_CHAR. Accept only if
input_from_string is TRUE.
* read.c (input_from_string): New.
(read_symbol_name): Use FAKE_LABEL_CHAR. Accept only if
input_from_string is TRUE.
(temp_ilp): Set input_from_string to TRUE.
(restore_ilp): Set input_from_string to FALSE.
* read.h (input_from_string): Declare.
* symbols.c: Include write.h
(S_IS_LOCAL): Check for FAKE_LABEL_CHAR.
(symbol_relc_make_sym): Fix comment refering to default fake label
string.
* write.h (FAKE_LABEL_CHAR): New.
* config/tc-riscv.h (FAKE_LABEL_CHAR): Define.
* testsuite/gas/all/err-fakelabel.s: New.
Uses of reg_expected_msgs rely on each arm_reg_type enumerator to have a
message entry in the same order as the enumerator declaration. This is
not clearly stated in the definition of both the arm_reg_type enum and
the reg_expected_msgs. Worse, there is nothing to ensure both are kept
in sync.
As an attempt towards this, this patch uses C99 array designators to
ensure that each message is associated with the right arm_reg_type. A
comment is also added near the definition of arm_reg_type to point to
the reg_expected_msgs array. Finally, the array is synced with
arm_reg_type by adding the missing error message for REG_TYPE_RNB.
2017-11-22 Thomas Preud'homme <thomas.preudhomme@arm.com>
gas/
* config/tc-arm.c (arm_reg_type): Comment on the link with
reg_expected_msgs.
(reg_expected_msgs): Initialize using array designators with
arm_reg_type index.
The -n command-line of x86 assembler disables optimization of alignment
directives, like ".balign 8, 0x90", with multi-byte nop instructions
such as leal 0(%esi),%esi.
PR gas/22464
* testsuite/gas/i386/align-1.s: New file.
* testsuite/gas/i386/align-1a.d: Likewise.
* testsuite/gas/i386/align-1b.d: Likewise.
* testsuite/gas/i386/i386.exp: Run align-1a and align-1b.
This patch separates the new FP16 instructions backported from Armv8.4-a to Armv8.2-a
into a new flag order to distinguish them from the rest of the already existing optional
FP16 instructions in Armv8.2-a.
The new flag "+fp16fml" is available from Armv8.2-a and implies +fp16 and is mandatory on
Armv8.4-a.
gas/
* config/tc-aarch64.c (fp16fml): New.
* doc/c-aarch64.texi (fp16fml): New.
* testsuite/gas/aarch64/armv8_2-a-crypto-fp16.d (fp16): Make fp16fml.
* testsuite/gas/aarch64/armv8_3-a-crypto-fp16.d (fp16): Make fp16fml.
include/
* opcode/aarch64.h: (AARCH64_FEATURE_F16_FML): New.
(AARCH64_ARCH_V8_4): Enable AARCH64_FEATURE_F16_FML by default.
opcodes/
* aarch64-tbl.h (aarch64_feature_fp_16_v8_2): Require AARCH64_FEATURE_F16_FML
and AARCH64_FEATURE_F16.
The crypto options depend on SIMD and FP, the documentation states so but the dependency is not there the code.
We have mostly gotten away with this due to the default flags
for the architectures (e.g. Armv8.2-a implies +simd) but this
discrepancy needs to be addressed.
gas/
2017-11-16 Tamar Christina <tamar.christina@arm.com>
* opcodes/aarch64-tbl.h
(aarch64_feature_crypto): Add ARCH64_FEATURE_SIMD and AARCH64_FEATURE_FP.
(aarch64_feature_crypto_v8_2, aarch64_feature_sm4): Likewise.
(aarch64_feature_sha3): Likewise.
While commits 9889cbb14e ("Check invalid mask registers") and
abfcb414b9 ("X86: Ignore REX_B bit for 32-bit XOP instructions") went a
bit into the right direction, this wasn't quite enough:
- VEX.vvvv has its high bit ignored
- EVEX.vvvv has its high bit ignored together with EVEX.v'
- the high bits of {,E}VEX.vvvv should not be prematurely zapped, to
allow proper checking of them when the fields has to hold al ones
- when the high bits of an immediate specify a register, bit 7 is
ignored
Since .code64 directive isn't available for 32-bit BFD and ELF directive
isn't available for non-ELF directive, we should avoid them.
* testsuite/gas/i386/noextreg.s: Replace .code64/.code32 and
64-bit instructions with .byte. Remove ELF directive.
The new flag "+fp16fml" is available from Armv8.2-a and implies +fp16 and is mandatory
from Armv8.4-a.
gas/
* config/tc-arm.c (arm_ext_fp16_fml, fp16fml): New.
(do_neon_fmac_maybe_scalar_long): Use arm_ext_fp16_fml.
* doc/c-arm.texi (fp16, fp16fml): New.
* testsuite/gas/arm/armv8_2-a-fp16.d (fp16): Make fp16fml.
* testsuite/gas/arm/armv8_3-a-fp16.d (fp16): Make fp16fml.
* testsuite/gas/arm/armv8_2-a-fp16-illegal.d (fp16): Make fp16fml.
* testsuite/gas/arm/armv8_2-a-fp16-thumb2.d (fp16): Make fp16fml.
include/
* opcode/arm.h: (ARM_EXT2_FP16_FML): New.
(ARM_AEXT2_V8_4A): Add ARM_EXT2_FP16_FML.
Hi Guys,
I am applying the rather large patch attached to this email to enhance
the readelf and objdump programs so that they now have the ability to
follow links to separate debug info files. (As requested by PR
15152). So for example whereas before we had this output:
$ readelf -wi main.exe
Contents of the .debug_info section:
[...]
<15> DW_AT_comp_dir : (alt indirect string, offset: 0x30c)
[...]
With the new option enabled we get:
$ readelf -wiK main.exe
main.exe: Found separate debug info file: dwz.debug
Contents of the .debug_info section (loaded from main.exe):
[...]
<15> DW_AT_comp_dir : (alt indirect string, offset: 0x30c) /home/nickc/Downloads/dwzm
[...]
The link following feature also means that we can get two lots of
output if the same section exists in both the main file and the
separate debug info file:
$ readelf -wiK main.exe
main.exe: Found separate debug info file: dwz.debug
Contents of the .debug_info section (loaded from main.exe):
[...]
Contents of the .debug_info section (loaded from dwz.debug):
[...]
The patch also adds the ability to display the contents of debuglink
sections:
$ readelf -wk main.exe
Contents of the .gnu_debugaltlink section:
Separate debug info file: dwz.debug
Build-ID (0x14 bytes):
c4 a8 89 8d 64 cf 70 8a 35 68 21 f2 ed 24 45 3e 18 7a 7a 93
Naturally there are long versions of these options (=follow-links and
=links). The documentation has been updated as well, and since both
readelf and objdump use the same set of debug display options, I have
moved the text into a separate file. There are also a couple of new
binutils tests to exercise the new behaviour.
There are a couple of missing features in the current patch however,
although I do intend to address them in follow up submissions:
Firstly the code does not check the build-id inside separate debug
info files when it is searching for a file specified by a
.gnu_debugaltlink section. It just assumes that if the file is there,
then it contains the information being sought.
Secondly I have not checked the DWARF-5 version of these link
features, so there will probably be code to add there.
Thirdly I have only implemented link following for the
DW_FORM_GNU_strp_alt format. Other alternate formats (eg
DW_FORM_GNU_ref_alt) have yet to be implemented.
Lastly, whilst implementing this feature I found it necessary to move
some of the global variables used by readelf (eg section_headers) into
a structure that can be passed around. I have moved all of the global
variables that were necessary to get the patch working, but I need to
complete the operation and move the remaining, file-specific variables
(eg dynamic_strings).
Cheers
Nick
binutils PR 15152
* dwarf.h (enum dwarf_section_display_enum): Add gnu_debuglink,
gnu_debugaltlink and separate_debug_str.
(struct dwarf_section): Add filename field.
Add prototypes for load_separate_debug_file, close_debug_file and
open_debug_file.
* dwarf.c (do_debug_links): New.
(do_follow_links): New.
(separate_debug_file, separate_debug_filename): New.
(fetch_alt_indirect_string): New function. Retrieves a string
from the debug string table in the separate debug info file.
(read_and_display_attr_value): Use it with DW_FORM_GNU_strp_alt.
(load_debug_section_with_follow): New function. Like
load_debug_section, but if the first attempt fails, then tries
again in the separate debug info file.
(introduce): New function.
(process_debug_info): Use load_debug_section_with_follow and
introduce.
(load_debug_info): Likewise.
(display_debug_lines_raw): Likewise.
(display_debug_lines_decoded): Likewise.
(display_debug_macinfo): Likewise.
(display_debug_macro): Likewise.
(display_debug_abbrev): Likewise.
(display_debug_loc): Likewise.
(display_debug_str): Likewise.
(display_debug_aranges): Likewise.
(display_debug_addr); Likewise.
(display_debug_frames): Likewise.
(display_gdb_index): Likewise.
(process_cu_tu_index): Likewise.
(load_cu_tu_indexes): Likewise.
(display_debug_links): New function. Displays the contents of a
.gnu_debuglink or .gnu_debugaltlink section.
(calc_gnu_debuglink_ctc32):New function. Calculates a CRC32
value.
(check_gnu_debuglink): New function. Checks the CRC of a
potential separate debug info file.
(parse_gnu_debuglink): New function. Reads a CRC value out of a
.gnu_debuglink section.
(check_gnu_debugaltlink): New function.
(parse_gnu_debugaltlink): New function. Reads the build-id value
out of a .gnu_debugaltlink section.
(load_separate_debug_info): New function. Finds and loads a
separate debug info file.
(load_separate_debug_file): New function. Attempts to find and
follow a link to a separate debug info file.
(free_debug_memory): Free the separate debug info file
information.
(opts_table): Add "follow-links" and "links".
(dwarf_select_sections_by_letters): Add "k" and "K".
(debug_displays): Reformat. Add .gnu-debuglink and
.gnu_debugaltlink.
Add an extra entry for .debug_str in a separate debug info file.
* doc/binutils.texi: Move description of debug dump features
common to both readelf and objdump into...
* objdump.c (usage): Add -Wk and -WK.
(load_specific_debug_section): Initialise the filename field in
the dwarf_section structure.
(close_debug_file): New function.
(open_debug_file): New function.
(dump_dwarf): Load and dump the separate debug info sections.
* readelf.c (struct filedata): New structure. Contains various
variables that used to be global:
(current_file_size, string_table, string_table_length, elf_header)
(section_headers, program_headers, dump_sects, num_dump_sects):
Move into filedata structure.
(cmdline): New global variable. Contains list of sections to dump
by number, as specified on the command line.
Add filedata parameter to most functions.
(load_debug_section): Load the string table if it has not already
been retrieved.
(close_file): New function.
(close_debug_file): New function.
(open_file): New function.
(open_debug_file): New function.
(process_object): Process sections in any separate debug info files.
* doc/debug.options.texi: New file. Add description of =links and
=follow-links options.
* NEWS: Mention the new feature.
* elfcomm.c: Have the byte gte functions take a const pointer.
* elfcomm.h: Update prototypes.
* testsuite/binutils-all/dw5.W: Update expected output.
* testsuite/binutils-all/objdump.WL: Update expected output.
* testsuite/binutils-all/objdump.exp: Add test of -WK and -Wk.
* testsuite/binutils-all/readelf.exp: Add test of -wK and -wk.
* testsuite/binutils-all/readelf.k: New file.
* testsuite/binutils-all/objdump.Wk: New file.
* testsuite/binutils-all/objdump.WK2: New file.
* testsuite/binutils-all/linkdebug.s: New file.
* testsuite/binutils-all/debuglink.s: New file.
gas * testsuite/gas/avr/large-debug-line-table.d: Update expected
output.
* testsuite/gas/elf/dwarf2-11.d: Likewise.
* testsuite/gas/elf/dwarf2-12.d: Likewise.
* testsuite/gas/elf/dwarf2-13.d: Likewise.
* testsuite/gas/elf/dwarf2-14.d: Likewise.
* testsuite/gas/elf/dwarf2-15.d: Likewise.
* testsuite/gas/elf/dwarf2-16.d: Likewise.
* testsuite/gas/elf/dwarf2-17.d: Likewise.
* testsuite/gas/elf/dwarf2-18.d: Likewise.
* testsuite/gas/elf/dwarf2-5.d: Likewise.
* testsuite/gas/elf/dwarf2-6.d: Likewise.
* testsuite/gas/elf/dwarf2-7.d: Likewise.
ld * testsuite/ld-avr/gc-section-debugline.d: Update expected
output.
VEX.W may be legitimately set (and is then ignored by the CPU) for
non-64-bit code. Don't print 64-bit register names in such a case, by
utilizing that REX_W would never be set for non-64-bit code, and that
it is being set from VEX.W by generic decoding.
A test for this is going to be introduced in the next patch of this
series.
The low four bits of an immediate being set when the high bits specify a
fourth register operand is not a problem: CPUs ignore these bits rather
than raising #UD. Take care of incrementing codep in OP_EX_VexW()
instead.
Just like %cxl can't be used as shift count register. Otherwise for
consistency %cxl would need to gain "ShiftCount" and use of both ought
to properly cause REX prefixes to be emitted.
Commit dd90581873 ("Place .shstrtab section after .symtab and .strtab,
thus restoring monotonically incre... ") adjusted section numbers, but
forgot to adjust sh_link references from relocation and group section
table entries.
Additionally some other (perhaps subsequent) change appears to have
added .rel.* and .rela.* sections to their respective groups, which
requires some further adjustments to group-2.d. I assume this additional
breakage wasn't noticed because the test was already failing at that
time.
This makes the gas testsuite complete successfully again for me in a
cross build on ix86-linux; there continue to be quite a few ld failures.
... rather than silently dropping it altogether.
i386_finalize_displacement() expects baseindex to already be set, so
the respective statement needs to be moved up. This then also allows a
subsequent conditional to be simplified.
For this to not regress on 32-bit addressing, break out address size
guessing from i386_index_check(), invoking the new function earlier so
that i386_finalize_displacement() has i.prefix[ADDR_PREFIX] available.
i386_addressing_mode () in turn needs i.base_reg / i.index_reg set
earlier.
The new options are:
+aes: Enables the AES instructions of Armv8-a,
enabled by default with +crypto.
+sha2: Enables the SHA1 and SHA2 instructions of Armv8-a,
enabled by default with +crypto.
These options have been turned on by default when +crypto
is used, as such no breakage is expected.
The reason for the split is because with the introduction of Armv8.4-a
the implementation of AES has explicitly been made independent of the
implementation of the other crypto extensions. Backporting the split does
not break any of the previous requirements and so is safe to do.
gas * config/tc-aarch64.c
(aarch64_features): Include AES and SHA2 in CRYPTO.
Add SHA2 and AES.
include * opcode/aarch64.h:
(AARCH64_FEATURE_SHA2, AARCH64_FEATURE_AES): New.
opcodes * aarch64-tbl.h (aarch64_feature_crypto): Add AES and SHA2.
(aarch64_feature_sha2, aarch64_feature_aes): New.
(SHA2, AES): New.
(AES_INSN, SHA2_INSN): New.
(pmull, pmull2, aese, aesd, aesmc, aesimc): Change to AES_INS.
(sha1h, sha1su1, sha256su0, sha1c, sha1p,
sha1m, sha1su0, sha256h, sha256h2, sha256su1):
Change to SHA2_INS.
gas * config/tc-arm.c (arm_extensions):
(arm_archs): New entry for "armv8.4-a".
Add FPU_ARCH_DOTPROD_NEON_VFP_ARMV8.
(arm_ext_v8_2): New variable.
(enum arm_reg_type): New enumeration REG_TYPE_NSD.
(reg_expected_msgs): New entry for REG_TYPE_NSD.
(parse_typed_reg_or_scalar): Handle REG_TYPE_NSD.
(parse_scalar): Support REG_TYPE_VFS.
(enum operand_parse_code): New enumerations OP_RNSD and OP_RNSD_RNSC.
(parse_operands): Handle OP_RNSD and OP_RNSD_RNSC.
(NEON_SHAPE_DEF): New entries for DHH and DHS.
(neon_scalar_for_fmac_fp16_long): New function to generate Rm encoding
for new FP16 instructions in ARMv8.2-A.
(do_neon_fmac_maybe_scalar_long): New function to encode new FP16
instructions in ARMv8.2-A.
(do_neon_vfmal): Wrapper function for vfmal.
(do_neon_vfmsl): Wrapper function for vfmsl.
(insns): New entries for vfmal and vfmsl.
* doc/c-arm.texi (-march): Document "armv8.4-a".
* testsuite/gas/arm/dotprod-mandatory.d: New test.
* testsuite/gas/arm/armv8_2-a-fp16.s: New test source.
* testsuite/gas/arm/armv8_2-a-fp16-illegal.s: New test source.
* testsuite/gas/arm/armv8_2-a-fp16.d: New test.
* testsuite/gas/arm/armv8_3-a-fp16.d: New test.
* testsuite/gas/arm/armv8_4-a-fp16.d: New test.
* testsuite/gas/arm/armv8_2-a-fp16-thumb2.d: New test.
* testsuite/gas/arm/armv8_2-a-fp16-illegal.d: New test.
* testsuite/gas/arm/armv8_2-a-fp16-illegal.l: New error file.
opcodes * arm-dis.c (coprocessor_opcodes): New entries for ARMv8.2-A new
FP16 instructions, including vfmal.f16 and vfmsl.f16.
include * opcode/arm.h (ARM_AEXT2_V8_4A): Include Dot Product feature.
(ARM_EXT2_V8_4A): New macro.
(ARM_AEXT2_V8_4A): Likewise.
(ARM_ARCH_V8_4A): Likewise.