Don't use address where symbol gets resolved, as during section
relaxation symbols will slide, instead canonicalize symbols and check
that they are are the same.
This fixes a bug when a relaxed jump goes into the wrong trampoline.
gas/
2017-12-07 Max Filippov <jcmvbkbc@gmail.com>
* config/tc-xtensa.c (xg_order_trampoline_chain): Replace
xg_order_trampoline_chain_entry call with check for
canonicalized symbol equality and offset equality.
While we shouldn't outright reject such (as was wrongly done by commit
4d36230d59 ("x86: Update segment register check in Intel syntax"), as
MASM accepts them even silently, issue (by default) a warning for such
questionable constructs.
gas/
* config/tc-riscv.c (riscv_frag_align_code): New local insn_alignment.
Early return if bytes less than or equal to insn_alignment.
* testsuite/gas/riscv/align-1.l: New.
* testsuite/gas/riscv/align-1.s: New.
* testsuite/gas/riscv/riscv.exp: Use run_dump_tests. Use run_list_test
for align-1.
This should be an obvious fix.
It corrects the register number for IP1 to 17.
gas/
2017-11-29 Renlin Li <renlin.li@arm.com>
* config/tc-aarch64.c (reg_names): Fix IP1 register alias error.
* testsuite/gas/aarch64/register_aliases.s: Add IP0 and IP1 tests.
* testsuite/gas/aarch64/register_aliases.d: Update.
bfd/
* po/Make-in (datadir): Define as @datadir@.
(localedir): Define as @localedir@.
(gnulocaledir, gettextsrcdir): Use @datarootdir@.
binutils/
* po/Make-in (datadir): Define as @datadir@.
(localedir): Define as @localedir@.
(gnulocaledir, gettextsrcdir): Use @datarootdir@.
gas/
* po/Make-in (datadir): Define as @datadir@.
(localedir): Define as @localedir@.
(gnulocaledir, gettextsrcdir): Use @datarootdir@.
gold/
* po/Make-in (datadir): Define as @datadir@.
(localedir): Define as @localedir@.
(gnulocaledir, gettextsrcdir): Use @datarootdir@.
gprof/
* po/Make-in (datadir): Define as @datadir@.
(localedir): Define as @localedir@.
(gnulocaledir, gettextsrcdir): Use @datarootdir@.
ld/
* po/Make-in (datadir): Define as @datadir@.
(localedir): Define as @localedir@.
(gnulocaledir, gettextsrcdir): Use @datarootdir@.
opcodes/
* po/Make-in (datadir): Define as @datadir@.
(localedir): Define as @localedir@.
(gnulocaledir, gettextsrcdir): Use @datarootdir@.
find_trampoline_seg takes noticeable time when assembling source with
many sections. Cache the result of the most recent search and check it
first. No functional changes.
gas/
2017-11-27 Max Filippov <jcmvbkbc@gmail.com>
* config/tc-xtensa.c (find_trampoline_seg): Add static variable
that caches the result of the most recent search.
There is a recurring pattern in assembly files generated by a compiler
where a lot of jumps in a function are going to the same place. When
these jumps are relaxed with trampolines the assembler generates a
separate jump thread from each source.
Create an index of trampoline jump targets for each segment and see if a
jump being relaxed goes to a location from that index, in which case
replace its target with a location of existing trampoline jump that
results in the shortest path to the original target.
gas/
2017-11-27 Max Filippov <jcmvbkbc@gmail.com>
* config/tc-xtensa.c (trampoline_chain_entry, trampoline_chain)
(trampoline_chain_index): New structures.
(trampoline_index): Add chain_index field.
(xg_order_trampoline_chain_entry, xg_sort_trampoline_chain)
(xg_find_chain_entry, xg_get_best_chain_entry)
(xg_order_trampoline_chain, xg_get_trampoline_chain)
(xg_find_best_eq_target, xg_add_location_to_chain)
(xg_create_trampoline_chain, xg_get_single_symbol_slot): New
functions.
(xg_relax_fixups): Call xg_find_best_eq_target to adjust jump
target to point to an existing jump. Call
xg_create_trampoline_chain to create new jump target. Call
xg_add_location_to_chain to add newly created trampoline jump
to the corresponding chain.
(add_jump_to_trampoline): Extract loop searching for a single
slot with a symbol into a separate function, replace that code
with a call to that function.
(relax_frag_immed): Call xg_find_best_eq_target to adjust jump
target to point to an existing jump.
* testsuite/gas/xtensa/all.exp: Add trampoline-2 test.
* testsuite/gas/xtensa/trampoline.d: Adjust absolute addresses
as many duplicate trampoline chains are now coalesced.
* testsuite/gas/xtensa/trampoline.s: Add _nop so that objdump
stays in sync with instruction stream.
* testsuite/gas/xtensa/trampoline-2.l: New test result file.
* testsuite/gas/xtensa/trampoline-2.s: New test source file.
There's almost exact copy of the trampoline placement code in the
search_trampolines function that is used for jumps generated for relaxed
branch instructions. Get rid of the duplication and reuse
xg_find_best_trampoline function for that.
gas/
2017-11-27 Max Filippov <jcmvbkbc@gmail.com>
* config/tc-xtensa.c (search_trampolines, get_best_trampoline):
Remove definitions.
(xg_find_best_trampoline_for_tinsn): New function.
(relax_frag_immed): Replace call to get_best_trampoline with a
call to xg_find_best_trampoline_for_tinsn.
* testsuite/gas/xtensa/trampoline.d: Adjust absolute addresses
as the placement of trampolines for relaxed branches has been
changed.
Replace linked list of trampoline frags with an ordered array, so that
instead of indexing fixups trampolines could be indexed. Keep each array
in the trampoline_seg structure, so there's no need to rebuild it for
every new processed segment. Don't run relaxation for each trampoline
frag, instead run it for each fixup in the current segment that needs
relaxation at the beginning of each relaxation pass. This way the
complexity of this process drops from about O(n^2 * m) to about
O(log n * m), where n is the number of trampoline frags and m is the
number of fixups that need relaxation in the segment.
gas/
2017-11-27 Max Filippov <jcmvbkbc@gmail.com>
* config/tc-xtensa.c (trampoline_index): New structure.
(trampoline_seg): Replace trampoline list with trampoline index.
(xg_find_trampoline, xg_add_trampoline_to_index)
(xg_remove_trampoline_from_index, xg_add_trampoline_to_seg)
(xg_is_trampoline_frag_full, xg_get_fulcrum)
(xg_find_best_trampoline, xg_relax_fixup, xg_relax_fixups)
(xg_is_relaxable_fixup): New functions.
(J_MARGIN): New macro.
(xtensa_create_trampoline_frag): Use xg_add_trampoline_to_seg
instead of open-coded addition to the linked list.
(dump_trampolines): Iterate through the trampoline_seg::index.
(cached_fixupS, cached_fixup, fixup_cacheS, fixup_cache)
(fixup_order, xtensa_make_cached_fixup)
(xtensa_realloc_fixup_cache, xtensa_cache_relaxable_fixups)
(xtensa_find_first_cached_fixup, xtensa_delete_cached_fixup)
(xtensa_add_cached_fixup, check_and_update_trampolines): Remove
definitions.
(xg_relax_trampoline): Extract logic into separate functions,
replace body with a call to xg_relax_fixups.
(search_trampolines): Replace search in linked list with search
in index. Change data type of address-tracking variables from
int to offsetT. Replace abs with labs.
(xg_append_jump): Finish the trampoline frag if it's full.
(add_jump_to_trampoline): Remove trampoline frag from the index
if the frag is full.
* config/tc-xtensa.h (xtensa_frag_type): Remove next_trampoline.
* testsuite/gas/xtensa/trampoline.d: Adjust absolute addresses
as the placement of trampolines has slightly changed.
* testsuite/gas/xtensa/trampoline.s: Add _nop so that objdump
stays in sync with instruction stream.
The split between fragS and trampoline_frag doesn't save much space, but
makes trampolines management much more awkward. Merge trampoline_frag
data into the xtensa_frag_type, which is a part of fragS. No functional
changes.
gas/
2017-11-27 Max Filippov <jcmvbkbc@gmail.com>
* config/tc-xtensa.c (init_trampoline_frag): Replace pointer to
struct trampoline_frag parameter with pointer to fragS.
(xg_append_jump): Remove jump_around parameter.
(struct trampoline_frag): Remove.
(struct trampoline_seg): Change type of trampoline_list from
struct trampoline_frag to fragS.
(xtensa_create_trampoline_frag): Don't allocate struct
trampoline_frag. Initialize new fragS::tc_frag_data fields.
(dump_trampolines, xg_relax_trampoline, search_trampolines)
(get_best_trampoline, init_trampoline_frag)
(add_jump_to_trampoline, relax_frag_immed): Replace pointer to
struct trampoline_frag with a pointer to fragS.
(xg_append_jump): Remove jump_around parameter, use
fragS::tc_frag_data.jump_around_fix instead.
(xg_relax_trampoline, init_trampoline_frag)
(add_jump_to_trampoline): Don't pass jump_around parameter to
xg_append_jump.
* config/tc-xtensa.h (struct xtensa_frag_type): Add new fields:
needs_jump_around, next_trampoline and jump_around_fix.
xtensa_create_trampoline_frag has opencoded fragment equivalent to
find_trampoline_seg. Drop the fragment and use find_trampoline_seg
instead. No functional changes.
gas/
2017-11-27 Max Filippov <jcmvbkbc@gmail.com>
* config/tc-xtensa.c (find_trampoline_seg): Move above the first
use.
(xtensa_create_trampoline_frag): Replace trampoline seg search
code with a call to find_trampoline_seg.
init_trampoline_frag, add_jump_to_trampoline and xg_relax_trampoline add
a jump to the end of a trampoline frag. Extract it into a separate
funciton and use it in all these places. No functional changes.
gas/
2017-11-27 Max Filippov <jcmvbkbc@gmail.com>
* config/tc-xtensa.c (xg_append_jump): New function.
(xg_relax_trampoline, init_trampoline_frag)
(add_jump_to_trampoline): Replace trampoline jump assembling
code with a call to xg_append_jump.
To make measurement and changes easier extract trampoline relaxation
function. No functional changes.
gas/
2017-11-27 Max Filippov <jcmvbkbc@gmail.com>
* config/tc-xtensa.c (xg_relax_trampoline): New function.
(xtensa_relax_frag): Replace trampoline relaxation code with a
call to xg_relax_trampoline.
For one the register type used for masking should be validated. And then
we shouldn't accept input producing encodings which will #UD when
executed, as is the case when EVEX.Z is set while EVEX.AAA is zero.
Despite EVEX encodings not being available in real and VM86 modes,
16-bit addressing still needs to be handled properly for 16-bit
protected mode as well as 16-bit addressing in 32-bit mode. Neither
should displacements be dropped silently by the assembler, nor should
the disassembler fail to correctly scale 8-bit displacements.
Except for %eip-relative addressing, where we don't have a suitable
relocation type silently wrapping at the 4G boundary, consistently
force use of R_X86_64_32 (in ELF terms) instead of its sign-extending
counterpart. This wasn't right in case there was no base register in
the addressing expression.
Make the assembler recognize UD0, supporting only the newer form
expecting a ModR/M byte.
Make assembler and disassembler properly emit / expect a ModR/M byte for
UD1.
For the testsuite, as arch-4 already tests all UDn, avoid producing a
huge delta for other tests using UD2B by making them use UD2 instead.
Multiple errors are more confusing than helpful, as the more generic
one often implies a sufficiently different adjustment than would
actually be needed to fix the code. Additionally it makes it more
cumbersome to add missing error checks, as the testsuite then needs
extra updating.
* as.c: Include write.h.
(common_emul_init): Use FAKE_LABEL_NAME.
* ecoff.c (add_file, ecoff_directive_end, ecoff_directive_loc):
Likewise.
(ecoff_build_symbols): Use FAKE_LABEL_CHAR.
* expr.c (get_symbol_name): Use FAKE_LABEL_CHAR. Accept only if
input_from_string is TRUE.
* read.c (input_from_string): New.
(read_symbol_name): Use FAKE_LABEL_CHAR. Accept only if
input_from_string is TRUE.
(temp_ilp): Set input_from_string to TRUE.
(restore_ilp): Set input_from_string to FALSE.
* read.h (input_from_string): Declare.
* symbols.c: Include write.h
(S_IS_LOCAL): Check for FAKE_LABEL_CHAR.
(symbol_relc_make_sym): Fix comment refering to default fake label
string.
* write.h (FAKE_LABEL_CHAR): New.
* config/tc-riscv.h (FAKE_LABEL_CHAR): Define.
* testsuite/gas/all/err-fakelabel.s: New.
Uses of reg_expected_msgs rely on each arm_reg_type enumerator to have a
message entry in the same order as the enumerator declaration. This is
not clearly stated in the definition of both the arm_reg_type enum and
the reg_expected_msgs. Worse, there is nothing to ensure both are kept
in sync.
As an attempt towards this, this patch uses C99 array designators to
ensure that each message is associated with the right arm_reg_type. A
comment is also added near the definition of arm_reg_type to point to
the reg_expected_msgs array. Finally, the array is synced with
arm_reg_type by adding the missing error message for REG_TYPE_RNB.
2017-11-22 Thomas Preud'homme <thomas.preudhomme@arm.com>
gas/
* config/tc-arm.c (arm_reg_type): Comment on the link with
reg_expected_msgs.
(reg_expected_msgs): Initialize using array designators with
arm_reg_type index.
The -n command-line of x86 assembler disables optimization of alignment
directives, like ".balign 8, 0x90", with multi-byte nop instructions
such as leal 0(%esi),%esi.
PR gas/22464
* testsuite/gas/i386/align-1.s: New file.
* testsuite/gas/i386/align-1a.d: Likewise.
* testsuite/gas/i386/align-1b.d: Likewise.
* testsuite/gas/i386/i386.exp: Run align-1a and align-1b.
This patch separates the new FP16 instructions backported from Armv8.4-a to Armv8.2-a
into a new flag order to distinguish them from the rest of the already existing optional
FP16 instructions in Armv8.2-a.
The new flag "+fp16fml" is available from Armv8.2-a and implies +fp16 and is mandatory on
Armv8.4-a.
gas/
* config/tc-aarch64.c (fp16fml): New.
* doc/c-aarch64.texi (fp16fml): New.
* testsuite/gas/aarch64/armv8_2-a-crypto-fp16.d (fp16): Make fp16fml.
* testsuite/gas/aarch64/armv8_3-a-crypto-fp16.d (fp16): Make fp16fml.
include/
* opcode/aarch64.h: (AARCH64_FEATURE_F16_FML): New.
(AARCH64_ARCH_V8_4): Enable AARCH64_FEATURE_F16_FML by default.
opcodes/
* aarch64-tbl.h (aarch64_feature_fp_16_v8_2): Require AARCH64_FEATURE_F16_FML
and AARCH64_FEATURE_F16.
The crypto options depend on SIMD and FP, the documentation states so but the dependency is not there the code.
We have mostly gotten away with this due to the default flags
for the architectures (e.g. Armv8.2-a implies +simd) but this
discrepancy needs to be addressed.
gas/
2017-11-16 Tamar Christina <tamar.christina@arm.com>
* opcodes/aarch64-tbl.h
(aarch64_feature_crypto): Add ARCH64_FEATURE_SIMD and AARCH64_FEATURE_FP.
(aarch64_feature_crypto_v8_2, aarch64_feature_sm4): Likewise.
(aarch64_feature_sha3): Likewise.
While commits 9889cbb14e ("Check invalid mask registers") and
abfcb414b9 ("X86: Ignore REX_B bit for 32-bit XOP instructions") went a
bit into the right direction, this wasn't quite enough:
- VEX.vvvv has its high bit ignored
- EVEX.vvvv has its high bit ignored together with EVEX.v'
- the high bits of {,E}VEX.vvvv should not be prematurely zapped, to
allow proper checking of them when the fields has to hold al ones
- when the high bits of an immediate specify a register, bit 7 is
ignored