In case of discarded sections, via /DISCARD/ or .gnu.linkonce,
relr relocation accounting was wrong. This broke building linux.
The issue was that the *_relocate_section logic was copied to
record_relr_non_got_relocs to find the relative relocs that can
be packed, however *_relocate_section is not called on sections
that are discarded, while record_relr_non_got_relocs is called
for all input sections. The fix is to filter out the discarded
sections with the same logic that is used to count non-GOT
relocs in *_late_size_sections for local symbols earlier.
Use the discarded_section helper in both cases to clarify the
intent and handle all corner-cases consistently.
GOT relocations are affected too if all sections are discarded
that reference the GOT entry of a particular symbol, however
this can cause unused GOT entries independently of DT_RELR, and
the only difference with DT_RELR is that a relative reloc may be
emitted instead of a R_AARCH64_NONE for the unused GOT entry
which is acceptable. A proper fix would require redoing the GOT
refcounting after we know the discarded sections, see bug 31850.
The logic to decide if an input relocation for a symbol becomes a
particular kind of output relocation is one of the hard to maintain
parts of the bfd ld backend, since it is partially repeated across
elfNN_aarch64_check_relocs (where dynamic relocations are counted per
symbol and input section),
elfNN_aarch64_late_size_sections (where relocation sections are sized
and GOT offsets assigned),
elfNN_aarch64_relocate_section (where most relocations are applied and
output to a relocation section),
elfNN_aarch64_finish_dynamic_symbol (where some of the GOT
relocations are applied and output).
The DT_RELR support adds another layer to this complexity: after the
output relocation sections are sized, so all dynamic relocations are
accounted (in elfNN_aarch64_late_size_sections), we scan the symbols
and input relocations again to decide which ones become relative
relocations that can be packed. The sizes of the relocation sections
are updated accordingly. This logic must be consistent with the code
that applies the relocs later so each relative relocation is emitted
exactly once: either in .rela.* or packed in .relr.dyn.
Sizing of .relr.dyn is done via elfNN_aarch64_size_relative_relocs that
may be called repeatedly whenever the layout changes since an address
change can affect the size of the packed format. Then the final content
is emitted in elfNN_aarch64_finish_relative_relocs, that is called
after the layout is final and before relocations are applied and
emitted. These hooks are only called if DT_RELR is enabled.
We only pack relative relocs that are known to be aligned in the output
during .relr.dyn sizing, the potentially unaligned relative relocs are
emitted normally (in .rela.*, not packed), because the format requires
aligned addresses.
This largely mechanical patch is preparation for a followup patch.
For quite some time I've thought that it would be useful to call
elf_backend_size_dynamic_sections even when no dynamic objects are
seen by the linker. That's what this patch does, with some renaming.
There are no functional changes to the linker, just a move of the
dynobj test in bfd_elf_size_dynamic_sections to target backend
functions, replacing the asserts/aborts already there. No doubt some
of the current always_size_sections functions could be moved to
size_dynamic_sections but I haven't made that change.
Because both hooks are now always called, I have renamed
always_size_sections to early_size_sections and size_dynamic_sections
to late_size_sections. I condisdered calling late_size_sections plain
size_sections, since this is the usual target dynamic section sizing
hook, but decided that searching the sources for "size_sections" would
then hit early_size_sections and other functions.
With the removal of symbian support, most targets no longer or never
did set is_relocatable_executable. Remove the backend support that is
no longer relevant.
* elf32-arm.c (record_arm_to_thumb_glue, elf32_arm_create_thumb_stub),
(elf32_arm_final_link_relocate, elf32_arm_check_relocs),
(elf32_arm_adjust_dynamic_symbol, allocate_dynrelocs_for_symbol),
(elf32_arm_output_arch_local_syms): Remove is_relocatable_executable
code and comments.
* elf32-csky.c (csky_elf_adjust_dynamic_symbol): Likewise.
* elfnn-aarch64.c (elfNN_aarch64_final_link_relocate): Likewise.
* elfnn-kvx.c (elfNN_kvx_final_link_relocate): Likewise.
* elfxx-mips.c (count_section_dynsyms): Likewise.
I didn't examine ld testsuite logs properly after cf95b909e2.
Replacing one of the "return false" with BFD_ASSERT in
finish_dynamic_symbol was wrong as it causes segmentation faults on
testcases expected to fail. Revert those changes and instead make
a bfd_final_link failure noisy.
Returning false from elf_backend_finish_dynamic_symbol will not result
in an error being printed unless bfd_error is set but will result in
the linker exiting with a non-zero status. If just bfd_error is set
then a generic "final link failed" will result, which doesn't help a
user much. So elf_backend_finish_dynamic_symbol should print its own
error message whenever returning false, or use BFD_ASSERT or abort to
print assertion failures for conditions that shouldn't occur.
This patch does that, and removes unnecessary "htab != NULL" tests in
elf_backend_finish_dynamic_symbol. Such tests aren't needed in a
function only called via elf_backend_data.
Adds two new external authors to etc/update-copyright.py to cover
bfd/ax_tls.m4, and adds gprofng to dirs handled automatically, then
updates copyright messages as follows:
1) Update cgen/utils.scm emitted copyrights.
2) Run "etc/update-copyright.py --this-year" with an extra external
author I haven't committed, 'Kalray SA.', to cover gas testsuite
files (which should have their copyright message removed).
3) Build with --enable-maintainer-mode --enable-cgen-maint=yes.
4) Check out */po/*.pot which we don't update frequently.
We decide to emit BTI stubs based on the instruction at the target
location. But PLT code is generated later than the stubs so we always
read 0 which is not a valid BTI.
Fix the logic to special case the PLT section: this is code the linker
generates so we know when it will have BTI.
This avoids BTI stubs in large executables where the PLTs have them
already. An alternative is to never emit BTI stubs for PLTs, instead
use BTI in the PLT if a library gets too big, however that may be
more tricky given the ordering of PLT sizing and stub insertion.
Related to bug 30957.
BTI stub parameters were recomputed even if those were already set up.
This is unnecessary work and leaks the symbol name that is allocated
for the stub.
Input sections are grouped together that can use the same stub area
(within reach) and these groups have a stable id.
Stubs have a name generated from the stub group id and target symbol.
When a relocation requires a stub with a name that already exists, the
stub is reused instead of adding a new one.
For an indirect branch stub another BTI stub may be inserted near the
target to provide a BTI landing pad.
The BTI stub can end up with the same stub group id and thus the same
name as the indirect stub. This happens if the target symbol is within
reach of the indirect branch stub. Then, due to the name collision,
only a single stub was emmitted which branched to itself causing an
infinite loop at runtime.
A possible solution is to just name the BTI stubs differently, but
since in the problematic case the indirect and BTI stub are in the
same stub area, a better solution is to emit a single stub with a
direct branch. The stub is still needed since the caller cannot reach
the target directly and we also want a BTI landing pad in the stub in
case other indirect stubs target the same symbol and thus need a BTI
stub.
In short we convert an indirect branch stub into a BTI stub when the
target is within reach and has no BTI. It is a hassle to change the
symbol of the stub so a BTI stub may end up with *_veneer instead of
*_bti_veneer after the conversion, but this should not matter much.
(Refactoring some of _bfd_aarch64_add_call_stub_entries would be
useful but too much for this bug fix patch.)
The same conversion to direct branch could be done even if the target
did not need a BTI. The stub groups are fixed in the current logic so
linking can fail if too many stubs are inserted and the section layout
is changed too much, but this only happens in extreme cases that can
be reasonably ignored. Because of this the target cannot go out of
reach during stub insertion so the optimization is valid, but not
implemented by this patch for the non-BTI case.
Fixes bug 30930.
The instruction was looked up in the wrong input file (file of branch
source instead of branch target) when optimizing away BTI stubs in
commit 5834f36d93
bfd: aarch64: Optimize BTI stubs PR30076
This can cause adding BTI stubs when they are not necessary or removing
them when they are (the latter is a correctness issue but it is very
unlikely in practice).
Fixes bug 30957.
doc/bfdint.texi and comments in the aout and som code about this
function are just wrong, and its name is not very apt. Better would
be _bfd_mostly_destroy, and we certainly should not be saying anything
about the possibility of later recreating anything lost by this
function. What's more, if _bfd_free_cached_info is called when
creating an archive map to reduce memory usage by throwing away
symbols, the target _close_and_cleanup function won't have access to
tdata or section bfd_user_data to tidy memory. This means most of the
target _close_and_cleanup function won't do anything, and therefore
sometimes will result in memory leaks.
This patch fixes the documentation problems and moves most of the
target _close_and_cleanup code to target _bfd_free_cached_info.
Another notable change is that bfd_generic_bfd_free_cached_info is now
defined as _bfd_free_cached_info rather than _bfd_bool_bfd_true,
ie. the default now frees objalloc memory.
bfd_free_cached_info is used in just one place in archive.c, which
means most times we reach bfd_close the function isn't called. On the
other hand, if bfd_free_cached_info is called we can't do much on the
bfd since it loses all its obj_alloc memory. This restricts what can
be done in a target _close_and_cleanup. In particular you can't look
at sections, which leads to duplication of code in target
close_and_cleanup and free_cached_info, eg. elfnn-aarch64.c.
* opncls.c (_bfd_delete_bfd): Call bfd_free_cached_info.
* elfnn-aarch64.c (elfNN_aarch64_close_and_cleanup): Delete.
(bfd_elfNN_close_and_cleanup): Don't define.
* som.c (som_bfd_free_cached_info): Don't call
_bfd_generic_close_and_cleanup here.
(som_close_and_cleanup): Define as _bfd_generic_close_and_cleanup.
normally RELA relocs in BFD should not consider the contents of the
relocated place. The aarch64 psABI is even stricter, it specifies
(section 5.7.16) that all RELA relocs _must_ be idempotent.
Since the inception of the aarch64 BFD backend all the relocs have a
non-zero src_mask, and hence break this invariant. It's normally not
a very visible problem as one can see it only when the relocated place
already contains a non-zero value, which usually only happens sometimes
when using 'ld -r' (or as in the testcase when jumping through hoops to
generate the relocations). Or with alternative toolchains that do encode
stuff in the relocated places with the assumption that a relocation
to that place ignores whatever is there (as they can according to
the psABI).
Golang is such a toolchain and https://github.com/golang/go/issues/39927
is ultimately caused by this problem: the testcase testGCData failing
is caused by the garbage collection data-structure to describe a type
containing pointers to be wrong. It's wrong because a field that's
supposed to contain a file-relative offset (to some gcbits) has a
relocation applied and that relocation has an addend which also is
already part of the go-produced object file (so the addend is
implicitely applied twice).
bfd/
PR ld/30437
* elfnn-aarch64.c (elfNN_aarch64_howto_table): Clear src_mask
if all relocation descriptors.
ld/
* testsuite/ld-aarch64/rela-idempotent.s: New testcase.
* testsuite/ld-aarch64/rela-idempotent.d: New.
* testsuite/ld-aarch64/aarch64-elf.exp: Run it.
While only a secondary issue there, the testcase of PR gas/27212 exposes
an oversight in relocation handling: Just like e.g. Arm32, which has a
similar comment and a similar check, relocations against STN_UNDEF have
to be permitted to satisfy the ELF spec.
Insert two stubs in a BTI enabled binary when fixing long calls: The
first is near the call site and uses an indirect jump like before,
but it targets the second stub that is near the call target site and
uses a direct jump.
This is needed when a single stub breaks BTI compatibility.
The stub layout is kept fixed between sizing and building the stubs,
so the location of the second stub is known at build time, this may
introduce padding between stubs when those are relaxed. Stub layout
with BTI disabled is unchanged.
We already use C99's __func__ in places, use it more generally. This
patch doesn't change uses in the testsuite. I've also left one in
gold.h that is protected by GCC_VERSION < 4003. If any of the
remaining uses bothers anyone I invite patches.
bfd/
* bfd-in.h: Replace __FUNCTION__ with __func__.
* elf32-bfin.c: Likewise.
* elfnn-aarch64.c: Likewise.
* elfxx-sparc.c: Likewise.
* bfd-in2.h: Regenerate.
gas/
* config/tc-cris.c: Replace __FUNCTION__ with __func__.
* config/tc-m68hc11.c: Likewise.
* config/tc-msp430.c: Likewise.
gold/
* dwp.h: Replace __FUNCTION__ with __func__.
* gold.h: Likewise, except for use inside GCC_VERSION < 4003.
ld/
* emultempl/pe.em: Replace __FUNCTION__ with __func__.
* emultempl/pep.em: Likewise.
* pe-dll.c: Likewise.
The newer update-copyright.py fixes file encoding too, removing cr/lf
on binutils/bfdtest2.c and ld/testsuite/ld-cygwin/exe-export.exp, and
embedded cr in binutils/testsuite/binutils-all/ar.exp string match.
The warning about discarded sections in elf_link_input_bfd doesn't
belong there since the code is dealing with symbols. Multiple symbols
in a discarded section will result in multiple identical warnings
about the section. Move the warning to a new function in ldlang.c.
The patch also tidies the warning quoting of section and file names,
consistently using `%pA' and `%pB'. I'm no stickler for one style of
section and file name quoting, but they ought to be consistent within
a warning, eg. see the first one fixed in ldlang.c, and when a warning
is emitted for multiple targets they all ought to use exactly the same
format string to reduce translation work. elf64-ppc.c loses the
build_one_stub errors since we won't get there before hitting the
fatal errors in size_one_stub.
bfd/
* elflink.c (elf_link_input_bfd): Don't warn here about
discarded sections.
* elf32-arm.c (arm_build_one_stub): Use consistent style in
--enable-non-contiguous-regions error.
* elf32-csky.c (csky_build_one_stub): Likewise.
* elf32-hppa.c (hppa_build_one_stub): Likewise.
* elf32-m68hc11.c (m68hc11_elf_build_one_stub): Likewise.
* elf32-m68hc12.c (m68hc12_elf_build_one_stub): Likewise.
* elf32-metag.c (metag_build_one_stub): Likewise.
* elf32-nios2.c (nios2_build_one_stub): Likewise.
* elfnn-aarch64.c (aarch64_build_one_stub): Likewise.
* xcofflink.c (xcoff_build_one_stub): Likewise.
* elf64-ppc.c (ppc_size_one_stub): Likewise.
(ppc_build_one_stub): Delete dead code.
ld/
* ldlang.c (lang_add_section): Use consistent style in
--enable-non-contiguous-regions warnings.
(size_input_section): Likewise.
(warn_non_contiguous_discards): New function.
(lang_process): Call it.
* testsuite/ld-arm/non-contiguous-arm.d: Update.
* testsuite/ld-arm/non-contiguous-arm4.d: Update.
* testsuite/ld-arm/non-contiguous-arm7.d: Add
--enable-non-contiguous-regions-warnings.
* testsuite/ld-arm/non-contiguous-arm7.err: New.
* testsuite/ld-powerpc/non-contiguous-powerpc.d: Update.
* testsuite/ld-powerpc/non-contiguous-powerpc64.d: Update.
Always call elf_backend_output_arch_local_syms since only the backend
knows if elf_backend_output_arch_local_syms is needed when all symbols
are striped. elf_backend_output_arch_local_syms is defined only for
x86, ARM and AARCH64. On x86, elf_backend_output_arch_local_syms must
be called to handle local IFUNC symbols even if all symbols are striped.
Update ARM and AARCH64 to skip elf_backend_output_arch_local_syms when
symbols aren't needed.
bfd/
PR ld/29797
* elf32-arm.c (elf32_arm_output_arch_local_syms): Skip if symbols
aren't needed.
* elfnn-aarch64.c (elfNN_aarch64_output_arch_local_syms):
Likewise.
* elflink.c (bfd_elf_final_link): Always call
elf_backend_output_arch_local_syms if available.
ld/
PR ld/29797
* testsuite/ld-elf/linux-x86.exp: Run PR ld/29797 test.
* testsuite/ld-elf/pr29797.c: New file.
BFD_VMA_FMT can't be used in format strings that need to be
translated, because the translation won't work when the type of
bfd_vma differs from the machine used to compile .pot files. We've
known about this for a long time, but patches slip through review.
So just get rid of BFD_VMA_FMT, instead using the appropriate PRId64,
PRIu64, PRIx64 or PRIo64 and SCN variants for scanf. The patch is
mostly mechanical, the only thing requiring any thought is casts
needed to preserve PRId64 output from bfd_vma values, or to preserve
one of the unsigned output formats from bfd_signed_vma values.
In aarch64_tls_transition_without_check and elfNN_aarch64_tls_relax we
choose whether to perform a relaxation to an IE access model or an LE
access model based on whether the symbol itself is marked as local (i.e.
`h == NULL`).
This is problematic in two ways. The first is that sometimes a global
dynamic access can be relaxed to an initial exec access when creating a
shared library, and if that happens on a local symbol then we currently
relax it to a local exec access instead. This usually does not happen
since we only relax an access if aarch64_can_relax_tls returns true and
aarch64_can_relax_tls does not have the same problem. However, it can
happen when we have seen both an IE and GD access on the same symbol.
This case is exercised in the newly added testcase tls-relax-gd-ie-2.
The second problem is that deciding based on whether the symbol is local
misses the case when the symbol is global but is still non-interposable
and known to be located in the executable. This happens on all global
symbols in executables.
This case is exercised in the newly added testcase tls-relax-ie-le-4.
Here we adjust the condition we base our relaxation on so that we relax
to local-exec if we are creating an executable and the relevant symbol
we're accessing is stored inside that executable.
-- Updating tests for new relaxation criteria
Many of the tests added to check our relaxation to IE were implemented
by taking advantage of the fact that we did not relax a global symbol
defined in an executable.
Since a global symbol defined in an executable is still not
interposable, we know that a TLS version of such a symbol will be in the
main TLS block. This means that we can perform a stronger relaxation on
such symbols and relax their accesses to a local-exec access.
Hence we have to update all tests that relied on the older suboptimal
decision making.
The two cases when we still would want to relax a general dynamic access
to an initial exec one are:
1) When in a shared library and accessing a symbol which we have already
seen accessed with an initial exec access sequence.
2) When in an executable and accessing a symbol defined in a shared
library.
Both of these require shared library support, which means that these
tests are now only available on targets with that.
I have chosen to switch the existing testcases from a plain executable
to one dynamically linked to a shared object as that doesn't require
changing the testcases quite so much (just requires accessing a
different variable rather than requiring adding another code sequence).
The tls-relax-all testcase was an outlier to the above approach, since
it included a general dynamic access to both a local and global symbol
and inspected for the difference accordingly.
The Linux kernel can dump memory tag segments to a core file, one segment
per mapped range. The format and documentation can be found in the Linux
kernel tree [1].
The following patch adjusts bfd and binutils so they can handle this new
segment type and display it accordingly. It also adds code required so GDB
can properly read/dump core file data containing memory tags.
Upon reading, each segment that contains memory tags gets mapped to a
section named "memtag". These sections will be used by GDB to lookup the tag
data. There can be multiple such sections with the same name, and they are not
numbered to simplify GDB's handling and lookup.
There is another patch for GDB that enables both reading
and dumping of memory tag segments.
Tested on aarch64-linux Ubuntu 20.04.
[1] Documentation/arm64/memory-tagging-extension.rst (Core Dump Support)
__attribute__((visibility("protected"))) void *foo() {
return (void *)foo;
}
gcc -fpic -shared -fuse-ld=bfd fails with the confusing diagnostic:
relocation R_AARCH64_ADR_PREL_PG_HI21 against symbol `foo' which may bind externally can not be used when making a shared object; recompile with -fPIC
Call _bfd_elf_symbol_refs_local_p with local_protected==true to suppress
the error. The new behavior matches gold and ld.lld.
Note: if some code tries to use direct access relocations to take the
address of foo (likely due to -fno-pic), the pointer equality will
break, but the error should be reported on the executable link, not on
the innocent shared object link. glibc 2.36 will give a warning at
relocation resolving time.
Follow-up to commit 90b7a5df15
("aarch64: Disallow copy relocations on protected data").
Commit 32f573bcb3 changed ld to produce
R_AARCH64_GLOB_DAT but that defeated the purpose of protected visibility
as an optimization. Restore the previous behavior (which matches
ld.lld) by defining elf_backend_extern_protected_data to 0.
If an executable has copy relocations for extern protected data, that
can only work if the shared object containing the definition is built
with assumptions (a) the compiler emits GOT-generating relocations (b)
the linker produces R_*_GLOB_DAT instead of R_*_RELATIVE. Otherwise the
shared object uses its own definition directly and the executable
accesses a stale copy. Note: the GOT relocations defeat the purpose of
protected visibility as an optimization, and it turns out this never
worked perfectly.
glibc 2.36 will warn on copy relocations on protected data. Let's
produce a warning at link time, matching ld.lld which has been used on
many aarch64 OSes.
Note: x86 requires GNU_PROPERTY_NO_COPY_ON_PROTECTED to have the error.
This is to largely due to GCC 5's "x86-64: Optimize access to globals in
PIE with copy reloc" which started to use direct access relocations for
external data symbols in -fpie mode.
GCC's aarch64 port does not have the change. Nowadays with most builds
switching to -fpie/-fpic, aarch64 mostly doesn't need to worry about
copy relocations. So for aarch64 we simply don't check
GNU_PROPERTY_NO_COPY_ON_PROTECTED.
The result of running etc/update-copyright.py --this-year, fixing all
the files whose mode is changed by the script, plus a build with
--enable-maintainer-mode --enable-cgen-maint=yes, then checking
out */po/*.pot which we don't update frequently.
The copy of cgen was with commit d1dd5fcc38ead reverted as that commit
breaks building of bfp opcodes files.
The idea of this patch is to make it easy to see which targets (just
sparc) have ELF_MINPAGESIZE != ELF_COMMONPAGESIZE.
* elf32-arm.c (ELF_MINPAGESIZE): Don't define.
* elf32-metag.c: Likewise.
* elfnn-aarch64.c: Likewise.
* elf64-x86-64.c: Likewise. Also don't redefine a bunch of other
macros for l1om elf64-target.h use that are unchanged from default.
We can't assume .dynamic is a multiple of ElfNN_External_Dyn, at least
not when presented with fuzzed object files.
* elfnn-aarch64.c (get_plt_type): Don't access past end of
improperly sized .dynamic.
bfd * elf.c (_bfd_elf_maybe_function_sym): Do not accept annobin
symbols as potential function symbols.
* elfnn-aarch64.c (elfNN_aarch64_maybe_function_sym): Likewise.
* elf64-ppc.c (ppc64_elf_maybe_function_sym): Likewise.
* elf32-arm.c (elf32_arm_maybe_function_sym): Likewise.
ld * testsuite/ld-elf/anno-sym.s: New test source file.
* testsuite/ld-elf/anno-sym.d: New test driver.
* testsuite/ld-elf/anno-sym.l: New test error output.
* elf32-arm.c (elf32_arm_print_private_bfd_data): Prefix hex value
of private flags with 0x.
* elfnn-aarch64.c (elfNN_aarch64_print_private_bfd_data): Likewise.
The fix in 7e05773767 introduced a PLT
for conditional jumps when the target symbol is undefined. This is
incorrect because conditional branch relocations are not allowed to
clobber IP0/IP1 and hence, should not result in a dynamic relocation.
Revert that change and in its place, issue an error when the target
symbol is undefined.
bfd/
2020-09-10 Siddhesh Poyarekar <siddesh.poyarekar@arm.com>
* elfnn-aarch64.c (elfNN_aarch64_final_link_relocate): Revert
changes in 7e05773767. Set
error for undefined symbol in BFD_RELOC_AARCH64_BRANCH19 and
BFD_RELOC_AARCH64_TSTBR14 relocations.
ld/
2020-09-10 Siddhesh Poyarekar <siddesh.poyarekar@arm.com>
* testsuite/ld-aarch64/emit-relocs-560.d: Expect error instead
of valid output.