Commit Graph

4 Commits

Author SHA1 Message Date
Tamar Christina
b4e87f2c1e Arm: Fix performance issue with thumb-2 tailcalls
We currently use a padding NOP after a Thumb to Arm interworking veneer (BX pc).
The NOP is never executed but may result in a performance penalty on some cores.

For this reason this patch changes the NOPs after Thumb to Arm veneers into B .-2
and adds a note to this in the source code for future reference.

bfd/ChangeLog:

	* elf32-arm.c (elf32_thumb2_plt_entry, elf32_arm_plt_thumb_stub,
	elf32_arm_stub_long_branch_v4t_thumb_thumb,
	elf32_arm_stub_long_branch_v4t_thumb_arm,
	elf32_arm_stub_short_branch_v4t_thumb_arm,
	elf32_arm_stub_long_branch_v4t_thumb_arm_pic,
	elf32_arm_stub_long_branch_v4t_thumb_thumb_pic,
	elf32_arm_stub_long_branch_v4t_thumb_tls_pic): Change nop to branch to
	previous instruction.

ld/ChangeLog:

	* testsuite/ld-arm/cortex-a8-fix-b-plt.d: Update Testcase.
	* testsuite/ld-arm/cortex-a8-fix-b-rel-arm.d: Likewise.
	* testsuite/ld-arm/cortex-a8-fix-bcc-plt.d: Likewise.
	* testsuite/ld-arm/farcall-cond-thumb-arm.d: Likewise.
	* testsuite/ld-arm/farcall-mixed-app.d: Likewise.
	* testsuite/ld-arm/farcall-mixed-app2.d: Likewise.
	* testsuite/ld-arm/farcall-mixed-lib-v4t.d: Likewise.
	* testsuite/ld-arm/farcall-thumb-arm-pic-veneer.d: Likewise.
	* testsuite/ld-arm/farcall-thumb-arm-short.d: Likewise.
	* testsuite/ld-arm/farcall-thumb-arm.d: Likewise.
	* testsuite/ld-arm/farcall-thumb-thumb-pic-veneer.d: Likewise.
	* testsuite/ld-arm/farcall-thumb-thumb.d: Likewise.
	* testsuite/ld-arm/fix-arm1176-on.d: Likewise.
	* testsuite/ld-arm/ifunc-10.dd: Likewise.
	* testsuite/ld-arm/ifunc-2.dd: Likewise.
	* testsuite/ld-arm/ifunc-4.dd: Likewise.
	* testsuite/ld-arm/ifunc-6.dd: Likewise.
	* testsuite/ld-arm/ifunc-8.dd: Likewise.
	* testsuite/ld-arm/jump-reloc-veneers-long.d: Likewise.
	* testsuite/ld-arm/mixed-app.d: Likewise.
	* testsuite/ld-arm/thumb2-b-interwork.d: Likewise.
	* testsuite/ld-arm/tls-longplt.d: Likewise.
	* testsuite/ld-arm/tls-thumb1.d: Likewise.
2019-08-20 16:35:28 +01:00
Julian Brown
a415b1cd63 Jie Zhang <jie@codesourcery.com>
Julian Brown  <julian@codesourcery.com>

    gas/
    * config/tc-arm.c (parse_shifter_operand): Fix handling
    of explicit rotation.
    (encode_arm_shifter_operand): Likewise.

    gas/testsuite/
    * gas/arm/adrl.d: Adjust.
    * gas/arm/immed2.d: New test.
    * gas/arm/immed2.s: New test.

    ld/testsuite/
    * ld-arm/cortex-a8-fix-b-plt.d: Adjust.
    * ld-arm/cortex-a8-fix-bcc-plt.d: Adjust.
    * ld-arm/cortex-a8-fix-bl-plt.d: Adjust.
    * ld-arm/cortex-a8-fix-bl-rel-plt.d: Adjust.
    * ld-arm/cortex-a8-fix-blx-plt.d: Adjust.
    * ld-arm/ifunc-1.dd: Adjust.
    * ld-arm/ifunc-2.dd: Adjust.
    * ld-arm/ifunc-3.dd: Adjust.
    * ld-arm/ifunc-4.dd: Adjust.
    * ld-arm/ifunc-5.dd: Adjust.
    * ld-arm/ifunc-6.dd: Adjust.
    * ld-arm/ifunc-7.dd: Adjust.
    * ld-arm/ifunc-8.dd: Adjust.
    * ld-arm/ifunc-9.dd: Adjust.
    * ld-arm/ifunc-10.dd: Adjust.
    * ld-arm/ifunc-14.dd: Adjust.
    * ld-arm/ifunc-15.dd: Adjust.
    * ld-arm/ifunc-16.dd: Adjust.

    opcodes/
    * arm-dis.c (print_insn_arm): Explicitly specify rotation
    if needed.
2011-10-18 14:41:55 +00:00
Nathan Sidwell
26d97720ed gas/
* config/tc-arm.c (parse_address_main): Handle -0 offsets.
	(encode_arm_addr_mode_2): Set default sign of zero here ...
	(encode_arm_addr_mode_3): ... and here.
	(encode_arm_cp_address): ... and here.
	(md_apply_fix): Use default sign of zero here.

	gas/testsuite/
	* gas/arm/inst.d: Adjust for signed zero offsets.
	* gas/arm/ldst-offset0.d: New test.
	* gas/arm/ldst-offset0.s: New test.
	* gas/arm/offset-1.d: New test.
	* gas/arm/offset-1.s: New test.

	ld/testsuite/
	Adjust tests for zero offset formatting.
	* ld-arm/cortex-a8-fix-bcc-plt.d: Adjust.
	* ld-arm/farcall-arm-arm-pic-veneer.d: Adjust.
	* ld-arm/farcall-arm-thumb.d: Adjust.
	* ld-arm/farcall-group-size2.d: Adjust.
	* ld-arm/farcall-group.d: Adjust.
	* ld-arm/farcall-mix.d: Adjust.
	* ld-arm/farcall-mix2.d: Adjust.
	* ld-arm/farcall-mixed-lib-v4t.d: Adjust.
	* ld-arm/farcall-mixed-lib.d: Adjust.
	* ld-arm/farcall-thumb-arm-blx-pic-veneer.d: Adjust.
	* ld-arm/farcall-thumb-arm-pic-veneer.d: Adjust.
	* ld-arm/farcall-thumb-thumb.d: Adjust.
	* ld-arm/ifunc-10.dd: Adjust.
	* ld-arm/ifunc-3.dd: Adjust.
	* ld-arm/ifunc-4.dd: Adjust.
	* ld-arm/ifunc-5.dd: Adjust.
	* ld-arm/ifunc-6.dd: Adjust.
	* ld-arm/ifunc-7.dd: Adjust.
	* ld-arm/ifunc-8.dd: Adjust.
	* ld-arm/jump-reloc-veneers-long.d: Adjust.
	* ld-arm/tls-longplt-lib.d: Adjust.
	* ld-arm/tls-thumb1.d: Adjust.

	opcodes/
	* arm-dis.c (print_insn_coprocessor): Explicitly print #-0
	as address offset.
	(print_arm_address): Likewise. Elide positive #0 appropriately.
	(print_insn_arm): Likewise.
2011-06-02 15:32:10 +00:00
Richard Sandiford
34e77a920a include/elf/
* arm.h (R_ARM_IRELATIVE): New relocation.

bfd/
	* reloc.c (BFD_RELOC_ARM_IRELATIVE): New relocation.
	* bfd-in2.h: Regenerate.
	* elf32-arm.c (elf32_arm_howto_table_2): Rename existing definition
	to elf32_arm_howto_table_3 and replace with a single R_ARM_IRELATIVE
	entry.
	(elf32_arm_howto_from_type): Update accordingly.
	(elf32_arm_reloc_map): Map BFD_RELOC_ARM_IRELATIVE to R_ARM_IRELATIVE.
	(elf32_arm_reloc_name_lookup): Handle elf32_arm_howto_table_3.
	(arm_plt_info): New structure, split out from elf32_arm_link_hash_entry
	with an extra noncall_refcount field.
	(arm_local_iplt_info): New structure.
	(elf_arm_obj_tdata): Add local_iplt.
	(elf32_arm_local_iplt): New accessor macro.
	(elf32_arm_link_hash_entry): Replace plt_thumb_refcount,
	plt_maybe_thumb_refcount and plt_got_offset with an arm_plt_info.
	Change tls_type to a bitfield and add is_iplt.
	(elf32_arm_link_hash_newfunc): Update accordingly.
	(elf32_arm_allocate_local_sym_info): New function.
	(elf32_arm_create_local_iplt): Likewise.
	(elf32_arm_get_plt_info): Likewise.
	(elf32_arm_plt_needs_thumb_stub_p): Likewise.
	(elf32_arm_get_local_dynreloc_list): Likewise.
	(create_ifunc_sections): Likewise.
	(elf32_arm_copy_indirect_symbol): Update after the changes to
	elf32_arm_link_hash_entry.  Assert the is_iplt has not yet been set.
	(arm_type_of_stub): Add an st_type argument.  Use elf32_arm_get_plt_info
	to get PLT information.  Assert that all STT_GNU_IFUNC references
	are turned into PLT references.
	(arm_build_one_stub): Pass the symbol type to
	elf32_arm_final_link_relocate.
	(elf32_arm_size_stubs): Pass the symbol type to arm_type_of_stub.
	(elf32_arm_allocate_irelocs): New function.
	(elf32_arm_add_dynreloc): In static objects, use .rel.iplt for
	all R_ARM_IRELATIVE.
	(elf32_arm_allocate_plt_entry): New function.
	(elf32_arm_populate_plt_entry): Likewise.
	(elf32_arm_final_link_relocate): Add an st_type parameter.
	Set srelgot to null for static objects.  Use separate variables
	to record which st_value and st_type should be used when generating
	a dynamic relocation.  Use elf32_arm_get_plt_info to find the
	symbol's PLT information, setting has_iplt_entry, splt,
	plt_offset and gotplt_offset accordingly.  Check whether
	STT_GNU_IFUNC symbols should resolve to an .iplt entry, and change
	the relocation target accordingly.  Broaden assert to include
	.iplts.  Don't set sreloc for static relocations.  Assert that
	we only generate dynamic R_ARM_RELATIVE relocations for R_ARM_ABS32
	and R_ARM_ABS32_NOI.  Generate R_ARM_IRELATIVE relocations instead
	of R_ARM_RELATIVE relocations if the target is an STT_GNU_IFUNC
	symbol.  Pass the symbol type to arm_type_of_stub.  Conditionally
	resolve GOT references to the .igot.plt entry.
	(elf32_arm_relocate_section): Update the call to
	elf32_arm_final_link_relocate.
	(elf32_arm_gc_sweep_hook): Use elf32_arm_get_plt_info to get PLT
	information.  Treat R_ARM_REL32 and R_ARM_REL32_NOI as call
	relocations in shared libraries and relocatable executables.
	Count non-call PLT references.  Use elf32_arm_get_local_dynreloc_list
	to get the list of dynamic relocations for a local symbol.
	(elf32_arm_check_relocs): Always create ifunc sections.  Set isym
	at the same time as setting h.  Use elf32_arm_allocate_local_sym_info
	to allocate local symbol information.  Treat R_ARM_REL32 and
	R_ARM_REL32_NOI as call relocations in shared libraries and
	relocatable executables.  Record PLT information for local
	STT_GNU_IFUNC functions as well as global functions.   Count
	non-call PLT references.  Use elf32_arm_get_local_dynreloc_list
	to get the list of dynamic relocations for a local symbol.
	(elf32_arm_adjust_dynamic_symbol): Handle STT_GNU_IFUNC symbols.
	Don't remove STT_GNU_IFUNC PLTs unless all references have been
	removed.  Update after the changes to elf32_arm_link_hash_entry.
	(allocate_dynrelocs_for_symbol): Decide whether STT_GNU_IFUNC PLT
	entries should live in .plt or .iplt.  Check whether the .igot.plt
	and .got entries can be combined.  Use elf32_arm_allocate_plt_entry
	to allocate .plt and .(i)got.plt entries.  Detect which .got
	entries will need R_ARM_IRELATIVE relocations and use
	elf32_arm_allocate_irelocs to allocate them.  Likewise other
	non-.got dynamic relocations.
	(elf32_arm_size_dynamic_sections): Allocate .iplt, .igot.plt
	and dynamic relocations for local STT_GNU_IFUNC symbols.
	Check whether the .igot.plt and .got entries can be combined.
	Detect which .got entries will need R_ARM_IRELATIVE relocations
	and use elf32_arm_allocate_irelocs to allocate them.  Use stashed
	section pointers intead of strcmp checks.  Handle iplt and igotplt.
	(elf32_arm_finish_dynamic_symbol): Use elf32_arm_populate_plt_entry
	to fill in .plt, .got.plt and .rel(a).plt entries.  Point
	STT_GNU_IFUNC symbols at an .iplt entry if non-call relocations
	resolve to it.
	(elf32_arm_output_plt_map_1): New function, split out from
	elf32_arm_output_plt_map.  Handle .iplt entries.  Use
	elf32_arm_plt_needs_thumb_stub_p.
	(elf32_arm_output_plt_map): Call it.
	(elf32_arm_output_arch_local_syms): Add mapping symbols for
	local .iplt entries.
	(elf32_arm_swap_symbol_in): Handle Thumb STT_GNU_IFUNC symbols.
	(elf32_arm_swap_symbol_out): Likewise.
	(elf32_arm_add_symbol_hook): New function.
	(elf_backend_add_symbol_hook): Define for all targets.

opcodes/
	* arm-dis.c (get_sym_code_type): Treat STT_GNU_IFUNCs as code.

gas/
	* config/tc-arm.c (md_pcrel_from_section): Use S_FORCE_RELOC to
	determine whether a relocation is needed.
	(md_apply_fix, arm_apply_sym_value): Likewise.

ld/testsuite/
	* ld-arm/ifunc-1.s, ld-arm/ifunc-1.dd, ld-arm/ifunc-1.gd,
	ld-arm/ifunc-1.rd, ld-arm/ifunc-2.s, ld-arm/ifunc-2.dd,
	ld-arm/ifunc-2.gd, ld-arm/ifunc-2.rd, ld-arm/ifunc-3.s,
	ld-arm/ifunc-3.dd, ld-arm/ifunc-3.gd, ld-arm/ifunc-3.rd,
	ld-arm/ifunc-4.s, ld-arm/ifunc-4.dd, ld-arm/ifunc-4.gd,
	ld-arm/ifunc-4.rd, ld-arm/ifunc-5.s, ld-arm/ifunc-5.dd,
	ld-arm/ifunc-5.gd, ld-arm/ifunc-5.rd, ld-arm/ifunc-6.s,
	ld-arm/ifunc-6.dd, ld-arm/ifunc-6.gd, ld-arm/ifunc-6.rd,
	ld-arm/ifunc-7.s, ld-arm/ifunc-7.dd, ld-arm/ifunc-7.gd,
	ld-arm/ifunc-7.rd, ld-arm/ifunc-8.s, ld-arm/ifunc-8.dd,
	ld-arm/ifunc-8.gd, ld-arm/ifunc-8.rd, ld-arm/ifunc-9.s,
	ld-arm/ifunc-9.dd, ld-arm/ifunc-9.gd, ld-arm/ifunc-9.rd,
	ld-arm/ifunc-10.s, ld-arm/ifunc-10.dd, ld-arm/ifunc-10.gd,
	ld-arm/ifunc-10.rd, ld-arm/ifunc-11.s, ld-arm/ifunc-11.dd,
	ld-arm/ifunc-11.gd, ld-arm/ifunc-11.rd, ld-arm/ifunc-12.s,
	ld-arm/ifunc-12.dd, ld-arm/ifunc-12.gd, ld-arm/ifunc-12.rd,
	ld-arm/ifunc-13.s, ld-arm/ifunc-13.dd, ld-arm/ifunc-13.gd,
	ld-arm/ifunc-13.rd, ld-arm/ifunc-14.s, ld-arm/ifunc-14.dd,
	ld-arm/ifunc-14.gd, ld-arm/ifunc-14.rd, ld-arm/ifunc-15.s,
	ld-arm/ifunc-15.dd, ld-arm/ifunc-15.gd, ld-arm/ifunc-15.rd,
	ld-arm/ifunc-16.s, ld-arm/ifunc-16.dd, ld-arm/ifunc-16.gd,
	ld-arm/ifunc-16.rd, ld-arm/ifunc-dynamic.ld,
	ld-arm/ifunc-static.ld: New tests.
	* ld-arm/farcall-group.d, ld-arm/farcall-group-size2.d,
	ld-arm/farcall-mixed-lib-v4t.d, ld-arm/farcall-mixed-lib.d: Update
	for new stub hashes.
	* ld-arm/arm-elf.exp: Run them.
2011-03-14 16:04:16 +00:00