This patch adds readelf support for decoding the exception table
opcode for restoring the RA_AUTH_CODE pseudo register defined by the
EHABI
(https://github.com/ARM-software/abi-aa/releases/download/2021Q1/ehabi32.pdf
Section 10.3).
* readelf.c (decode_arm_unwind_bytecode): Add support to decode
restoring RA_AUTH_CODE pseudo register.
This option has been present since the very early days of the
development of libctf as part of binutils, and it shows. Back in the
earliest days, I thought we might handle ambiguous types by introducing
new ELF sections on the fly named things like .ctf.foo.c for ambiguous
types found only in foo.c, etc. This turned out to be a terrible idea,
so we moved to using a CTF archive in the .ctf section which contained
all the CTF dictionaries -- but the --ctf-parent option in objdump and
readelf was never adjusted, and lingered as a mechanism to specify CTF
parent dictionaries in sections other than .ctf, even though the linker
has no way to produce parent dictionaries in different sections from
their children, libctf's ctf_open can't handle such split-up
parent/child dicts, and they are never found in the wild, emitted by GNU
ld or by any known third-party linking tool.
Meanwhile, the actually-useful ctf_link feature (albeit not used by ld)
which lets you remap the names of CTF archive members (so you can end up
with a parent archive member named something other than ".ctf", still
contained with all its children in a single .ctf section) had no support
in objdump or readelf: there was no way to tell them that these members
were parents, so all the types in the associated child dicts always
appeared corrupted, referencing nonexistent types from a parent objdump
couldn't find.
So adjust --ctf-parent so that rather than taking a section name it
takes a member name instead (if not specified, the name is ".ctf", which
is what GNU ld emits). Because the option was always useless before
now, this is expected to have no backward-compatibility implications.
As part of this, we have to slightly adjust the code which skips the
archive member name if redundant: right now it skips it if it's ".ctf",
on the assumption that this name will almost always be at the start
of the objdump output and thus we'll end up with a shared dump
and then smaller, headed dumps for the per-TU child dicts; but if
the parent name has been changed, that won't be true any more.
So change the rules to "members named .ctf which appear first in the
first have their member name skipped". Since we now need to count
members, move from ctf_archive_iter (for which passing in extra
parameters requires defining a new struct and is clumsy) to
ctf_archive_next, allowing us to just *call* dump_ctf_archive_member and
maintain a member count in the obvious way. In the process we fix a
tiny difference between readelf and objdump: if a ctf_dump ever failed,
readelf skipped every later member, while objdump tried to keep going as
much as it could. For a dumping tool the former is clearly preferable.
binutils/ChangeLog
2021-10-25 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (usage): --ctf-parent now takes a name, not a section.
(dump_ctf): Don't open a separate section; use the parent_name in
ctf_dict_open instead. Use ctf_archive_next, not ctf_archive_iter,
so we can pass down a member count.
(dump_ctf_archive_member): Add the member count; don't return
anything. Import parents into children no matter what the
parent's name, while still avoiding displaying the header for the
common parent name of ".ctf".
* readelf.c (usage): Adjust similarly.
(dump_section_as_ctf): Likewise.
(dump_ctf_archive_member): Likewise. Never stop iterating over
archive members, even if ctf_dump of one member fails.
* doc/ctf.options.texi: Adjust.
Mainline gcc:
readelf.c: In function 'find_section':
readelf.c:349:8: error: the comparison will always evaluate as 'true' for the pointer operand in 'filedata->section_headers + (sizetype)((long unsigned int)i * 80)' must not be NULL [-Werror=address]
349 | ((X) != NULL \
| ^~
readelf.c:761:9: note: in expansion of macro 'SECTION_NAME_VALID'
761 | if (SECTION_NAME_VALID (filedata->section_headers + i)
| ^~~~~~~~~~~~~~~~~~
This will likely be fixed in gcc, but inline functions are nicer than
macros.
* readelf.c (SECTION_NAME, SECTION_NAME_VALID),
(SECTION_NAME_PRINT, VALID_SYMBOL_NAME, VALID_DYNAMIC_NAME),
(GET_DYNAMIC_NAME): Delete. Replace with..
(section_name, section_name_valid, section_name_print),
(valid_symbol_name, valid_dynamic_name, get_dynamic_name): ..these
new inline functions. Update use throughout file.
Similar to ARM/AARCH64, we add mapping symbols in the symbol table,
to mark the start addresses of data and instructions. The $d means
data, and the $x means instruction. Then the disassembler uses these
symbols to decide whether we should dump data or instruction.
Consider the mapping-04 test case,
$ cat tmp.s
.text
.option norelax
.option norvc
.fill 2, 4, 0x1001
.byte 1
.word 0
.balign 8
add a0, a0, a0
.fill 5, 2, 0x2002
add a1, a1, a1
.data
.word 0x1 # No need to add mapping symbols.
.word 0x2
$ riscv64-unknown-elf-as tmp.s -o tmp.o
$ riscv64-unknown-elf-objdump -d tmp.o
Disassembly of section .text:
0000000000000000 <.text>:
0: 00001001 .word 0x00001001 # Marked $d, .fill directive.
4: 00001001 .word 0x00001001
8: 00000001 .word 0x00000001 # .byte + part of .word.
c: 00 .byte 0x00 # remaining .word.
d: 00 .byte 0x00 # Marked $d, odd byte of alignment.
e: 0001 nop # Marked $x, nops for alignment.
10: 00a50533 add a0,a0,a0
14: 20022002 .word 0x20022002 # Marked $d, .fill directive.
18: 20022002 .word 0x20022002
1c: 2002 .short 0x2002
1e: 00b585b3 add a1,a1,a1 # Marked $x.
22: 0001 nop # Section tail alignment.
24: 00000013 nop
* Use $d and $x to mark the distribution of data and instructions.
Alignments of code are recognized as instructions, since we usually
fill nops for them.
* If the alignment have odd bytes, then we cannot just fill the nops
into the spaces. We always fill an odd byte 0x00 at the start of
the spaces. Therefore, add a $d mapping symbol for the odd byte,
to tell disassembler that it isn't an instruction. The behavior
is same as Arm and Aarch64.
The elf/linux toolchain regressions all passed. Besides, I also
disable the mapping symbols internally, but use the new objudmp, the
regressions passed, too. Therefore, the new objudmp should dump
the objects corretly, even if they don't have any mapping symbols.
bfd/
pr 27916
* cpu-riscv.c (riscv_elf_is_mapping_symbols): Define mapping symbols.
* cpu-riscv.h: extern riscv_elf_is_mapping_symbols.
* elfnn-riscv.c (riscv_maybe_function_sym): Do not choose mapping
symbols as a function name.
(riscv_elf_is_target_special_symbol): Add mapping symbols.
binutils/
pr 27916
* testsuite/binutils-all/readelf.s: Updated.
* testsuite/binutils-all/readelf.s-64: Likewise.
* testsuite/binutils-all/readelf.s-64-unused: Likewise.
* testsuite/binutils-all/readelf.ss: Likewise.
* testsuite/binutils-all/readelf.ss-64: Likewise.
* testsuite/binutils-all/readelf.ss-64-unused: Likewise.
gas/
pr 27916
* config/tc-riscv.c (make_mapping_symbol): Create a new mapping symbol.
(riscv_mapping_state): Decide whether to create mapping symbol for
frag_now. Only add the mapping symbols to text sections.
(riscv_add_odd_padding_symbol): Add the mapping symbols for the
riscv_handle_align, which have odd bytes spaces.
(riscv_check_mapping_symbols): Remove any excess mapping symbols.
(md_assemble): Marked as MAP_INSN.
(riscv_frag_align_code): Marked as MAP_INSN.
(riscv_init_frag): Add mapping symbols for frag, it usually called
by frag_var. Marked as MAP_DATA for rs_align and rs_fill, and
marked as MAP_INSN for rs_align_code.
(s_riscv_insn): Marked as MAP_INSN.
(riscv_adjust_symtab): Call riscv_check_mapping_symbols.
* config/tc-riscv.h (md_cons_align): Defined to riscv_mapping_state
with MAP_DATA.
(TC_SEGMENT_INFO_TYPE): Record mapping state for each segment.
(TC_FRAG_TYPE): Record the first and last mapping symbols for the
fragments. The first mapping symbol must be placed at the start
of the fragment.
(TC_FRAG_INIT): Defined to riscv_init_frag.
* testsuite/gas/riscv/mapping-01.s: New testcase.
* testsuite/gas/riscv/mapping-01a.d: Likewise.
* testsuite/gas/riscv/mapping-01b.d: Likewise.
* testsuite/gas/riscv/mapping-02.s: Likewise.
* testsuite/gas/riscv/mapping-02a.d: Likewise.
* testsuite/gas/riscv/mapping-02b.d: Likewise.
* testsuite/gas/riscv/mapping-03.s: Likewise.
* testsuite/gas/riscv/mapping-03a.d: Likewise.
* testsuite/gas/riscv/mapping-03b.d: Likewise.
* testsuite/gas/riscv/mapping-04.s: Likewise.
* testsuite/gas/riscv/mapping-04a.d: Likewise.
* testsuite/gas/riscv/mapping-04b.d: Likewise.
* testsuite/gas/riscv/mapping-norelax-04a.d: Likewise.
* testsuite/gas/riscv/mapping-norelax-04b.d: Likewise.
* testsuite/gas/riscv/no-relax-align.d: Updated.
* testsuite/gas/riscv/no-relax-align-2.d: Likewise.
include/
pr 27916
* opcode/riscv.h (enum riscv_seg_mstate): Added.
opcodes/
pr 27916
* riscv-dis.c (last_map_symbol, last_stop_offset, last_map_state):
Added to dump sections with mapping symbols.
(riscv_get_map_state): Get the mapping state from the symbol.
(riscv_search_mapping_symbol): Check the sorted symbol table, and
then find the suitable mapping symbol.
(riscv_data_length): Decide which data size we should print.
(riscv_disassemble_data): Dump the data contents.
(print_insn_riscv): Handle the mapping symbols.
(riscv_symbol_is_valid): Marked mapping symbols as invalid.
The following patch synchronizes includes/objdump/readelf with the Linux
Kernel in terms of ARM regset notes.
We're currently missing 3 of them:
NT_ARM_PACA_KEYS
NT_ARM_PACG_KEYS
NT_ARM_PAC_ENABLED_KEYS
We don't need GDB to bother with this at the moment, so this doesn't update
bfd/elf.c. If needed, we can do it in the future.
binutils/
* readelf.c (get_note_type): Handle new ARM PAC notes.
include/elf/
* common.h (NT_ARM_PACA_KEYS, NT_ARM_PACG_KEYS)
(NT_ARM_PAC_ENABLED_KEYS): New constants.
Fuzzers might put -1 in arhdr.ar_size. If the size is rounded up to
and even number of bytes we get zero.
* readelf.c (process_archive): Don't round up archive_file_size.
Do round up next_arhdr_offset calculation.
Add GNU_PROPERTY_1_NEEDED:
#define GNU_PROPERTY_1_NEEDED GNU_PROPERTY_UINT32_OR_LO
to indicate the needed properties by the object file.
Add GNU_PROPERTY_1_NEEDED_INDIRECT_EXTERN_ACCESS:
#define GNU_PROPERTY_1_NEEDED_INDIRECT_EXTERN_ACCESS (1U << 0)
to indicate that the object file requires canonical function pointers and
cannot be used with copy relocation.
binutils/
* readelf.c (decode_1_needed): New.
(print_gnu_property_note): Handle GNU_PROPERTY_1_NEEDED.
include/
* elf/common.h (GNU_PROPERTY_1_NEEDED): New.
(GNU_PROPERTY_1_NEEDED_INDIRECT_EXTERN_ACCESS): Likewise.
ld/
* testsuite/ld-elf/property-1_needed-1a.d: New file.
* testsuite/ld-elf/property-1_needed-1.s: Likewise.
* readelf.c (process_archive): Reset file position to the
beginning when calling process_object for thin archive members.
* testsuite/binutils-all/readelf.exp: Add test.
* testsuite/binutils-all/readelf.h.thin: New file.
Implement GNU_PROPERTY_UINT32_AND_XXX/GNU_PROPERTY_UINT32_OR_XXX:
https://sourceware.org/pipermail/gnu-gabi/2021q1/000467.html
1. GNU_PROPERTY_UINT32_AND_LO..GNU_PROPERTY_UINT32_AND_HI
#define GNU_PROPERTY_UINT32_AND_LO 0xb0000000
#define GNU_PROPERTY_UINT32_AND_HI 0xb0007fff
A bit in the output pr_data field is set only if it is set in all
relocatable input pr_data fields. If all bits in the the output
pr_data field are zero, this property should be removed from output.
If the bit is 1, all input relocatables have the feature. If the
bit is 0 or the property is missing, the info is unknown.
2. GNU_PROPERTY_UINT32_OR_LO..GNU_PROPERTY_UINT32_OR_HI
#define GNU_PROPERTY_UINT32_OR_LO 0xb0008000
#define GNU_PROPERTY_UINT32_OR_HI 0xb000ffff
A bit in the output pr_data field is set if it is set in any
relocatable input pr_data fields. If all bits in the the output
pr_data field are zero, this property should be removed from output.
If the bit is 1, some input relocatables have the feature. If the
bit is 0 or the property is missing, the info is unknown.
bfd/
* elf-properties.c (_bfd_elf_parse_gnu_properties): Handle
GNU_PROPERTY_UINT32_AND_LO, GNU_PROPERTY_UINT32_AND_HI,
GNU_PROPERTY_UINT32_OR_LO and GNU_PROPERTY_UINT32_OR_HI.
(elf_merge_gnu_properties): Likewise.
binutils/
* readelf.c (print_gnu_property_note): Handle
GNU_PROPERTY_UINT32_AND_LO, GNU_PROPERTY_UINT32_AND_HI,
GNU_PROPERTY_UINT32_OR_LO and GNU_PROPERTY_UINT32_OR_HI.
include/
* elf/common.h (GNU_PROPERTY_UINT32_AND_LO): New.
(GNU_PROPERTY_UINT32_AND_HI): Likewise.
(GNU_PROPERTY_UINT32_OR_LO): Likewise.
(GNU_PROPERTY_UINT32_OR_HI): Likewise.
ld/
* testsuite/ld-elf/property-and-1.d: New file.
* testsuite/ld-elf/property-and-1.s: Likewise.
* testsuite/ld-elf/property-and-2.d: Likewise.
* testsuite/ld-elf/property-and-2.s: Likewise.
* testsuite/ld-elf/property-and-3.d: Likewise.
* testsuite/ld-elf/property-and-3.s: Likewise.
* testsuite/ld-elf/property-and-4.d: Likewise.
* testsuite/ld-elf/property-and-empty.s: Likewise.
* testsuite/ld-elf/property-or-1.d: Likewise.
* testsuite/ld-elf/property-or-1.s: Likewise.
* testsuite/ld-elf/property-or-2.d: Likewise.
* testsuite/ld-elf/property-or-2.s: Likewise.
* testsuite/ld-elf/property-or-3.d: Likewise.
* testsuite/ld-elf/property-or-3.s: Likewise.
* testsuite/ld-elf/property-or-4.d: Likewise.
* testsuite/ld-elf/property-or-empty.s: Likewise.
I finally found time to teach readelf to identify PIEs in the file
header display and program header display. So in place of
"DYN (Shared object file)" which isn't completely true, show
"DYN (Position-Independent Executable file)".
It requires a little bit of untangling code in readelf due to
process_program_headers setting up dynamic_addr and dynamic_size,
needed to scan .dynamic for the DT_FLAGS_1 entry, and
process_program_headers itself wanting to display the file type in
some cases. At first I modified process_program_header using a
"probe" parameter similar to get_section_headers in order to inhibit
output, but decided it was cleaner to separate out
locate_dynamic_sections.
binutils/
* readelf.c (locate_dynamic_section, is_pie): New functions.
(get_file_type): Replace e_type parameter with filedata. Call
is_pie for ET_DYN. Update all callers.
(process_program_headers): Use local variables dynamic_addr and
dynamic_size, updating filedata on exit from function. Set
dynamic_size of 1 to indicate no dynamic section or segment.
Update tests of dynamic_size throughout.
* testsuite/binutils-all/x86-64/pr27708.dump: Update expected output.
ld/
* testsuite/ld-pie/vaddr-0.d: Update expected output.
gdb/
* testsuite/lib/gdb.exp (exec_is_pie): Match new PIE readelf output.
Fix commit 4de91c10cd, which cached the single section header read
to pick up file header extension fields. Also, testing e_shoff in
get_section_headers opened a hole for fuzzers where we'd end up with
segfaults due to non-zero e_shnum but NULL section_headers.
* readelf.c (get_section_headers): Don't test e_shoff here, leave
that to get_32bit_section_headers or get_64bit_section_headers.
(process_object): Throw away section header read to print file
header extension.
A number of filedata entries were not cleared. Make sure they are
all cleared out, except the ones needed for archive handling.
* readelf.c (struct filedata): Move archive_file_offset and
archive_file_size earlier.
(free_filedata): Clear using memset.
This is a followup to git commit 8ff66993e0, a patch aimed at
segfaults found invoking readelf multiple times with fuzzed objects.
In that patch I added code to clear more stashed data early in
process_section_headers, along with any stashed section headers. This
patch instead relies on clearing out the stash at the end of
process_object, making sure that process_object doesn't exit early.
The patch also introduces some new wrapper functions.
* readelf.c (GET_ELF_SYMBOLS): Delete. Replace with..
(get_elf_symbols): ..this new function throughout.
(get_32bit_section_headers): Don't free section_headers.
(get_64bit_section_headers): Likewise.
(get_section_headers): New function, use throughout in place of
32bit and 64bit variants.
(get_dynamic_section): Similarly.
(process_section_headers): Don't free filedata memory here.
(get_file_header): Don't get section headers here..
(process_object): ..Read them here instead. Don't exit without
freeing filedata memory.
Splitting up help strings makes it more likely that at least some of
the help translation survives adding new options.
* readelf.c (parse_args): Call dwarf_select_sections_all on
--debug-dump without optarg.
(usage): Associate -w and --debug-dump options closely.
Split up help message. Remove extraneous blank lines around
ctf help.
* objdump.c (usage): Similarly.
commit a7664973b2
Author: Jan Beulich <jbeulich@suse.com>
Date: Mon Apr 26 10:41:35 2021 +0200
x86: correct overflow checking for 16-bit PC-relative relocs
caused linker failure when building 16-bit program in a 32-bit ELF
container. Update GNU_PROPERTY_X86_FEATURE_2_USED with
#define GNU_PROPERTY_X86_FEATURE_2_CODE16 (1U << 12)
to indicate that 16-bit mode instructions are used in the input object:
https://groups.google.com/g/x86-64-abi/c/UvvXWeHIGMA
to indicate that 16-bit mode instructions are used in the object to
allow linker to properly perform relocation overflow check for 16-bit
PC-relative relocations in 16-bit mode instructions.
1. Update x86 assembler to always generate the GNU property note with
GNU_PROPERTY_X86_FEATURE_2_CODE16 for .code16 in ELF object.
2. Update i386 and x86-64 linkers to use 16-bit PC16 relocations if
input object is marked with GNU_PROPERTY_X86_FEATURE_2_CODE16.
bfd/
PR ld/27905
* elf32-i386.c: Include "libiberty.h".
(elf_howto_table): Add 16-bit R_386_PC16 entry.
(elf_i386_rtype_to_howto): Add a BFD argument. Use 16-bit
R_386_PC16 if input has 16-bit mode instructions.
(elf_i386_info_to_howto_rel): Update elf_i386_rtype_to_howto
call.
(elf_i386_tls_transition): Likewise.
(elf_i386_relocate_section): Likewise.
* elf64-x86-64.c (x86_64_elf_howto_table): Add 16-bit
R_X86_64_PC16 entry.
(elf_x86_64_rtype_to_howto): Use 16-bit R_X86_64_PC16 if input
has 16-bit mode instructions.
* elfxx-x86.c (_bfd_x86_elf_parse_gnu_properties): Set
elf_x86_has_code16 if relocatable input is marked with
GNU_PROPERTY_X86_FEATURE_2_CODE16.
* elfxx-x86.h (elf_x86_obj_tdata): Add has_code16.
(elf_x86_has_code16): New.
binutils/
PR ld/27905
* readelf.c (decode_x86_feature_2): Support
GNU_PROPERTY_X86_FEATURE_2_CODE16.
gas/
PR ld/27905
* config/tc-i386.c (set_code_flag): Update x86_feature_2_used
with GNU_PROPERTY_X86_FEATURE_2_CODE16 for .code16 in ELF
object.
(set_16bit_gcc_code_flag): Likewise.
(x86_cleanup): Always generate the GNU property note if
x86_feature_2_used isn't 0.
* testsuite/gas/i386/code16-2.d: New file.
* testsuite/gas/i386/code16-2.s: Likewise.
* testsuite/gas/i386/x86-64-code16-2.d: Likewise.
* testsuite/gas/i386/i386.exp: Run code16-2 and x86-64-code16-2.
include/
PR ld/27905
* elf/common.h (GNU_PROPERTY_X86_FEATURE_2_CODE16): New.
ld/
PR ld/27905
* testsuite/ld-i386/code16.d: New file.
* testsuite/ld-i386/code16.t: Likewise.
* testsuite/ld-x86-64/code16.d: Likewise.
* testsuite/ld-x86-64/code16.t: Likewise.
* testsuite/ld-i386/i386.exp: Run code16.
* testsuite/ld-x86-64/x86-64.exp: Likewise.
The official name for Loongson Architecture is LoongArch, it is better
to use LoongArch instead of Loongson Loongarch for EM_LOONGARCH to avoid
confusion and keep consistent with the various of software in the future.
The official documentation in Chinese:
http://www.loongson.cn/uploadfile/cpu/LoongArch.pdf
The translated version in English:
https://loongson.github.io/LoongArch-Documentation/
binutils/
* readelf.c (get_machine_name): Change Loongson Loongarch to
LoongArch.
include/
* elf/common.h (EM_LOONGARCH): Change Loongson Loongarch to
LoongArch.
PR 27672
* readelf.c (sym_base): New variable.
(enum print_mode): Add more modes.
(print_vma): Add suport for new modes.
(options): Add sym-base.
(usage): Add sym-base.
(parse_args): Add support for --sym-base.
(print_dynamic_symbol_size): New function.
(print_dynamic_symbol): Use new function.
* doc/binutils.texi: Document the new feature.
* NEWS: Mention the new feature.
We shouldn't be using arbitrary limits like PATH_MAX in GNU programs.
This patch also fixes some memory leaks in readelf when processing
separate debug info.
PR 27716
binutils/
* objdump.c (show_line): Don't limit paths to PATH_MAX.
* readelf.c (struct filedata): Change program_interpreter from
a char array to a char pointer.
(process_program_headers): Sanity check PT_INTERP p_filesz.
Malloc program_interpreter using p_filesz and read directly from
file.
(process_dynamic_section): Check program_interpreter is non-NULL.
(free_filedata): New function, split out from..
(process_object): ..here.
(close_debug_file): Call free_filedata.
* sysdep.h: Don't include sys/param.h.
(PATH_MAX): Don't define.
* configure.ac: Don't check for sys/param.h.
* configure: Regenerate.
gprof/
* gprof.h (PATH_MAX): Don't define.
* corefile.c (core_create_line_syms): Don't use PATH_MAX for initial
file name size.
* source.c (annotate_source): Malloc file name buffer. Always
trim off "-ann" when dos 8.3 annotate file matches original.
* utils.c (print_name_only): Malloc file name buffer.
NT_NETBSD_PAX was defined in commit be3b926d8d.
binutils/ChangeLog:
* readelf.c (process_netbsd_elf_note): Remove now unneeded #ifdef
check for NT_NETBSD_PAX.
* objdump.c (process_links): Use type int.
* readelf.c (request_dump): Don't increment do_dump, set it.
* windint.h (target_is_bigendian): Use type bfd_boolean.
* windmc.c (target_is_bigendian): Likewise.
* windres.c (target_is_bigendian): Likewise.
PR 27478
* readelf.c (dump_section_as_strings): Mention separate filename.
(dump_section_as_bytes): Likewise.
(dump_section_as_ctf): Likewise.
(initialise_dumkps_byname): Only issue a warning for missing
sections if processing the main file.
(process_section_contents): Only issue a warning for unsumped
section numbers in the main file.
(initialise_dump_sects): New function. Contains code extracted
from ...
(process_object): ... here. Also call initialise_dump_sects for
separate files.