libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
/* CTF linking.
|
2024-01-04 19:52:08 +08:00
|
|
|
Copyright (C) 2019-2024 Free Software Foundation, Inc.
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
|
|
|
|
This file is part of libctf.
|
|
|
|
|
|
|
|
libctf is free software; you can redistribute it and/or modify it under
|
|
|
|
the terms of the GNU General Public License as published by the Free
|
|
|
|
Software Foundation; either version 3, or (at your option) any later
|
|
|
|
version.
|
|
|
|
|
|
|
|
This program is distributed in the hope that it will be useful, but
|
|
|
|
WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
|
|
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
|
|
|
|
See the GNU General Public License for more details.
|
|
|
|
|
|
|
|
You should have received a copy of the GNU General Public License
|
|
|
|
along with this program; see the file COPYING. If not see
|
|
|
|
<http://www.gnu.org/licenses/>. */
|
|
|
|
|
|
|
|
#include <ctf-impl.h>
|
|
|
|
#include <string.h>
|
|
|
|
|
libctf, link: add lazy linking: clean up input members: err/warn cleanup
This rather large and intertwined pile of changes does three things:
First, it transitions from dprintf to ctf_err_warn for things the user might
care about: this one file is the major impetus for the ctf_err_warn
infrastructure, because things like file names are crucial in linker
error messages, and errno values are utterly incapable of
communicating them
Second, it stabilizes the ctf_link APIs: you can now call
ctf_link_add_ctf without a CTF argument (only a NAME), to lazily
ctf_open the file with the given NAME when needed, and close it as soon
as possible, to save memory. This is not an API change because a null
CTF argument was prohibited before now.
Since getting CTF directly from files uses ctf_open, passing in only a
NAME requires use of libctf, not libctf-nobfd. The linker's behaviour
is unchanged, as it still passes in a ctf_archive_t as before.
This also let us fix a leak: we were opening ctf_archives and their
containing ctf_files, then only closing the files and leaving the
archives open.
Third, this commit restructures the ctf_link_in_member argument used by
the CTF linking machinery and adjusts its users accordingly.
We drop two members:
- arcname, which is difficult to construct and then only used in error
messages (that were only dprintf()ed, so never seen!)
- share_mode, since we store the flags passed to ctf_link (including the
share mode) in a new ctf_file_t.ctf_link_flags to help dedup get hold
of it
We rename others whose existing names were fairly dreadful:
- done_main_member -> done_parent, using consistent terminology for .ctf
as the parent of all archive members
- main_input_fp -> in_fp_parent, likewise
- file_name -> in_file_name, likewise
We add one new member, cu_mapped.
Finally, we move the various frees of things like mapping table data to
the top-level ctf_link, since deduplicating links will want to do that
too.
include/
* ctf-api.h (ECTF_NEEDSBFD): New.
(ECTF_NERR): Adjust.
(ctf_link): Rename share_mode arg to flags.
libctf/
* Makefile.am: Set -DNOBFD=1 in libctf-nobfd, and =0 elsewhere.
* Makefile.in: Regenerated.
* ctf-impl.h (ctf_link_input_name): New.
(ctf_file_t) <ctf_link_flags>: New.
* ctf-create.c (ctf_serialize): Adjust accordingly.
* ctf-link.c: Define ctf_open as weak when PIC.
(ctf_arc_close_thunk): Remove unnecessary thunk.
(ctf_file_close_thunk): Likewise.
(ctf_link_input_name): New.
(ctf_link_input_t): New value of the ctf_file_t.ctf_link_input.
(ctf_link_input_close): Adjust accordingly.
(ctf_link_add_ctf_internal): New, split from...
(ctf_link_add_ctf): ... here. Return error if lazy loading of
CTF is not possible. Change to just call...
(ctf_link_add): ... this new function.
(ctf_link_add_cu_mapping): Transition to ctf_err_warn. Drop the
ctf_file_close_thunk.
(ctf_link_in_member_cb_arg_t) <file_name> Rename to...
<in_file_name>: ... this.
<arcname>: Drop.
<share_mode>: Likewise (migrated to ctf_link_flags).
<done_main_member>: Rename to...
<done_parent>: ... this.
<main_input_fp>: Rename to...
<in_fp_parent>: ... this.
<cu_mapped>: New.
(ctf_link_one_type): Adjuwt accordingly. Transition to
ctf_err_warn, removing a TODO.
(ctf_link_one_variable): Note a case too common to warn about.
Report in the debug stream if a cu-mapped link prevents addition
of a conflicting variable.
(ctf_link_one_input_archive_member): Adjust.
(ctf_link_lazy_open): New, open a CTF archive for linking when
needed.
(ctf_link_close_one_input_archive): New, close it again.
(ctf_link_one_input_archive): Adjust for lazy opening, member
renames, and ctf_err_warn transition. Move the
empty_link_type_mapping call to...
(ctf_link): ... here. Adjut for renamings and thunk removal.
Don't spuriously fail if some input contains no CTF data.
(ctf_link_write): ctf_err_warn transition.
* libctf.ver: Remove not-yet-stable comment.
2020-06-05 02:28:52 +08:00
|
|
|
#if defined (PIC)
|
|
|
|
#pragma weak ctf_open
|
|
|
|
#endif
|
|
|
|
|
libctf: add a deduplicator-specific type mapping table
When CTF linking is done, the linker has to track the association
between types in the inputs and types in the outputs. The deduplicator
does this via the cd_output_emission_hashes, which maps from hashes of
types (valid in both the input and output) to the IDs of types in the
specific dict in which the cd_emission_hashes is held. However, the
nondeduplicating linker and ctf_add_type used a different mechanism, a
dedicated hashtab stored in the ctf_link_type_mapping, populated via
ctf_add_type_mapping and queried via the ctf_type_mapping function. To
allow the same functions to be used for variable and symbol population
in both the deduplicating and nondeduplicating linker, the deduplicator
carefully transferred all its input->output mappings into this hashtab
before returning.
This is *expensive*. The number of entries in this hashtab scales as the
number of input types, and unlike the hashing machinery the type mapping
machinery (the only other thing which scales that way) has not been much
optimized.
Now the nondeduplicating linker is gone, we can throw this out, move
the existing type mapping machinery to ctf-create.c and dedicate it to
ctf_add_type alone, and add a new function ctf_dedup_type_mapping which
uses the deduplicator's built-in knowledge of type mappings directly,
without requiring an expensive repopulation phase.
This speeds up a test link of nouveau.ko (a good worst-case candidate
with a lot of types in each of a lot of input files) from 9.11s to 7.15s
in my testing, a speedup of over 20%.
libctf/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (ctf_dict_t) <ctf_link_type_mapping>: No longer used
by the nondeduplicating linker.
(ctf_add_type_mapping): Removed, now static.
(ctf_type_mapping): Likewise.
(ctf_dedup_type_mapping): New.
(ctf_dedup_t) <cd_input_nums>: New.
* ctf-dedup.c (ctf_dedup_init): Populate it.
(ctf_dedup_fini): Free it again. Emphasise that this has to be
the last thing called.
(ctf_dedup): Populate it.
(ctf_dedup_populate_type_mapping): Removed.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): No longer call it. No longer call
ctf_dedup_fini either.
(ctf_dedup_type_mapping): New.
* ctf-link.c (ctf_unnamed_cuname): New.
(ctf_create_per_cu): Arguments must be non-null now.
(ctf_in_member_cb_arg): Removed.
(ctf_link): No longer populate it. No longer discard the
mapping table.
(ctf_link_deduplicating_one_symtypetab): Use
ctf_dedup_type_mapping, not ctf_type_mapping. Use
ctf_unnamed_cuname.
(ctf_link_one_variable): Likewise. Pass in args individually: no
longer a ctf_variable_iter callback.
(empty_link_type_mapping): Removed.
(ctf_link_deduplicating_variables): Use ctf_variable_next, not
ctf_variable_iter. No longer pack arguments to
ctf_link_one_variable into a struct.
(ctf_link_deduplicating_per_cu): Call ctf_dedup_fini once
all link phases are done.
(ctf_link_deduplicating): Likewise.
(ctf_link_intern_extern_string): Improve comment.
(ctf_add_type_mapping): Migrate...
(ctf_type_mapping): ... these functions...
* ctf-create.c (ctf_add_type_mapping): ... here...
(ctf_type_mapping): ... and make static, for the sole use of
ctf_add_type.
2021-03-02 23:10:05 +08:00
|
|
|
/* CTF linking consists of adding CTF archives full of content to be merged into
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
this one to the current file (which must be writable) by calling
|
libctf: add a deduplicator-specific type mapping table
When CTF linking is done, the linker has to track the association
between types in the inputs and types in the outputs. The deduplicator
does this via the cd_output_emission_hashes, which maps from hashes of
types (valid in both the input and output) to the IDs of types in the
specific dict in which the cd_emission_hashes is held. However, the
nondeduplicating linker and ctf_add_type used a different mechanism, a
dedicated hashtab stored in the ctf_link_type_mapping, populated via
ctf_add_type_mapping and queried via the ctf_type_mapping function. To
allow the same functions to be used for variable and symbol population
in both the deduplicating and nondeduplicating linker, the deduplicator
carefully transferred all its input->output mappings into this hashtab
before returning.
This is *expensive*. The number of entries in this hashtab scales as the
number of input types, and unlike the hashing machinery the type mapping
machinery (the only other thing which scales that way) has not been much
optimized.
Now the nondeduplicating linker is gone, we can throw this out, move
the existing type mapping machinery to ctf-create.c and dedicate it to
ctf_add_type alone, and add a new function ctf_dedup_type_mapping which
uses the deduplicator's built-in knowledge of type mappings directly,
without requiring an expensive repopulation phase.
This speeds up a test link of nouveau.ko (a good worst-case candidate
with a lot of types in each of a lot of input files) from 9.11s to 7.15s
in my testing, a speedup of over 20%.
libctf/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (ctf_dict_t) <ctf_link_type_mapping>: No longer used
by the nondeduplicating linker.
(ctf_add_type_mapping): Removed, now static.
(ctf_type_mapping): Likewise.
(ctf_dedup_type_mapping): New.
(ctf_dedup_t) <cd_input_nums>: New.
* ctf-dedup.c (ctf_dedup_init): Populate it.
(ctf_dedup_fini): Free it again. Emphasise that this has to be
the last thing called.
(ctf_dedup): Populate it.
(ctf_dedup_populate_type_mapping): Removed.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): No longer call it. No longer call
ctf_dedup_fini either.
(ctf_dedup_type_mapping): New.
* ctf-link.c (ctf_unnamed_cuname): New.
(ctf_create_per_cu): Arguments must be non-null now.
(ctf_in_member_cb_arg): Removed.
(ctf_link): No longer populate it. No longer discard the
mapping table.
(ctf_link_deduplicating_one_symtypetab): Use
ctf_dedup_type_mapping, not ctf_type_mapping. Use
ctf_unnamed_cuname.
(ctf_link_one_variable): Likewise. Pass in args individually: no
longer a ctf_variable_iter callback.
(empty_link_type_mapping): Removed.
(ctf_link_deduplicating_variables): Use ctf_variable_next, not
ctf_variable_iter. No longer pack arguments to
ctf_link_one_variable into a struct.
(ctf_link_deduplicating_per_cu): Call ctf_dedup_fini once
all link phases are done.
(ctf_link_deduplicating): Likewise.
(ctf_link_intern_extern_string): Improve comment.
(ctf_add_type_mapping): Migrate...
(ctf_type_mapping): ... these functions...
* ctf-create.c (ctf_add_type_mapping): ... here...
(ctf_type_mapping): ... and make static, for the sole use of
ctf_add_type.
2021-03-02 23:10:05 +08:00
|
|
|
ctf_link_add_ctf. Once this is done, a call to ctf_link will merge the type
|
|
|
|
tables together, generating new CTF files as needed, with this one as a
|
|
|
|
parent, to contain types from the inputs which conflict. ctf_link_add_strtab
|
|
|
|
takes a callback which provides string/offset pairs to be added to the
|
|
|
|
external symbol table and deduplicated from all CTF string tables in the
|
|
|
|
output link; ctf_link_shuffle_syms takes a callback which provides symtab
|
|
|
|
entries in ascending order, and shuffles the function and data sections to
|
|
|
|
match; and ctf_link_write emits a CTF file (if there are no conflicts
|
|
|
|
requiring per-compilation-unit sub-CTF files) or CTF archives (otherwise) and
|
|
|
|
returns it, suitable for addition in the .ctf section of the output. */
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
|
libctf, link: add lazy linking: clean up input members: err/warn cleanup
This rather large and intertwined pile of changes does three things:
First, it transitions from dprintf to ctf_err_warn for things the user might
care about: this one file is the major impetus for the ctf_err_warn
infrastructure, because things like file names are crucial in linker
error messages, and errno values are utterly incapable of
communicating them
Second, it stabilizes the ctf_link APIs: you can now call
ctf_link_add_ctf without a CTF argument (only a NAME), to lazily
ctf_open the file with the given NAME when needed, and close it as soon
as possible, to save memory. This is not an API change because a null
CTF argument was prohibited before now.
Since getting CTF directly from files uses ctf_open, passing in only a
NAME requires use of libctf, not libctf-nobfd. The linker's behaviour
is unchanged, as it still passes in a ctf_archive_t as before.
This also let us fix a leak: we were opening ctf_archives and their
containing ctf_files, then only closing the files and leaving the
archives open.
Third, this commit restructures the ctf_link_in_member argument used by
the CTF linking machinery and adjusts its users accordingly.
We drop two members:
- arcname, which is difficult to construct and then only used in error
messages (that were only dprintf()ed, so never seen!)
- share_mode, since we store the flags passed to ctf_link (including the
share mode) in a new ctf_file_t.ctf_link_flags to help dedup get hold
of it
We rename others whose existing names were fairly dreadful:
- done_main_member -> done_parent, using consistent terminology for .ctf
as the parent of all archive members
- main_input_fp -> in_fp_parent, likewise
- file_name -> in_file_name, likewise
We add one new member, cu_mapped.
Finally, we move the various frees of things like mapping table data to
the top-level ctf_link, since deduplicating links will want to do that
too.
include/
* ctf-api.h (ECTF_NEEDSBFD): New.
(ECTF_NERR): Adjust.
(ctf_link): Rename share_mode arg to flags.
libctf/
* Makefile.am: Set -DNOBFD=1 in libctf-nobfd, and =0 elsewhere.
* Makefile.in: Regenerated.
* ctf-impl.h (ctf_link_input_name): New.
(ctf_file_t) <ctf_link_flags>: New.
* ctf-create.c (ctf_serialize): Adjust accordingly.
* ctf-link.c: Define ctf_open as weak when PIC.
(ctf_arc_close_thunk): Remove unnecessary thunk.
(ctf_file_close_thunk): Likewise.
(ctf_link_input_name): New.
(ctf_link_input_t): New value of the ctf_file_t.ctf_link_input.
(ctf_link_input_close): Adjust accordingly.
(ctf_link_add_ctf_internal): New, split from...
(ctf_link_add_ctf): ... here. Return error if lazy loading of
CTF is not possible. Change to just call...
(ctf_link_add): ... this new function.
(ctf_link_add_cu_mapping): Transition to ctf_err_warn. Drop the
ctf_file_close_thunk.
(ctf_link_in_member_cb_arg_t) <file_name> Rename to...
<in_file_name>: ... this.
<arcname>: Drop.
<share_mode>: Likewise (migrated to ctf_link_flags).
<done_main_member>: Rename to...
<done_parent>: ... this.
<main_input_fp>: Rename to...
<in_fp_parent>: ... this.
<cu_mapped>: New.
(ctf_link_one_type): Adjuwt accordingly. Transition to
ctf_err_warn, removing a TODO.
(ctf_link_one_variable): Note a case too common to warn about.
Report in the debug stream if a cu-mapped link prevents addition
of a conflicting variable.
(ctf_link_one_input_archive_member): Adjust.
(ctf_link_lazy_open): New, open a CTF archive for linking when
needed.
(ctf_link_close_one_input_archive): New, close it again.
(ctf_link_one_input_archive): Adjust for lazy opening, member
renames, and ctf_err_warn transition. Move the
empty_link_type_mapping call to...
(ctf_link): ... here. Adjut for renamings and thunk removal.
Don't spuriously fail if some input contains no CTF data.
(ctf_link_write): ctf_err_warn transition.
* libctf.ver: Remove not-yet-stable comment.
2020-06-05 02:28:52 +08:00
|
|
|
/* Return the name of the compilation unit this CTF dict or its parent applies
|
|
|
|
to, or a non-null string otherwise: prefer the parent. Used in debugging
|
|
|
|
output. Sometimes used for outputs too. */
|
|
|
|
const char *
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
ctf_link_input_name (ctf_dict_t *fp)
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
{
|
libctf, link: add lazy linking: clean up input members: err/warn cleanup
This rather large and intertwined pile of changes does three things:
First, it transitions from dprintf to ctf_err_warn for things the user might
care about: this one file is the major impetus for the ctf_err_warn
infrastructure, because things like file names are crucial in linker
error messages, and errno values are utterly incapable of
communicating them
Second, it stabilizes the ctf_link APIs: you can now call
ctf_link_add_ctf without a CTF argument (only a NAME), to lazily
ctf_open the file with the given NAME when needed, and close it as soon
as possible, to save memory. This is not an API change because a null
CTF argument was prohibited before now.
Since getting CTF directly from files uses ctf_open, passing in only a
NAME requires use of libctf, not libctf-nobfd. The linker's behaviour
is unchanged, as it still passes in a ctf_archive_t as before.
This also let us fix a leak: we were opening ctf_archives and their
containing ctf_files, then only closing the files and leaving the
archives open.
Third, this commit restructures the ctf_link_in_member argument used by
the CTF linking machinery and adjusts its users accordingly.
We drop two members:
- arcname, which is difficult to construct and then only used in error
messages (that were only dprintf()ed, so never seen!)
- share_mode, since we store the flags passed to ctf_link (including the
share mode) in a new ctf_file_t.ctf_link_flags to help dedup get hold
of it
We rename others whose existing names were fairly dreadful:
- done_main_member -> done_parent, using consistent terminology for .ctf
as the parent of all archive members
- main_input_fp -> in_fp_parent, likewise
- file_name -> in_file_name, likewise
We add one new member, cu_mapped.
Finally, we move the various frees of things like mapping table data to
the top-level ctf_link, since deduplicating links will want to do that
too.
include/
* ctf-api.h (ECTF_NEEDSBFD): New.
(ECTF_NERR): Adjust.
(ctf_link): Rename share_mode arg to flags.
libctf/
* Makefile.am: Set -DNOBFD=1 in libctf-nobfd, and =0 elsewhere.
* Makefile.in: Regenerated.
* ctf-impl.h (ctf_link_input_name): New.
(ctf_file_t) <ctf_link_flags>: New.
* ctf-create.c (ctf_serialize): Adjust accordingly.
* ctf-link.c: Define ctf_open as weak when PIC.
(ctf_arc_close_thunk): Remove unnecessary thunk.
(ctf_file_close_thunk): Likewise.
(ctf_link_input_name): New.
(ctf_link_input_t): New value of the ctf_file_t.ctf_link_input.
(ctf_link_input_close): Adjust accordingly.
(ctf_link_add_ctf_internal): New, split from...
(ctf_link_add_ctf): ... here. Return error if lazy loading of
CTF is not possible. Change to just call...
(ctf_link_add): ... this new function.
(ctf_link_add_cu_mapping): Transition to ctf_err_warn. Drop the
ctf_file_close_thunk.
(ctf_link_in_member_cb_arg_t) <file_name> Rename to...
<in_file_name>: ... this.
<arcname>: Drop.
<share_mode>: Likewise (migrated to ctf_link_flags).
<done_main_member>: Rename to...
<done_parent>: ... this.
<main_input_fp>: Rename to...
<in_fp_parent>: ... this.
<cu_mapped>: New.
(ctf_link_one_type): Adjuwt accordingly. Transition to
ctf_err_warn, removing a TODO.
(ctf_link_one_variable): Note a case too common to warn about.
Report in the debug stream if a cu-mapped link prevents addition
of a conflicting variable.
(ctf_link_one_input_archive_member): Adjust.
(ctf_link_lazy_open): New, open a CTF archive for linking when
needed.
(ctf_link_close_one_input_archive): New, close it again.
(ctf_link_one_input_archive): Adjust for lazy opening, member
renames, and ctf_err_warn transition. Move the
empty_link_type_mapping call to...
(ctf_link): ... here. Adjut for renamings and thunk removal.
Don't spuriously fail if some input contains no CTF data.
(ctf_link_write): ctf_err_warn transition.
* libctf.ver: Remove not-yet-stable comment.
2020-06-05 02:28:52 +08:00
|
|
|
if (fp->ctf_parent && fp->ctf_parent->ctf_cuname)
|
|
|
|
return fp->ctf_parent->ctf_cuname;
|
|
|
|
else if (fp->ctf_cuname)
|
|
|
|
return fp->ctf_cuname;
|
|
|
|
else
|
|
|
|
return "(unnamed)";
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
}
|
|
|
|
|
libctf: add a deduplicator-specific type mapping table
When CTF linking is done, the linker has to track the association
between types in the inputs and types in the outputs. The deduplicator
does this via the cd_output_emission_hashes, which maps from hashes of
types (valid in both the input and output) to the IDs of types in the
specific dict in which the cd_emission_hashes is held. However, the
nondeduplicating linker and ctf_add_type used a different mechanism, a
dedicated hashtab stored in the ctf_link_type_mapping, populated via
ctf_add_type_mapping and queried via the ctf_type_mapping function. To
allow the same functions to be used for variable and symbol population
in both the deduplicating and nondeduplicating linker, the deduplicator
carefully transferred all its input->output mappings into this hashtab
before returning.
This is *expensive*. The number of entries in this hashtab scales as the
number of input types, and unlike the hashing machinery the type mapping
machinery (the only other thing which scales that way) has not been much
optimized.
Now the nondeduplicating linker is gone, we can throw this out, move
the existing type mapping machinery to ctf-create.c and dedicate it to
ctf_add_type alone, and add a new function ctf_dedup_type_mapping which
uses the deduplicator's built-in knowledge of type mappings directly,
without requiring an expensive repopulation phase.
This speeds up a test link of nouveau.ko (a good worst-case candidate
with a lot of types in each of a lot of input files) from 9.11s to 7.15s
in my testing, a speedup of over 20%.
libctf/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (ctf_dict_t) <ctf_link_type_mapping>: No longer used
by the nondeduplicating linker.
(ctf_add_type_mapping): Removed, now static.
(ctf_type_mapping): Likewise.
(ctf_dedup_type_mapping): New.
(ctf_dedup_t) <cd_input_nums>: New.
* ctf-dedup.c (ctf_dedup_init): Populate it.
(ctf_dedup_fini): Free it again. Emphasise that this has to be
the last thing called.
(ctf_dedup): Populate it.
(ctf_dedup_populate_type_mapping): Removed.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): No longer call it. No longer call
ctf_dedup_fini either.
(ctf_dedup_type_mapping): New.
* ctf-link.c (ctf_unnamed_cuname): New.
(ctf_create_per_cu): Arguments must be non-null now.
(ctf_in_member_cb_arg): Removed.
(ctf_link): No longer populate it. No longer discard the
mapping table.
(ctf_link_deduplicating_one_symtypetab): Use
ctf_dedup_type_mapping, not ctf_type_mapping. Use
ctf_unnamed_cuname.
(ctf_link_one_variable): Likewise. Pass in args individually: no
longer a ctf_variable_iter callback.
(empty_link_type_mapping): Removed.
(ctf_link_deduplicating_variables): Use ctf_variable_next, not
ctf_variable_iter. No longer pack arguments to
ctf_link_one_variable into a struct.
(ctf_link_deduplicating_per_cu): Call ctf_dedup_fini once
all link phases are done.
(ctf_link_deduplicating): Likewise.
(ctf_link_intern_extern_string): Improve comment.
(ctf_add_type_mapping): Migrate...
(ctf_type_mapping): ... these functions...
* ctf-create.c (ctf_add_type_mapping): ... here...
(ctf_type_mapping): ... and make static, for the sole use of
ctf_add_type.
2021-03-02 23:10:05 +08:00
|
|
|
/* Return the cuname of a dict, or the string "unnamed-CU" if none. */
|
|
|
|
|
|
|
|
static const char *
|
|
|
|
ctf_unnamed_cuname (ctf_dict_t *fp)
|
|
|
|
{
|
|
|
|
const char *cuname = ctf_cuname (fp);
|
|
|
|
|
|
|
|
if (!cuname)
|
|
|
|
cuname = "unnamed-CU";
|
|
|
|
|
|
|
|
return cuname;
|
|
|
|
}
|
|
|
|
|
libctf, link: add lazy linking: clean up input members: err/warn cleanup
This rather large and intertwined pile of changes does three things:
First, it transitions from dprintf to ctf_err_warn for things the user might
care about: this one file is the major impetus for the ctf_err_warn
infrastructure, because things like file names are crucial in linker
error messages, and errno values are utterly incapable of
communicating them
Second, it stabilizes the ctf_link APIs: you can now call
ctf_link_add_ctf without a CTF argument (only a NAME), to lazily
ctf_open the file with the given NAME when needed, and close it as soon
as possible, to save memory. This is not an API change because a null
CTF argument was prohibited before now.
Since getting CTF directly from files uses ctf_open, passing in only a
NAME requires use of libctf, not libctf-nobfd. The linker's behaviour
is unchanged, as it still passes in a ctf_archive_t as before.
This also let us fix a leak: we were opening ctf_archives and their
containing ctf_files, then only closing the files and leaving the
archives open.
Third, this commit restructures the ctf_link_in_member argument used by
the CTF linking machinery and adjusts its users accordingly.
We drop two members:
- arcname, which is difficult to construct and then only used in error
messages (that were only dprintf()ed, so never seen!)
- share_mode, since we store the flags passed to ctf_link (including the
share mode) in a new ctf_file_t.ctf_link_flags to help dedup get hold
of it
We rename others whose existing names were fairly dreadful:
- done_main_member -> done_parent, using consistent terminology for .ctf
as the parent of all archive members
- main_input_fp -> in_fp_parent, likewise
- file_name -> in_file_name, likewise
We add one new member, cu_mapped.
Finally, we move the various frees of things like mapping table data to
the top-level ctf_link, since deduplicating links will want to do that
too.
include/
* ctf-api.h (ECTF_NEEDSBFD): New.
(ECTF_NERR): Adjust.
(ctf_link): Rename share_mode arg to flags.
libctf/
* Makefile.am: Set -DNOBFD=1 in libctf-nobfd, and =0 elsewhere.
* Makefile.in: Regenerated.
* ctf-impl.h (ctf_link_input_name): New.
(ctf_file_t) <ctf_link_flags>: New.
* ctf-create.c (ctf_serialize): Adjust accordingly.
* ctf-link.c: Define ctf_open as weak when PIC.
(ctf_arc_close_thunk): Remove unnecessary thunk.
(ctf_file_close_thunk): Likewise.
(ctf_link_input_name): New.
(ctf_link_input_t): New value of the ctf_file_t.ctf_link_input.
(ctf_link_input_close): Adjust accordingly.
(ctf_link_add_ctf_internal): New, split from...
(ctf_link_add_ctf): ... here. Return error if lazy loading of
CTF is not possible. Change to just call...
(ctf_link_add): ... this new function.
(ctf_link_add_cu_mapping): Transition to ctf_err_warn. Drop the
ctf_file_close_thunk.
(ctf_link_in_member_cb_arg_t) <file_name> Rename to...
<in_file_name>: ... this.
<arcname>: Drop.
<share_mode>: Likewise (migrated to ctf_link_flags).
<done_main_member>: Rename to...
<done_parent>: ... this.
<main_input_fp>: Rename to...
<in_fp_parent>: ... this.
<cu_mapped>: New.
(ctf_link_one_type): Adjuwt accordingly. Transition to
ctf_err_warn, removing a TODO.
(ctf_link_one_variable): Note a case too common to warn about.
Report in the debug stream if a cu-mapped link prevents addition
of a conflicting variable.
(ctf_link_one_input_archive_member): Adjust.
(ctf_link_lazy_open): New, open a CTF archive for linking when
needed.
(ctf_link_close_one_input_archive): New, close it again.
(ctf_link_one_input_archive): Adjust for lazy opening, member
renames, and ctf_err_warn transition. Move the
empty_link_type_mapping call to...
(ctf_link): ... here. Adjut for renamings and thunk removal.
Don't spuriously fail if some input contains no CTF data.
(ctf_link_write): ctf_err_warn transition.
* libctf.ver: Remove not-yet-stable comment.
2020-06-05 02:28:52 +08:00
|
|
|
/* The linker inputs look like this. clin_fp is used for short-circuited
|
|
|
|
CU-mapped links that can entirely avoid the first link phase in some
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
situations in favour of just passing on the contained ctf_dict_t: it is
|
|
|
|
always the sole ctf_dict_t inside the corresponding clin_arc. If set, it
|
libctf, link: add lazy linking: clean up input members: err/warn cleanup
This rather large and intertwined pile of changes does three things:
First, it transitions from dprintf to ctf_err_warn for things the user might
care about: this one file is the major impetus for the ctf_err_warn
infrastructure, because things like file names are crucial in linker
error messages, and errno values are utterly incapable of
communicating them
Second, it stabilizes the ctf_link APIs: you can now call
ctf_link_add_ctf without a CTF argument (only a NAME), to lazily
ctf_open the file with the given NAME when needed, and close it as soon
as possible, to save memory. This is not an API change because a null
CTF argument was prohibited before now.
Since getting CTF directly from files uses ctf_open, passing in only a
NAME requires use of libctf, not libctf-nobfd. The linker's behaviour
is unchanged, as it still passes in a ctf_archive_t as before.
This also let us fix a leak: we were opening ctf_archives and their
containing ctf_files, then only closing the files and leaving the
archives open.
Third, this commit restructures the ctf_link_in_member argument used by
the CTF linking machinery and adjusts its users accordingly.
We drop two members:
- arcname, which is difficult to construct and then only used in error
messages (that were only dprintf()ed, so never seen!)
- share_mode, since we store the flags passed to ctf_link (including the
share mode) in a new ctf_file_t.ctf_link_flags to help dedup get hold
of it
We rename others whose existing names were fairly dreadful:
- done_main_member -> done_parent, using consistent terminology for .ctf
as the parent of all archive members
- main_input_fp -> in_fp_parent, likewise
- file_name -> in_file_name, likewise
We add one new member, cu_mapped.
Finally, we move the various frees of things like mapping table data to
the top-level ctf_link, since deduplicating links will want to do that
too.
include/
* ctf-api.h (ECTF_NEEDSBFD): New.
(ECTF_NERR): Adjust.
(ctf_link): Rename share_mode arg to flags.
libctf/
* Makefile.am: Set -DNOBFD=1 in libctf-nobfd, and =0 elsewhere.
* Makefile.in: Regenerated.
* ctf-impl.h (ctf_link_input_name): New.
(ctf_file_t) <ctf_link_flags>: New.
* ctf-create.c (ctf_serialize): Adjust accordingly.
* ctf-link.c: Define ctf_open as weak when PIC.
(ctf_arc_close_thunk): Remove unnecessary thunk.
(ctf_file_close_thunk): Likewise.
(ctf_link_input_name): New.
(ctf_link_input_t): New value of the ctf_file_t.ctf_link_input.
(ctf_link_input_close): Adjust accordingly.
(ctf_link_add_ctf_internal): New, split from...
(ctf_link_add_ctf): ... here. Return error if lazy loading of
CTF is not possible. Change to just call...
(ctf_link_add): ... this new function.
(ctf_link_add_cu_mapping): Transition to ctf_err_warn. Drop the
ctf_file_close_thunk.
(ctf_link_in_member_cb_arg_t) <file_name> Rename to...
<in_file_name>: ... this.
<arcname>: Drop.
<share_mode>: Likewise (migrated to ctf_link_flags).
<done_main_member>: Rename to...
<done_parent>: ... this.
<main_input_fp>: Rename to...
<in_fp_parent>: ... this.
<cu_mapped>: New.
(ctf_link_one_type): Adjuwt accordingly. Transition to
ctf_err_warn, removing a TODO.
(ctf_link_one_variable): Note a case too common to warn about.
Report in the debug stream if a cu-mapped link prevents addition
of a conflicting variable.
(ctf_link_one_input_archive_member): Adjust.
(ctf_link_lazy_open): New, open a CTF archive for linking when
needed.
(ctf_link_close_one_input_archive): New, close it again.
(ctf_link_one_input_archive): Adjust for lazy opening, member
renames, and ctf_err_warn transition. Move the
empty_link_type_mapping call to...
(ctf_link): ... here. Adjut for renamings and thunk removal.
Don't spuriously fail if some input contains no CTF data.
(ctf_link_write): ctf_err_warn transition.
* libctf.ver: Remove not-yet-stable comment.
2020-06-05 02:28:52 +08:00
|
|
|
gets assigned directly to the final link inputs and freed from there, so it
|
|
|
|
never gets explicitly freed in the ctf_link_input. */
|
|
|
|
typedef struct ctf_link_input
|
|
|
|
{
|
libctf: fix linking together multiple objects derived from the same source
Right now, if you compile the same .c input repeatedly with CTF enabled
and different compilation flags, then arrange to link all of these
together, then things misbehave in various ways. libctf may conflate
either inputs (if the .o files have the same name, say if they are
stored in different .a archives), or per-CU outputs when conflicting
types are found: the latter can lead to entirely spurious errors when
it tries to produce multiple per-CU outputs with the same name
(discarding all but the last, but then looking for types in the earlier
ones which have just been thrown away).
Fixing this is multi-pronged. Both inputs and outputs need to be
differentiated in the hashtables libctf keeps them in: inputs with the
same cuname and filename need to be considered distinct as long as they
have different associated CTF dicts, and per-CU outputs need to be
considered distinct as long as they have different associated input
dicts. Right now there is nothing tying the two together other than the
CU name: fix this by introducing a new field in the ctf_dict_t named
ctf_link_in_out, which (for input dicts) points to the associated per-CU
output dict (if any), and for output dicts points to the associated
input dict. At creation time the name used is completely arbitrary:
it's only important that it be distinct if CTF dicts are distinct. So,
when a clash is found, adjust the CU name by sticking the number of
elements in the input on the end. At output time, the CU name will
appear in the linked object, so it matters a little more that it look
slightly less ugly: in conflicting cases, append an incrementing
integer, starting at 0.
This naming scheme is not very helpful, but it's hard to see what else
we can do. The input .o name may be the same. The input .a name is not
even visible to ctf_link, and even *that* might be the same, because
.a's can contain many members with the same name, all of which
participate in the link. All we really know is that the two have
distinct dictionaries with distinct types in them, and at least this way
they are all represented, any any symbols, variables etc referring to
those types are accurately stored.
(As a side-effect this also fixes a use-after-free and double-free when
errors are found during variable or symbol emission.)
Use the opportunity to prevent a couple of sources of problems, to wit
changing the active CU mappings when a link has already been done
(no effect on ld, which doesn't use CU mappings at all), and causing
multiple consecutive ctf_link's to have the same net effect as just
doing the last one (no effect on ld, which only ever does one
ctf_link) rather than having the links be a sort of half-incremental
not-really-intended mess.
libctf/ChangeLog:
PR libctf/29242
* ctf-impl.h (struct ctf_dict) [ctf_link_in_out]: New.
* ctf-dedup.c (ctf_dedup_emit_type): Set it.
* ctf-link.c (ctf_link_add_ctf_internal): Set the input
CU name uniquely when clashes are found.
(ctf_link_add): Document what repeated additions do.
(ctf_new_per_cu_name): New, come up with a consistent
name for a new per-CU dict.
(ctf_link_deduplicating): Use it.
(ctf_create_per_cu): Use it, and ctf_link_in_out, and set
ctf_link_in_out properly. Don't overwrite per-CU dicts with
per-CU dicts relating to different inputs.
(ctf_link_add_cu_mapping): Prevent per-CU mappings being set up
if we already have per-CU outputs.
(ctf_link_one_variable): Adjust ctf_link_per_cu call.
(ctf_link_deduplicating_one_symtypetab): Likewise.
(ctf_link_empty_outputs): New, delete all the ctf_link_outputs
and blank out ctf_link_in_out on the corresponding inputs.
(ctf_link): Clarify the effect of multiple ctf_link calls.
Empty ctf_link_outputs if it already exists rather than
having the old output leak into the new link. Fix a variable
name.
* testsuite/config/default.exp (AR): Add.
(OBJDUMP): Likewise.
* testsuite/libctf-regression/libctf-repeat-cu.exp: New test.
* testsuite/libctf-regression/libctf-repeat-cu*: Main program,
library, and expected results for the test.
2022-06-11 00:05:50 +08:00
|
|
|
char *clin_filename;
|
libctf, link: add lazy linking: clean up input members: err/warn cleanup
This rather large and intertwined pile of changes does three things:
First, it transitions from dprintf to ctf_err_warn for things the user might
care about: this one file is the major impetus for the ctf_err_warn
infrastructure, because things like file names are crucial in linker
error messages, and errno values are utterly incapable of
communicating them
Second, it stabilizes the ctf_link APIs: you can now call
ctf_link_add_ctf without a CTF argument (only a NAME), to lazily
ctf_open the file with the given NAME when needed, and close it as soon
as possible, to save memory. This is not an API change because a null
CTF argument was prohibited before now.
Since getting CTF directly from files uses ctf_open, passing in only a
NAME requires use of libctf, not libctf-nobfd. The linker's behaviour
is unchanged, as it still passes in a ctf_archive_t as before.
This also let us fix a leak: we were opening ctf_archives and their
containing ctf_files, then only closing the files and leaving the
archives open.
Third, this commit restructures the ctf_link_in_member argument used by
the CTF linking machinery and adjusts its users accordingly.
We drop two members:
- arcname, which is difficult to construct and then only used in error
messages (that were only dprintf()ed, so never seen!)
- share_mode, since we store the flags passed to ctf_link (including the
share mode) in a new ctf_file_t.ctf_link_flags to help dedup get hold
of it
We rename others whose existing names were fairly dreadful:
- done_main_member -> done_parent, using consistent terminology for .ctf
as the parent of all archive members
- main_input_fp -> in_fp_parent, likewise
- file_name -> in_file_name, likewise
We add one new member, cu_mapped.
Finally, we move the various frees of things like mapping table data to
the top-level ctf_link, since deduplicating links will want to do that
too.
include/
* ctf-api.h (ECTF_NEEDSBFD): New.
(ECTF_NERR): Adjust.
(ctf_link): Rename share_mode arg to flags.
libctf/
* Makefile.am: Set -DNOBFD=1 in libctf-nobfd, and =0 elsewhere.
* Makefile.in: Regenerated.
* ctf-impl.h (ctf_link_input_name): New.
(ctf_file_t) <ctf_link_flags>: New.
* ctf-create.c (ctf_serialize): Adjust accordingly.
* ctf-link.c: Define ctf_open as weak when PIC.
(ctf_arc_close_thunk): Remove unnecessary thunk.
(ctf_file_close_thunk): Likewise.
(ctf_link_input_name): New.
(ctf_link_input_t): New value of the ctf_file_t.ctf_link_input.
(ctf_link_input_close): Adjust accordingly.
(ctf_link_add_ctf_internal): New, split from...
(ctf_link_add_ctf): ... here. Return error if lazy loading of
CTF is not possible. Change to just call...
(ctf_link_add): ... this new function.
(ctf_link_add_cu_mapping): Transition to ctf_err_warn. Drop the
ctf_file_close_thunk.
(ctf_link_in_member_cb_arg_t) <file_name> Rename to...
<in_file_name>: ... this.
<arcname>: Drop.
<share_mode>: Likewise (migrated to ctf_link_flags).
<done_main_member>: Rename to...
<done_parent>: ... this.
<main_input_fp>: Rename to...
<in_fp_parent>: ... this.
<cu_mapped>: New.
(ctf_link_one_type): Adjuwt accordingly. Transition to
ctf_err_warn, removing a TODO.
(ctf_link_one_variable): Note a case too common to warn about.
Report in the debug stream if a cu-mapped link prevents addition
of a conflicting variable.
(ctf_link_one_input_archive_member): Adjust.
(ctf_link_lazy_open): New, open a CTF archive for linking when
needed.
(ctf_link_close_one_input_archive): New, close it again.
(ctf_link_one_input_archive): Adjust for lazy opening, member
renames, and ctf_err_warn transition. Move the
empty_link_type_mapping call to...
(ctf_link): ... here. Adjut for renamings and thunk removal.
Don't spuriously fail if some input contains no CTF data.
(ctf_link_write): ctf_err_warn transition.
* libctf.ver: Remove not-yet-stable comment.
2020-06-05 02:28:52 +08:00
|
|
|
ctf_archive_t *clin_arc;
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
ctf_dict_t *clin_fp;
|
libctf, link: add lazy linking: clean up input members: err/warn cleanup
This rather large and intertwined pile of changes does three things:
First, it transitions from dprintf to ctf_err_warn for things the user might
care about: this one file is the major impetus for the ctf_err_warn
infrastructure, because things like file names are crucial in linker
error messages, and errno values are utterly incapable of
communicating them
Second, it stabilizes the ctf_link APIs: you can now call
ctf_link_add_ctf without a CTF argument (only a NAME), to lazily
ctf_open the file with the given NAME when needed, and close it as soon
as possible, to save memory. This is not an API change because a null
CTF argument was prohibited before now.
Since getting CTF directly from files uses ctf_open, passing in only a
NAME requires use of libctf, not libctf-nobfd. The linker's behaviour
is unchanged, as it still passes in a ctf_archive_t as before.
This also let us fix a leak: we were opening ctf_archives and their
containing ctf_files, then only closing the files and leaving the
archives open.
Third, this commit restructures the ctf_link_in_member argument used by
the CTF linking machinery and adjusts its users accordingly.
We drop two members:
- arcname, which is difficult to construct and then only used in error
messages (that were only dprintf()ed, so never seen!)
- share_mode, since we store the flags passed to ctf_link (including the
share mode) in a new ctf_file_t.ctf_link_flags to help dedup get hold
of it
We rename others whose existing names were fairly dreadful:
- done_main_member -> done_parent, using consistent terminology for .ctf
as the parent of all archive members
- main_input_fp -> in_fp_parent, likewise
- file_name -> in_file_name, likewise
We add one new member, cu_mapped.
Finally, we move the various frees of things like mapping table data to
the top-level ctf_link, since deduplicating links will want to do that
too.
include/
* ctf-api.h (ECTF_NEEDSBFD): New.
(ECTF_NERR): Adjust.
(ctf_link): Rename share_mode arg to flags.
libctf/
* Makefile.am: Set -DNOBFD=1 in libctf-nobfd, and =0 elsewhere.
* Makefile.in: Regenerated.
* ctf-impl.h (ctf_link_input_name): New.
(ctf_file_t) <ctf_link_flags>: New.
* ctf-create.c (ctf_serialize): Adjust accordingly.
* ctf-link.c: Define ctf_open as weak when PIC.
(ctf_arc_close_thunk): Remove unnecessary thunk.
(ctf_file_close_thunk): Likewise.
(ctf_link_input_name): New.
(ctf_link_input_t): New value of the ctf_file_t.ctf_link_input.
(ctf_link_input_close): Adjust accordingly.
(ctf_link_add_ctf_internal): New, split from...
(ctf_link_add_ctf): ... here. Return error if lazy loading of
CTF is not possible. Change to just call...
(ctf_link_add): ... this new function.
(ctf_link_add_cu_mapping): Transition to ctf_err_warn. Drop the
ctf_file_close_thunk.
(ctf_link_in_member_cb_arg_t) <file_name> Rename to...
<in_file_name>: ... this.
<arcname>: Drop.
<share_mode>: Likewise (migrated to ctf_link_flags).
<done_main_member>: Rename to...
<done_parent>: ... this.
<main_input_fp>: Rename to...
<in_fp_parent>: ... this.
<cu_mapped>: New.
(ctf_link_one_type): Adjuwt accordingly. Transition to
ctf_err_warn, removing a TODO.
(ctf_link_one_variable): Note a case too common to warn about.
Report in the debug stream if a cu-mapped link prevents addition
of a conflicting variable.
(ctf_link_one_input_archive_member): Adjust.
(ctf_link_lazy_open): New, open a CTF archive for linking when
needed.
(ctf_link_close_one_input_archive): New, close it again.
(ctf_link_one_input_archive): Adjust for lazy opening, member
renames, and ctf_err_warn transition. Move the
empty_link_type_mapping call to...
(ctf_link): ... here. Adjut for renamings and thunk removal.
Don't spuriously fail if some input contains no CTF data.
(ctf_link_write): ctf_err_warn transition.
* libctf.ver: Remove not-yet-stable comment.
2020-06-05 02:28:52 +08:00
|
|
|
int n;
|
|
|
|
} ctf_link_input_t;
|
|
|
|
|
|
|
|
static void
|
|
|
|
ctf_link_input_close (void *input)
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
{
|
libctf, link: add lazy linking: clean up input members: err/warn cleanup
This rather large and intertwined pile of changes does three things:
First, it transitions from dprintf to ctf_err_warn for things the user might
care about: this one file is the major impetus for the ctf_err_warn
infrastructure, because things like file names are crucial in linker
error messages, and errno values are utterly incapable of
communicating them
Second, it stabilizes the ctf_link APIs: you can now call
ctf_link_add_ctf without a CTF argument (only a NAME), to lazily
ctf_open the file with the given NAME when needed, and close it as soon
as possible, to save memory. This is not an API change because a null
CTF argument was prohibited before now.
Since getting CTF directly from files uses ctf_open, passing in only a
NAME requires use of libctf, not libctf-nobfd. The linker's behaviour
is unchanged, as it still passes in a ctf_archive_t as before.
This also let us fix a leak: we were opening ctf_archives and their
containing ctf_files, then only closing the files and leaving the
archives open.
Third, this commit restructures the ctf_link_in_member argument used by
the CTF linking machinery and adjusts its users accordingly.
We drop two members:
- arcname, which is difficult to construct and then only used in error
messages (that were only dprintf()ed, so never seen!)
- share_mode, since we store the flags passed to ctf_link (including the
share mode) in a new ctf_file_t.ctf_link_flags to help dedup get hold
of it
We rename others whose existing names were fairly dreadful:
- done_main_member -> done_parent, using consistent terminology for .ctf
as the parent of all archive members
- main_input_fp -> in_fp_parent, likewise
- file_name -> in_file_name, likewise
We add one new member, cu_mapped.
Finally, we move the various frees of things like mapping table data to
the top-level ctf_link, since deduplicating links will want to do that
too.
include/
* ctf-api.h (ECTF_NEEDSBFD): New.
(ECTF_NERR): Adjust.
(ctf_link): Rename share_mode arg to flags.
libctf/
* Makefile.am: Set -DNOBFD=1 in libctf-nobfd, and =0 elsewhere.
* Makefile.in: Regenerated.
* ctf-impl.h (ctf_link_input_name): New.
(ctf_file_t) <ctf_link_flags>: New.
* ctf-create.c (ctf_serialize): Adjust accordingly.
* ctf-link.c: Define ctf_open as weak when PIC.
(ctf_arc_close_thunk): Remove unnecessary thunk.
(ctf_file_close_thunk): Likewise.
(ctf_link_input_name): New.
(ctf_link_input_t): New value of the ctf_file_t.ctf_link_input.
(ctf_link_input_close): Adjust accordingly.
(ctf_link_add_ctf_internal): New, split from...
(ctf_link_add_ctf): ... here. Return error if lazy loading of
CTF is not possible. Change to just call...
(ctf_link_add): ... this new function.
(ctf_link_add_cu_mapping): Transition to ctf_err_warn. Drop the
ctf_file_close_thunk.
(ctf_link_in_member_cb_arg_t) <file_name> Rename to...
<in_file_name>: ... this.
<arcname>: Drop.
<share_mode>: Likewise (migrated to ctf_link_flags).
<done_main_member>: Rename to...
<done_parent>: ... this.
<main_input_fp>: Rename to...
<in_fp_parent>: ... this.
<cu_mapped>: New.
(ctf_link_one_type): Adjuwt accordingly. Transition to
ctf_err_warn, removing a TODO.
(ctf_link_one_variable): Note a case too common to warn about.
Report in the debug stream if a cu-mapped link prevents addition
of a conflicting variable.
(ctf_link_one_input_archive_member): Adjust.
(ctf_link_lazy_open): New, open a CTF archive for linking when
needed.
(ctf_link_close_one_input_archive): New, close it again.
(ctf_link_one_input_archive): Adjust for lazy opening, member
renames, and ctf_err_warn transition. Move the
empty_link_type_mapping call to...
(ctf_link): ... here. Adjut for renamings and thunk removal.
Don't spuriously fail if some input contains no CTF data.
(ctf_link_write): ctf_err_warn transition.
* libctf.ver: Remove not-yet-stable comment.
2020-06-05 02:28:52 +08:00
|
|
|
ctf_link_input_t *i = (ctf_link_input_t *) input;
|
|
|
|
if (i->clin_arc)
|
|
|
|
ctf_arc_close (i->clin_arc);
|
libctf: fix linking together multiple objects derived from the same source
Right now, if you compile the same .c input repeatedly with CTF enabled
and different compilation flags, then arrange to link all of these
together, then things misbehave in various ways. libctf may conflate
either inputs (if the .o files have the same name, say if they are
stored in different .a archives), or per-CU outputs when conflicting
types are found: the latter can lead to entirely spurious errors when
it tries to produce multiple per-CU outputs with the same name
(discarding all but the last, but then looking for types in the earlier
ones which have just been thrown away).
Fixing this is multi-pronged. Both inputs and outputs need to be
differentiated in the hashtables libctf keeps them in: inputs with the
same cuname and filename need to be considered distinct as long as they
have different associated CTF dicts, and per-CU outputs need to be
considered distinct as long as they have different associated input
dicts. Right now there is nothing tying the two together other than the
CU name: fix this by introducing a new field in the ctf_dict_t named
ctf_link_in_out, which (for input dicts) points to the associated per-CU
output dict (if any), and for output dicts points to the associated
input dict. At creation time the name used is completely arbitrary:
it's only important that it be distinct if CTF dicts are distinct. So,
when a clash is found, adjust the CU name by sticking the number of
elements in the input on the end. At output time, the CU name will
appear in the linked object, so it matters a little more that it look
slightly less ugly: in conflicting cases, append an incrementing
integer, starting at 0.
This naming scheme is not very helpful, but it's hard to see what else
we can do. The input .o name may be the same. The input .a name is not
even visible to ctf_link, and even *that* might be the same, because
.a's can contain many members with the same name, all of which
participate in the link. All we really know is that the two have
distinct dictionaries with distinct types in them, and at least this way
they are all represented, any any symbols, variables etc referring to
those types are accurately stored.
(As a side-effect this also fixes a use-after-free and double-free when
errors are found during variable or symbol emission.)
Use the opportunity to prevent a couple of sources of problems, to wit
changing the active CU mappings when a link has already been done
(no effect on ld, which doesn't use CU mappings at all), and causing
multiple consecutive ctf_link's to have the same net effect as just
doing the last one (no effect on ld, which only ever does one
ctf_link) rather than having the links be a sort of half-incremental
not-really-intended mess.
libctf/ChangeLog:
PR libctf/29242
* ctf-impl.h (struct ctf_dict) [ctf_link_in_out]: New.
* ctf-dedup.c (ctf_dedup_emit_type): Set it.
* ctf-link.c (ctf_link_add_ctf_internal): Set the input
CU name uniquely when clashes are found.
(ctf_link_add): Document what repeated additions do.
(ctf_new_per_cu_name): New, come up with a consistent
name for a new per-CU dict.
(ctf_link_deduplicating): Use it.
(ctf_create_per_cu): Use it, and ctf_link_in_out, and set
ctf_link_in_out properly. Don't overwrite per-CU dicts with
per-CU dicts relating to different inputs.
(ctf_link_add_cu_mapping): Prevent per-CU mappings being set up
if we already have per-CU outputs.
(ctf_link_one_variable): Adjust ctf_link_per_cu call.
(ctf_link_deduplicating_one_symtypetab): Likewise.
(ctf_link_empty_outputs): New, delete all the ctf_link_outputs
and blank out ctf_link_in_out on the corresponding inputs.
(ctf_link): Clarify the effect of multiple ctf_link calls.
Empty ctf_link_outputs if it already exists rather than
having the old output leak into the new link. Fix a variable
name.
* testsuite/config/default.exp (AR): Add.
(OBJDUMP): Likewise.
* testsuite/libctf-regression/libctf-repeat-cu.exp: New test.
* testsuite/libctf-regression/libctf-repeat-cu*: Main program,
library, and expected results for the test.
2022-06-11 00:05:50 +08:00
|
|
|
free (i->clin_filename);
|
libctf, link: add lazy linking: clean up input members: err/warn cleanup
This rather large and intertwined pile of changes does three things:
First, it transitions from dprintf to ctf_err_warn for things the user might
care about: this one file is the major impetus for the ctf_err_warn
infrastructure, because things like file names are crucial in linker
error messages, and errno values are utterly incapable of
communicating them
Second, it stabilizes the ctf_link APIs: you can now call
ctf_link_add_ctf without a CTF argument (only a NAME), to lazily
ctf_open the file with the given NAME when needed, and close it as soon
as possible, to save memory. This is not an API change because a null
CTF argument was prohibited before now.
Since getting CTF directly from files uses ctf_open, passing in only a
NAME requires use of libctf, not libctf-nobfd. The linker's behaviour
is unchanged, as it still passes in a ctf_archive_t as before.
This also let us fix a leak: we were opening ctf_archives and their
containing ctf_files, then only closing the files and leaving the
archives open.
Third, this commit restructures the ctf_link_in_member argument used by
the CTF linking machinery and adjusts its users accordingly.
We drop two members:
- arcname, which is difficult to construct and then only used in error
messages (that were only dprintf()ed, so never seen!)
- share_mode, since we store the flags passed to ctf_link (including the
share mode) in a new ctf_file_t.ctf_link_flags to help dedup get hold
of it
We rename others whose existing names were fairly dreadful:
- done_main_member -> done_parent, using consistent terminology for .ctf
as the parent of all archive members
- main_input_fp -> in_fp_parent, likewise
- file_name -> in_file_name, likewise
We add one new member, cu_mapped.
Finally, we move the various frees of things like mapping table data to
the top-level ctf_link, since deduplicating links will want to do that
too.
include/
* ctf-api.h (ECTF_NEEDSBFD): New.
(ECTF_NERR): Adjust.
(ctf_link): Rename share_mode arg to flags.
libctf/
* Makefile.am: Set -DNOBFD=1 in libctf-nobfd, and =0 elsewhere.
* Makefile.in: Regenerated.
* ctf-impl.h (ctf_link_input_name): New.
(ctf_file_t) <ctf_link_flags>: New.
* ctf-create.c (ctf_serialize): Adjust accordingly.
* ctf-link.c: Define ctf_open as weak when PIC.
(ctf_arc_close_thunk): Remove unnecessary thunk.
(ctf_file_close_thunk): Likewise.
(ctf_link_input_name): New.
(ctf_link_input_t): New value of the ctf_file_t.ctf_link_input.
(ctf_link_input_close): Adjust accordingly.
(ctf_link_add_ctf_internal): New, split from...
(ctf_link_add_ctf): ... here. Return error if lazy loading of
CTF is not possible. Change to just call...
(ctf_link_add): ... this new function.
(ctf_link_add_cu_mapping): Transition to ctf_err_warn. Drop the
ctf_file_close_thunk.
(ctf_link_in_member_cb_arg_t) <file_name> Rename to...
<in_file_name>: ... this.
<arcname>: Drop.
<share_mode>: Likewise (migrated to ctf_link_flags).
<done_main_member>: Rename to...
<done_parent>: ... this.
<main_input_fp>: Rename to...
<in_fp_parent>: ... this.
<cu_mapped>: New.
(ctf_link_one_type): Adjuwt accordingly. Transition to
ctf_err_warn, removing a TODO.
(ctf_link_one_variable): Note a case too common to warn about.
Report in the debug stream if a cu-mapped link prevents addition
of a conflicting variable.
(ctf_link_one_input_archive_member): Adjust.
(ctf_link_lazy_open): New, open a CTF archive for linking when
needed.
(ctf_link_close_one_input_archive): New, close it again.
(ctf_link_one_input_archive): Adjust for lazy opening, member
renames, and ctf_err_warn transition. Move the
empty_link_type_mapping call to...
(ctf_link): ... here. Adjut for renamings and thunk removal.
Don't spuriously fail if some input contains no CTF data.
(ctf_link_write): ctf_err_warn transition.
* libctf.ver: Remove not-yet-stable comment.
2020-06-05 02:28:52 +08:00
|
|
|
free (i);
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
}
|
|
|
|
|
libctf, link: add lazy linking: clean up input members: err/warn cleanup
This rather large and intertwined pile of changes does three things:
First, it transitions from dprintf to ctf_err_warn for things the user might
care about: this one file is the major impetus for the ctf_err_warn
infrastructure, because things like file names are crucial in linker
error messages, and errno values are utterly incapable of
communicating them
Second, it stabilizes the ctf_link APIs: you can now call
ctf_link_add_ctf without a CTF argument (only a NAME), to lazily
ctf_open the file with the given NAME when needed, and close it as soon
as possible, to save memory. This is not an API change because a null
CTF argument was prohibited before now.
Since getting CTF directly from files uses ctf_open, passing in only a
NAME requires use of libctf, not libctf-nobfd. The linker's behaviour
is unchanged, as it still passes in a ctf_archive_t as before.
This also let us fix a leak: we were opening ctf_archives and their
containing ctf_files, then only closing the files and leaving the
archives open.
Third, this commit restructures the ctf_link_in_member argument used by
the CTF linking machinery and adjusts its users accordingly.
We drop two members:
- arcname, which is difficult to construct and then only used in error
messages (that were only dprintf()ed, so never seen!)
- share_mode, since we store the flags passed to ctf_link (including the
share mode) in a new ctf_file_t.ctf_link_flags to help dedup get hold
of it
We rename others whose existing names were fairly dreadful:
- done_main_member -> done_parent, using consistent terminology for .ctf
as the parent of all archive members
- main_input_fp -> in_fp_parent, likewise
- file_name -> in_file_name, likewise
We add one new member, cu_mapped.
Finally, we move the various frees of things like mapping table data to
the top-level ctf_link, since deduplicating links will want to do that
too.
include/
* ctf-api.h (ECTF_NEEDSBFD): New.
(ECTF_NERR): Adjust.
(ctf_link): Rename share_mode arg to flags.
libctf/
* Makefile.am: Set -DNOBFD=1 in libctf-nobfd, and =0 elsewhere.
* Makefile.in: Regenerated.
* ctf-impl.h (ctf_link_input_name): New.
(ctf_file_t) <ctf_link_flags>: New.
* ctf-create.c (ctf_serialize): Adjust accordingly.
* ctf-link.c: Define ctf_open as weak when PIC.
(ctf_arc_close_thunk): Remove unnecessary thunk.
(ctf_file_close_thunk): Likewise.
(ctf_link_input_name): New.
(ctf_link_input_t): New value of the ctf_file_t.ctf_link_input.
(ctf_link_input_close): Adjust accordingly.
(ctf_link_add_ctf_internal): New, split from...
(ctf_link_add_ctf): ... here. Return error if lazy loading of
CTF is not possible. Change to just call...
(ctf_link_add): ... this new function.
(ctf_link_add_cu_mapping): Transition to ctf_err_warn. Drop the
ctf_file_close_thunk.
(ctf_link_in_member_cb_arg_t) <file_name> Rename to...
<in_file_name>: ... this.
<arcname>: Drop.
<share_mode>: Likewise (migrated to ctf_link_flags).
<done_main_member>: Rename to...
<done_parent>: ... this.
<main_input_fp>: Rename to...
<in_fp_parent>: ... this.
<cu_mapped>: New.
(ctf_link_one_type): Adjuwt accordingly. Transition to
ctf_err_warn, removing a TODO.
(ctf_link_one_variable): Note a case too common to warn about.
Report in the debug stream if a cu-mapped link prevents addition
of a conflicting variable.
(ctf_link_one_input_archive_member): Adjust.
(ctf_link_lazy_open): New, open a CTF archive for linking when
needed.
(ctf_link_close_one_input_archive): New, close it again.
(ctf_link_one_input_archive): Adjust for lazy opening, member
renames, and ctf_err_warn transition. Move the
empty_link_type_mapping call to...
(ctf_link): ... here. Adjut for renamings and thunk removal.
Don't spuriously fail if some input contains no CTF data.
(ctf_link_write): ctf_err_warn transition.
* libctf.ver: Remove not-yet-stable comment.
2020-06-05 02:28:52 +08:00
|
|
|
/* Like ctf_link_add_ctf, below, but with no error-checking, so it can be called
|
|
|
|
in the middle of an ongoing link. */
|
|
|
|
static int
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
ctf_link_add_ctf_internal (ctf_dict_t *fp, ctf_archive_t *ctf,
|
|
|
|
ctf_dict_t *fp_input, const char *name)
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
{
|
libctf: fix linking together multiple objects derived from the same source
Right now, if you compile the same .c input repeatedly with CTF enabled
and different compilation flags, then arrange to link all of these
together, then things misbehave in various ways. libctf may conflate
either inputs (if the .o files have the same name, say if they are
stored in different .a archives), or per-CU outputs when conflicting
types are found: the latter can lead to entirely spurious errors when
it tries to produce multiple per-CU outputs with the same name
(discarding all but the last, but then looking for types in the earlier
ones which have just been thrown away).
Fixing this is multi-pronged. Both inputs and outputs need to be
differentiated in the hashtables libctf keeps them in: inputs with the
same cuname and filename need to be considered distinct as long as they
have different associated CTF dicts, and per-CU outputs need to be
considered distinct as long as they have different associated input
dicts. Right now there is nothing tying the two together other than the
CU name: fix this by introducing a new field in the ctf_dict_t named
ctf_link_in_out, which (for input dicts) points to the associated per-CU
output dict (if any), and for output dicts points to the associated
input dict. At creation time the name used is completely arbitrary:
it's only important that it be distinct if CTF dicts are distinct. So,
when a clash is found, adjust the CU name by sticking the number of
elements in the input on the end. At output time, the CU name will
appear in the linked object, so it matters a little more that it look
slightly less ugly: in conflicting cases, append an incrementing
integer, starting at 0.
This naming scheme is not very helpful, but it's hard to see what else
we can do. The input .o name may be the same. The input .a name is not
even visible to ctf_link, and even *that* might be the same, because
.a's can contain many members with the same name, all of which
participate in the link. All we really know is that the two have
distinct dictionaries with distinct types in them, and at least this way
they are all represented, any any symbols, variables etc referring to
those types are accurately stored.
(As a side-effect this also fixes a use-after-free and double-free when
errors are found during variable or symbol emission.)
Use the opportunity to prevent a couple of sources of problems, to wit
changing the active CU mappings when a link has already been done
(no effect on ld, which doesn't use CU mappings at all), and causing
multiple consecutive ctf_link's to have the same net effect as just
doing the last one (no effect on ld, which only ever does one
ctf_link) rather than having the links be a sort of half-incremental
not-really-intended mess.
libctf/ChangeLog:
PR libctf/29242
* ctf-impl.h (struct ctf_dict) [ctf_link_in_out]: New.
* ctf-dedup.c (ctf_dedup_emit_type): Set it.
* ctf-link.c (ctf_link_add_ctf_internal): Set the input
CU name uniquely when clashes are found.
(ctf_link_add): Document what repeated additions do.
(ctf_new_per_cu_name): New, come up with a consistent
name for a new per-CU dict.
(ctf_link_deduplicating): Use it.
(ctf_create_per_cu): Use it, and ctf_link_in_out, and set
ctf_link_in_out properly. Don't overwrite per-CU dicts with
per-CU dicts relating to different inputs.
(ctf_link_add_cu_mapping): Prevent per-CU mappings being set up
if we already have per-CU outputs.
(ctf_link_one_variable): Adjust ctf_link_per_cu call.
(ctf_link_deduplicating_one_symtypetab): Likewise.
(ctf_link_empty_outputs): New, delete all the ctf_link_outputs
and blank out ctf_link_in_out on the corresponding inputs.
(ctf_link): Clarify the effect of multiple ctf_link calls.
Empty ctf_link_outputs if it already exists rather than
having the old output leak into the new link. Fix a variable
name.
* testsuite/config/default.exp (AR): Add.
(OBJDUMP): Likewise.
* testsuite/libctf-regression/libctf-repeat-cu.exp: New test.
* testsuite/libctf-regression/libctf-repeat-cu*: Main program,
library, and expected results for the test.
2022-06-11 00:05:50 +08:00
|
|
|
int existing = 0;
|
|
|
|
ctf_link_input_t *input;
|
|
|
|
char *filename, *keyname;
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
|
libctf: fix linking together multiple objects derived from the same source
Right now, if you compile the same .c input repeatedly with CTF enabled
and different compilation flags, then arrange to link all of these
together, then things misbehave in various ways. libctf may conflate
either inputs (if the .o files have the same name, say if they are
stored in different .a archives), or per-CU outputs when conflicting
types are found: the latter can lead to entirely spurious errors when
it tries to produce multiple per-CU outputs with the same name
(discarding all but the last, but then looking for types in the earlier
ones which have just been thrown away).
Fixing this is multi-pronged. Both inputs and outputs need to be
differentiated in the hashtables libctf keeps them in: inputs with the
same cuname and filename need to be considered distinct as long as they
have different associated CTF dicts, and per-CU outputs need to be
considered distinct as long as they have different associated input
dicts. Right now there is nothing tying the two together other than the
CU name: fix this by introducing a new field in the ctf_dict_t named
ctf_link_in_out, which (for input dicts) points to the associated per-CU
output dict (if any), and for output dicts points to the associated
input dict. At creation time the name used is completely arbitrary:
it's only important that it be distinct if CTF dicts are distinct. So,
when a clash is found, adjust the CU name by sticking the number of
elements in the input on the end. At output time, the CU name will
appear in the linked object, so it matters a little more that it look
slightly less ugly: in conflicting cases, append an incrementing
integer, starting at 0.
This naming scheme is not very helpful, but it's hard to see what else
we can do. The input .o name may be the same. The input .a name is not
even visible to ctf_link, and even *that* might be the same, because
.a's can contain many members with the same name, all of which
participate in the link. All we really know is that the two have
distinct dictionaries with distinct types in them, and at least this way
they are all represented, any any symbols, variables etc referring to
those types are accurately stored.
(As a side-effect this also fixes a use-after-free and double-free when
errors are found during variable or symbol emission.)
Use the opportunity to prevent a couple of sources of problems, to wit
changing the active CU mappings when a link has already been done
(no effect on ld, which doesn't use CU mappings at all), and causing
multiple consecutive ctf_link's to have the same net effect as just
doing the last one (no effect on ld, which only ever does one
ctf_link) rather than having the links be a sort of half-incremental
not-really-intended mess.
libctf/ChangeLog:
PR libctf/29242
* ctf-impl.h (struct ctf_dict) [ctf_link_in_out]: New.
* ctf-dedup.c (ctf_dedup_emit_type): Set it.
* ctf-link.c (ctf_link_add_ctf_internal): Set the input
CU name uniquely when clashes are found.
(ctf_link_add): Document what repeated additions do.
(ctf_new_per_cu_name): New, come up with a consistent
name for a new per-CU dict.
(ctf_link_deduplicating): Use it.
(ctf_create_per_cu): Use it, and ctf_link_in_out, and set
ctf_link_in_out properly. Don't overwrite per-CU dicts with
per-CU dicts relating to different inputs.
(ctf_link_add_cu_mapping): Prevent per-CU mappings being set up
if we already have per-CU outputs.
(ctf_link_one_variable): Adjust ctf_link_per_cu call.
(ctf_link_deduplicating_one_symtypetab): Likewise.
(ctf_link_empty_outputs): New, delete all the ctf_link_outputs
and blank out ctf_link_in_out on the corresponding inputs.
(ctf_link): Clarify the effect of multiple ctf_link calls.
Empty ctf_link_outputs if it already exists rather than
having the old output leak into the new link. Fix a variable
name.
* testsuite/config/default.exp (AR): Add.
(OBJDUMP): Likewise.
* testsuite/libctf-regression/libctf-repeat-cu.exp: New test.
* testsuite/libctf-regression/libctf-repeat-cu*: Main program,
library, and expected results for the test.
2022-06-11 00:05:50 +08:00
|
|
|
/* Existing: return it, or (if a different dict with the same name
|
|
|
|
is already there) make up a new unique name. Always use the actual name
|
|
|
|
for the filename, because that needs to be ctf_open()ed. */
|
|
|
|
|
|
|
|
if ((input = ctf_dynhash_lookup (fp->ctf_link_inputs, name)) != NULL)
|
|
|
|
{
|
|
|
|
if ((fp_input != NULL && (input->clin_fp == fp_input))
|
|
|
|
|| (ctf != NULL && (input->clin_arc == ctf)))
|
|
|
|
return 0;
|
|
|
|
existing = 1;
|
|
|
|
}
|
|
|
|
|
|
|
|
if ((filename = strdup (name)) == NULL)
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
goto oom;
|
|
|
|
|
libctf: fix linking together multiple objects derived from the same source
Right now, if you compile the same .c input repeatedly with CTF enabled
and different compilation flags, then arrange to link all of these
together, then things misbehave in various ways. libctf may conflate
either inputs (if the .o files have the same name, say if they are
stored in different .a archives), or per-CU outputs when conflicting
types are found: the latter can lead to entirely spurious errors when
it tries to produce multiple per-CU outputs with the same name
(discarding all but the last, but then looking for types in the earlier
ones which have just been thrown away).
Fixing this is multi-pronged. Both inputs and outputs need to be
differentiated in the hashtables libctf keeps them in: inputs with the
same cuname and filename need to be considered distinct as long as they
have different associated CTF dicts, and per-CU outputs need to be
considered distinct as long as they have different associated input
dicts. Right now there is nothing tying the two together other than the
CU name: fix this by introducing a new field in the ctf_dict_t named
ctf_link_in_out, which (for input dicts) points to the associated per-CU
output dict (if any), and for output dicts points to the associated
input dict. At creation time the name used is completely arbitrary:
it's only important that it be distinct if CTF dicts are distinct. So,
when a clash is found, adjust the CU name by sticking the number of
elements in the input on the end. At output time, the CU name will
appear in the linked object, so it matters a little more that it look
slightly less ugly: in conflicting cases, append an incrementing
integer, starting at 0.
This naming scheme is not very helpful, but it's hard to see what else
we can do. The input .o name may be the same. The input .a name is not
even visible to ctf_link, and even *that* might be the same, because
.a's can contain many members with the same name, all of which
participate in the link. All we really know is that the two have
distinct dictionaries with distinct types in them, and at least this way
they are all represented, any any symbols, variables etc referring to
those types are accurately stored.
(As a side-effect this also fixes a use-after-free and double-free when
errors are found during variable or symbol emission.)
Use the opportunity to prevent a couple of sources of problems, to wit
changing the active CU mappings when a link has already been done
(no effect on ld, which doesn't use CU mappings at all), and causing
multiple consecutive ctf_link's to have the same net effect as just
doing the last one (no effect on ld, which only ever does one
ctf_link) rather than having the links be a sort of half-incremental
not-really-intended mess.
libctf/ChangeLog:
PR libctf/29242
* ctf-impl.h (struct ctf_dict) [ctf_link_in_out]: New.
* ctf-dedup.c (ctf_dedup_emit_type): Set it.
* ctf-link.c (ctf_link_add_ctf_internal): Set the input
CU name uniquely when clashes are found.
(ctf_link_add): Document what repeated additions do.
(ctf_new_per_cu_name): New, come up with a consistent
name for a new per-CU dict.
(ctf_link_deduplicating): Use it.
(ctf_create_per_cu): Use it, and ctf_link_in_out, and set
ctf_link_in_out properly. Don't overwrite per-CU dicts with
per-CU dicts relating to different inputs.
(ctf_link_add_cu_mapping): Prevent per-CU mappings being set up
if we already have per-CU outputs.
(ctf_link_one_variable): Adjust ctf_link_per_cu call.
(ctf_link_deduplicating_one_symtypetab): Likewise.
(ctf_link_empty_outputs): New, delete all the ctf_link_outputs
and blank out ctf_link_in_out on the corresponding inputs.
(ctf_link): Clarify the effect of multiple ctf_link calls.
Empty ctf_link_outputs if it already exists rather than
having the old output leak into the new link. Fix a variable
name.
* testsuite/config/default.exp (AR): Add.
(OBJDUMP): Likewise.
* testsuite/libctf-regression/libctf-repeat-cu.exp: New test.
* testsuite/libctf-regression/libctf-repeat-cu*: Main program,
library, and expected results for the test.
2022-06-11 00:05:50 +08:00
|
|
|
if ((input = calloc (1, sizeof (ctf_link_input_t))) == NULL)
|
2022-07-31 21:21:55 +08:00
|
|
|
goto oom1;
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
|
libctf, link: add lazy linking: clean up input members: err/warn cleanup
This rather large and intertwined pile of changes does three things:
First, it transitions from dprintf to ctf_err_warn for things the user might
care about: this one file is the major impetus for the ctf_err_warn
infrastructure, because things like file names are crucial in linker
error messages, and errno values are utterly incapable of
communicating them
Second, it stabilizes the ctf_link APIs: you can now call
ctf_link_add_ctf without a CTF argument (only a NAME), to lazily
ctf_open the file with the given NAME when needed, and close it as soon
as possible, to save memory. This is not an API change because a null
CTF argument was prohibited before now.
Since getting CTF directly from files uses ctf_open, passing in only a
NAME requires use of libctf, not libctf-nobfd. The linker's behaviour
is unchanged, as it still passes in a ctf_archive_t as before.
This also let us fix a leak: we were opening ctf_archives and their
containing ctf_files, then only closing the files and leaving the
archives open.
Third, this commit restructures the ctf_link_in_member argument used by
the CTF linking machinery and adjusts its users accordingly.
We drop two members:
- arcname, which is difficult to construct and then only used in error
messages (that were only dprintf()ed, so never seen!)
- share_mode, since we store the flags passed to ctf_link (including the
share mode) in a new ctf_file_t.ctf_link_flags to help dedup get hold
of it
We rename others whose existing names were fairly dreadful:
- done_main_member -> done_parent, using consistent terminology for .ctf
as the parent of all archive members
- main_input_fp -> in_fp_parent, likewise
- file_name -> in_file_name, likewise
We add one new member, cu_mapped.
Finally, we move the various frees of things like mapping table data to
the top-level ctf_link, since deduplicating links will want to do that
too.
include/
* ctf-api.h (ECTF_NEEDSBFD): New.
(ECTF_NERR): Adjust.
(ctf_link): Rename share_mode arg to flags.
libctf/
* Makefile.am: Set -DNOBFD=1 in libctf-nobfd, and =0 elsewhere.
* Makefile.in: Regenerated.
* ctf-impl.h (ctf_link_input_name): New.
(ctf_file_t) <ctf_link_flags>: New.
* ctf-create.c (ctf_serialize): Adjust accordingly.
* ctf-link.c: Define ctf_open as weak when PIC.
(ctf_arc_close_thunk): Remove unnecessary thunk.
(ctf_file_close_thunk): Likewise.
(ctf_link_input_name): New.
(ctf_link_input_t): New value of the ctf_file_t.ctf_link_input.
(ctf_link_input_close): Adjust accordingly.
(ctf_link_add_ctf_internal): New, split from...
(ctf_link_add_ctf): ... here. Return error if lazy loading of
CTF is not possible. Change to just call...
(ctf_link_add): ... this new function.
(ctf_link_add_cu_mapping): Transition to ctf_err_warn. Drop the
ctf_file_close_thunk.
(ctf_link_in_member_cb_arg_t) <file_name> Rename to...
<in_file_name>: ... this.
<arcname>: Drop.
<share_mode>: Likewise (migrated to ctf_link_flags).
<done_main_member>: Rename to...
<done_parent>: ... this.
<main_input_fp>: Rename to...
<in_fp_parent>: ... this.
<cu_mapped>: New.
(ctf_link_one_type): Adjuwt accordingly. Transition to
ctf_err_warn, removing a TODO.
(ctf_link_one_variable): Note a case too common to warn about.
Report in the debug stream if a cu-mapped link prevents addition
of a conflicting variable.
(ctf_link_one_input_archive_member): Adjust.
(ctf_link_lazy_open): New, open a CTF archive for linking when
needed.
(ctf_link_close_one_input_archive): New, close it again.
(ctf_link_one_input_archive): Adjust for lazy opening, member
renames, and ctf_err_warn transition. Move the
empty_link_type_mapping call to...
(ctf_link): ... here. Adjut for renamings and thunk removal.
Don't spuriously fail if some input contains no CTF data.
(ctf_link_write): ctf_err_warn transition.
* libctf.ver: Remove not-yet-stable comment.
2020-06-05 02:28:52 +08:00
|
|
|
input->clin_arc = ctf;
|
|
|
|
input->clin_fp = fp_input;
|
libctf: fix linking together multiple objects derived from the same source
Right now, if you compile the same .c input repeatedly with CTF enabled
and different compilation flags, then arrange to link all of these
together, then things misbehave in various ways. libctf may conflate
either inputs (if the .o files have the same name, say if they are
stored in different .a archives), or per-CU outputs when conflicting
types are found: the latter can lead to entirely spurious errors when
it tries to produce multiple per-CU outputs with the same name
(discarding all but the last, but then looking for types in the earlier
ones which have just been thrown away).
Fixing this is multi-pronged. Both inputs and outputs need to be
differentiated in the hashtables libctf keeps them in: inputs with the
same cuname and filename need to be considered distinct as long as they
have different associated CTF dicts, and per-CU outputs need to be
considered distinct as long as they have different associated input
dicts. Right now there is nothing tying the two together other than the
CU name: fix this by introducing a new field in the ctf_dict_t named
ctf_link_in_out, which (for input dicts) points to the associated per-CU
output dict (if any), and for output dicts points to the associated
input dict. At creation time the name used is completely arbitrary:
it's only important that it be distinct if CTF dicts are distinct. So,
when a clash is found, adjust the CU name by sticking the number of
elements in the input on the end. At output time, the CU name will
appear in the linked object, so it matters a little more that it look
slightly less ugly: in conflicting cases, append an incrementing
integer, starting at 0.
This naming scheme is not very helpful, but it's hard to see what else
we can do. The input .o name may be the same. The input .a name is not
even visible to ctf_link, and even *that* might be the same, because
.a's can contain many members with the same name, all of which
participate in the link. All we really know is that the two have
distinct dictionaries with distinct types in them, and at least this way
they are all represented, any any symbols, variables etc referring to
those types are accurately stored.
(As a side-effect this also fixes a use-after-free and double-free when
errors are found during variable or symbol emission.)
Use the opportunity to prevent a couple of sources of problems, to wit
changing the active CU mappings when a link has already been done
(no effect on ld, which doesn't use CU mappings at all), and causing
multiple consecutive ctf_link's to have the same net effect as just
doing the last one (no effect on ld, which only ever does one
ctf_link) rather than having the links be a sort of half-incremental
not-really-intended mess.
libctf/ChangeLog:
PR libctf/29242
* ctf-impl.h (struct ctf_dict) [ctf_link_in_out]: New.
* ctf-dedup.c (ctf_dedup_emit_type): Set it.
* ctf-link.c (ctf_link_add_ctf_internal): Set the input
CU name uniquely when clashes are found.
(ctf_link_add): Document what repeated additions do.
(ctf_new_per_cu_name): New, come up with a consistent
name for a new per-CU dict.
(ctf_link_deduplicating): Use it.
(ctf_create_per_cu): Use it, and ctf_link_in_out, and set
ctf_link_in_out properly. Don't overwrite per-CU dicts with
per-CU dicts relating to different inputs.
(ctf_link_add_cu_mapping): Prevent per-CU mappings being set up
if we already have per-CU outputs.
(ctf_link_one_variable): Adjust ctf_link_per_cu call.
(ctf_link_deduplicating_one_symtypetab): Likewise.
(ctf_link_empty_outputs): New, delete all the ctf_link_outputs
and blank out ctf_link_in_out on the corresponding inputs.
(ctf_link): Clarify the effect of multiple ctf_link calls.
Empty ctf_link_outputs if it already exists rather than
having the old output leak into the new link. Fix a variable
name.
* testsuite/config/default.exp (AR): Add.
(OBJDUMP): Likewise.
* testsuite/libctf-regression/libctf-repeat-cu.exp: New test.
* testsuite/libctf-regression/libctf-repeat-cu*: Main program,
library, and expected results for the test.
2022-06-11 00:05:50 +08:00
|
|
|
input->clin_filename = filename;
|
libctf, link: add lazy linking: clean up input members: err/warn cleanup
This rather large and intertwined pile of changes does three things:
First, it transitions from dprintf to ctf_err_warn for things the user might
care about: this one file is the major impetus for the ctf_err_warn
infrastructure, because things like file names are crucial in linker
error messages, and errno values are utterly incapable of
communicating them
Second, it stabilizes the ctf_link APIs: you can now call
ctf_link_add_ctf without a CTF argument (only a NAME), to lazily
ctf_open the file with the given NAME when needed, and close it as soon
as possible, to save memory. This is not an API change because a null
CTF argument was prohibited before now.
Since getting CTF directly from files uses ctf_open, passing in only a
NAME requires use of libctf, not libctf-nobfd. The linker's behaviour
is unchanged, as it still passes in a ctf_archive_t as before.
This also let us fix a leak: we were opening ctf_archives and their
containing ctf_files, then only closing the files and leaving the
archives open.
Third, this commit restructures the ctf_link_in_member argument used by
the CTF linking machinery and adjusts its users accordingly.
We drop two members:
- arcname, which is difficult to construct and then only used in error
messages (that were only dprintf()ed, so never seen!)
- share_mode, since we store the flags passed to ctf_link (including the
share mode) in a new ctf_file_t.ctf_link_flags to help dedup get hold
of it
We rename others whose existing names were fairly dreadful:
- done_main_member -> done_parent, using consistent terminology for .ctf
as the parent of all archive members
- main_input_fp -> in_fp_parent, likewise
- file_name -> in_file_name, likewise
We add one new member, cu_mapped.
Finally, we move the various frees of things like mapping table data to
the top-level ctf_link, since deduplicating links will want to do that
too.
include/
* ctf-api.h (ECTF_NEEDSBFD): New.
(ECTF_NERR): Adjust.
(ctf_link): Rename share_mode arg to flags.
libctf/
* Makefile.am: Set -DNOBFD=1 in libctf-nobfd, and =0 elsewhere.
* Makefile.in: Regenerated.
* ctf-impl.h (ctf_link_input_name): New.
(ctf_file_t) <ctf_link_flags>: New.
* ctf-create.c (ctf_serialize): Adjust accordingly.
* ctf-link.c: Define ctf_open as weak when PIC.
(ctf_arc_close_thunk): Remove unnecessary thunk.
(ctf_file_close_thunk): Likewise.
(ctf_link_input_name): New.
(ctf_link_input_t): New value of the ctf_file_t.ctf_link_input.
(ctf_link_input_close): Adjust accordingly.
(ctf_link_add_ctf_internal): New, split from...
(ctf_link_add_ctf): ... here. Return error if lazy loading of
CTF is not possible. Change to just call...
(ctf_link_add): ... this new function.
(ctf_link_add_cu_mapping): Transition to ctf_err_warn. Drop the
ctf_file_close_thunk.
(ctf_link_in_member_cb_arg_t) <file_name> Rename to...
<in_file_name>: ... this.
<arcname>: Drop.
<share_mode>: Likewise (migrated to ctf_link_flags).
<done_main_member>: Rename to...
<done_parent>: ... this.
<main_input_fp>: Rename to...
<in_fp_parent>: ... this.
<cu_mapped>: New.
(ctf_link_one_type): Adjuwt accordingly. Transition to
ctf_err_warn, removing a TODO.
(ctf_link_one_variable): Note a case too common to warn about.
Report in the debug stream if a cu-mapped link prevents addition
of a conflicting variable.
(ctf_link_one_input_archive_member): Adjust.
(ctf_link_lazy_open): New, open a CTF archive for linking when
needed.
(ctf_link_close_one_input_archive): New, close it again.
(ctf_link_one_input_archive): Adjust for lazy opening, member
renames, and ctf_err_warn transition. Move the
empty_link_type_mapping call to...
(ctf_link): ... here. Adjut for renamings and thunk removal.
Don't spuriously fail if some input contains no CTF data.
(ctf_link_write): ctf_err_warn transition.
* libctf.ver: Remove not-yet-stable comment.
2020-06-05 02:28:52 +08:00
|
|
|
input->n = ctf_dynhash_elements (fp->ctf_link_inputs);
|
|
|
|
|
libctf: fix linking together multiple objects derived from the same source
Right now, if you compile the same .c input repeatedly with CTF enabled
and different compilation flags, then arrange to link all of these
together, then things misbehave in various ways. libctf may conflate
either inputs (if the .o files have the same name, say if they are
stored in different .a archives), or per-CU outputs when conflicting
types are found: the latter can lead to entirely spurious errors when
it tries to produce multiple per-CU outputs with the same name
(discarding all but the last, but then looking for types in the earlier
ones which have just been thrown away).
Fixing this is multi-pronged. Both inputs and outputs need to be
differentiated in the hashtables libctf keeps them in: inputs with the
same cuname and filename need to be considered distinct as long as they
have different associated CTF dicts, and per-CU outputs need to be
considered distinct as long as they have different associated input
dicts. Right now there is nothing tying the two together other than the
CU name: fix this by introducing a new field in the ctf_dict_t named
ctf_link_in_out, which (for input dicts) points to the associated per-CU
output dict (if any), and for output dicts points to the associated
input dict. At creation time the name used is completely arbitrary:
it's only important that it be distinct if CTF dicts are distinct. So,
when a clash is found, adjust the CU name by sticking the number of
elements in the input on the end. At output time, the CU name will
appear in the linked object, so it matters a little more that it look
slightly less ugly: in conflicting cases, append an incrementing
integer, starting at 0.
This naming scheme is not very helpful, but it's hard to see what else
we can do. The input .o name may be the same. The input .a name is not
even visible to ctf_link, and even *that* might be the same, because
.a's can contain many members with the same name, all of which
participate in the link. All we really know is that the two have
distinct dictionaries with distinct types in them, and at least this way
they are all represented, any any symbols, variables etc referring to
those types are accurately stored.
(As a side-effect this also fixes a use-after-free and double-free when
errors are found during variable or symbol emission.)
Use the opportunity to prevent a couple of sources of problems, to wit
changing the active CU mappings when a link has already been done
(no effect on ld, which doesn't use CU mappings at all), and causing
multiple consecutive ctf_link's to have the same net effect as just
doing the last one (no effect on ld, which only ever does one
ctf_link) rather than having the links be a sort of half-incremental
not-really-intended mess.
libctf/ChangeLog:
PR libctf/29242
* ctf-impl.h (struct ctf_dict) [ctf_link_in_out]: New.
* ctf-dedup.c (ctf_dedup_emit_type): Set it.
* ctf-link.c (ctf_link_add_ctf_internal): Set the input
CU name uniquely when clashes are found.
(ctf_link_add): Document what repeated additions do.
(ctf_new_per_cu_name): New, come up with a consistent
name for a new per-CU dict.
(ctf_link_deduplicating): Use it.
(ctf_create_per_cu): Use it, and ctf_link_in_out, and set
ctf_link_in_out properly. Don't overwrite per-CU dicts with
per-CU dicts relating to different inputs.
(ctf_link_add_cu_mapping): Prevent per-CU mappings being set up
if we already have per-CU outputs.
(ctf_link_one_variable): Adjust ctf_link_per_cu call.
(ctf_link_deduplicating_one_symtypetab): Likewise.
(ctf_link_empty_outputs): New, delete all the ctf_link_outputs
and blank out ctf_link_in_out on the corresponding inputs.
(ctf_link): Clarify the effect of multiple ctf_link calls.
Empty ctf_link_outputs if it already exists rather than
having the old output leak into the new link. Fix a variable
name.
* testsuite/config/default.exp (AR): Add.
(OBJDUMP): Likewise.
* testsuite/libctf-regression/libctf-repeat-cu.exp: New test.
* testsuite/libctf-regression/libctf-repeat-cu*: Main program,
library, and expected results for the test.
2022-06-11 00:05:50 +08:00
|
|
|
if (existing)
|
|
|
|
{
|
|
|
|
if (asprintf (&keyname, "%s#%li", name, (long int)
|
|
|
|
ctf_dynhash_elements (fp->ctf_link_inputs)) < 0)
|
2022-07-31 21:21:55 +08:00
|
|
|
goto oom2;
|
libctf: fix linking together multiple objects derived from the same source
Right now, if you compile the same .c input repeatedly with CTF enabled
and different compilation flags, then arrange to link all of these
together, then things misbehave in various ways. libctf may conflate
either inputs (if the .o files have the same name, say if they are
stored in different .a archives), or per-CU outputs when conflicting
types are found: the latter can lead to entirely spurious errors when
it tries to produce multiple per-CU outputs with the same name
(discarding all but the last, but then looking for types in the earlier
ones which have just been thrown away).
Fixing this is multi-pronged. Both inputs and outputs need to be
differentiated in the hashtables libctf keeps them in: inputs with the
same cuname and filename need to be considered distinct as long as they
have different associated CTF dicts, and per-CU outputs need to be
considered distinct as long as they have different associated input
dicts. Right now there is nothing tying the two together other than the
CU name: fix this by introducing a new field in the ctf_dict_t named
ctf_link_in_out, which (for input dicts) points to the associated per-CU
output dict (if any), and for output dicts points to the associated
input dict. At creation time the name used is completely arbitrary:
it's only important that it be distinct if CTF dicts are distinct. So,
when a clash is found, adjust the CU name by sticking the number of
elements in the input on the end. At output time, the CU name will
appear in the linked object, so it matters a little more that it look
slightly less ugly: in conflicting cases, append an incrementing
integer, starting at 0.
This naming scheme is not very helpful, but it's hard to see what else
we can do. The input .o name may be the same. The input .a name is not
even visible to ctf_link, and even *that* might be the same, because
.a's can contain many members with the same name, all of which
participate in the link. All we really know is that the two have
distinct dictionaries with distinct types in them, and at least this way
they are all represented, any any symbols, variables etc referring to
those types are accurately stored.
(As a side-effect this also fixes a use-after-free and double-free when
errors are found during variable or symbol emission.)
Use the opportunity to prevent a couple of sources of problems, to wit
changing the active CU mappings when a link has already been done
(no effect on ld, which doesn't use CU mappings at all), and causing
multiple consecutive ctf_link's to have the same net effect as just
doing the last one (no effect on ld, which only ever does one
ctf_link) rather than having the links be a sort of half-incremental
not-really-intended mess.
libctf/ChangeLog:
PR libctf/29242
* ctf-impl.h (struct ctf_dict) [ctf_link_in_out]: New.
* ctf-dedup.c (ctf_dedup_emit_type): Set it.
* ctf-link.c (ctf_link_add_ctf_internal): Set the input
CU name uniquely when clashes are found.
(ctf_link_add): Document what repeated additions do.
(ctf_new_per_cu_name): New, come up with a consistent
name for a new per-CU dict.
(ctf_link_deduplicating): Use it.
(ctf_create_per_cu): Use it, and ctf_link_in_out, and set
ctf_link_in_out properly. Don't overwrite per-CU dicts with
per-CU dicts relating to different inputs.
(ctf_link_add_cu_mapping): Prevent per-CU mappings being set up
if we already have per-CU outputs.
(ctf_link_one_variable): Adjust ctf_link_per_cu call.
(ctf_link_deduplicating_one_symtypetab): Likewise.
(ctf_link_empty_outputs): New, delete all the ctf_link_outputs
and blank out ctf_link_in_out on the corresponding inputs.
(ctf_link): Clarify the effect of multiple ctf_link calls.
Empty ctf_link_outputs if it already exists rather than
having the old output leak into the new link. Fix a variable
name.
* testsuite/config/default.exp (AR): Add.
(OBJDUMP): Likewise.
* testsuite/libctf-regression/libctf-repeat-cu.exp: New test.
* testsuite/libctf-regression/libctf-repeat-cu*: Main program,
library, and expected results for the test.
2022-06-11 00:05:50 +08:00
|
|
|
}
|
|
|
|
else if ((keyname = strdup (name)) == NULL)
|
2022-07-31 21:21:55 +08:00
|
|
|
goto oom2;
|
libctf: fix linking together multiple objects derived from the same source
Right now, if you compile the same .c input repeatedly with CTF enabled
and different compilation flags, then arrange to link all of these
together, then things misbehave in various ways. libctf may conflate
either inputs (if the .o files have the same name, say if they are
stored in different .a archives), or per-CU outputs when conflicting
types are found: the latter can lead to entirely spurious errors when
it tries to produce multiple per-CU outputs with the same name
(discarding all but the last, but then looking for types in the earlier
ones which have just been thrown away).
Fixing this is multi-pronged. Both inputs and outputs need to be
differentiated in the hashtables libctf keeps them in: inputs with the
same cuname and filename need to be considered distinct as long as they
have different associated CTF dicts, and per-CU outputs need to be
considered distinct as long as they have different associated input
dicts. Right now there is nothing tying the two together other than the
CU name: fix this by introducing a new field in the ctf_dict_t named
ctf_link_in_out, which (for input dicts) points to the associated per-CU
output dict (if any), and for output dicts points to the associated
input dict. At creation time the name used is completely arbitrary:
it's only important that it be distinct if CTF dicts are distinct. So,
when a clash is found, adjust the CU name by sticking the number of
elements in the input on the end. At output time, the CU name will
appear in the linked object, so it matters a little more that it look
slightly less ugly: in conflicting cases, append an incrementing
integer, starting at 0.
This naming scheme is not very helpful, but it's hard to see what else
we can do. The input .o name may be the same. The input .a name is not
even visible to ctf_link, and even *that* might be the same, because
.a's can contain many members with the same name, all of which
participate in the link. All we really know is that the two have
distinct dictionaries with distinct types in them, and at least this way
they are all represented, any any symbols, variables etc referring to
those types are accurately stored.
(As a side-effect this also fixes a use-after-free and double-free when
errors are found during variable or symbol emission.)
Use the opportunity to prevent a couple of sources of problems, to wit
changing the active CU mappings when a link has already been done
(no effect on ld, which doesn't use CU mappings at all), and causing
multiple consecutive ctf_link's to have the same net effect as just
doing the last one (no effect on ld, which only ever does one
ctf_link) rather than having the links be a sort of half-incremental
not-really-intended mess.
libctf/ChangeLog:
PR libctf/29242
* ctf-impl.h (struct ctf_dict) [ctf_link_in_out]: New.
* ctf-dedup.c (ctf_dedup_emit_type): Set it.
* ctf-link.c (ctf_link_add_ctf_internal): Set the input
CU name uniquely when clashes are found.
(ctf_link_add): Document what repeated additions do.
(ctf_new_per_cu_name): New, come up with a consistent
name for a new per-CU dict.
(ctf_link_deduplicating): Use it.
(ctf_create_per_cu): Use it, and ctf_link_in_out, and set
ctf_link_in_out properly. Don't overwrite per-CU dicts with
per-CU dicts relating to different inputs.
(ctf_link_add_cu_mapping): Prevent per-CU mappings being set up
if we already have per-CU outputs.
(ctf_link_one_variable): Adjust ctf_link_per_cu call.
(ctf_link_deduplicating_one_symtypetab): Likewise.
(ctf_link_empty_outputs): New, delete all the ctf_link_outputs
and blank out ctf_link_in_out on the corresponding inputs.
(ctf_link): Clarify the effect of multiple ctf_link calls.
Empty ctf_link_outputs if it already exists rather than
having the old output leak into the new link. Fix a variable
name.
* testsuite/config/default.exp (AR): Add.
(OBJDUMP): Likewise.
* testsuite/libctf-regression/libctf-repeat-cu.exp: New test.
* testsuite/libctf-regression/libctf-repeat-cu*: Main program,
library, and expected results for the test.
2022-06-11 00:05:50 +08:00
|
|
|
|
|
|
|
if (ctf_dynhash_insert (fp->ctf_link_inputs, keyname, input) < 0)
|
2022-07-31 21:21:55 +08:00
|
|
|
goto oom3;
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
|
|
|
|
return 0;
|
2022-07-31 21:21:55 +08:00
|
|
|
|
|
|
|
oom3:
|
|
|
|
free (keyname);
|
|
|
|
oom2:
|
libctf, link: add lazy linking: clean up input members: err/warn cleanup
This rather large and intertwined pile of changes does three things:
First, it transitions from dprintf to ctf_err_warn for things the user might
care about: this one file is the major impetus for the ctf_err_warn
infrastructure, because things like file names are crucial in linker
error messages, and errno values are utterly incapable of
communicating them
Second, it stabilizes the ctf_link APIs: you can now call
ctf_link_add_ctf without a CTF argument (only a NAME), to lazily
ctf_open the file with the given NAME when needed, and close it as soon
as possible, to save memory. This is not an API change because a null
CTF argument was prohibited before now.
Since getting CTF directly from files uses ctf_open, passing in only a
NAME requires use of libctf, not libctf-nobfd. The linker's behaviour
is unchanged, as it still passes in a ctf_archive_t as before.
This also let us fix a leak: we were opening ctf_archives and their
containing ctf_files, then only closing the files and leaving the
archives open.
Third, this commit restructures the ctf_link_in_member argument used by
the CTF linking machinery and adjusts its users accordingly.
We drop two members:
- arcname, which is difficult to construct and then only used in error
messages (that were only dprintf()ed, so never seen!)
- share_mode, since we store the flags passed to ctf_link (including the
share mode) in a new ctf_file_t.ctf_link_flags to help dedup get hold
of it
We rename others whose existing names were fairly dreadful:
- done_main_member -> done_parent, using consistent terminology for .ctf
as the parent of all archive members
- main_input_fp -> in_fp_parent, likewise
- file_name -> in_file_name, likewise
We add one new member, cu_mapped.
Finally, we move the various frees of things like mapping table data to
the top-level ctf_link, since deduplicating links will want to do that
too.
include/
* ctf-api.h (ECTF_NEEDSBFD): New.
(ECTF_NERR): Adjust.
(ctf_link): Rename share_mode arg to flags.
libctf/
* Makefile.am: Set -DNOBFD=1 in libctf-nobfd, and =0 elsewhere.
* Makefile.in: Regenerated.
* ctf-impl.h (ctf_link_input_name): New.
(ctf_file_t) <ctf_link_flags>: New.
* ctf-create.c (ctf_serialize): Adjust accordingly.
* ctf-link.c: Define ctf_open as weak when PIC.
(ctf_arc_close_thunk): Remove unnecessary thunk.
(ctf_file_close_thunk): Likewise.
(ctf_link_input_name): New.
(ctf_link_input_t): New value of the ctf_file_t.ctf_link_input.
(ctf_link_input_close): Adjust accordingly.
(ctf_link_add_ctf_internal): New, split from...
(ctf_link_add_ctf): ... here. Return error if lazy loading of
CTF is not possible. Change to just call...
(ctf_link_add): ... this new function.
(ctf_link_add_cu_mapping): Transition to ctf_err_warn. Drop the
ctf_file_close_thunk.
(ctf_link_in_member_cb_arg_t) <file_name> Rename to...
<in_file_name>: ... this.
<arcname>: Drop.
<share_mode>: Likewise (migrated to ctf_link_flags).
<done_main_member>: Rename to...
<done_parent>: ... this.
<main_input_fp>: Rename to...
<in_fp_parent>: ... this.
<cu_mapped>: New.
(ctf_link_one_type): Adjuwt accordingly. Transition to
ctf_err_warn, removing a TODO.
(ctf_link_one_variable): Note a case too common to warn about.
Report in the debug stream if a cu-mapped link prevents addition
of a conflicting variable.
(ctf_link_one_input_archive_member): Adjust.
(ctf_link_lazy_open): New, open a CTF archive for linking when
needed.
(ctf_link_close_one_input_archive): New, close it again.
(ctf_link_one_input_archive): Adjust for lazy opening, member
renames, and ctf_err_warn transition. Move the
empty_link_type_mapping call to...
(ctf_link): ... here. Adjut for renamings and thunk removal.
Don't spuriously fail if some input contains no CTF data.
(ctf_link_write): ctf_err_warn transition.
* libctf.ver: Remove not-yet-stable comment.
2020-06-05 02:28:52 +08:00
|
|
|
free (input);
|
2022-07-31 21:21:55 +08:00
|
|
|
oom1:
|
libctf: fix linking together multiple objects derived from the same source
Right now, if you compile the same .c input repeatedly with CTF enabled
and different compilation flags, then arrange to link all of these
together, then things misbehave in various ways. libctf may conflate
either inputs (if the .o files have the same name, say if they are
stored in different .a archives), or per-CU outputs when conflicting
types are found: the latter can lead to entirely spurious errors when
it tries to produce multiple per-CU outputs with the same name
(discarding all but the last, but then looking for types in the earlier
ones which have just been thrown away).
Fixing this is multi-pronged. Both inputs and outputs need to be
differentiated in the hashtables libctf keeps them in: inputs with the
same cuname and filename need to be considered distinct as long as they
have different associated CTF dicts, and per-CU outputs need to be
considered distinct as long as they have different associated input
dicts. Right now there is nothing tying the two together other than the
CU name: fix this by introducing a new field in the ctf_dict_t named
ctf_link_in_out, which (for input dicts) points to the associated per-CU
output dict (if any), and for output dicts points to the associated
input dict. At creation time the name used is completely arbitrary:
it's only important that it be distinct if CTF dicts are distinct. So,
when a clash is found, adjust the CU name by sticking the number of
elements in the input on the end. At output time, the CU name will
appear in the linked object, so it matters a little more that it look
slightly less ugly: in conflicting cases, append an incrementing
integer, starting at 0.
This naming scheme is not very helpful, but it's hard to see what else
we can do. The input .o name may be the same. The input .a name is not
even visible to ctf_link, and even *that* might be the same, because
.a's can contain many members with the same name, all of which
participate in the link. All we really know is that the two have
distinct dictionaries with distinct types in them, and at least this way
they are all represented, any any symbols, variables etc referring to
those types are accurately stored.
(As a side-effect this also fixes a use-after-free and double-free when
errors are found during variable or symbol emission.)
Use the opportunity to prevent a couple of sources of problems, to wit
changing the active CU mappings when a link has already been done
(no effect on ld, which doesn't use CU mappings at all), and causing
multiple consecutive ctf_link's to have the same net effect as just
doing the last one (no effect on ld, which only ever does one
ctf_link) rather than having the links be a sort of half-incremental
not-really-intended mess.
libctf/ChangeLog:
PR libctf/29242
* ctf-impl.h (struct ctf_dict) [ctf_link_in_out]: New.
* ctf-dedup.c (ctf_dedup_emit_type): Set it.
* ctf-link.c (ctf_link_add_ctf_internal): Set the input
CU name uniquely when clashes are found.
(ctf_link_add): Document what repeated additions do.
(ctf_new_per_cu_name): New, come up with a consistent
name for a new per-CU dict.
(ctf_link_deduplicating): Use it.
(ctf_create_per_cu): Use it, and ctf_link_in_out, and set
ctf_link_in_out properly. Don't overwrite per-CU dicts with
per-CU dicts relating to different inputs.
(ctf_link_add_cu_mapping): Prevent per-CU mappings being set up
if we already have per-CU outputs.
(ctf_link_one_variable): Adjust ctf_link_per_cu call.
(ctf_link_deduplicating_one_symtypetab): Likewise.
(ctf_link_empty_outputs): New, delete all the ctf_link_outputs
and blank out ctf_link_in_out on the corresponding inputs.
(ctf_link): Clarify the effect of multiple ctf_link calls.
Empty ctf_link_outputs if it already exists rather than
having the old output leak into the new link. Fix a variable
name.
* testsuite/config/default.exp (AR): Add.
(OBJDUMP): Likewise.
* testsuite/libctf-regression/libctf-repeat-cu.exp: New test.
* testsuite/libctf-regression/libctf-repeat-cu*: Main program,
library, and expected results for the test.
2022-06-11 00:05:50 +08:00
|
|
|
free (filename);
|
2022-07-31 21:21:55 +08:00
|
|
|
oom:
|
libctf, link: add lazy linking: clean up input members: err/warn cleanup
This rather large and intertwined pile of changes does three things:
First, it transitions from dprintf to ctf_err_warn for things the user might
care about: this one file is the major impetus for the ctf_err_warn
infrastructure, because things like file names are crucial in linker
error messages, and errno values are utterly incapable of
communicating them
Second, it stabilizes the ctf_link APIs: you can now call
ctf_link_add_ctf without a CTF argument (only a NAME), to lazily
ctf_open the file with the given NAME when needed, and close it as soon
as possible, to save memory. This is not an API change because a null
CTF argument was prohibited before now.
Since getting CTF directly from files uses ctf_open, passing in only a
NAME requires use of libctf, not libctf-nobfd. The linker's behaviour
is unchanged, as it still passes in a ctf_archive_t as before.
This also let us fix a leak: we were opening ctf_archives and their
containing ctf_files, then only closing the files and leaving the
archives open.
Third, this commit restructures the ctf_link_in_member argument used by
the CTF linking machinery and adjusts its users accordingly.
We drop two members:
- arcname, which is difficult to construct and then only used in error
messages (that were only dprintf()ed, so never seen!)
- share_mode, since we store the flags passed to ctf_link (including the
share mode) in a new ctf_file_t.ctf_link_flags to help dedup get hold
of it
We rename others whose existing names were fairly dreadful:
- done_main_member -> done_parent, using consistent terminology for .ctf
as the parent of all archive members
- main_input_fp -> in_fp_parent, likewise
- file_name -> in_file_name, likewise
We add one new member, cu_mapped.
Finally, we move the various frees of things like mapping table data to
the top-level ctf_link, since deduplicating links will want to do that
too.
include/
* ctf-api.h (ECTF_NEEDSBFD): New.
(ECTF_NERR): Adjust.
(ctf_link): Rename share_mode arg to flags.
libctf/
* Makefile.am: Set -DNOBFD=1 in libctf-nobfd, and =0 elsewhere.
* Makefile.in: Regenerated.
* ctf-impl.h (ctf_link_input_name): New.
(ctf_file_t) <ctf_link_flags>: New.
* ctf-create.c (ctf_serialize): Adjust accordingly.
* ctf-link.c: Define ctf_open as weak when PIC.
(ctf_arc_close_thunk): Remove unnecessary thunk.
(ctf_file_close_thunk): Likewise.
(ctf_link_input_name): New.
(ctf_link_input_t): New value of the ctf_file_t.ctf_link_input.
(ctf_link_input_close): Adjust accordingly.
(ctf_link_add_ctf_internal): New, split from...
(ctf_link_add_ctf): ... here. Return error if lazy loading of
CTF is not possible. Change to just call...
(ctf_link_add): ... this new function.
(ctf_link_add_cu_mapping): Transition to ctf_err_warn. Drop the
ctf_file_close_thunk.
(ctf_link_in_member_cb_arg_t) <file_name> Rename to...
<in_file_name>: ... this.
<arcname>: Drop.
<share_mode>: Likewise (migrated to ctf_link_flags).
<done_main_member>: Rename to...
<done_parent>: ... this.
<main_input_fp>: Rename to...
<in_fp_parent>: ... this.
<cu_mapped>: New.
(ctf_link_one_type): Adjuwt accordingly. Transition to
ctf_err_warn, removing a TODO.
(ctf_link_one_variable): Note a case too common to warn about.
Report in the debug stream if a cu-mapped link prevents addition
of a conflicting variable.
(ctf_link_one_input_archive_member): Adjust.
(ctf_link_lazy_open): New, open a CTF archive for linking when
needed.
(ctf_link_close_one_input_archive): New, close it again.
(ctf_link_one_input_archive): Adjust for lazy opening, member
renames, and ctf_err_warn transition. Move the
empty_link_type_mapping call to...
(ctf_link): ... here. Adjut for renamings and thunk removal.
Don't spuriously fail if some input contains no CTF data.
(ctf_link_write): ctf_err_warn transition.
* libctf.ver: Remove not-yet-stable comment.
2020-06-05 02:28:52 +08:00
|
|
|
return ctf_set_errno (fp, ENOMEM);
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Add a file, memory buffer, or unopened file (by name) to a link.
|
|
|
|
|
|
|
|
You can call this with:
|
|
|
|
|
|
|
|
CTF and NAME: link the passed ctf_archive_t, with the given NAME.
|
|
|
|
NAME alone: open NAME as a CTF file when needed.
|
|
|
|
BUF and NAME: open the BUF (of length N) as CTF, with the given NAME. (Not
|
|
|
|
yet implemented.)
|
|
|
|
|
|
|
|
Passed in CTF args are owned by the dictionary and will be freed by it.
|
|
|
|
The BUF arg is *not* owned by the dictionary, and the user should not free
|
|
|
|
its referent until the link is done.
|
|
|
|
|
|
|
|
The order of calls to this function influences the order of types in the
|
|
|
|
final link output, but otherwise is not important.
|
|
|
|
|
libctf: fix linking together multiple objects derived from the same source
Right now, if you compile the same .c input repeatedly with CTF enabled
and different compilation flags, then arrange to link all of these
together, then things misbehave in various ways. libctf may conflate
either inputs (if the .o files have the same name, say if they are
stored in different .a archives), or per-CU outputs when conflicting
types are found: the latter can lead to entirely spurious errors when
it tries to produce multiple per-CU outputs with the same name
(discarding all but the last, but then looking for types in the earlier
ones which have just been thrown away).
Fixing this is multi-pronged. Both inputs and outputs need to be
differentiated in the hashtables libctf keeps them in: inputs with the
same cuname and filename need to be considered distinct as long as they
have different associated CTF dicts, and per-CU outputs need to be
considered distinct as long as they have different associated input
dicts. Right now there is nothing tying the two together other than the
CU name: fix this by introducing a new field in the ctf_dict_t named
ctf_link_in_out, which (for input dicts) points to the associated per-CU
output dict (if any), and for output dicts points to the associated
input dict. At creation time the name used is completely arbitrary:
it's only important that it be distinct if CTF dicts are distinct. So,
when a clash is found, adjust the CU name by sticking the number of
elements in the input on the end. At output time, the CU name will
appear in the linked object, so it matters a little more that it look
slightly less ugly: in conflicting cases, append an incrementing
integer, starting at 0.
This naming scheme is not very helpful, but it's hard to see what else
we can do. The input .o name may be the same. The input .a name is not
even visible to ctf_link, and even *that* might be the same, because
.a's can contain many members with the same name, all of which
participate in the link. All we really know is that the two have
distinct dictionaries with distinct types in them, and at least this way
they are all represented, any any symbols, variables etc referring to
those types are accurately stored.
(As a side-effect this also fixes a use-after-free and double-free when
errors are found during variable or symbol emission.)
Use the opportunity to prevent a couple of sources of problems, to wit
changing the active CU mappings when a link has already been done
(no effect on ld, which doesn't use CU mappings at all), and causing
multiple consecutive ctf_link's to have the same net effect as just
doing the last one (no effect on ld, which only ever does one
ctf_link) rather than having the links be a sort of half-incremental
not-really-intended mess.
libctf/ChangeLog:
PR libctf/29242
* ctf-impl.h (struct ctf_dict) [ctf_link_in_out]: New.
* ctf-dedup.c (ctf_dedup_emit_type): Set it.
* ctf-link.c (ctf_link_add_ctf_internal): Set the input
CU name uniquely when clashes are found.
(ctf_link_add): Document what repeated additions do.
(ctf_new_per_cu_name): New, come up with a consistent
name for a new per-CU dict.
(ctf_link_deduplicating): Use it.
(ctf_create_per_cu): Use it, and ctf_link_in_out, and set
ctf_link_in_out properly. Don't overwrite per-CU dicts with
per-CU dicts relating to different inputs.
(ctf_link_add_cu_mapping): Prevent per-CU mappings being set up
if we already have per-CU outputs.
(ctf_link_one_variable): Adjust ctf_link_per_cu call.
(ctf_link_deduplicating_one_symtypetab): Likewise.
(ctf_link_empty_outputs): New, delete all the ctf_link_outputs
and blank out ctf_link_in_out on the corresponding inputs.
(ctf_link): Clarify the effect of multiple ctf_link calls.
Empty ctf_link_outputs if it already exists rather than
having the old output leak into the new link. Fix a variable
name.
* testsuite/config/default.exp (AR): Add.
(OBJDUMP): Likewise.
* testsuite/libctf-regression/libctf-repeat-cu.exp: New test.
* testsuite/libctf-regression/libctf-repeat-cu*: Main program,
library, and expected results for the test.
2022-06-11 00:05:50 +08:00
|
|
|
Repeated additions of the same NAME have no effect; repeated additions of
|
|
|
|
different dicts with the same NAME add all the dicts with unique NAMEs
|
|
|
|
derived from NAME.
|
|
|
|
|
libctf, link: add lazy linking: clean up input members: err/warn cleanup
This rather large and intertwined pile of changes does three things:
First, it transitions from dprintf to ctf_err_warn for things the user might
care about: this one file is the major impetus for the ctf_err_warn
infrastructure, because things like file names are crucial in linker
error messages, and errno values are utterly incapable of
communicating them
Second, it stabilizes the ctf_link APIs: you can now call
ctf_link_add_ctf without a CTF argument (only a NAME), to lazily
ctf_open the file with the given NAME when needed, and close it as soon
as possible, to save memory. This is not an API change because a null
CTF argument was prohibited before now.
Since getting CTF directly from files uses ctf_open, passing in only a
NAME requires use of libctf, not libctf-nobfd. The linker's behaviour
is unchanged, as it still passes in a ctf_archive_t as before.
This also let us fix a leak: we were opening ctf_archives and their
containing ctf_files, then only closing the files and leaving the
archives open.
Third, this commit restructures the ctf_link_in_member argument used by
the CTF linking machinery and adjusts its users accordingly.
We drop two members:
- arcname, which is difficult to construct and then only used in error
messages (that were only dprintf()ed, so never seen!)
- share_mode, since we store the flags passed to ctf_link (including the
share mode) in a new ctf_file_t.ctf_link_flags to help dedup get hold
of it
We rename others whose existing names were fairly dreadful:
- done_main_member -> done_parent, using consistent terminology for .ctf
as the parent of all archive members
- main_input_fp -> in_fp_parent, likewise
- file_name -> in_file_name, likewise
We add one new member, cu_mapped.
Finally, we move the various frees of things like mapping table data to
the top-level ctf_link, since deduplicating links will want to do that
too.
include/
* ctf-api.h (ECTF_NEEDSBFD): New.
(ECTF_NERR): Adjust.
(ctf_link): Rename share_mode arg to flags.
libctf/
* Makefile.am: Set -DNOBFD=1 in libctf-nobfd, and =0 elsewhere.
* Makefile.in: Regenerated.
* ctf-impl.h (ctf_link_input_name): New.
(ctf_file_t) <ctf_link_flags>: New.
* ctf-create.c (ctf_serialize): Adjust accordingly.
* ctf-link.c: Define ctf_open as weak when PIC.
(ctf_arc_close_thunk): Remove unnecessary thunk.
(ctf_file_close_thunk): Likewise.
(ctf_link_input_name): New.
(ctf_link_input_t): New value of the ctf_file_t.ctf_link_input.
(ctf_link_input_close): Adjust accordingly.
(ctf_link_add_ctf_internal): New, split from...
(ctf_link_add_ctf): ... here. Return error if lazy loading of
CTF is not possible. Change to just call...
(ctf_link_add): ... this new function.
(ctf_link_add_cu_mapping): Transition to ctf_err_warn. Drop the
ctf_file_close_thunk.
(ctf_link_in_member_cb_arg_t) <file_name> Rename to...
<in_file_name>: ... this.
<arcname>: Drop.
<share_mode>: Likewise (migrated to ctf_link_flags).
<done_main_member>: Rename to...
<done_parent>: ... this.
<main_input_fp>: Rename to...
<in_fp_parent>: ... this.
<cu_mapped>: New.
(ctf_link_one_type): Adjuwt accordingly. Transition to
ctf_err_warn, removing a TODO.
(ctf_link_one_variable): Note a case too common to warn about.
Report in the debug stream if a cu-mapped link prevents addition
of a conflicting variable.
(ctf_link_one_input_archive_member): Adjust.
(ctf_link_lazy_open): New, open a CTF archive for linking when
needed.
(ctf_link_close_one_input_archive): New, close it again.
(ctf_link_one_input_archive): Adjust for lazy opening, member
renames, and ctf_err_warn transition. Move the
empty_link_type_mapping call to...
(ctf_link): ... here. Adjut for renamings and thunk removal.
Don't spuriously fail if some input contains no CTF data.
(ctf_link_write): ctf_err_warn transition.
* libctf.ver: Remove not-yet-stable comment.
2020-06-05 02:28:52 +08:00
|
|
|
Private for now, but may in time become public once support for BUF is
|
|
|
|
implemented. */
|
|
|
|
|
|
|
|
static int
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
ctf_link_add (ctf_dict_t *fp, ctf_archive_t *ctf, const char *name,
|
libctf, link: add lazy linking: clean up input members: err/warn cleanup
This rather large and intertwined pile of changes does three things:
First, it transitions from dprintf to ctf_err_warn for things the user might
care about: this one file is the major impetus for the ctf_err_warn
infrastructure, because things like file names are crucial in linker
error messages, and errno values are utterly incapable of
communicating them
Second, it stabilizes the ctf_link APIs: you can now call
ctf_link_add_ctf without a CTF argument (only a NAME), to lazily
ctf_open the file with the given NAME when needed, and close it as soon
as possible, to save memory. This is not an API change because a null
CTF argument was prohibited before now.
Since getting CTF directly from files uses ctf_open, passing in only a
NAME requires use of libctf, not libctf-nobfd. The linker's behaviour
is unchanged, as it still passes in a ctf_archive_t as before.
This also let us fix a leak: we were opening ctf_archives and their
containing ctf_files, then only closing the files and leaving the
archives open.
Third, this commit restructures the ctf_link_in_member argument used by
the CTF linking machinery and adjusts its users accordingly.
We drop two members:
- arcname, which is difficult to construct and then only used in error
messages (that were only dprintf()ed, so never seen!)
- share_mode, since we store the flags passed to ctf_link (including the
share mode) in a new ctf_file_t.ctf_link_flags to help dedup get hold
of it
We rename others whose existing names were fairly dreadful:
- done_main_member -> done_parent, using consistent terminology for .ctf
as the parent of all archive members
- main_input_fp -> in_fp_parent, likewise
- file_name -> in_file_name, likewise
We add one new member, cu_mapped.
Finally, we move the various frees of things like mapping table data to
the top-level ctf_link, since deduplicating links will want to do that
too.
include/
* ctf-api.h (ECTF_NEEDSBFD): New.
(ECTF_NERR): Adjust.
(ctf_link): Rename share_mode arg to flags.
libctf/
* Makefile.am: Set -DNOBFD=1 in libctf-nobfd, and =0 elsewhere.
* Makefile.in: Regenerated.
* ctf-impl.h (ctf_link_input_name): New.
(ctf_file_t) <ctf_link_flags>: New.
* ctf-create.c (ctf_serialize): Adjust accordingly.
* ctf-link.c: Define ctf_open as weak when PIC.
(ctf_arc_close_thunk): Remove unnecessary thunk.
(ctf_file_close_thunk): Likewise.
(ctf_link_input_name): New.
(ctf_link_input_t): New value of the ctf_file_t.ctf_link_input.
(ctf_link_input_close): Adjust accordingly.
(ctf_link_add_ctf_internal): New, split from...
(ctf_link_add_ctf): ... here. Return error if lazy loading of
CTF is not possible. Change to just call...
(ctf_link_add): ... this new function.
(ctf_link_add_cu_mapping): Transition to ctf_err_warn. Drop the
ctf_file_close_thunk.
(ctf_link_in_member_cb_arg_t) <file_name> Rename to...
<in_file_name>: ... this.
<arcname>: Drop.
<share_mode>: Likewise (migrated to ctf_link_flags).
<done_main_member>: Rename to...
<done_parent>: ... this.
<main_input_fp>: Rename to...
<in_fp_parent>: ... this.
<cu_mapped>: New.
(ctf_link_one_type): Adjuwt accordingly. Transition to
ctf_err_warn, removing a TODO.
(ctf_link_one_variable): Note a case too common to warn about.
Report in the debug stream if a cu-mapped link prevents addition
of a conflicting variable.
(ctf_link_one_input_archive_member): Adjust.
(ctf_link_lazy_open): New, open a CTF archive for linking when
needed.
(ctf_link_close_one_input_archive): New, close it again.
(ctf_link_one_input_archive): Adjust for lazy opening, member
renames, and ctf_err_warn transition. Move the
empty_link_type_mapping call to...
(ctf_link): ... here. Adjut for renamings and thunk removal.
Don't spuriously fail if some input contains no CTF data.
(ctf_link_write): ctf_err_warn transition.
* libctf.ver: Remove not-yet-stable comment.
2020-06-05 02:28:52 +08:00
|
|
|
void *buf _libctf_unused_, size_t n _libctf_unused_)
|
|
|
|
{
|
|
|
|
if (buf)
|
|
|
|
return (ctf_set_errno (fp, ECTF_NOTYET));
|
|
|
|
|
|
|
|
if (!((ctf && name && !buf)
|
|
|
|
|| (name && !buf && !ctf)
|
|
|
|
|| (buf && name && !ctf)))
|
|
|
|
return (ctf_set_errno (fp, EINVAL));
|
|
|
|
|
|
|
|
/* We can only lazily open files if libctf.so is in use rather than
|
|
|
|
libctf-nobfd.so. This is a little tricky: in shared libraries, we can use
|
|
|
|
a weak symbol so that -lctf -lctf-nobfd works, but in static libraries we
|
|
|
|
must distinguish between the two libraries explicitly. */
|
|
|
|
|
|
|
|
#if defined (PIC)
|
|
|
|
if (!buf && !ctf && name && !ctf_open)
|
|
|
|
return (ctf_set_errno (fp, ECTF_NEEDSBFD));
|
|
|
|
#elif NOBFD
|
|
|
|
if (!buf && !ctf && name)
|
|
|
|
return (ctf_set_errno (fp, ECTF_NEEDSBFD));
|
|
|
|
#endif
|
|
|
|
|
|
|
|
if (fp->ctf_link_outputs)
|
|
|
|
return (ctf_set_errno (fp, ECTF_LINKADDEDLATE));
|
|
|
|
if (fp->ctf_link_inputs == NULL)
|
|
|
|
fp->ctf_link_inputs = ctf_dynhash_create (ctf_hash_string,
|
|
|
|
ctf_hash_eq_string, free,
|
|
|
|
ctf_link_input_close);
|
|
|
|
|
|
|
|
if (fp->ctf_link_inputs == NULL)
|
|
|
|
return (ctf_set_errno (fp, ENOMEM));
|
|
|
|
|
|
|
|
return ctf_link_add_ctf_internal (fp, ctf, NULL, name);
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Add an opened CTF archive or unopened file (by name) to a link.
|
|
|
|
If CTF is NULL and NAME is non-null, an unopened file is meant:
|
|
|
|
otherwise, the specified archive is assumed to have the given NAME.
|
|
|
|
|
|
|
|
Passed in CTF args are owned by the dictionary and will be freed by it.
|
|
|
|
|
|
|
|
The order of calls to this function influences the order of types in the
|
|
|
|
final link output, but otherwise is not important. */
|
|
|
|
|
|
|
|
int
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
ctf_link_add_ctf (ctf_dict_t *fp, ctf_archive_t *ctf, const char *name)
|
libctf, link: add lazy linking: clean up input members: err/warn cleanup
This rather large and intertwined pile of changes does three things:
First, it transitions from dprintf to ctf_err_warn for things the user might
care about: this one file is the major impetus for the ctf_err_warn
infrastructure, because things like file names are crucial in linker
error messages, and errno values are utterly incapable of
communicating them
Second, it stabilizes the ctf_link APIs: you can now call
ctf_link_add_ctf without a CTF argument (only a NAME), to lazily
ctf_open the file with the given NAME when needed, and close it as soon
as possible, to save memory. This is not an API change because a null
CTF argument was prohibited before now.
Since getting CTF directly from files uses ctf_open, passing in only a
NAME requires use of libctf, not libctf-nobfd. The linker's behaviour
is unchanged, as it still passes in a ctf_archive_t as before.
This also let us fix a leak: we were opening ctf_archives and their
containing ctf_files, then only closing the files and leaving the
archives open.
Third, this commit restructures the ctf_link_in_member argument used by
the CTF linking machinery and adjusts its users accordingly.
We drop two members:
- arcname, which is difficult to construct and then only used in error
messages (that were only dprintf()ed, so never seen!)
- share_mode, since we store the flags passed to ctf_link (including the
share mode) in a new ctf_file_t.ctf_link_flags to help dedup get hold
of it
We rename others whose existing names were fairly dreadful:
- done_main_member -> done_parent, using consistent terminology for .ctf
as the parent of all archive members
- main_input_fp -> in_fp_parent, likewise
- file_name -> in_file_name, likewise
We add one new member, cu_mapped.
Finally, we move the various frees of things like mapping table data to
the top-level ctf_link, since deduplicating links will want to do that
too.
include/
* ctf-api.h (ECTF_NEEDSBFD): New.
(ECTF_NERR): Adjust.
(ctf_link): Rename share_mode arg to flags.
libctf/
* Makefile.am: Set -DNOBFD=1 in libctf-nobfd, and =0 elsewhere.
* Makefile.in: Regenerated.
* ctf-impl.h (ctf_link_input_name): New.
(ctf_file_t) <ctf_link_flags>: New.
* ctf-create.c (ctf_serialize): Adjust accordingly.
* ctf-link.c: Define ctf_open as weak when PIC.
(ctf_arc_close_thunk): Remove unnecessary thunk.
(ctf_file_close_thunk): Likewise.
(ctf_link_input_name): New.
(ctf_link_input_t): New value of the ctf_file_t.ctf_link_input.
(ctf_link_input_close): Adjust accordingly.
(ctf_link_add_ctf_internal): New, split from...
(ctf_link_add_ctf): ... here. Return error if lazy loading of
CTF is not possible. Change to just call...
(ctf_link_add): ... this new function.
(ctf_link_add_cu_mapping): Transition to ctf_err_warn. Drop the
ctf_file_close_thunk.
(ctf_link_in_member_cb_arg_t) <file_name> Rename to...
<in_file_name>: ... this.
<arcname>: Drop.
<share_mode>: Likewise (migrated to ctf_link_flags).
<done_main_member>: Rename to...
<done_parent>: ... this.
<main_input_fp>: Rename to...
<in_fp_parent>: ... this.
<cu_mapped>: New.
(ctf_link_one_type): Adjuwt accordingly. Transition to
ctf_err_warn, removing a TODO.
(ctf_link_one_variable): Note a case too common to warn about.
Report in the debug stream if a cu-mapped link prevents addition
of a conflicting variable.
(ctf_link_one_input_archive_member): Adjust.
(ctf_link_lazy_open): New, open a CTF archive for linking when
needed.
(ctf_link_close_one_input_archive): New, close it again.
(ctf_link_one_input_archive): Adjust for lazy opening, member
renames, and ctf_err_warn transition. Move the
empty_link_type_mapping call to...
(ctf_link): ... here. Adjut for renamings and thunk removal.
Don't spuriously fail if some input contains no CTF data.
(ctf_link_write): ctf_err_warn transition.
* libctf.ver: Remove not-yet-stable comment.
2020-06-05 02:28:52 +08:00
|
|
|
{
|
|
|
|
return ctf_link_add (fp, ctf, name, NULL, 0);
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
}
|
|
|
|
|
2021-03-18 20:37:52 +08:00
|
|
|
/* Lazily open a CTF archive for linking, if not already open.
|
|
|
|
|
|
|
|
Returns the number of files contained within the opened archive (0 for none),
|
|
|
|
or -1 on error, as usual. */
|
|
|
|
static ssize_t
|
|
|
|
ctf_link_lazy_open (ctf_dict_t *fp, ctf_link_input_t *input)
|
|
|
|
{
|
|
|
|
size_t count;
|
|
|
|
int err;
|
|
|
|
|
|
|
|
if (input->clin_arc)
|
|
|
|
return ctf_archive_count (input->clin_arc);
|
|
|
|
|
|
|
|
if (input->clin_fp)
|
|
|
|
return 1;
|
|
|
|
|
|
|
|
/* See ctf_link_add_ctf. */
|
|
|
|
#if defined (PIC) || !NOBFD
|
|
|
|
input->clin_arc = ctf_open (input->clin_filename, NULL, &err);
|
|
|
|
#else
|
|
|
|
ctf_err_warn (fp, 0, ECTF_NEEDSBFD, _("cannot open %s lazily"),
|
|
|
|
input->clin_filename);
|
2023-09-13 17:02:36 +08:00
|
|
|
return ctf_set_errno (fp, ECTF_NEEDSBFD);
|
2021-03-18 20:37:52 +08:00
|
|
|
#endif
|
|
|
|
|
|
|
|
/* Having no CTF sections is not an error. We just don't need to do
|
|
|
|
anything. */
|
|
|
|
|
|
|
|
if (!input->clin_arc)
|
|
|
|
{
|
|
|
|
if (err == ECTF_NOCTFDATA)
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
ctf_err_warn (fp, 0, err, _("opening CTF %s failed"),
|
|
|
|
input->clin_filename);
|
2023-09-13 17:02:36 +08:00
|
|
|
return ctf_set_errno (fp, err);
|
2021-03-18 20:37:52 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
if ((count = ctf_archive_count (input->clin_arc)) == 0)
|
|
|
|
ctf_arc_close (input->clin_arc);
|
|
|
|
|
|
|
|
return (ssize_t) count;
|
|
|
|
}
|
|
|
|
|
libctf: fix linking together multiple objects derived from the same source
Right now, if you compile the same .c input repeatedly with CTF enabled
and different compilation flags, then arrange to link all of these
together, then things misbehave in various ways. libctf may conflate
either inputs (if the .o files have the same name, say if they are
stored in different .a archives), or per-CU outputs when conflicting
types are found: the latter can lead to entirely spurious errors when
it tries to produce multiple per-CU outputs with the same name
(discarding all but the last, but then looking for types in the earlier
ones which have just been thrown away).
Fixing this is multi-pronged. Both inputs and outputs need to be
differentiated in the hashtables libctf keeps them in: inputs with the
same cuname and filename need to be considered distinct as long as they
have different associated CTF dicts, and per-CU outputs need to be
considered distinct as long as they have different associated input
dicts. Right now there is nothing tying the two together other than the
CU name: fix this by introducing a new field in the ctf_dict_t named
ctf_link_in_out, which (for input dicts) points to the associated per-CU
output dict (if any), and for output dicts points to the associated
input dict. At creation time the name used is completely arbitrary:
it's only important that it be distinct if CTF dicts are distinct. So,
when a clash is found, adjust the CU name by sticking the number of
elements in the input on the end. At output time, the CU name will
appear in the linked object, so it matters a little more that it look
slightly less ugly: in conflicting cases, append an incrementing
integer, starting at 0.
This naming scheme is not very helpful, but it's hard to see what else
we can do. The input .o name may be the same. The input .a name is not
even visible to ctf_link, and even *that* might be the same, because
.a's can contain many members with the same name, all of which
participate in the link. All we really know is that the two have
distinct dictionaries with distinct types in them, and at least this way
they are all represented, any any symbols, variables etc referring to
those types are accurately stored.
(As a side-effect this also fixes a use-after-free and double-free when
errors are found during variable or symbol emission.)
Use the opportunity to prevent a couple of sources of problems, to wit
changing the active CU mappings when a link has already been done
(no effect on ld, which doesn't use CU mappings at all), and causing
multiple consecutive ctf_link's to have the same net effect as just
doing the last one (no effect on ld, which only ever does one
ctf_link) rather than having the links be a sort of half-incremental
not-really-intended mess.
libctf/ChangeLog:
PR libctf/29242
* ctf-impl.h (struct ctf_dict) [ctf_link_in_out]: New.
* ctf-dedup.c (ctf_dedup_emit_type): Set it.
* ctf-link.c (ctf_link_add_ctf_internal): Set the input
CU name uniquely when clashes are found.
(ctf_link_add): Document what repeated additions do.
(ctf_new_per_cu_name): New, come up with a consistent
name for a new per-CU dict.
(ctf_link_deduplicating): Use it.
(ctf_create_per_cu): Use it, and ctf_link_in_out, and set
ctf_link_in_out properly. Don't overwrite per-CU dicts with
per-CU dicts relating to different inputs.
(ctf_link_add_cu_mapping): Prevent per-CU mappings being set up
if we already have per-CU outputs.
(ctf_link_one_variable): Adjust ctf_link_per_cu call.
(ctf_link_deduplicating_one_symtypetab): Likewise.
(ctf_link_empty_outputs): New, delete all the ctf_link_outputs
and blank out ctf_link_in_out on the corresponding inputs.
(ctf_link): Clarify the effect of multiple ctf_link calls.
Empty ctf_link_outputs if it already exists rather than
having the old output leak into the new link. Fix a variable
name.
* testsuite/config/default.exp (AR): Add.
(OBJDUMP): Likewise.
* testsuite/libctf-regression/libctf-repeat-cu.exp: New test.
* testsuite/libctf-regression/libctf-repeat-cu*: Main program,
library, and expected results for the test.
2022-06-11 00:05:50 +08:00
|
|
|
/* Find a non-clashing unique name for a per-CU output dict, to prevent distinct
|
|
|
|
members corresponding to inputs with identical cunames from overwriting each
|
|
|
|
other. The name should be something like NAME. */
|
|
|
|
|
|
|
|
static char *
|
|
|
|
ctf_new_per_cu_name (ctf_dict_t *fp, const char *name)
|
|
|
|
{
|
|
|
|
char *dynname;
|
|
|
|
long int i = 0;
|
|
|
|
|
|
|
|
if ((dynname = strdup (name)) == NULL)
|
|
|
|
return NULL;
|
|
|
|
|
|
|
|
while ((ctf_dynhash_lookup (fp->ctf_link_outputs, dynname)) != NULL)
|
|
|
|
{
|
|
|
|
free (dynname);
|
|
|
|
if (asprintf (&dynname, "%s#%li", name, i++) < 0)
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
|
|
|
return dynname;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Return a per-CU output CTF dictionary suitable for the given INPUT or CU,
|
|
|
|
creating and interning it if need be. */
|
2019-07-14 04:41:25 +08:00
|
|
|
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
static ctf_dict_t *
|
libctf: fix linking together multiple objects derived from the same source
Right now, if you compile the same .c input repeatedly with CTF enabled
and different compilation flags, then arrange to link all of these
together, then things misbehave in various ways. libctf may conflate
either inputs (if the .o files have the same name, say if they are
stored in different .a archives), or per-CU outputs when conflicting
types are found: the latter can lead to entirely spurious errors when
it tries to produce multiple per-CU outputs with the same name
(discarding all but the last, but then looking for types in the earlier
ones which have just been thrown away).
Fixing this is multi-pronged. Both inputs and outputs need to be
differentiated in the hashtables libctf keeps them in: inputs with the
same cuname and filename need to be considered distinct as long as they
have different associated CTF dicts, and per-CU outputs need to be
considered distinct as long as they have different associated input
dicts. Right now there is nothing tying the two together other than the
CU name: fix this by introducing a new field in the ctf_dict_t named
ctf_link_in_out, which (for input dicts) points to the associated per-CU
output dict (if any), and for output dicts points to the associated
input dict. At creation time the name used is completely arbitrary:
it's only important that it be distinct if CTF dicts are distinct. So,
when a clash is found, adjust the CU name by sticking the number of
elements in the input on the end. At output time, the CU name will
appear in the linked object, so it matters a little more that it look
slightly less ugly: in conflicting cases, append an incrementing
integer, starting at 0.
This naming scheme is not very helpful, but it's hard to see what else
we can do. The input .o name may be the same. The input .a name is not
even visible to ctf_link, and even *that* might be the same, because
.a's can contain many members with the same name, all of which
participate in the link. All we really know is that the two have
distinct dictionaries with distinct types in them, and at least this way
they are all represented, any any symbols, variables etc referring to
those types are accurately stored.
(As a side-effect this also fixes a use-after-free and double-free when
errors are found during variable or symbol emission.)
Use the opportunity to prevent a couple of sources of problems, to wit
changing the active CU mappings when a link has already been done
(no effect on ld, which doesn't use CU mappings at all), and causing
multiple consecutive ctf_link's to have the same net effect as just
doing the last one (no effect on ld, which only ever does one
ctf_link) rather than having the links be a sort of half-incremental
not-really-intended mess.
libctf/ChangeLog:
PR libctf/29242
* ctf-impl.h (struct ctf_dict) [ctf_link_in_out]: New.
* ctf-dedup.c (ctf_dedup_emit_type): Set it.
* ctf-link.c (ctf_link_add_ctf_internal): Set the input
CU name uniquely when clashes are found.
(ctf_link_add): Document what repeated additions do.
(ctf_new_per_cu_name): New, come up with a consistent
name for a new per-CU dict.
(ctf_link_deduplicating): Use it.
(ctf_create_per_cu): Use it, and ctf_link_in_out, and set
ctf_link_in_out properly. Don't overwrite per-CU dicts with
per-CU dicts relating to different inputs.
(ctf_link_add_cu_mapping): Prevent per-CU mappings being set up
if we already have per-CU outputs.
(ctf_link_one_variable): Adjust ctf_link_per_cu call.
(ctf_link_deduplicating_one_symtypetab): Likewise.
(ctf_link_empty_outputs): New, delete all the ctf_link_outputs
and blank out ctf_link_in_out on the corresponding inputs.
(ctf_link): Clarify the effect of multiple ctf_link calls.
Empty ctf_link_outputs if it already exists rather than
having the old output leak into the new link. Fix a variable
name.
* testsuite/config/default.exp (AR): Add.
(OBJDUMP): Likewise.
* testsuite/libctf-regression/libctf-repeat-cu.exp: New test.
* testsuite/libctf-regression/libctf-repeat-cu*: Main program,
library, and expected results for the test.
2022-06-11 00:05:50 +08:00
|
|
|
ctf_create_per_cu (ctf_dict_t *fp, ctf_dict_t *input, const char *cu_name)
|
2019-07-14 04:41:25 +08:00
|
|
|
{
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
ctf_dict_t *cu_fp;
|
libctf: add CU-mapping machinery
Once the deduplicator is capable of actually detecting conflicting types
with the same name (i.e., not yet) we will place such conflicting types,
and types that depend on them, into CTF dictionaries that are the child
of the main dictionary we usually emit: currently, this will lead to the
.ctf section becoming a CTF archive rather than a single dictionary,
with the default-named archive member (_CTF_SECTION, or NULL) being the
main shared dictionary with most of the types in it.
By default, the sections are named after the compilation unit they come
from (complete path and all), with the cuname field in the CTF header
providing further evidence of the name without requiring the caller to
engage in tiresome parsing. But some callers may not wish the mapping
from input CU to output sub-dictionary to be purely CU-based.
The machinery here allows this to be freely changed, in two ways:
- callers can call ctf_link_add_cu_mapping to specify that a single
input compilation unit should have its types placed in some other CU
if they conflict: the CU will always be created, even if empty, so
the consuming program can depend on its existence. You can map
multiple input CUs to one output CU to force all their types to be
merged together: if some of *those* types conflict, the behaviour is
currently unspecified (the new deduplicator will specify it).
- callers can call ctf_link_set_memb_name_changer to provide a function
which is passed every CTF sub-dictionary name in turn (including
_CTF_SECTION) and can return a new name, or NULL if no change is
desired. The mapping from input to output names should not map two
input names to the same output name: if this happens, the two are not
merged but will result in an archive with two members with the same
name (technically valid, but it's hard to access the second
same-named member: you have to do an iteration over archive members).
This is used by the kernel's ctfarchive machinery (not yet upstream) to
encode CTF under member names like {module name}.ctf rather than
.ctf.CU, but it is anticipated that other large projects may wish to
have their own storage for CTF outside of .ctf sections and may wish to
have new naming schemes that suit their special-purpose consumers.
New in v3.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (ctf_link_add_cu_mapping): New.
(ctf_link_memb_name_changer_f): New.
(ctf_link_set_memb_name_changer): New.
libctf/
* ctf-impl.h (ctf_file_t) <ctf_link_cu_mappping>: New.
<ctf_link_memb_name_changer>: Likewise.
<ctf_link_memb_name_changer_arg>: Likewise.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-link.c (ctf_create_per_cu): Apply the cu mapping.
(ctf_link_add_cu_mapping): New.
(ctf_link_set_memb_name_changer): Likewise.
(ctf_change_parent_name): New.
(ctf_name_list_accum_cb_arg_t) <dynames>: New, storage for names
allocated by the caller's ctf_link_memb_name_changer.
<ndynames>: Likewise.
(ctf_accumulate_archive_names): Call the ctf_link_memb_name_changer.
(ctf_link_write): Likewise (for _CTF_SECTION only): also call
ctf_change_parent_name. Free any resulting names.
2019-07-20 21:44:44 +08:00
|
|
|
const char *ctf_name = NULL;
|
2019-07-14 04:41:25 +08:00
|
|
|
char *dynname = NULL;
|
|
|
|
|
libctf: fix linking together multiple objects derived from the same source
Right now, if you compile the same .c input repeatedly with CTF enabled
and different compilation flags, then arrange to link all of these
together, then things misbehave in various ways. libctf may conflate
either inputs (if the .o files have the same name, say if they are
stored in different .a archives), or per-CU outputs when conflicting
types are found: the latter can lead to entirely spurious errors when
it tries to produce multiple per-CU outputs with the same name
(discarding all but the last, but then looking for types in the earlier
ones which have just been thrown away).
Fixing this is multi-pronged. Both inputs and outputs need to be
differentiated in the hashtables libctf keeps them in: inputs with the
same cuname and filename need to be considered distinct as long as they
have different associated CTF dicts, and per-CU outputs need to be
considered distinct as long as they have different associated input
dicts. Right now there is nothing tying the two together other than the
CU name: fix this by introducing a new field in the ctf_dict_t named
ctf_link_in_out, which (for input dicts) points to the associated per-CU
output dict (if any), and for output dicts points to the associated
input dict. At creation time the name used is completely arbitrary:
it's only important that it be distinct if CTF dicts are distinct. So,
when a clash is found, adjust the CU name by sticking the number of
elements in the input on the end. At output time, the CU name will
appear in the linked object, so it matters a little more that it look
slightly less ugly: in conflicting cases, append an incrementing
integer, starting at 0.
This naming scheme is not very helpful, but it's hard to see what else
we can do. The input .o name may be the same. The input .a name is not
even visible to ctf_link, and even *that* might be the same, because
.a's can contain many members with the same name, all of which
participate in the link. All we really know is that the two have
distinct dictionaries with distinct types in them, and at least this way
they are all represented, any any symbols, variables etc referring to
those types are accurately stored.
(As a side-effect this also fixes a use-after-free and double-free when
errors are found during variable or symbol emission.)
Use the opportunity to prevent a couple of sources of problems, to wit
changing the active CU mappings when a link has already been done
(no effect on ld, which doesn't use CU mappings at all), and causing
multiple consecutive ctf_link's to have the same net effect as just
doing the last one (no effect on ld, which only ever does one
ctf_link) rather than having the links be a sort of half-incremental
not-really-intended mess.
libctf/ChangeLog:
PR libctf/29242
* ctf-impl.h (struct ctf_dict) [ctf_link_in_out]: New.
* ctf-dedup.c (ctf_dedup_emit_type): Set it.
* ctf-link.c (ctf_link_add_ctf_internal): Set the input
CU name uniquely when clashes are found.
(ctf_link_add): Document what repeated additions do.
(ctf_new_per_cu_name): New, come up with a consistent
name for a new per-CU dict.
(ctf_link_deduplicating): Use it.
(ctf_create_per_cu): Use it, and ctf_link_in_out, and set
ctf_link_in_out properly. Don't overwrite per-CU dicts with
per-CU dicts relating to different inputs.
(ctf_link_add_cu_mapping): Prevent per-CU mappings being set up
if we already have per-CU outputs.
(ctf_link_one_variable): Adjust ctf_link_per_cu call.
(ctf_link_deduplicating_one_symtypetab): Likewise.
(ctf_link_empty_outputs): New, delete all the ctf_link_outputs
and blank out ctf_link_in_out on the corresponding inputs.
(ctf_link): Clarify the effect of multiple ctf_link calls.
Empty ctf_link_outputs if it already exists rather than
having the old output leak into the new link. Fix a variable
name.
* testsuite/config/default.exp (AR): Add.
(OBJDUMP): Likewise.
* testsuite/libctf-regression/libctf-repeat-cu.exp: New test.
* testsuite/libctf-regression/libctf-repeat-cu*: Main program,
library, and expected results for the test.
2022-06-11 00:05:50 +08:00
|
|
|
/* Already has a per-CU mapping? Just return it. */
|
|
|
|
|
|
|
|
if (input && input->ctf_link_in_out)
|
|
|
|
return input->ctf_link_in_out;
|
|
|
|
|
|
|
|
/* Check the mapping table and translate the per-CU name we use
|
libctf, include: remove the nondeduplicating CTF linker
The nondeduplicating CTF linker was kept around when the deduplicating
one was added so that people had something to fall back to in case the
deduplicating linker turned out to be buggy. It's now much more stable
than the nondeduplicating linker, in addition to much faster, using much
less memory and producing much better output. In addition, while
libctf has a linker flag to invoke the nondeduplicating linker, ld does
not expose it: the only way to turn it on within ld is an intentionally-
undocumented environment variable. So we can remove it without any ABI
or user-visibility concerns (the only thing we leave around is the
CTF_LINK_NONDEDUP flag, which can easily be interpreted as "deduplicate
less", though right now it does nothing).
This lets us remove a lot of complexity associated with tracking
filenames and CU names separately (something the deduplcating linker
never bothered with, since the cunames are always reliable and ld never
hands us useful filenames anyway)
The biggest lacuna left behind is the ctf_type_mapping machinery, which
slows down deduplicating links quite a lot. We can't just ditch it
because ctf_add_type uses it: removing the slowdown from the
deduplicating linker is a job for another commit.
include/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (CTF_LINK_SHARE_DUPLICATED): Note that this might
merely change how much deduplication is done.
libctf/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-link.c (ctf_create_per_cu): Drop FILENAME now that it is
always identical to CUNAME.
(ctf_link_deduplicating_one_symtypetab): Adjust.
(ctf_link_one_type): Remove.
(ctf_link_one_input_archive_member): Likewise.
(ctf_link_close_one_input_archive): Likewise.
(ctf_link_one_input_archive): Likewise.
(ctf_link): No longer call it. Drop CTF_LINK_NONDEDUP path.
Improve header comment a bit (dicts, not files). Adjust
ctf_create_per_cu call.
(ctf_link_deduplicating_variables): Simplify.
(ctf_link_in_member_cb_arg_t) <cu_name>: Remove.
<in_input_cu_file>: Likewise.
<in_fp_parent>: Likewise.
<done_parent>: Likewise.
(ctf_link_one_variable): Turn uses of in_file_name to in_cuname.
2021-03-02 23:10:05 +08:00
|
|
|
accordingly. */
|
libctf: add CU-mapping machinery
Once the deduplicator is capable of actually detecting conflicting types
with the same name (i.e., not yet) we will place such conflicting types,
and types that depend on them, into CTF dictionaries that are the child
of the main dictionary we usually emit: currently, this will lead to the
.ctf section becoming a CTF archive rather than a single dictionary,
with the default-named archive member (_CTF_SECTION, or NULL) being the
main shared dictionary with most of the types in it.
By default, the sections are named after the compilation unit they come
from (complete path and all), with the cuname field in the CTF header
providing further evidence of the name without requiring the caller to
engage in tiresome parsing. But some callers may not wish the mapping
from input CU to output sub-dictionary to be purely CU-based.
The machinery here allows this to be freely changed, in two ways:
- callers can call ctf_link_add_cu_mapping to specify that a single
input compilation unit should have its types placed in some other CU
if they conflict: the CU will always be created, even if empty, so
the consuming program can depend on its existence. You can map
multiple input CUs to one output CU to force all their types to be
merged together: if some of *those* types conflict, the behaviour is
currently unspecified (the new deduplicator will specify it).
- callers can call ctf_link_set_memb_name_changer to provide a function
which is passed every CTF sub-dictionary name in turn (including
_CTF_SECTION) and can return a new name, or NULL if no change is
desired. The mapping from input to output names should not map two
input names to the same output name: if this happens, the two are not
merged but will result in an archive with two members with the same
name (technically valid, but it's hard to access the second
same-named member: you have to do an iteration over archive members).
This is used by the kernel's ctfarchive machinery (not yet upstream) to
encode CTF under member names like {module name}.ctf rather than
.ctf.CU, but it is anticipated that other large projects may wish to
have their own storage for CTF outside of .ctf sections and may wish to
have new naming schemes that suit their special-purpose consumers.
New in v3.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (ctf_link_add_cu_mapping): New.
(ctf_link_memb_name_changer_f): New.
(ctf_link_set_memb_name_changer): New.
libctf/
* ctf-impl.h (ctf_file_t) <ctf_link_cu_mappping>: New.
<ctf_link_memb_name_changer>: Likewise.
<ctf_link_memb_name_changer_arg>: Likewise.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-link.c (ctf_create_per_cu): Apply the cu mapping.
(ctf_link_add_cu_mapping): New.
(ctf_link_set_memb_name_changer): Likewise.
(ctf_change_parent_name): New.
(ctf_name_list_accum_cb_arg_t) <dynames>: New, storage for names
allocated by the caller's ctf_link_memb_name_changer.
<ndynames>: Likewise.
(ctf_accumulate_archive_names): Call the ctf_link_memb_name_changer.
(ctf_link_write): Likewise (for _CTF_SECTION only): also call
ctf_change_parent_name. Free any resulting names.
2019-07-20 21:44:44 +08:00
|
|
|
|
libctf: fix linking together multiple objects derived from the same source
Right now, if you compile the same .c input repeatedly with CTF enabled
and different compilation flags, then arrange to link all of these
together, then things misbehave in various ways. libctf may conflate
either inputs (if the .o files have the same name, say if they are
stored in different .a archives), or per-CU outputs when conflicting
types are found: the latter can lead to entirely spurious errors when
it tries to produce multiple per-CU outputs with the same name
(discarding all but the last, but then looking for types in the earlier
ones which have just been thrown away).
Fixing this is multi-pronged. Both inputs and outputs need to be
differentiated in the hashtables libctf keeps them in: inputs with the
same cuname and filename need to be considered distinct as long as they
have different associated CTF dicts, and per-CU outputs need to be
considered distinct as long as they have different associated input
dicts. Right now there is nothing tying the two together other than the
CU name: fix this by introducing a new field in the ctf_dict_t named
ctf_link_in_out, which (for input dicts) points to the associated per-CU
output dict (if any), and for output dicts points to the associated
input dict. At creation time the name used is completely arbitrary:
it's only important that it be distinct if CTF dicts are distinct. So,
when a clash is found, adjust the CU name by sticking the number of
elements in the input on the end. At output time, the CU name will
appear in the linked object, so it matters a little more that it look
slightly less ugly: in conflicting cases, append an incrementing
integer, starting at 0.
This naming scheme is not very helpful, but it's hard to see what else
we can do. The input .o name may be the same. The input .a name is not
even visible to ctf_link, and even *that* might be the same, because
.a's can contain many members with the same name, all of which
participate in the link. All we really know is that the two have
distinct dictionaries with distinct types in them, and at least this way
they are all represented, any any symbols, variables etc referring to
those types are accurately stored.
(As a side-effect this also fixes a use-after-free and double-free when
errors are found during variable or symbol emission.)
Use the opportunity to prevent a couple of sources of problems, to wit
changing the active CU mappings when a link has already been done
(no effect on ld, which doesn't use CU mappings at all), and causing
multiple consecutive ctf_link's to have the same net effect as just
doing the last one (no effect on ld, which only ever does one
ctf_link) rather than having the links be a sort of half-incremental
not-really-intended mess.
libctf/ChangeLog:
PR libctf/29242
* ctf-impl.h (struct ctf_dict) [ctf_link_in_out]: New.
* ctf-dedup.c (ctf_dedup_emit_type): Set it.
* ctf-link.c (ctf_link_add_ctf_internal): Set the input
CU name uniquely when clashes are found.
(ctf_link_add): Document what repeated additions do.
(ctf_new_per_cu_name): New, come up with a consistent
name for a new per-CU dict.
(ctf_link_deduplicating): Use it.
(ctf_create_per_cu): Use it, and ctf_link_in_out, and set
ctf_link_in_out properly. Don't overwrite per-CU dicts with
per-CU dicts relating to different inputs.
(ctf_link_add_cu_mapping): Prevent per-CU mappings being set up
if we already have per-CU outputs.
(ctf_link_one_variable): Adjust ctf_link_per_cu call.
(ctf_link_deduplicating_one_symtypetab): Likewise.
(ctf_link_empty_outputs): New, delete all the ctf_link_outputs
and blank out ctf_link_in_out on the corresponding inputs.
(ctf_link): Clarify the effect of multiple ctf_link calls.
Empty ctf_link_outputs if it already exists rather than
having the old output leak into the new link. Fix a variable
name.
* testsuite/config/default.exp (AR): Add.
(OBJDUMP): Likewise.
* testsuite/libctf-regression/libctf-repeat-cu.exp: New test.
* testsuite/libctf-regression/libctf-repeat-cu*: Main program,
library, and expected results for the test.
2022-06-11 00:05:50 +08:00
|
|
|
if (cu_name == NULL)
|
|
|
|
cu_name = ctf_unnamed_cuname (input);
|
|
|
|
|
libctf, link: redo cu-mapping handling
Now a bunch of stuff that doesn't apply to ld or any normal use of
libctf, piled into one commit so that it's easier to ignore.
The cu-mapping machinery associates incoming compilation unit names with
outgoing names of CTF dictionaries that should correspond to them, for
non-gdb CTF consumers that would like to group multiple TUs into a
single child dict if conflicting types are found in it (the existing use
case is one kernel module, one child CTF dict, even if the kernel module
is composed of multiple CUs).
The upcoming deduplicator needs to track not only the mapping from
incoming CU name to outgoing dict name, but the inverse mapping from
outgoing dict name to incoming CU name, so it can work over every CTF
dict we might see in the output and link into it.
So rejig the ctf-link machinery to do that. Simultaneously (because
they are closely associated and were written at the same time), we add a
new CTF_LINK_EMPTY_CU_MAPPINGS flag to ctf_link, which tells the
ctf_link machinery to create empty child dicts for each outgoing CU
mapping even if no CUs that correspond to it exist in the link. This is
a bit (OK, quite a lot) of a waste of space, but some existing consumers
require it. (Nobody else should use it.)
Its value is not consecutive with existing CTF_LINK flag values because
we're about to add more flags that are conceptually closer to the
existing ones than this one is.
include/
* ctf-api.h (CTF_LINK_EMPTY_CU_MAPPINGS): New.
libctf/
* ctf-impl.h (ctf_file_t): Improve comments.
<ctf_link_cu_mapping>: Split into...
<ctf_link_in_cu_mapping>: ... this...
<ctf_link_out_cu_mapping>: ... and this.
* ctf-create.c (ctf_serialize): Adjust.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-link.c (ctf_create_per_cu): Look things up in the
in_cu_mapping instead of the cu_mapping.
(ctf_link_add_cu_mapping): The deduplicating link will define
what happens if many FROMs share a TO.
(ctf_link_add_cu_mapping): Create in_cu_mapping and
out_cu_mapping. Do not create ctf_link_outputs here any more, or
create per-CU dicts here: they are already created when needed.
(ctf_link_one_variable): Log a debug message if we skip a
variable due to its type being concealed in a CU-mapped link.
(This is probably too common a case to make into a warning.)
(ctf_link): Create empty per-CU dicts if requested.
2020-06-06 00:36:16 +08:00
|
|
|
if (fp->ctf_link_in_cu_mapping)
|
libctf: add CU-mapping machinery
Once the deduplicator is capable of actually detecting conflicting types
with the same name (i.e., not yet) we will place such conflicting types,
and types that depend on them, into CTF dictionaries that are the child
of the main dictionary we usually emit: currently, this will lead to the
.ctf section becoming a CTF archive rather than a single dictionary,
with the default-named archive member (_CTF_SECTION, or NULL) being the
main shared dictionary with most of the types in it.
By default, the sections are named after the compilation unit they come
from (complete path and all), with the cuname field in the CTF header
providing further evidence of the name without requiring the caller to
engage in tiresome parsing. But some callers may not wish the mapping
from input CU to output sub-dictionary to be purely CU-based.
The machinery here allows this to be freely changed, in two ways:
- callers can call ctf_link_add_cu_mapping to specify that a single
input compilation unit should have its types placed in some other CU
if they conflict: the CU will always be created, even if empty, so
the consuming program can depend on its existence. You can map
multiple input CUs to one output CU to force all their types to be
merged together: if some of *those* types conflict, the behaviour is
currently unspecified (the new deduplicator will specify it).
- callers can call ctf_link_set_memb_name_changer to provide a function
which is passed every CTF sub-dictionary name in turn (including
_CTF_SECTION) and can return a new name, or NULL if no change is
desired. The mapping from input to output names should not map two
input names to the same output name: if this happens, the two are not
merged but will result in an archive with two members with the same
name (technically valid, but it's hard to access the second
same-named member: you have to do an iteration over archive members).
This is used by the kernel's ctfarchive machinery (not yet upstream) to
encode CTF under member names like {module name}.ctf rather than
.ctf.CU, but it is anticipated that other large projects may wish to
have their own storage for CTF outside of .ctf sections and may wish to
have new naming schemes that suit their special-purpose consumers.
New in v3.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (ctf_link_add_cu_mapping): New.
(ctf_link_memb_name_changer_f): New.
(ctf_link_set_memb_name_changer): New.
libctf/
* ctf-impl.h (ctf_file_t) <ctf_link_cu_mappping>: New.
<ctf_link_memb_name_changer>: Likewise.
<ctf_link_memb_name_changer_arg>: Likewise.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-link.c (ctf_create_per_cu): Apply the cu mapping.
(ctf_link_add_cu_mapping): New.
(ctf_link_set_memb_name_changer): Likewise.
(ctf_change_parent_name): New.
(ctf_name_list_accum_cb_arg_t) <dynames>: New, storage for names
allocated by the caller's ctf_link_memb_name_changer.
<ndynames>: Likewise.
(ctf_accumulate_archive_names): Call the ctf_link_memb_name_changer.
(ctf_link_write): Likewise (for _CTF_SECTION only): also call
ctf_change_parent_name. Free any resulting names.
2019-07-20 21:44:44 +08:00
|
|
|
{
|
libctf, include: remove the nondeduplicating CTF linker
The nondeduplicating CTF linker was kept around when the deduplicating
one was added so that people had something to fall back to in case the
deduplicating linker turned out to be buggy. It's now much more stable
than the nondeduplicating linker, in addition to much faster, using much
less memory and producing much better output. In addition, while
libctf has a linker flag to invoke the nondeduplicating linker, ld does
not expose it: the only way to turn it on within ld is an intentionally-
undocumented environment variable. So we can remove it without any ABI
or user-visibility concerns (the only thing we leave around is the
CTF_LINK_NONDEDUP flag, which can easily be interpreted as "deduplicate
less", though right now it does nothing).
This lets us remove a lot of complexity associated with tracking
filenames and CU names separately (something the deduplcating linker
never bothered with, since the cunames are always reliable and ld never
hands us useful filenames anyway)
The biggest lacuna left behind is the ctf_type_mapping machinery, which
slows down deduplicating links quite a lot. We can't just ditch it
because ctf_add_type uses it: removing the slowdown from the
deduplicating linker is a job for another commit.
include/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (CTF_LINK_SHARE_DUPLICATED): Note that this might
merely change how much deduplication is done.
libctf/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-link.c (ctf_create_per_cu): Drop FILENAME now that it is
always identical to CUNAME.
(ctf_link_deduplicating_one_symtypetab): Adjust.
(ctf_link_one_type): Remove.
(ctf_link_one_input_archive_member): Likewise.
(ctf_link_close_one_input_archive): Likewise.
(ctf_link_one_input_archive): Likewise.
(ctf_link): No longer call it. Drop CTF_LINK_NONDEDUP path.
Improve header comment a bit (dicts, not files). Adjust
ctf_create_per_cu call.
(ctf_link_deduplicating_variables): Simplify.
(ctf_link_in_member_cb_arg_t) <cu_name>: Remove.
<in_input_cu_file>: Likewise.
<in_fp_parent>: Likewise.
<done_parent>: Likewise.
(ctf_link_one_variable): Turn uses of in_file_name to in_cuname.
2021-03-02 23:10:05 +08:00
|
|
|
if ((ctf_name = ctf_dynhash_lookup (fp->ctf_link_in_cu_mapping,
|
|
|
|
cu_name)) == NULL)
|
|
|
|
ctf_name = cu_name;
|
libctf: add CU-mapping machinery
Once the deduplicator is capable of actually detecting conflicting types
with the same name (i.e., not yet) we will place such conflicting types,
and types that depend on them, into CTF dictionaries that are the child
of the main dictionary we usually emit: currently, this will lead to the
.ctf section becoming a CTF archive rather than a single dictionary,
with the default-named archive member (_CTF_SECTION, or NULL) being the
main shared dictionary with most of the types in it.
By default, the sections are named after the compilation unit they come
from (complete path and all), with the cuname field in the CTF header
providing further evidence of the name without requiring the caller to
engage in tiresome parsing. But some callers may not wish the mapping
from input CU to output sub-dictionary to be purely CU-based.
The machinery here allows this to be freely changed, in two ways:
- callers can call ctf_link_add_cu_mapping to specify that a single
input compilation unit should have its types placed in some other CU
if they conflict: the CU will always be created, even if empty, so
the consuming program can depend on its existence. You can map
multiple input CUs to one output CU to force all their types to be
merged together: if some of *those* types conflict, the behaviour is
currently unspecified (the new deduplicator will specify it).
- callers can call ctf_link_set_memb_name_changer to provide a function
which is passed every CTF sub-dictionary name in turn (including
_CTF_SECTION) and can return a new name, or NULL if no change is
desired. The mapping from input to output names should not map two
input names to the same output name: if this happens, the two are not
merged but will result in an archive with two members with the same
name (technically valid, but it's hard to access the second
same-named member: you have to do an iteration over archive members).
This is used by the kernel's ctfarchive machinery (not yet upstream) to
encode CTF under member names like {module name}.ctf rather than
.ctf.CU, but it is anticipated that other large projects may wish to
have their own storage for CTF outside of .ctf sections and may wish to
have new naming schemes that suit their special-purpose consumers.
New in v3.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (ctf_link_add_cu_mapping): New.
(ctf_link_memb_name_changer_f): New.
(ctf_link_set_memb_name_changer): New.
libctf/
* ctf-impl.h (ctf_file_t) <ctf_link_cu_mappping>: New.
<ctf_link_memb_name_changer>: Likewise.
<ctf_link_memb_name_changer_arg>: Likewise.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-link.c (ctf_create_per_cu): Apply the cu mapping.
(ctf_link_add_cu_mapping): New.
(ctf_link_set_memb_name_changer): Likewise.
(ctf_change_parent_name): New.
(ctf_name_list_accum_cb_arg_t) <dynames>: New, storage for names
allocated by the caller's ctf_link_memb_name_changer.
<ndynames>: Likewise.
(ctf_accumulate_archive_names): Call the ctf_link_memb_name_changer.
(ctf_link_write): Likewise (for _CTF_SECTION only): also call
ctf_change_parent_name. Free any resulting names.
2019-07-20 21:44:44 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
if (ctf_name == NULL)
|
libctf, include: remove the nondeduplicating CTF linker
The nondeduplicating CTF linker was kept around when the deduplicating
one was added so that people had something to fall back to in case the
deduplicating linker turned out to be buggy. It's now much more stable
than the nondeduplicating linker, in addition to much faster, using much
less memory and producing much better output. In addition, while
libctf has a linker flag to invoke the nondeduplicating linker, ld does
not expose it: the only way to turn it on within ld is an intentionally-
undocumented environment variable. So we can remove it without any ABI
or user-visibility concerns (the only thing we leave around is the
CTF_LINK_NONDEDUP flag, which can easily be interpreted as "deduplicate
less", though right now it does nothing).
This lets us remove a lot of complexity associated with tracking
filenames and CU names separately (something the deduplcating linker
never bothered with, since the cunames are always reliable and ld never
hands us useful filenames anyway)
The biggest lacuna left behind is the ctf_type_mapping machinery, which
slows down deduplicating links quite a lot. We can't just ditch it
because ctf_add_type uses it: removing the slowdown from the
deduplicating linker is a job for another commit.
include/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (CTF_LINK_SHARE_DUPLICATED): Note that this might
merely change how much deduplication is done.
libctf/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-link.c (ctf_create_per_cu): Drop FILENAME now that it is
always identical to CUNAME.
(ctf_link_deduplicating_one_symtypetab): Adjust.
(ctf_link_one_type): Remove.
(ctf_link_one_input_archive_member): Likewise.
(ctf_link_close_one_input_archive): Likewise.
(ctf_link_one_input_archive): Likewise.
(ctf_link): No longer call it. Drop CTF_LINK_NONDEDUP path.
Improve header comment a bit (dicts, not files). Adjust
ctf_create_per_cu call.
(ctf_link_deduplicating_variables): Simplify.
(ctf_link_in_member_cb_arg_t) <cu_name>: Remove.
<in_input_cu_file>: Likewise.
<in_fp_parent>: Likewise.
<done_parent>: Likewise.
(ctf_link_one_variable): Turn uses of in_file_name to in_cuname.
2021-03-02 23:10:05 +08:00
|
|
|
ctf_name = cu_name;
|
libctf: add CU-mapping machinery
Once the deduplicator is capable of actually detecting conflicting types
with the same name (i.e., not yet) we will place such conflicting types,
and types that depend on them, into CTF dictionaries that are the child
of the main dictionary we usually emit: currently, this will lead to the
.ctf section becoming a CTF archive rather than a single dictionary,
with the default-named archive member (_CTF_SECTION, or NULL) being the
main shared dictionary with most of the types in it.
By default, the sections are named after the compilation unit they come
from (complete path and all), with the cuname field in the CTF header
providing further evidence of the name without requiring the caller to
engage in tiresome parsing. But some callers may not wish the mapping
from input CU to output sub-dictionary to be purely CU-based.
The machinery here allows this to be freely changed, in two ways:
- callers can call ctf_link_add_cu_mapping to specify that a single
input compilation unit should have its types placed in some other CU
if they conflict: the CU will always be created, even if empty, so
the consuming program can depend on its existence. You can map
multiple input CUs to one output CU to force all their types to be
merged together: if some of *those* types conflict, the behaviour is
currently unspecified (the new deduplicator will specify it).
- callers can call ctf_link_set_memb_name_changer to provide a function
which is passed every CTF sub-dictionary name in turn (including
_CTF_SECTION) and can return a new name, or NULL if no change is
desired. The mapping from input to output names should not map two
input names to the same output name: if this happens, the two are not
merged but will result in an archive with two members with the same
name (technically valid, but it's hard to access the second
same-named member: you have to do an iteration over archive members).
This is used by the kernel's ctfarchive machinery (not yet upstream) to
encode CTF under member names like {module name}.ctf rather than
.ctf.CU, but it is anticipated that other large projects may wish to
have their own storage for CTF outside of .ctf sections and may wish to
have new naming schemes that suit their special-purpose consumers.
New in v3.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (ctf_link_add_cu_mapping): New.
(ctf_link_memb_name_changer_f): New.
(ctf_link_set_memb_name_changer): New.
libctf/
* ctf-impl.h (ctf_file_t) <ctf_link_cu_mappping>: New.
<ctf_link_memb_name_changer>: Likewise.
<ctf_link_memb_name_changer_arg>: Likewise.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-link.c (ctf_create_per_cu): Apply the cu mapping.
(ctf_link_add_cu_mapping): New.
(ctf_link_set_memb_name_changer): Likewise.
(ctf_change_parent_name): New.
(ctf_name_list_accum_cb_arg_t) <dynames>: New, storage for names
allocated by the caller's ctf_link_memb_name_changer.
<ndynames>: Likewise.
(ctf_accumulate_archive_names): Call the ctf_link_memb_name_changer.
(ctf_link_write): Likewise (for _CTF_SECTION only): also call
ctf_change_parent_name. Free any resulting names.
2019-07-20 21:44:44 +08:00
|
|
|
|
libctf, link: fix CU-mapped links with CTF_LINK_EMPTY_CU_MAPPINGS
This is a bug in the intersection of two obscure options that cannot
even be invoked from ld with a feature added to stop ld of the
same input file repeatedly from crashing the linker.
The latter fix involved tracking input files (internally to libctf) not
just with their input CU name but with a version of their input CU name
that was augmented with a numeric prefix if their linker input file name
was changed, to prevent distinct CTF dicts with the same cuname from
overwriting each other. (We can't use just the linker input file name
because one linker input can contain many CU dicts, particularly under
ld -r). If these inputs then produced conflicting types, those types
were emitted into similarly-named output dicts, so we needed similar
machinery to detect clashing output dicts and add a numeric prefix to
them as well.
This works fine, except that if you used the cu-mapping feature to force
double-linking of CTF (so that your CTF can be grouped into output dicts
larger than a single translation unit) and then also used
CTF_LINK_EMPTY_CU_MAPPINGS to force every possible output dict in the
mapping to be created (even if empty), we did the creation of empty dicts
first, and then all the actual content got considered to be a clash. So
you ended up with a pile of useless empty dicts and then all the content
was in full dicts with the same names suffixed with a #0. This seems
likely to confuse consumers that use this facility.
Fixed by generating all the EMPTY_CU_MAPPINGS empty dicts after linking
is complete, not before it runs.
No impact on ld, which does not do cu-mapped links or pass
CTF_LINK_EMPTY_CU_MAPPINGS to ctf_link().
libctf/
* ctf-link.c (ctf_create_per_cu): Don't create new dicts iff one
already exists and we are making one for no input in particular.
(ctf_link): Emit empty CTF dicts corresponding to no input in
particular only after linkiing is complete.
2023-04-08 03:09:24 +08:00
|
|
|
/* Look up the per-CU dict. If we don't know of one, or it is for a different input
|
|
|
|
CU which just happens to have the same name, create a new one. If we are creating
|
|
|
|
a dict with no input specified, anything will do. */
|
libctf: fix linking together multiple objects derived from the same source
Right now, if you compile the same .c input repeatedly with CTF enabled
and different compilation flags, then arrange to link all of these
together, then things misbehave in various ways. libctf may conflate
either inputs (if the .o files have the same name, say if they are
stored in different .a archives), or per-CU outputs when conflicting
types are found: the latter can lead to entirely spurious errors when
it tries to produce multiple per-CU outputs with the same name
(discarding all but the last, but then looking for types in the earlier
ones which have just been thrown away).
Fixing this is multi-pronged. Both inputs and outputs need to be
differentiated in the hashtables libctf keeps them in: inputs with the
same cuname and filename need to be considered distinct as long as they
have different associated CTF dicts, and per-CU outputs need to be
considered distinct as long as they have different associated input
dicts. Right now there is nothing tying the two together other than the
CU name: fix this by introducing a new field in the ctf_dict_t named
ctf_link_in_out, which (for input dicts) points to the associated per-CU
output dict (if any), and for output dicts points to the associated
input dict. At creation time the name used is completely arbitrary:
it's only important that it be distinct if CTF dicts are distinct. So,
when a clash is found, adjust the CU name by sticking the number of
elements in the input on the end. At output time, the CU name will
appear in the linked object, so it matters a little more that it look
slightly less ugly: in conflicting cases, append an incrementing
integer, starting at 0.
This naming scheme is not very helpful, but it's hard to see what else
we can do. The input .o name may be the same. The input .a name is not
even visible to ctf_link, and even *that* might be the same, because
.a's can contain many members with the same name, all of which
participate in the link. All we really know is that the two have
distinct dictionaries with distinct types in them, and at least this way
they are all represented, any any symbols, variables etc referring to
those types are accurately stored.
(As a side-effect this also fixes a use-after-free and double-free when
errors are found during variable or symbol emission.)
Use the opportunity to prevent a couple of sources of problems, to wit
changing the active CU mappings when a link has already been done
(no effect on ld, which doesn't use CU mappings at all), and causing
multiple consecutive ctf_link's to have the same net effect as just
doing the last one (no effect on ld, which only ever does one
ctf_link) rather than having the links be a sort of half-incremental
not-really-intended mess.
libctf/ChangeLog:
PR libctf/29242
* ctf-impl.h (struct ctf_dict) [ctf_link_in_out]: New.
* ctf-dedup.c (ctf_dedup_emit_type): Set it.
* ctf-link.c (ctf_link_add_ctf_internal): Set the input
CU name uniquely when clashes are found.
(ctf_link_add): Document what repeated additions do.
(ctf_new_per_cu_name): New, come up with a consistent
name for a new per-CU dict.
(ctf_link_deduplicating): Use it.
(ctf_create_per_cu): Use it, and ctf_link_in_out, and set
ctf_link_in_out properly. Don't overwrite per-CU dicts with
per-CU dicts relating to different inputs.
(ctf_link_add_cu_mapping): Prevent per-CU mappings being set up
if we already have per-CU outputs.
(ctf_link_one_variable): Adjust ctf_link_per_cu call.
(ctf_link_deduplicating_one_symtypetab): Likewise.
(ctf_link_empty_outputs): New, delete all the ctf_link_outputs
and blank out ctf_link_in_out on the corresponding inputs.
(ctf_link): Clarify the effect of multiple ctf_link calls.
Empty ctf_link_outputs if it already exists rather than
having the old output leak into the new link. Fix a variable
name.
* testsuite/config/default.exp (AR): Add.
(OBJDUMP): Likewise.
* testsuite/libctf-regression/libctf-repeat-cu.exp: New test.
* testsuite/libctf-regression/libctf-repeat-cu*: Main program,
library, and expected results for the test.
2022-06-11 00:05:50 +08:00
|
|
|
|
|
|
|
if ((cu_fp = ctf_dynhash_lookup (fp->ctf_link_outputs, ctf_name)) == NULL
|
libctf, link: fix CU-mapped links with CTF_LINK_EMPTY_CU_MAPPINGS
This is a bug in the intersection of two obscure options that cannot
even be invoked from ld with a feature added to stop ld of the
same input file repeatedly from crashing the linker.
The latter fix involved tracking input files (internally to libctf) not
just with their input CU name but with a version of their input CU name
that was augmented with a numeric prefix if their linker input file name
was changed, to prevent distinct CTF dicts with the same cuname from
overwriting each other. (We can't use just the linker input file name
because one linker input can contain many CU dicts, particularly under
ld -r). If these inputs then produced conflicting types, those types
were emitted into similarly-named output dicts, so we needed similar
machinery to detect clashing output dicts and add a numeric prefix to
them as well.
This works fine, except that if you used the cu-mapping feature to force
double-linking of CTF (so that your CTF can be grouped into output dicts
larger than a single translation unit) and then also used
CTF_LINK_EMPTY_CU_MAPPINGS to force every possible output dict in the
mapping to be created (even if empty), we did the creation of empty dicts
first, and then all the actual content got considered to be a clash. So
you ended up with a pile of useless empty dicts and then all the content
was in full dicts with the same names suffixed with a #0. This seems
likely to confuse consumers that use this facility.
Fixed by generating all the EMPTY_CU_MAPPINGS empty dicts after linking
is complete, not before it runs.
No impact on ld, which does not do cu-mapped links or pass
CTF_LINK_EMPTY_CU_MAPPINGS to ctf_link().
libctf/
* ctf-link.c (ctf_create_per_cu): Don't create new dicts iff one
already exists and we are making one for no input in particular.
(ctf_link): Emit empty CTF dicts corresponding to no input in
particular only after linkiing is complete.
2023-04-08 03:09:24 +08:00
|
|
|
|| (input && cu_fp->ctf_link_in_out != fp))
|
2019-07-14 04:41:25 +08:00
|
|
|
{
|
|
|
|
int err;
|
|
|
|
|
|
|
|
if ((cu_fp = ctf_create (&err)) == NULL)
|
|
|
|
{
|
libctf, binutils, include, ld: gettextize and improve error handling
This commit follows on from the earlier commit "libctf, ld, binutils:
add textual error/warning reporting for libctf" and converts every error
in libctf that was reported using ctf_dprintf to use ctf_err_warn
instead, gettextizing them in the process, using N_() where necessary to
avoid doing gettext calls unless an error message is actually generated,
and rephrasing some error messages for ease of translation.
This requires a slight change in the ctf_errwarning_next API: this API
is public but has not been in a release yet, so can still change freely.
The problem is that many errors are emitted at open time (whether
opening of a CTF dict, or opening of a CTF archive): the former of these
throws away its incompletely-initialized ctf_file_t rather than return
it, and the latter has no ctf_file_t at all. So errors and warnings
emitted at open time cannot be stored in the ctf_file_t, and have to go
elsewhere.
We put them in a static local in ctf-subr.c (which is not very
thread-safe: a later commit will improve things here): ctf_err_warn with
a NULL fp adds to this list, and the public interface
ctf_errwarning_next with a NULL fp retrieves from it.
We need a slight exception from the usual iterator rules in this case:
with a NULL fp, there is nowhere to store the ECTF_NEXT_END "error"
which signifies the end of iteration, so we add a new err parameter to
ctf_errwarning_next which is used to report such iteration-related
errors. (If an fp is provided -- i.e., if not reporting open errors --
this is optional, but even if it's optional it's still an API change.
This is actually useful from a usability POV as well, since
ctf_errwarning_next is usually called when there's been an error, so
overwriting the error code with ECTF_NEXT_END is not very helpful!
So, unusually, ctf_errwarning_next now uses the passed fp for its
error code *only* if no errp pointer is passed in, and leaves it
untouched otherwise.)
ld, objdump and readelf are adapted to call ctf_errwarning_next with a
NULL fp to report open errors where appropriate.
The ctf_err_warn API also has to change, gaining a new error-number
parameter which is used to add the error message corresponding to that
error number into the debug stream when LIBCTF_DEBUG is enabled:
changing this API is easy at this point since we are already touching
all existing calls to gettextize them. We need this because the debug
stream should contain the errno's message, but the error reported in the
error/warning stream should *not*, because the caller will probably
report it themselves at failure time regardless, and reporting it in
every error message that leads up to it leads to a ridiculous chattering
on failure, which is likely to end up as ridiculous chattering on stderr
(trimmed a bit):
CTF error: `ld/testsuite/ld-ctf/A.c (0): lookup failure for type 3: flags 1: The parent CTF dictionary is unavailable'
CTF error: `ld/testsuite/ld-ctf/A.c (0): struct/union member type hashing error during type hashing for type 80000001, kind 6: The parent CTF dictionary is unavailable'
CTF error: `deduplicating link variable emission failed for ld/testsuite/ld-ctf/A.c: The parent CTF dictionary is unavailable'
ld/.libs/lt-ld-new: warning: CTF linking failed; output will have no CTF section: `The parent CTF dictionary is unavailable'
We only need to be told that the parent CTF dictionary is unavailable
*once*, not over and over again!
errmsgs are still emitted on warning generation, because warnings do not
usually lead to a failure propagated up to the caller and reported
there.
Debug-stream messages are not translated. If translation is turned on,
there will be a mixture of English and translated messages in the debug
stream, but rather that than burden the translators with debug-only
output.
binutils/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_archive_member): Move error-
reporting...
(dump_ctf_errs): ... into this separate function.
(dump_ctf): Call it on open errors.
* readelf.c (dump_ctf_archive_member): Move error-
reporting...
(dump_ctf_errs): ... into this separate function. Support
calls with NULL fp. Adjust for new err parameter to
ctf_errwarning_next.
(dump_section_as_ctf): Call it on open errors.
include/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_errwarning_next): New err parameter.
ld/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (lang_ctf_errs_warnings): Support calls with NULL fp.
Adjust for new err parameter to ctf_errwarning_next. Only
check for assertion failures when fp is non-NULL.
(ldlang_open_ctf): Call it on open errors.
* testsuite/ld-ctf/ctf.exp: Always use the C locale to avoid
breaking the diags tests.
libctf/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-subr.c (open_errors): New list.
(ctf_err_warn): Calls with NULL fp append to open_errors. Add err
parameter, and use it to decorate the debug stream with errmsgs.
(ctf_err_warn_to_open): Splice errors from a CTF dict into the
open_errors.
(ctf_errwarning_next): Calls with NULL fp report from open_errors.
New err param to report iteration errors (including end-of-iteration)
when fp is NULL.
(ctf_assert_fail_internal): Adjust ctf_err_warn call for new err
parameter: gettextize.
* ctf-impl.h (ctfo_get_vbytes): Add ctf_file_t parameter.
(LCTF_VBYTES): Adjust.
(ctf_err_warn_to_open): New.
(ctf_err_warn): Adjust.
(ctf_bundle): Used in only one place: move...
* ctf-create.c: ... here.
(enumcmp): Use ctf_err_warn, not ctf_dprintf, passing the err number
down as needed. Don't emit the errmsg. Gettextize.
(membcmp): Likewise.
(ctf_add_type_internal): Likewise.
(ctf_write_mem): Likewise.
(ctf_compress_write): Likewise. Report errors writing the header or
body.
(ctf_write): Likewise.
* ctf-archive.c (ctf_arc_write_fd): Use ctf_err_warn, not
ctf_dprintf, and gettextize, as above.
(ctf_arc_write): Likewise.
(ctf_arc_bufopen): Likewise.
(ctf_arc_open_internal): Likewise.
* ctf-labels.c (ctf_label_iter): Likewise.
* ctf-open-bfd.c (ctf_bfdclose): Likewise.
(ctf_bfdopen): Likewise.
(ctf_bfdopen_ctfsect): Likewise.
(ctf_fdopen): Likewise.
* ctf-string.c (ctf_str_write_strtab): Likewise.
* ctf-types.c (ctf_type_resolve): Likewise.
* ctf-open.c (get_vbytes_common): Likewise. Pass down the ctf dict.
(get_vbytes_v1): Pass down the ctf dict.
(get_vbytes_v2): Likewise.
(flip_ctf): Likewise.
(flip_types): Likewise. Use ctf_err_warn, not ctf_dprintf, and
gettextize, as above.
(upgrade_types_v1): Adjust calls.
(init_types): Use ctf_err_warn, not ctf_dprintf, as above.
(ctf_bufopen_internal): Likewise. Adjust calls. Transplant errors
emitted into individual dicts into the open errors if this turns
out to be a failed open in the end.
* ctf-dump.c (ctf_dump_format_type): Adjust ctf_err_warn for new err
argument. Gettextize. Don't emit the errmsg.
(ctf_dump_funcs): Likewise. Collapse err label into its only case.
(ctf_dump_type): Likewise.
* ctf-link.c (ctf_create_per_cu): Adjust ctf_err_warn for new err
argument. Gettextize. Don't emit the errmsg.
(ctf_link_one_type): Likewise.
(ctf_link_lazy_open): Likewise.
(ctf_link_one_input_archive): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Likewise.
(ctf_link_deduplicating_per_cu): Likewise. Add some missed
ctf_set_errnos to obscure error cases.
* ctf-dedup.c (ctf_dedup_rhash_type): Adjust ctf_err_warn for new
err argument. Gettextize. Don't emit the errmsg.
(ctf_dedup_populate_mappings): Likewise.
(ctf_dedup_detect_name_ambiguity): Likewise.
(ctf_dedup_init): Likewise.
(ctf_dedup_multiple_input_dicts): Likewise.
(ctf_dedup_conflictify_unshared): Likewise.
(ctf_dedup): Likewise.
(ctf_dedup_rwalk_one_output_mapping): Likewise.
(ctf_dedup_id_to_target): Likewise.
(ctf_dedup_emit_type): Likewise.
(ctf_dedup_emit_struct_members): Likewise.
(ctf_dedup_populate_type_mapping): Likewise.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): Likewise.
(ctf_dedup_hash_type): Likewise. Fix a bit of messed-up error
status setting.
(ctf_dedup_rwalk_one_output_mapping): Likewise. Don't hide
unknown-type-kind messages (which signify file corruption).
2020-07-27 23:45:15 +08:00
|
|
|
ctf_err_warn (fp, 0, err, _("cannot create per-CU CTF archive for "
|
libctf, include: remove the nondeduplicating CTF linker
The nondeduplicating CTF linker was kept around when the deduplicating
one was added so that people had something to fall back to in case the
deduplicating linker turned out to be buggy. It's now much more stable
than the nondeduplicating linker, in addition to much faster, using much
less memory and producing much better output. In addition, while
libctf has a linker flag to invoke the nondeduplicating linker, ld does
not expose it: the only way to turn it on within ld is an intentionally-
undocumented environment variable. So we can remove it without any ABI
or user-visibility concerns (the only thing we leave around is the
CTF_LINK_NONDEDUP flag, which can easily be interpreted as "deduplicate
less", though right now it does nothing).
This lets us remove a lot of complexity associated with tracking
filenames and CU names separately (something the deduplcating linker
never bothered with, since the cunames are always reliable and ld never
hands us useful filenames anyway)
The biggest lacuna left behind is the ctf_type_mapping machinery, which
slows down deduplicating links quite a lot. We can't just ditch it
because ctf_add_type uses it: removing the slowdown from the
deduplicating linker is a job for another commit.
include/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (CTF_LINK_SHARE_DUPLICATED): Note that this might
merely change how much deduplication is done.
libctf/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-link.c (ctf_create_per_cu): Drop FILENAME now that it is
always identical to CUNAME.
(ctf_link_deduplicating_one_symtypetab): Adjust.
(ctf_link_one_type): Remove.
(ctf_link_one_input_archive_member): Likewise.
(ctf_link_close_one_input_archive): Likewise.
(ctf_link_one_input_archive): Likewise.
(ctf_link): No longer call it. Drop CTF_LINK_NONDEDUP path.
Improve header comment a bit (dicts, not files). Adjust
ctf_create_per_cu call.
(ctf_link_deduplicating_variables): Simplify.
(ctf_link_in_member_cb_arg_t) <cu_name>: Remove.
<in_input_cu_file>: Likewise.
<in_fp_parent>: Likewise.
<done_parent>: Likewise.
(ctf_link_one_variable): Turn uses of in_file_name to in_cuname.
2021-03-02 23:10:05 +08:00
|
|
|
"input CU %s"), cu_name);
|
2019-07-14 04:41:25 +08:00
|
|
|
ctf_set_errno (fp, err);
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
libctf: fix linking together multiple objects derived from the same source
Right now, if you compile the same .c input repeatedly with CTF enabled
and different compilation flags, then arrange to link all of these
together, then things misbehave in various ways. libctf may conflate
either inputs (if the .o files have the same name, say if they are
stored in different .a archives), or per-CU outputs when conflicting
types are found: the latter can lead to entirely spurious errors when
it tries to produce multiple per-CU outputs with the same name
(discarding all but the last, but then looking for types in the earlier
ones which have just been thrown away).
Fixing this is multi-pronged. Both inputs and outputs need to be
differentiated in the hashtables libctf keeps them in: inputs with the
same cuname and filename need to be considered distinct as long as they
have different associated CTF dicts, and per-CU outputs need to be
considered distinct as long as they have different associated input
dicts. Right now there is nothing tying the two together other than the
CU name: fix this by introducing a new field in the ctf_dict_t named
ctf_link_in_out, which (for input dicts) points to the associated per-CU
output dict (if any), and for output dicts points to the associated
input dict. At creation time the name used is completely arbitrary:
it's only important that it be distinct if CTF dicts are distinct. So,
when a clash is found, adjust the CU name by sticking the number of
elements in the input on the end. At output time, the CU name will
appear in the linked object, so it matters a little more that it look
slightly less ugly: in conflicting cases, append an incrementing
integer, starting at 0.
This naming scheme is not very helpful, but it's hard to see what else
we can do. The input .o name may be the same. The input .a name is not
even visible to ctf_link, and even *that* might be the same, because
.a's can contain many members with the same name, all of which
participate in the link. All we really know is that the two have
distinct dictionaries with distinct types in them, and at least this way
they are all represented, any any symbols, variables etc referring to
those types are accurately stored.
(As a side-effect this also fixes a use-after-free and double-free when
errors are found during variable or symbol emission.)
Use the opportunity to prevent a couple of sources of problems, to wit
changing the active CU mappings when a link has already been done
(no effect on ld, which doesn't use CU mappings at all), and causing
multiple consecutive ctf_link's to have the same net effect as just
doing the last one (no effect on ld, which only ever does one
ctf_link) rather than having the links be a sort of half-incremental
not-really-intended mess.
libctf/ChangeLog:
PR libctf/29242
* ctf-impl.h (struct ctf_dict) [ctf_link_in_out]: New.
* ctf-dedup.c (ctf_dedup_emit_type): Set it.
* ctf-link.c (ctf_link_add_ctf_internal): Set the input
CU name uniquely when clashes are found.
(ctf_link_add): Document what repeated additions do.
(ctf_new_per_cu_name): New, come up with a consistent
name for a new per-CU dict.
(ctf_link_deduplicating): Use it.
(ctf_create_per_cu): Use it, and ctf_link_in_out, and set
ctf_link_in_out properly. Don't overwrite per-CU dicts with
per-CU dicts relating to different inputs.
(ctf_link_add_cu_mapping): Prevent per-CU mappings being set up
if we already have per-CU outputs.
(ctf_link_one_variable): Adjust ctf_link_per_cu call.
(ctf_link_deduplicating_one_symtypetab): Likewise.
(ctf_link_empty_outputs): New, delete all the ctf_link_outputs
and blank out ctf_link_in_out on the corresponding inputs.
(ctf_link): Clarify the effect of multiple ctf_link calls.
Empty ctf_link_outputs if it already exists rather than
having the old output leak into the new link. Fix a variable
name.
* testsuite/config/default.exp (AR): Add.
(OBJDUMP): Likewise.
* testsuite/libctf-regression/libctf-repeat-cu.exp: New test.
* testsuite/libctf-regression/libctf-repeat-cu*: Main program,
library, and expected results for the test.
2022-06-11 00:05:50 +08:00
|
|
|
ctf_import_unref (cu_fp, fp);
|
|
|
|
|
|
|
|
if ((dynname = ctf_new_per_cu_name (fp, ctf_name)) == NULL)
|
2019-07-14 04:41:25 +08:00
|
|
|
goto oom;
|
|
|
|
|
libctf, include: remove the nondeduplicating CTF linker
The nondeduplicating CTF linker was kept around when the deduplicating
one was added so that people had something to fall back to in case the
deduplicating linker turned out to be buggy. It's now much more stable
than the nondeduplicating linker, in addition to much faster, using much
less memory and producing much better output. In addition, while
libctf has a linker flag to invoke the nondeduplicating linker, ld does
not expose it: the only way to turn it on within ld is an intentionally-
undocumented environment variable. So we can remove it without any ABI
or user-visibility concerns (the only thing we leave around is the
CTF_LINK_NONDEDUP flag, which can easily be interpreted as "deduplicate
less", though right now it does nothing).
This lets us remove a lot of complexity associated with tracking
filenames and CU names separately (something the deduplcating linker
never bothered with, since the cunames are always reliable and ld never
hands us useful filenames anyway)
The biggest lacuna left behind is the ctf_type_mapping machinery, which
slows down deduplicating links quite a lot. We can't just ditch it
because ctf_add_type uses it: removing the slowdown from the
deduplicating linker is a job for another commit.
include/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (CTF_LINK_SHARE_DUPLICATED): Note that this might
merely change how much deduplication is done.
libctf/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-link.c (ctf_create_per_cu): Drop FILENAME now that it is
always identical to CUNAME.
(ctf_link_deduplicating_one_symtypetab): Adjust.
(ctf_link_one_type): Remove.
(ctf_link_one_input_archive_member): Likewise.
(ctf_link_close_one_input_archive): Likewise.
(ctf_link_one_input_archive): Likewise.
(ctf_link): No longer call it. Drop CTF_LINK_NONDEDUP path.
Improve header comment a bit (dicts, not files). Adjust
ctf_create_per_cu call.
(ctf_link_deduplicating_variables): Simplify.
(ctf_link_in_member_cb_arg_t) <cu_name>: Remove.
<in_input_cu_file>: Likewise.
<in_fp_parent>: Likewise.
<done_parent>: Likewise.
(ctf_link_one_variable): Turn uses of in_file_name to in_cuname.
2021-03-02 23:10:05 +08:00
|
|
|
ctf_cuname_set (cu_fp, cu_name);
|
libctf: fix linking together multiple objects derived from the same source
Right now, if you compile the same .c input repeatedly with CTF enabled
and different compilation flags, then arrange to link all of these
together, then things misbehave in various ways. libctf may conflate
either inputs (if the .o files have the same name, say if they are
stored in different .a archives), or per-CU outputs when conflicting
types are found: the latter can lead to entirely spurious errors when
it tries to produce multiple per-CU outputs with the same name
(discarding all but the last, but then looking for types in the earlier
ones which have just been thrown away).
Fixing this is multi-pronged. Both inputs and outputs need to be
differentiated in the hashtables libctf keeps them in: inputs with the
same cuname and filename need to be considered distinct as long as they
have different associated CTF dicts, and per-CU outputs need to be
considered distinct as long as they have different associated input
dicts. Right now there is nothing tying the two together other than the
CU name: fix this by introducing a new field in the ctf_dict_t named
ctf_link_in_out, which (for input dicts) points to the associated per-CU
output dict (if any), and for output dicts points to the associated
input dict. At creation time the name used is completely arbitrary:
it's only important that it be distinct if CTF dicts are distinct. So,
when a clash is found, adjust the CU name by sticking the number of
elements in the input on the end. At output time, the CU name will
appear in the linked object, so it matters a little more that it look
slightly less ugly: in conflicting cases, append an incrementing
integer, starting at 0.
This naming scheme is not very helpful, but it's hard to see what else
we can do. The input .o name may be the same. The input .a name is not
even visible to ctf_link, and even *that* might be the same, because
.a's can contain many members with the same name, all of which
participate in the link. All we really know is that the two have
distinct dictionaries with distinct types in them, and at least this way
they are all represented, any any symbols, variables etc referring to
those types are accurately stored.
(As a side-effect this also fixes a use-after-free and double-free when
errors are found during variable or symbol emission.)
Use the opportunity to prevent a couple of sources of problems, to wit
changing the active CU mappings when a link has already been done
(no effect on ld, which doesn't use CU mappings at all), and causing
multiple consecutive ctf_link's to have the same net effect as just
doing the last one (no effect on ld, which only ever does one
ctf_link) rather than having the links be a sort of half-incremental
not-really-intended mess.
libctf/ChangeLog:
PR libctf/29242
* ctf-impl.h (struct ctf_dict) [ctf_link_in_out]: New.
* ctf-dedup.c (ctf_dedup_emit_type): Set it.
* ctf-link.c (ctf_link_add_ctf_internal): Set the input
CU name uniquely when clashes are found.
(ctf_link_add): Document what repeated additions do.
(ctf_new_per_cu_name): New, come up with a consistent
name for a new per-CU dict.
(ctf_link_deduplicating): Use it.
(ctf_create_per_cu): Use it, and ctf_link_in_out, and set
ctf_link_in_out properly. Don't overwrite per-CU dicts with
per-CU dicts relating to different inputs.
(ctf_link_add_cu_mapping): Prevent per-CU mappings being set up
if we already have per-CU outputs.
(ctf_link_one_variable): Adjust ctf_link_per_cu call.
(ctf_link_deduplicating_one_symtypetab): Likewise.
(ctf_link_empty_outputs): New, delete all the ctf_link_outputs
and blank out ctf_link_in_out on the corresponding inputs.
(ctf_link): Clarify the effect of multiple ctf_link calls.
Empty ctf_link_outputs if it already exists rather than
having the old output leak into the new link. Fix a variable
name.
* testsuite/config/default.exp (AR): Add.
(OBJDUMP): Likewise.
* testsuite/libctf-regression/libctf-repeat-cu.exp: New test.
* testsuite/libctf-regression/libctf-repeat-cu*: Main program,
library, and expected results for the test.
2022-06-11 00:05:50 +08:00
|
|
|
|
2019-07-14 04:41:25 +08:00
|
|
|
ctf_parent_name_set (cu_fp, _CTF_SECTION);
|
libctf: fix linking together multiple objects derived from the same source
Right now, if you compile the same .c input repeatedly with CTF enabled
and different compilation flags, then arrange to link all of these
together, then things misbehave in various ways. libctf may conflate
either inputs (if the .o files have the same name, say if they are
stored in different .a archives), or per-CU outputs when conflicting
types are found: the latter can lead to entirely spurious errors when
it tries to produce multiple per-CU outputs with the same name
(discarding all but the last, but then looking for types in the earlier
ones which have just been thrown away).
Fixing this is multi-pronged. Both inputs and outputs need to be
differentiated in the hashtables libctf keeps them in: inputs with the
same cuname and filename need to be considered distinct as long as they
have different associated CTF dicts, and per-CU outputs need to be
considered distinct as long as they have different associated input
dicts. Right now there is nothing tying the two together other than the
CU name: fix this by introducing a new field in the ctf_dict_t named
ctf_link_in_out, which (for input dicts) points to the associated per-CU
output dict (if any), and for output dicts points to the associated
input dict. At creation time the name used is completely arbitrary:
it's only important that it be distinct if CTF dicts are distinct. So,
when a clash is found, adjust the CU name by sticking the number of
elements in the input on the end. At output time, the CU name will
appear in the linked object, so it matters a little more that it look
slightly less ugly: in conflicting cases, append an incrementing
integer, starting at 0.
This naming scheme is not very helpful, but it's hard to see what else
we can do. The input .o name may be the same. The input .a name is not
even visible to ctf_link, and even *that* might be the same, because
.a's can contain many members with the same name, all of which
participate in the link. All we really know is that the two have
distinct dictionaries with distinct types in them, and at least this way
they are all represented, any any symbols, variables etc referring to
those types are accurately stored.
(As a side-effect this also fixes a use-after-free and double-free when
errors are found during variable or symbol emission.)
Use the opportunity to prevent a couple of sources of problems, to wit
changing the active CU mappings when a link has already been done
(no effect on ld, which doesn't use CU mappings at all), and causing
multiple consecutive ctf_link's to have the same net effect as just
doing the last one (no effect on ld, which only ever does one
ctf_link) rather than having the links be a sort of half-incremental
not-really-intended mess.
libctf/ChangeLog:
PR libctf/29242
* ctf-impl.h (struct ctf_dict) [ctf_link_in_out]: New.
* ctf-dedup.c (ctf_dedup_emit_type): Set it.
* ctf-link.c (ctf_link_add_ctf_internal): Set the input
CU name uniquely when clashes are found.
(ctf_link_add): Document what repeated additions do.
(ctf_new_per_cu_name): New, come up with a consistent
name for a new per-CU dict.
(ctf_link_deduplicating): Use it.
(ctf_create_per_cu): Use it, and ctf_link_in_out, and set
ctf_link_in_out properly. Don't overwrite per-CU dicts with
per-CU dicts relating to different inputs.
(ctf_link_add_cu_mapping): Prevent per-CU mappings being set up
if we already have per-CU outputs.
(ctf_link_one_variable): Adjust ctf_link_per_cu call.
(ctf_link_deduplicating_one_symtypetab): Likewise.
(ctf_link_empty_outputs): New, delete all the ctf_link_outputs
and blank out ctf_link_in_out on the corresponding inputs.
(ctf_link): Clarify the effect of multiple ctf_link calls.
Empty ctf_link_outputs if it already exists rather than
having the old output leak into the new link. Fix a variable
name.
* testsuite/config/default.exp (AR): Add.
(OBJDUMP): Likewise.
* testsuite/libctf-regression/libctf-repeat-cu.exp: New test.
* testsuite/libctf-regression/libctf-repeat-cu*: Main program,
library, and expected results for the test.
2022-06-11 00:05:50 +08:00
|
|
|
cu_fp->ctf_link_in_out = fp;
|
|
|
|
fp->ctf_link_in_out = cu_fp;
|
|
|
|
|
|
|
|
if (ctf_dynhash_insert (fp->ctf_link_outputs, dynname, cu_fp) < 0)
|
|
|
|
goto oom;
|
2019-07-14 04:41:25 +08:00
|
|
|
}
|
|
|
|
return cu_fp;
|
|
|
|
|
|
|
|
oom:
|
|
|
|
free (dynname);
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
ctf_dict_close (cu_fp);
|
2019-07-14 04:41:25 +08:00
|
|
|
ctf_set_errno (fp, ENOMEM);
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
libctf: add CU-mapping machinery
Once the deduplicator is capable of actually detecting conflicting types
with the same name (i.e., not yet) we will place such conflicting types,
and types that depend on them, into CTF dictionaries that are the child
of the main dictionary we usually emit: currently, this will lead to the
.ctf section becoming a CTF archive rather than a single dictionary,
with the default-named archive member (_CTF_SECTION, or NULL) being the
main shared dictionary with most of the types in it.
By default, the sections are named after the compilation unit they come
from (complete path and all), with the cuname field in the CTF header
providing further evidence of the name without requiring the caller to
engage in tiresome parsing. But some callers may not wish the mapping
from input CU to output sub-dictionary to be purely CU-based.
The machinery here allows this to be freely changed, in two ways:
- callers can call ctf_link_add_cu_mapping to specify that a single
input compilation unit should have its types placed in some other CU
if they conflict: the CU will always be created, even if empty, so
the consuming program can depend on its existence. You can map
multiple input CUs to one output CU to force all their types to be
merged together: if some of *those* types conflict, the behaviour is
currently unspecified (the new deduplicator will specify it).
- callers can call ctf_link_set_memb_name_changer to provide a function
which is passed every CTF sub-dictionary name in turn (including
_CTF_SECTION) and can return a new name, or NULL if no change is
desired. The mapping from input to output names should not map two
input names to the same output name: if this happens, the two are not
merged but will result in an archive with two members with the same
name (technically valid, but it's hard to access the second
same-named member: you have to do an iteration over archive members).
This is used by the kernel's ctfarchive machinery (not yet upstream) to
encode CTF under member names like {module name}.ctf rather than
.ctf.CU, but it is anticipated that other large projects may wish to
have their own storage for CTF outside of .ctf sections and may wish to
have new naming schemes that suit their special-purpose consumers.
New in v3.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (ctf_link_add_cu_mapping): New.
(ctf_link_memb_name_changer_f): New.
(ctf_link_set_memb_name_changer): New.
libctf/
* ctf-impl.h (ctf_file_t) <ctf_link_cu_mappping>: New.
<ctf_link_memb_name_changer>: Likewise.
<ctf_link_memb_name_changer_arg>: Likewise.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-link.c (ctf_create_per_cu): Apply the cu mapping.
(ctf_link_add_cu_mapping): New.
(ctf_link_set_memb_name_changer): Likewise.
(ctf_change_parent_name): New.
(ctf_name_list_accum_cb_arg_t) <dynames>: New, storage for names
allocated by the caller's ctf_link_memb_name_changer.
<ndynames>: Likewise.
(ctf_accumulate_archive_names): Call the ctf_link_memb_name_changer.
(ctf_link_write): Likewise (for _CTF_SECTION only): also call
ctf_change_parent_name. Free any resulting names.
2019-07-20 21:44:44 +08:00
|
|
|
/* Add a mapping directing that the CU named FROM should have its
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
conflicting/non-duplicate types (depending on link mode) go into a dict
|
libctf: adding CU mappings should be idempotent
When CTF finds conflicting types, it usually shoves each definition
into a CTF dictionary named after the compilation unit.
The intent of the obscure "cu-mapped link" feature is to allow you to
implement custom linkers that shove the definitions into other, more
coarse-grained units (say, one per kernel module, even if a module consists
of more than one compilation unit): conflicting types within one of these
larger components are hidden from name lookup so you can only look up (an
arbitrary one of) them by name, but can still be found by chasing type graph
links and are still fully deduplicated.
You do this by calling
ctf_link_add_cu_mapping (fp, "CU name", "bigger lump name"), repeatedly,
with different "CU name"s: the ctf_link() following that will put all
conflicting types found in "CU name"s sharing a "bigger lump name" into a
child dict in an archive member named "bigger lump name".
So it's clear enough what happens if you call it repeatedly with the same
"bigger lump name" more than once, because that's the whole point of it: but
what if you call it with the same "CU name" repeatedly?
ctf_link_add_cu_mapping (fp, "CU name", "bigger lump name");
ctf_link_add_cu_mapping (fp, "CU name", "other name");
This is meant to be the same as just doing the second of these, as if the
first was never called. Alas, this isn't what happens, and what you get is
instead a bit of an inconsistent mess: more or less, the first takes
precedence, which is the exact opposite of what we wanted.
Fix this to work the right way round.
(I plan to add support for CU-mapped links to GNU ld, mainly so that we can
properly *test* this machinery.)
libctf/ChangeLog:
* ctf-link.c (ctf_create_per_cu): Note the behaviour of
repeatedly adding FROMs.
(ctf_link_add_cu_mapping): Implement that behavour.
2023-11-08 05:11:18 +08:00
|
|
|
named TO. Many FROMs can share a TO, but adding the same FROM with
|
|
|
|
a different TO will replace the old mapping.
|
libctf: add CU-mapping machinery
Once the deduplicator is capable of actually detecting conflicting types
with the same name (i.e., not yet) we will place such conflicting types,
and types that depend on them, into CTF dictionaries that are the child
of the main dictionary we usually emit: currently, this will lead to the
.ctf section becoming a CTF archive rather than a single dictionary,
with the default-named archive member (_CTF_SECTION, or NULL) being the
main shared dictionary with most of the types in it.
By default, the sections are named after the compilation unit they come
from (complete path and all), with the cuname field in the CTF header
providing further evidence of the name without requiring the caller to
engage in tiresome parsing. But some callers may not wish the mapping
from input CU to output sub-dictionary to be purely CU-based.
The machinery here allows this to be freely changed, in two ways:
- callers can call ctf_link_add_cu_mapping to specify that a single
input compilation unit should have its types placed in some other CU
if they conflict: the CU will always be created, even if empty, so
the consuming program can depend on its existence. You can map
multiple input CUs to one output CU to force all their types to be
merged together: if some of *those* types conflict, the behaviour is
currently unspecified (the new deduplicator will specify it).
- callers can call ctf_link_set_memb_name_changer to provide a function
which is passed every CTF sub-dictionary name in turn (including
_CTF_SECTION) and can return a new name, or NULL if no change is
desired. The mapping from input to output names should not map two
input names to the same output name: if this happens, the two are not
merged but will result in an archive with two members with the same
name (technically valid, but it's hard to access the second
same-named member: you have to do an iteration over archive members).
This is used by the kernel's ctfarchive machinery (not yet upstream) to
encode CTF under member names like {module name}.ctf rather than
.ctf.CU, but it is anticipated that other large projects may wish to
have their own storage for CTF outside of .ctf sections and may wish to
have new naming schemes that suit their special-purpose consumers.
New in v3.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (ctf_link_add_cu_mapping): New.
(ctf_link_memb_name_changer_f): New.
(ctf_link_set_memb_name_changer): New.
libctf/
* ctf-impl.h (ctf_file_t) <ctf_link_cu_mappping>: New.
<ctf_link_memb_name_changer>: Likewise.
<ctf_link_memb_name_changer_arg>: Likewise.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-link.c (ctf_create_per_cu): Apply the cu mapping.
(ctf_link_add_cu_mapping): New.
(ctf_link_set_memb_name_changer): Likewise.
(ctf_change_parent_name): New.
(ctf_name_list_accum_cb_arg_t) <dynames>: New, storage for names
allocated by the caller's ctf_link_memb_name_changer.
<ndynames>: Likewise.
(ctf_accumulate_archive_names): Call the ctf_link_memb_name_changer.
(ctf_link_write): Likewise (for _CTF_SECTION only): also call
ctf_change_parent_name. Free any resulting names.
2019-07-20 21:44:44 +08:00
|
|
|
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
We forcibly add a dict named TO in every case, even though it may well
|
libctf: add CU-mapping machinery
Once the deduplicator is capable of actually detecting conflicting types
with the same name (i.e., not yet) we will place such conflicting types,
and types that depend on them, into CTF dictionaries that are the child
of the main dictionary we usually emit: currently, this will lead to the
.ctf section becoming a CTF archive rather than a single dictionary,
with the default-named archive member (_CTF_SECTION, or NULL) being the
main shared dictionary with most of the types in it.
By default, the sections are named after the compilation unit they come
from (complete path and all), with the cuname field in the CTF header
providing further evidence of the name without requiring the caller to
engage in tiresome parsing. But some callers may not wish the mapping
from input CU to output sub-dictionary to be purely CU-based.
The machinery here allows this to be freely changed, in two ways:
- callers can call ctf_link_add_cu_mapping to specify that a single
input compilation unit should have its types placed in some other CU
if they conflict: the CU will always be created, even if empty, so
the consuming program can depend on its existence. You can map
multiple input CUs to one output CU to force all their types to be
merged together: if some of *those* types conflict, the behaviour is
currently unspecified (the new deduplicator will specify it).
- callers can call ctf_link_set_memb_name_changer to provide a function
which is passed every CTF sub-dictionary name in turn (including
_CTF_SECTION) and can return a new name, or NULL if no change is
desired. The mapping from input to output names should not map two
input names to the same output name: if this happens, the two are not
merged but will result in an archive with two members with the same
name (technically valid, but it's hard to access the second
same-named member: you have to do an iteration over archive members).
This is used by the kernel's ctfarchive machinery (not yet upstream) to
encode CTF under member names like {module name}.ctf rather than
.ctf.CU, but it is anticipated that other large projects may wish to
have their own storage for CTF outside of .ctf sections and may wish to
have new naming schemes that suit their special-purpose consumers.
New in v3.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (ctf_link_add_cu_mapping): New.
(ctf_link_memb_name_changer_f): New.
(ctf_link_set_memb_name_changer): New.
libctf/
* ctf-impl.h (ctf_file_t) <ctf_link_cu_mappping>: New.
<ctf_link_memb_name_changer>: Likewise.
<ctf_link_memb_name_changer_arg>: Likewise.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-link.c (ctf_create_per_cu): Apply the cu mapping.
(ctf_link_add_cu_mapping): New.
(ctf_link_set_memb_name_changer): Likewise.
(ctf_change_parent_name): New.
(ctf_name_list_accum_cb_arg_t) <dynames>: New, storage for names
allocated by the caller's ctf_link_memb_name_changer.
<ndynames>: Likewise.
(ctf_accumulate_archive_names): Call the ctf_link_memb_name_changer.
(ctf_link_write): Likewise (for _CTF_SECTION only): also call
ctf_change_parent_name. Free any resulting names.
2019-07-20 21:44:44 +08:00
|
|
|
wind up empty, because clients that use this facility usually expect to find
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
every TO dict present, even if empty, and malfunction otherwise. */
|
libctf: add CU-mapping machinery
Once the deduplicator is capable of actually detecting conflicting types
with the same name (i.e., not yet) we will place such conflicting types,
and types that depend on them, into CTF dictionaries that are the child
of the main dictionary we usually emit: currently, this will lead to the
.ctf section becoming a CTF archive rather than a single dictionary,
with the default-named archive member (_CTF_SECTION, or NULL) being the
main shared dictionary with most of the types in it.
By default, the sections are named after the compilation unit they come
from (complete path and all), with the cuname field in the CTF header
providing further evidence of the name without requiring the caller to
engage in tiresome parsing. But some callers may not wish the mapping
from input CU to output sub-dictionary to be purely CU-based.
The machinery here allows this to be freely changed, in two ways:
- callers can call ctf_link_add_cu_mapping to specify that a single
input compilation unit should have its types placed in some other CU
if they conflict: the CU will always be created, even if empty, so
the consuming program can depend on its existence. You can map
multiple input CUs to one output CU to force all their types to be
merged together: if some of *those* types conflict, the behaviour is
currently unspecified (the new deduplicator will specify it).
- callers can call ctf_link_set_memb_name_changer to provide a function
which is passed every CTF sub-dictionary name in turn (including
_CTF_SECTION) and can return a new name, or NULL if no change is
desired. The mapping from input to output names should not map two
input names to the same output name: if this happens, the two are not
merged but will result in an archive with two members with the same
name (technically valid, but it's hard to access the second
same-named member: you have to do an iteration over archive members).
This is used by the kernel's ctfarchive machinery (not yet upstream) to
encode CTF under member names like {module name}.ctf rather than
.ctf.CU, but it is anticipated that other large projects may wish to
have their own storage for CTF outside of .ctf sections and may wish to
have new naming schemes that suit their special-purpose consumers.
New in v3.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (ctf_link_add_cu_mapping): New.
(ctf_link_memb_name_changer_f): New.
(ctf_link_set_memb_name_changer): New.
libctf/
* ctf-impl.h (ctf_file_t) <ctf_link_cu_mappping>: New.
<ctf_link_memb_name_changer>: Likewise.
<ctf_link_memb_name_changer_arg>: Likewise.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-link.c (ctf_create_per_cu): Apply the cu mapping.
(ctf_link_add_cu_mapping): New.
(ctf_link_set_memb_name_changer): Likewise.
(ctf_change_parent_name): New.
(ctf_name_list_accum_cb_arg_t) <dynames>: New, storage for names
allocated by the caller's ctf_link_memb_name_changer.
<ndynames>: Likewise.
(ctf_accumulate_archive_names): Call the ctf_link_memb_name_changer.
(ctf_link_write): Likewise (for _CTF_SECTION only): also call
ctf_change_parent_name. Free any resulting names.
2019-07-20 21:44:44 +08:00
|
|
|
|
|
|
|
int
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
ctf_link_add_cu_mapping (ctf_dict_t *fp, const char *from, const char *to)
|
libctf: add CU-mapping machinery
Once the deduplicator is capable of actually detecting conflicting types
with the same name (i.e., not yet) we will place such conflicting types,
and types that depend on them, into CTF dictionaries that are the child
of the main dictionary we usually emit: currently, this will lead to the
.ctf section becoming a CTF archive rather than a single dictionary,
with the default-named archive member (_CTF_SECTION, or NULL) being the
main shared dictionary with most of the types in it.
By default, the sections are named after the compilation unit they come
from (complete path and all), with the cuname field in the CTF header
providing further evidence of the name without requiring the caller to
engage in tiresome parsing. But some callers may not wish the mapping
from input CU to output sub-dictionary to be purely CU-based.
The machinery here allows this to be freely changed, in two ways:
- callers can call ctf_link_add_cu_mapping to specify that a single
input compilation unit should have its types placed in some other CU
if they conflict: the CU will always be created, even if empty, so
the consuming program can depend on its existence. You can map
multiple input CUs to one output CU to force all their types to be
merged together: if some of *those* types conflict, the behaviour is
currently unspecified (the new deduplicator will specify it).
- callers can call ctf_link_set_memb_name_changer to provide a function
which is passed every CTF sub-dictionary name in turn (including
_CTF_SECTION) and can return a new name, or NULL if no change is
desired. The mapping from input to output names should not map two
input names to the same output name: if this happens, the two are not
merged but will result in an archive with two members with the same
name (technically valid, but it's hard to access the second
same-named member: you have to do an iteration over archive members).
This is used by the kernel's ctfarchive machinery (not yet upstream) to
encode CTF under member names like {module name}.ctf rather than
.ctf.CU, but it is anticipated that other large projects may wish to
have their own storage for CTF outside of .ctf sections and may wish to
have new naming schemes that suit their special-purpose consumers.
New in v3.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (ctf_link_add_cu_mapping): New.
(ctf_link_memb_name_changer_f): New.
(ctf_link_set_memb_name_changer): New.
libctf/
* ctf-impl.h (ctf_file_t) <ctf_link_cu_mappping>: New.
<ctf_link_memb_name_changer>: Likewise.
<ctf_link_memb_name_changer_arg>: Likewise.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-link.c (ctf_create_per_cu): Apply the cu mapping.
(ctf_link_add_cu_mapping): New.
(ctf_link_set_memb_name_changer): Likewise.
(ctf_change_parent_name): New.
(ctf_name_list_accum_cb_arg_t) <dynames>: New, storage for names
allocated by the caller's ctf_link_memb_name_changer.
<ndynames>: Likewise.
(ctf_accumulate_archive_names): Call the ctf_link_memb_name_changer.
(ctf_link_write): Likewise (for _CTF_SECTION only): also call
ctf_change_parent_name. Free any resulting names.
2019-07-20 21:44:44 +08:00
|
|
|
{
|
|
|
|
int err;
|
libctf: adding CU mappings should be idempotent
When CTF finds conflicting types, it usually shoves each definition
into a CTF dictionary named after the compilation unit.
The intent of the obscure "cu-mapped link" feature is to allow you to
implement custom linkers that shove the definitions into other, more
coarse-grained units (say, one per kernel module, even if a module consists
of more than one compilation unit): conflicting types within one of these
larger components are hidden from name lookup so you can only look up (an
arbitrary one of) them by name, but can still be found by chasing type graph
links and are still fully deduplicated.
You do this by calling
ctf_link_add_cu_mapping (fp, "CU name", "bigger lump name"), repeatedly,
with different "CU name"s: the ctf_link() following that will put all
conflicting types found in "CU name"s sharing a "bigger lump name" into a
child dict in an archive member named "bigger lump name".
So it's clear enough what happens if you call it repeatedly with the same
"bigger lump name" more than once, because that's the whole point of it: but
what if you call it with the same "CU name" repeatedly?
ctf_link_add_cu_mapping (fp, "CU name", "bigger lump name");
ctf_link_add_cu_mapping (fp, "CU name", "other name");
This is meant to be the same as just doing the second of these, as if the
first was never called. Alas, this isn't what happens, and what you get is
instead a bit of an inconsistent mess: more or less, the first takes
precedence, which is the exact opposite of what we wanted.
Fix this to work the right way round.
(I plan to add support for CU-mapped links to GNU ld, mainly so that we can
properly *test* this machinery.)
libctf/ChangeLog:
* ctf-link.c (ctf_create_per_cu): Note the behaviour of
repeatedly adding FROMs.
(ctf_link_add_cu_mapping): Implement that behavour.
2023-11-08 05:11:18 +08:00
|
|
|
char *f = NULL, *t = NULL, *existing;
|
libctf, link: redo cu-mapping handling
Now a bunch of stuff that doesn't apply to ld or any normal use of
libctf, piled into one commit so that it's easier to ignore.
The cu-mapping machinery associates incoming compilation unit names with
outgoing names of CTF dictionaries that should correspond to them, for
non-gdb CTF consumers that would like to group multiple TUs into a
single child dict if conflicting types are found in it (the existing use
case is one kernel module, one child CTF dict, even if the kernel module
is composed of multiple CUs).
The upcoming deduplicator needs to track not only the mapping from
incoming CU name to outgoing dict name, but the inverse mapping from
outgoing dict name to incoming CU name, so it can work over every CTF
dict we might see in the output and link into it.
So rejig the ctf-link machinery to do that. Simultaneously (because
they are closely associated and were written at the same time), we add a
new CTF_LINK_EMPTY_CU_MAPPINGS flag to ctf_link, which tells the
ctf_link machinery to create empty child dicts for each outgoing CU
mapping even if no CUs that correspond to it exist in the link. This is
a bit (OK, quite a lot) of a waste of space, but some existing consumers
require it. (Nobody else should use it.)
Its value is not consecutive with existing CTF_LINK flag values because
we're about to add more flags that are conceptually closer to the
existing ones than this one is.
include/
* ctf-api.h (CTF_LINK_EMPTY_CU_MAPPINGS): New.
libctf/
* ctf-impl.h (ctf_file_t): Improve comments.
<ctf_link_cu_mapping>: Split into...
<ctf_link_in_cu_mapping>: ... this...
<ctf_link_out_cu_mapping>: ... and this.
* ctf-create.c (ctf_serialize): Adjust.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-link.c (ctf_create_per_cu): Look things up in the
in_cu_mapping instead of the cu_mapping.
(ctf_link_add_cu_mapping): The deduplicating link will define
what happens if many FROMs share a TO.
(ctf_link_add_cu_mapping): Create in_cu_mapping and
out_cu_mapping. Do not create ctf_link_outputs here any more, or
create per-CU dicts here: they are already created when needed.
(ctf_link_one_variable): Log a debug message if we skip a
variable due to its type being concealed in a CU-mapped link.
(This is probably too common a case to make into a warning.)
(ctf_link): Create empty per-CU dicts if requested.
2020-06-06 00:36:16 +08:00
|
|
|
ctf_dynhash_t *one_out;
|
|
|
|
|
libctf: fix linking together multiple objects derived from the same source
Right now, if you compile the same .c input repeatedly with CTF enabled
and different compilation flags, then arrange to link all of these
together, then things misbehave in various ways. libctf may conflate
either inputs (if the .o files have the same name, say if they are
stored in different .a archives), or per-CU outputs when conflicting
types are found: the latter can lead to entirely spurious errors when
it tries to produce multiple per-CU outputs with the same name
(discarding all but the last, but then looking for types in the earlier
ones which have just been thrown away).
Fixing this is multi-pronged. Both inputs and outputs need to be
differentiated in the hashtables libctf keeps them in: inputs with the
same cuname and filename need to be considered distinct as long as they
have different associated CTF dicts, and per-CU outputs need to be
considered distinct as long as they have different associated input
dicts. Right now there is nothing tying the two together other than the
CU name: fix this by introducing a new field in the ctf_dict_t named
ctf_link_in_out, which (for input dicts) points to the associated per-CU
output dict (if any), and for output dicts points to the associated
input dict. At creation time the name used is completely arbitrary:
it's only important that it be distinct if CTF dicts are distinct. So,
when a clash is found, adjust the CU name by sticking the number of
elements in the input on the end. At output time, the CU name will
appear in the linked object, so it matters a little more that it look
slightly less ugly: in conflicting cases, append an incrementing
integer, starting at 0.
This naming scheme is not very helpful, but it's hard to see what else
we can do. The input .o name may be the same. The input .a name is not
even visible to ctf_link, and even *that* might be the same, because
.a's can contain many members with the same name, all of which
participate in the link. All we really know is that the two have
distinct dictionaries with distinct types in them, and at least this way
they are all represented, any any symbols, variables etc referring to
those types are accurately stored.
(As a side-effect this also fixes a use-after-free and double-free when
errors are found during variable or symbol emission.)
Use the opportunity to prevent a couple of sources of problems, to wit
changing the active CU mappings when a link has already been done
(no effect on ld, which doesn't use CU mappings at all), and causing
multiple consecutive ctf_link's to have the same net effect as just
doing the last one (no effect on ld, which only ever does one
ctf_link) rather than having the links be a sort of half-incremental
not-really-intended mess.
libctf/ChangeLog:
PR libctf/29242
* ctf-impl.h (struct ctf_dict) [ctf_link_in_out]: New.
* ctf-dedup.c (ctf_dedup_emit_type): Set it.
* ctf-link.c (ctf_link_add_ctf_internal): Set the input
CU name uniquely when clashes are found.
(ctf_link_add): Document what repeated additions do.
(ctf_new_per_cu_name): New, come up with a consistent
name for a new per-CU dict.
(ctf_link_deduplicating): Use it.
(ctf_create_per_cu): Use it, and ctf_link_in_out, and set
ctf_link_in_out properly. Don't overwrite per-CU dicts with
per-CU dicts relating to different inputs.
(ctf_link_add_cu_mapping): Prevent per-CU mappings being set up
if we already have per-CU outputs.
(ctf_link_one_variable): Adjust ctf_link_per_cu call.
(ctf_link_deduplicating_one_symtypetab): Likewise.
(ctf_link_empty_outputs): New, delete all the ctf_link_outputs
and blank out ctf_link_in_out on the corresponding inputs.
(ctf_link): Clarify the effect of multiple ctf_link calls.
Empty ctf_link_outputs if it already exists rather than
having the old output leak into the new link. Fix a variable
name.
* testsuite/config/default.exp (AR): Add.
(OBJDUMP): Likewise.
* testsuite/libctf-regression/libctf-repeat-cu.exp: New test.
* testsuite/libctf-regression/libctf-repeat-cu*: Main program,
library, and expected results for the test.
2022-06-11 00:05:50 +08:00
|
|
|
/* Mappings cannot be set up if per-CU output dicts already exist. */
|
|
|
|
if (fp->ctf_link_outputs && ctf_dynhash_elements (fp->ctf_link_outputs) != 0)
|
|
|
|
return (ctf_set_errno (fp, ECTF_LINKADDEDLATE));
|
|
|
|
|
libctf, link: redo cu-mapping handling
Now a bunch of stuff that doesn't apply to ld or any normal use of
libctf, piled into one commit so that it's easier to ignore.
The cu-mapping machinery associates incoming compilation unit names with
outgoing names of CTF dictionaries that should correspond to them, for
non-gdb CTF consumers that would like to group multiple TUs into a
single child dict if conflicting types are found in it (the existing use
case is one kernel module, one child CTF dict, even if the kernel module
is composed of multiple CUs).
The upcoming deduplicator needs to track not only the mapping from
incoming CU name to outgoing dict name, but the inverse mapping from
outgoing dict name to incoming CU name, so it can work over every CTF
dict we might see in the output and link into it.
So rejig the ctf-link machinery to do that. Simultaneously (because
they are closely associated and were written at the same time), we add a
new CTF_LINK_EMPTY_CU_MAPPINGS flag to ctf_link, which tells the
ctf_link machinery to create empty child dicts for each outgoing CU
mapping even if no CUs that correspond to it exist in the link. This is
a bit (OK, quite a lot) of a waste of space, but some existing consumers
require it. (Nobody else should use it.)
Its value is not consecutive with existing CTF_LINK flag values because
we're about to add more flags that are conceptually closer to the
existing ones than this one is.
include/
* ctf-api.h (CTF_LINK_EMPTY_CU_MAPPINGS): New.
libctf/
* ctf-impl.h (ctf_file_t): Improve comments.
<ctf_link_cu_mapping>: Split into...
<ctf_link_in_cu_mapping>: ... this...
<ctf_link_out_cu_mapping>: ... and this.
* ctf-create.c (ctf_serialize): Adjust.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-link.c (ctf_create_per_cu): Look things up in the
in_cu_mapping instead of the cu_mapping.
(ctf_link_add_cu_mapping): The deduplicating link will define
what happens if many FROMs share a TO.
(ctf_link_add_cu_mapping): Create in_cu_mapping and
out_cu_mapping. Do not create ctf_link_outputs here any more, or
create per-CU dicts here: they are already created when needed.
(ctf_link_one_variable): Log a debug message if we skip a
variable due to its type being concealed in a CU-mapped link.
(This is probably too common a case to make into a warning.)
(ctf_link): Create empty per-CU dicts if requested.
2020-06-06 00:36:16 +08:00
|
|
|
if (fp->ctf_link_in_cu_mapping == NULL)
|
|
|
|
fp->ctf_link_in_cu_mapping = ctf_dynhash_create (ctf_hash_string,
|
|
|
|
ctf_hash_eq_string, free,
|
|
|
|
free);
|
|
|
|
if (fp->ctf_link_in_cu_mapping == NULL)
|
|
|
|
goto oom;
|
libctf: add CU-mapping machinery
Once the deduplicator is capable of actually detecting conflicting types
with the same name (i.e., not yet) we will place such conflicting types,
and types that depend on them, into CTF dictionaries that are the child
of the main dictionary we usually emit: currently, this will lead to the
.ctf section becoming a CTF archive rather than a single dictionary,
with the default-named archive member (_CTF_SECTION, or NULL) being the
main shared dictionary with most of the types in it.
By default, the sections are named after the compilation unit they come
from (complete path and all), with the cuname field in the CTF header
providing further evidence of the name without requiring the caller to
engage in tiresome parsing. But some callers may not wish the mapping
from input CU to output sub-dictionary to be purely CU-based.
The machinery here allows this to be freely changed, in two ways:
- callers can call ctf_link_add_cu_mapping to specify that a single
input compilation unit should have its types placed in some other CU
if they conflict: the CU will always be created, even if empty, so
the consuming program can depend on its existence. You can map
multiple input CUs to one output CU to force all their types to be
merged together: if some of *those* types conflict, the behaviour is
currently unspecified (the new deduplicator will specify it).
- callers can call ctf_link_set_memb_name_changer to provide a function
which is passed every CTF sub-dictionary name in turn (including
_CTF_SECTION) and can return a new name, or NULL if no change is
desired. The mapping from input to output names should not map two
input names to the same output name: if this happens, the two are not
merged but will result in an archive with two members with the same
name (technically valid, but it's hard to access the second
same-named member: you have to do an iteration over archive members).
This is used by the kernel's ctfarchive machinery (not yet upstream) to
encode CTF under member names like {module name}.ctf rather than
.ctf.CU, but it is anticipated that other large projects may wish to
have their own storage for CTF outside of .ctf sections and may wish to
have new naming schemes that suit their special-purpose consumers.
New in v3.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (ctf_link_add_cu_mapping): New.
(ctf_link_memb_name_changer_f): New.
(ctf_link_set_memb_name_changer): New.
libctf/
* ctf-impl.h (ctf_file_t) <ctf_link_cu_mappping>: New.
<ctf_link_memb_name_changer>: Likewise.
<ctf_link_memb_name_changer_arg>: Likewise.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-link.c (ctf_create_per_cu): Apply the cu mapping.
(ctf_link_add_cu_mapping): New.
(ctf_link_set_memb_name_changer): Likewise.
(ctf_change_parent_name): New.
(ctf_name_list_accum_cb_arg_t) <dynames>: New, storage for names
allocated by the caller's ctf_link_memb_name_changer.
<ndynames>: Likewise.
(ctf_accumulate_archive_names): Call the ctf_link_memb_name_changer.
(ctf_link_write): Likewise (for _CTF_SECTION only): also call
ctf_change_parent_name. Free any resulting names.
2019-07-20 21:44:44 +08:00
|
|
|
|
libctf, link: redo cu-mapping handling
Now a bunch of stuff that doesn't apply to ld or any normal use of
libctf, piled into one commit so that it's easier to ignore.
The cu-mapping machinery associates incoming compilation unit names with
outgoing names of CTF dictionaries that should correspond to them, for
non-gdb CTF consumers that would like to group multiple TUs into a
single child dict if conflicting types are found in it (the existing use
case is one kernel module, one child CTF dict, even if the kernel module
is composed of multiple CUs).
The upcoming deduplicator needs to track not only the mapping from
incoming CU name to outgoing dict name, but the inverse mapping from
outgoing dict name to incoming CU name, so it can work over every CTF
dict we might see in the output and link into it.
So rejig the ctf-link machinery to do that. Simultaneously (because
they are closely associated and were written at the same time), we add a
new CTF_LINK_EMPTY_CU_MAPPINGS flag to ctf_link, which tells the
ctf_link machinery to create empty child dicts for each outgoing CU
mapping even if no CUs that correspond to it exist in the link. This is
a bit (OK, quite a lot) of a waste of space, but some existing consumers
require it. (Nobody else should use it.)
Its value is not consecutive with existing CTF_LINK flag values because
we're about to add more flags that are conceptually closer to the
existing ones than this one is.
include/
* ctf-api.h (CTF_LINK_EMPTY_CU_MAPPINGS): New.
libctf/
* ctf-impl.h (ctf_file_t): Improve comments.
<ctf_link_cu_mapping>: Split into...
<ctf_link_in_cu_mapping>: ... this...
<ctf_link_out_cu_mapping>: ... and this.
* ctf-create.c (ctf_serialize): Adjust.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-link.c (ctf_create_per_cu): Look things up in the
in_cu_mapping instead of the cu_mapping.
(ctf_link_add_cu_mapping): The deduplicating link will define
what happens if many FROMs share a TO.
(ctf_link_add_cu_mapping): Create in_cu_mapping and
out_cu_mapping. Do not create ctf_link_outputs here any more, or
create per-CU dicts here: they are already created when needed.
(ctf_link_one_variable): Log a debug message if we skip a
variable due to its type being concealed in a CU-mapped link.
(This is probably too common a case to make into a warning.)
(ctf_link): Create empty per-CU dicts if requested.
2020-06-06 00:36:16 +08:00
|
|
|
if (fp->ctf_link_out_cu_mapping == NULL)
|
|
|
|
fp->ctf_link_out_cu_mapping = ctf_dynhash_create (ctf_hash_string,
|
|
|
|
ctf_hash_eq_string, free,
|
|
|
|
(ctf_hash_free_fun)
|
|
|
|
ctf_dynhash_destroy);
|
|
|
|
if (fp->ctf_link_out_cu_mapping == NULL)
|
|
|
|
goto oom;
|
libctf: add CU-mapping machinery
Once the deduplicator is capable of actually detecting conflicting types
with the same name (i.e., not yet) we will place such conflicting types,
and types that depend on them, into CTF dictionaries that are the child
of the main dictionary we usually emit: currently, this will lead to the
.ctf section becoming a CTF archive rather than a single dictionary,
with the default-named archive member (_CTF_SECTION, or NULL) being the
main shared dictionary with most of the types in it.
By default, the sections are named after the compilation unit they come
from (complete path and all), with the cuname field in the CTF header
providing further evidence of the name without requiring the caller to
engage in tiresome parsing. But some callers may not wish the mapping
from input CU to output sub-dictionary to be purely CU-based.
The machinery here allows this to be freely changed, in two ways:
- callers can call ctf_link_add_cu_mapping to specify that a single
input compilation unit should have its types placed in some other CU
if they conflict: the CU will always be created, even if empty, so
the consuming program can depend on its existence. You can map
multiple input CUs to one output CU to force all their types to be
merged together: if some of *those* types conflict, the behaviour is
currently unspecified (the new deduplicator will specify it).
- callers can call ctf_link_set_memb_name_changer to provide a function
which is passed every CTF sub-dictionary name in turn (including
_CTF_SECTION) and can return a new name, or NULL if no change is
desired. The mapping from input to output names should not map two
input names to the same output name: if this happens, the two are not
merged but will result in an archive with two members with the same
name (technically valid, but it's hard to access the second
same-named member: you have to do an iteration over archive members).
This is used by the kernel's ctfarchive machinery (not yet upstream) to
encode CTF under member names like {module name}.ctf rather than
.ctf.CU, but it is anticipated that other large projects may wish to
have their own storage for CTF outside of .ctf sections and may wish to
have new naming schemes that suit their special-purpose consumers.
New in v3.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (ctf_link_add_cu_mapping): New.
(ctf_link_memb_name_changer_f): New.
(ctf_link_set_memb_name_changer): New.
libctf/
* ctf-impl.h (ctf_file_t) <ctf_link_cu_mappping>: New.
<ctf_link_memb_name_changer>: Likewise.
<ctf_link_memb_name_changer_arg>: Likewise.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-link.c (ctf_create_per_cu): Apply the cu mapping.
(ctf_link_add_cu_mapping): New.
(ctf_link_set_memb_name_changer): Likewise.
(ctf_change_parent_name): New.
(ctf_name_list_accum_cb_arg_t) <dynames>: New, storage for names
allocated by the caller's ctf_link_memb_name_changer.
<ndynames>: Likewise.
(ctf_accumulate_archive_names): Call the ctf_link_memb_name_changer.
(ctf_link_write): Likewise (for _CTF_SECTION only): also call
ctf_change_parent_name. Free any resulting names.
2019-07-20 21:44:44 +08:00
|
|
|
|
libctf: adding CU mappings should be idempotent
When CTF finds conflicting types, it usually shoves each definition
into a CTF dictionary named after the compilation unit.
The intent of the obscure "cu-mapped link" feature is to allow you to
implement custom linkers that shove the definitions into other, more
coarse-grained units (say, one per kernel module, even if a module consists
of more than one compilation unit): conflicting types within one of these
larger components are hidden from name lookup so you can only look up (an
arbitrary one of) them by name, but can still be found by chasing type graph
links and are still fully deduplicated.
You do this by calling
ctf_link_add_cu_mapping (fp, "CU name", "bigger lump name"), repeatedly,
with different "CU name"s: the ctf_link() following that will put all
conflicting types found in "CU name"s sharing a "bigger lump name" into a
child dict in an archive member named "bigger lump name".
So it's clear enough what happens if you call it repeatedly with the same
"bigger lump name" more than once, because that's the whole point of it: but
what if you call it with the same "CU name" repeatedly?
ctf_link_add_cu_mapping (fp, "CU name", "bigger lump name");
ctf_link_add_cu_mapping (fp, "CU name", "other name");
This is meant to be the same as just doing the second of these, as if the
first was never called. Alas, this isn't what happens, and what you get is
instead a bit of an inconsistent mess: more or less, the first takes
precedence, which is the exact opposite of what we wanted.
Fix this to work the right way round.
(I plan to add support for CU-mapped links to GNU ld, mainly so that we can
properly *test* this machinery.)
libctf/ChangeLog:
* ctf-link.c (ctf_create_per_cu): Note the behaviour of
repeatedly adding FROMs.
(ctf_link_add_cu_mapping): Implement that behavour.
2023-11-08 05:11:18 +08:00
|
|
|
/* If this FROM already exists, remove the mapping from both the FROM->TO
|
|
|
|
and the TO->FROM lists: the user wants to change it. */
|
|
|
|
|
|
|
|
if ((existing = ctf_dynhash_lookup (fp->ctf_link_in_cu_mapping, from)) != NULL)
|
|
|
|
{
|
|
|
|
one_out = ctf_dynhash_lookup (fp->ctf_link_out_cu_mapping, existing);
|
|
|
|
if (!ctf_assert (fp, one_out))
|
|
|
|
return -1; /* errno is set for us. */
|
|
|
|
|
|
|
|
ctf_dynhash_remove (one_out, from);
|
|
|
|
ctf_dynhash_remove (fp->ctf_link_in_cu_mapping, from);
|
|
|
|
}
|
|
|
|
|
libctf, link: redo cu-mapping handling
Now a bunch of stuff that doesn't apply to ld or any normal use of
libctf, piled into one commit so that it's easier to ignore.
The cu-mapping machinery associates incoming compilation unit names with
outgoing names of CTF dictionaries that should correspond to them, for
non-gdb CTF consumers that would like to group multiple TUs into a
single child dict if conflicting types are found in it (the existing use
case is one kernel module, one child CTF dict, even if the kernel module
is composed of multiple CUs).
The upcoming deduplicator needs to track not only the mapping from
incoming CU name to outgoing dict name, but the inverse mapping from
outgoing dict name to incoming CU name, so it can work over every CTF
dict we might see in the output and link into it.
So rejig the ctf-link machinery to do that. Simultaneously (because
they are closely associated and were written at the same time), we add a
new CTF_LINK_EMPTY_CU_MAPPINGS flag to ctf_link, which tells the
ctf_link machinery to create empty child dicts for each outgoing CU
mapping even if no CUs that correspond to it exist in the link. This is
a bit (OK, quite a lot) of a waste of space, but some existing consumers
require it. (Nobody else should use it.)
Its value is not consecutive with existing CTF_LINK flag values because
we're about to add more flags that are conceptually closer to the
existing ones than this one is.
include/
* ctf-api.h (CTF_LINK_EMPTY_CU_MAPPINGS): New.
libctf/
* ctf-impl.h (ctf_file_t): Improve comments.
<ctf_link_cu_mapping>: Split into...
<ctf_link_in_cu_mapping>: ... this...
<ctf_link_out_cu_mapping>: ... and this.
* ctf-create.c (ctf_serialize): Adjust.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-link.c (ctf_create_per_cu): Look things up in the
in_cu_mapping instead of the cu_mapping.
(ctf_link_add_cu_mapping): The deduplicating link will define
what happens if many FROMs share a TO.
(ctf_link_add_cu_mapping): Create in_cu_mapping and
out_cu_mapping. Do not create ctf_link_outputs here any more, or
create per-CU dicts here: they are already created when needed.
(ctf_link_one_variable): Log a debug message if we skip a
variable due to its type being concealed in a CU-mapped link.
(This is probably too common a case to make into a warning.)
(ctf_link): Create empty per-CU dicts if requested.
2020-06-06 00:36:16 +08:00
|
|
|
f = strdup (from);
|
|
|
|
t = strdup (to);
|
|
|
|
if (!f || !t)
|
|
|
|
goto oom;
|
libctf: add CU-mapping machinery
Once the deduplicator is capable of actually detecting conflicting types
with the same name (i.e., not yet) we will place such conflicting types,
and types that depend on them, into CTF dictionaries that are the child
of the main dictionary we usually emit: currently, this will lead to the
.ctf section becoming a CTF archive rather than a single dictionary,
with the default-named archive member (_CTF_SECTION, or NULL) being the
main shared dictionary with most of the types in it.
By default, the sections are named after the compilation unit they come
from (complete path and all), with the cuname field in the CTF header
providing further evidence of the name without requiring the caller to
engage in tiresome parsing. But some callers may not wish the mapping
from input CU to output sub-dictionary to be purely CU-based.
The machinery here allows this to be freely changed, in two ways:
- callers can call ctf_link_add_cu_mapping to specify that a single
input compilation unit should have its types placed in some other CU
if they conflict: the CU will always be created, even if empty, so
the consuming program can depend on its existence. You can map
multiple input CUs to one output CU to force all their types to be
merged together: if some of *those* types conflict, the behaviour is
currently unspecified (the new deduplicator will specify it).
- callers can call ctf_link_set_memb_name_changer to provide a function
which is passed every CTF sub-dictionary name in turn (including
_CTF_SECTION) and can return a new name, or NULL if no change is
desired. The mapping from input to output names should not map two
input names to the same output name: if this happens, the two are not
merged but will result in an archive with two members with the same
name (technically valid, but it's hard to access the second
same-named member: you have to do an iteration over archive members).
This is used by the kernel's ctfarchive machinery (not yet upstream) to
encode CTF under member names like {module name}.ctf rather than
.ctf.CU, but it is anticipated that other large projects may wish to
have their own storage for CTF outside of .ctf sections and may wish to
have new naming schemes that suit their special-purpose consumers.
New in v3.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (ctf_link_add_cu_mapping): New.
(ctf_link_memb_name_changer_f): New.
(ctf_link_set_memb_name_changer): New.
libctf/
* ctf-impl.h (ctf_file_t) <ctf_link_cu_mappping>: New.
<ctf_link_memb_name_changer>: Likewise.
<ctf_link_memb_name_changer_arg>: Likewise.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-link.c (ctf_create_per_cu): Apply the cu mapping.
(ctf_link_add_cu_mapping): New.
(ctf_link_set_memb_name_changer): Likewise.
(ctf_change_parent_name): New.
(ctf_name_list_accum_cb_arg_t) <dynames>: New, storage for names
allocated by the caller's ctf_link_memb_name_changer.
<ndynames>: Likewise.
(ctf_accumulate_archive_names): Call the ctf_link_memb_name_changer.
(ctf_link_write): Likewise (for _CTF_SECTION only): also call
ctf_change_parent_name. Free any resulting names.
2019-07-20 21:44:44 +08:00
|
|
|
|
libctf, link: redo cu-mapping handling
Now a bunch of stuff that doesn't apply to ld or any normal use of
libctf, piled into one commit so that it's easier to ignore.
The cu-mapping machinery associates incoming compilation unit names with
outgoing names of CTF dictionaries that should correspond to them, for
non-gdb CTF consumers that would like to group multiple TUs into a
single child dict if conflicting types are found in it (the existing use
case is one kernel module, one child CTF dict, even if the kernel module
is composed of multiple CUs).
The upcoming deduplicator needs to track not only the mapping from
incoming CU name to outgoing dict name, but the inverse mapping from
outgoing dict name to incoming CU name, so it can work over every CTF
dict we might see in the output and link into it.
So rejig the ctf-link machinery to do that. Simultaneously (because
they are closely associated and were written at the same time), we add a
new CTF_LINK_EMPTY_CU_MAPPINGS flag to ctf_link, which tells the
ctf_link machinery to create empty child dicts for each outgoing CU
mapping even if no CUs that correspond to it exist in the link. This is
a bit (OK, quite a lot) of a waste of space, but some existing consumers
require it. (Nobody else should use it.)
Its value is not consecutive with existing CTF_LINK flag values because
we're about to add more flags that are conceptually closer to the
existing ones than this one is.
include/
* ctf-api.h (CTF_LINK_EMPTY_CU_MAPPINGS): New.
libctf/
* ctf-impl.h (ctf_file_t): Improve comments.
<ctf_link_cu_mapping>: Split into...
<ctf_link_in_cu_mapping>: ... this...
<ctf_link_out_cu_mapping>: ... and this.
* ctf-create.c (ctf_serialize): Adjust.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-link.c (ctf_create_per_cu): Look things up in the
in_cu_mapping instead of the cu_mapping.
(ctf_link_add_cu_mapping): The deduplicating link will define
what happens if many FROMs share a TO.
(ctf_link_add_cu_mapping): Create in_cu_mapping and
out_cu_mapping. Do not create ctf_link_outputs here any more, or
create per-CU dicts here: they are already created when needed.
(ctf_link_one_variable): Log a debug message if we skip a
variable due to its type being concealed in a CU-mapped link.
(This is probably too common a case to make into a warning.)
(ctf_link): Create empty per-CU dicts if requested.
2020-06-06 00:36:16 +08:00
|
|
|
/* Track both in a list from FROM to TO and in a list from TO to a list of
|
|
|
|
FROM. The former is used to create TUs with the mapped-to name at need:
|
|
|
|
the latter is used in deduplicating links to pull in all input CUs
|
|
|
|
corresponding to a single output CU. */
|
|
|
|
|
|
|
|
if ((err = ctf_dynhash_insert (fp->ctf_link_in_cu_mapping, f, t)) < 0)
|
|
|
|
{
|
|
|
|
ctf_set_errno (fp, err);
|
|
|
|
goto oom_noerrno;
|
|
|
|
}
|
libctf: add CU-mapping machinery
Once the deduplicator is capable of actually detecting conflicting types
with the same name (i.e., not yet) we will place such conflicting types,
and types that depend on them, into CTF dictionaries that are the child
of the main dictionary we usually emit: currently, this will lead to the
.ctf section becoming a CTF archive rather than a single dictionary,
with the default-named archive member (_CTF_SECTION, or NULL) being the
main shared dictionary with most of the types in it.
By default, the sections are named after the compilation unit they come
from (complete path and all), with the cuname field in the CTF header
providing further evidence of the name without requiring the caller to
engage in tiresome parsing. But some callers may not wish the mapping
from input CU to output sub-dictionary to be purely CU-based.
The machinery here allows this to be freely changed, in two ways:
- callers can call ctf_link_add_cu_mapping to specify that a single
input compilation unit should have its types placed in some other CU
if they conflict: the CU will always be created, even if empty, so
the consuming program can depend on its existence. You can map
multiple input CUs to one output CU to force all their types to be
merged together: if some of *those* types conflict, the behaviour is
currently unspecified (the new deduplicator will specify it).
- callers can call ctf_link_set_memb_name_changer to provide a function
which is passed every CTF sub-dictionary name in turn (including
_CTF_SECTION) and can return a new name, or NULL if no change is
desired. The mapping from input to output names should not map two
input names to the same output name: if this happens, the two are not
merged but will result in an archive with two members with the same
name (technically valid, but it's hard to access the second
same-named member: you have to do an iteration over archive members).
This is used by the kernel's ctfarchive machinery (not yet upstream) to
encode CTF under member names like {module name}.ctf rather than
.ctf.CU, but it is anticipated that other large projects may wish to
have their own storage for CTF outside of .ctf sections and may wish to
have new naming schemes that suit their special-purpose consumers.
New in v3.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (ctf_link_add_cu_mapping): New.
(ctf_link_memb_name_changer_f): New.
(ctf_link_set_memb_name_changer): New.
libctf/
* ctf-impl.h (ctf_file_t) <ctf_link_cu_mappping>: New.
<ctf_link_memb_name_changer>: Likewise.
<ctf_link_memb_name_changer_arg>: Likewise.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-link.c (ctf_create_per_cu): Apply the cu mapping.
(ctf_link_add_cu_mapping): New.
(ctf_link_set_memb_name_changer): Likewise.
(ctf_change_parent_name): New.
(ctf_name_list_accum_cb_arg_t) <dynames>: New, storage for names
allocated by the caller's ctf_link_memb_name_changer.
<ndynames>: Likewise.
(ctf_accumulate_archive_names): Call the ctf_link_memb_name_changer.
(ctf_link_write): Likewise (for _CTF_SECTION only): also call
ctf_change_parent_name. Free any resulting names.
2019-07-20 21:44:44 +08:00
|
|
|
|
libctf, link: redo cu-mapping handling
Now a bunch of stuff that doesn't apply to ld or any normal use of
libctf, piled into one commit so that it's easier to ignore.
The cu-mapping machinery associates incoming compilation unit names with
outgoing names of CTF dictionaries that should correspond to them, for
non-gdb CTF consumers that would like to group multiple TUs into a
single child dict if conflicting types are found in it (the existing use
case is one kernel module, one child CTF dict, even if the kernel module
is composed of multiple CUs).
The upcoming deduplicator needs to track not only the mapping from
incoming CU name to outgoing dict name, but the inverse mapping from
outgoing dict name to incoming CU name, so it can work over every CTF
dict we might see in the output and link into it.
So rejig the ctf-link machinery to do that. Simultaneously (because
they are closely associated and were written at the same time), we add a
new CTF_LINK_EMPTY_CU_MAPPINGS flag to ctf_link, which tells the
ctf_link machinery to create empty child dicts for each outgoing CU
mapping even if no CUs that correspond to it exist in the link. This is
a bit (OK, quite a lot) of a waste of space, but some existing consumers
require it. (Nobody else should use it.)
Its value is not consecutive with existing CTF_LINK flag values because
we're about to add more flags that are conceptually closer to the
existing ones than this one is.
include/
* ctf-api.h (CTF_LINK_EMPTY_CU_MAPPINGS): New.
libctf/
* ctf-impl.h (ctf_file_t): Improve comments.
<ctf_link_cu_mapping>: Split into...
<ctf_link_in_cu_mapping>: ... this...
<ctf_link_out_cu_mapping>: ... and this.
* ctf-create.c (ctf_serialize): Adjust.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-link.c (ctf_create_per_cu): Look things up in the
in_cu_mapping instead of the cu_mapping.
(ctf_link_add_cu_mapping): The deduplicating link will define
what happens if many FROMs share a TO.
(ctf_link_add_cu_mapping): Create in_cu_mapping and
out_cu_mapping. Do not create ctf_link_outputs here any more, or
create per-CU dicts here: they are already created when needed.
(ctf_link_one_variable): Log a debug message if we skip a
variable due to its type being concealed in a CU-mapped link.
(This is probably too common a case to make into a warning.)
(ctf_link): Create empty per-CU dicts if requested.
2020-06-06 00:36:16 +08:00
|
|
|
/* f and t are now owned by the in_cu_mapping: reallocate them. */
|
libctf: add CU-mapping machinery
Once the deduplicator is capable of actually detecting conflicting types
with the same name (i.e., not yet) we will place such conflicting types,
and types that depend on them, into CTF dictionaries that are the child
of the main dictionary we usually emit: currently, this will lead to the
.ctf section becoming a CTF archive rather than a single dictionary,
with the default-named archive member (_CTF_SECTION, or NULL) being the
main shared dictionary with most of the types in it.
By default, the sections are named after the compilation unit they come
from (complete path and all), with the cuname field in the CTF header
providing further evidence of the name without requiring the caller to
engage in tiresome parsing. But some callers may not wish the mapping
from input CU to output sub-dictionary to be purely CU-based.
The machinery here allows this to be freely changed, in two ways:
- callers can call ctf_link_add_cu_mapping to specify that a single
input compilation unit should have its types placed in some other CU
if they conflict: the CU will always be created, even if empty, so
the consuming program can depend on its existence. You can map
multiple input CUs to one output CU to force all their types to be
merged together: if some of *those* types conflict, the behaviour is
currently unspecified (the new deduplicator will specify it).
- callers can call ctf_link_set_memb_name_changer to provide a function
which is passed every CTF sub-dictionary name in turn (including
_CTF_SECTION) and can return a new name, or NULL if no change is
desired. The mapping from input to output names should not map two
input names to the same output name: if this happens, the two are not
merged but will result in an archive with two members with the same
name (technically valid, but it's hard to access the second
same-named member: you have to do an iteration over archive members).
This is used by the kernel's ctfarchive machinery (not yet upstream) to
encode CTF under member names like {module name}.ctf rather than
.ctf.CU, but it is anticipated that other large projects may wish to
have their own storage for CTF outside of .ctf sections and may wish to
have new naming schemes that suit their special-purpose consumers.
New in v3.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (ctf_link_add_cu_mapping): New.
(ctf_link_memb_name_changer_f): New.
(ctf_link_set_memb_name_changer): New.
libctf/
* ctf-impl.h (ctf_file_t) <ctf_link_cu_mappping>: New.
<ctf_link_memb_name_changer>: Likewise.
<ctf_link_memb_name_changer_arg>: Likewise.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-link.c (ctf_create_per_cu): Apply the cu mapping.
(ctf_link_add_cu_mapping): New.
(ctf_link_set_memb_name_changer): Likewise.
(ctf_change_parent_name): New.
(ctf_name_list_accum_cb_arg_t) <dynames>: New, storage for names
allocated by the caller's ctf_link_memb_name_changer.
<ndynames>: Likewise.
(ctf_accumulate_archive_names): Call the ctf_link_memb_name_changer.
(ctf_link_write): Likewise (for _CTF_SECTION only): also call
ctf_change_parent_name. Free any resulting names.
2019-07-20 21:44:44 +08:00
|
|
|
f = strdup (from);
|
|
|
|
t = strdup (to);
|
|
|
|
if (!f || !t)
|
|
|
|
goto oom;
|
|
|
|
|
libctf, link: redo cu-mapping handling
Now a bunch of stuff that doesn't apply to ld or any normal use of
libctf, piled into one commit so that it's easier to ignore.
The cu-mapping machinery associates incoming compilation unit names with
outgoing names of CTF dictionaries that should correspond to them, for
non-gdb CTF consumers that would like to group multiple TUs into a
single child dict if conflicting types are found in it (the existing use
case is one kernel module, one child CTF dict, even if the kernel module
is composed of multiple CUs).
The upcoming deduplicator needs to track not only the mapping from
incoming CU name to outgoing dict name, but the inverse mapping from
outgoing dict name to incoming CU name, so it can work over every CTF
dict we might see in the output and link into it.
So rejig the ctf-link machinery to do that. Simultaneously (because
they are closely associated and were written at the same time), we add a
new CTF_LINK_EMPTY_CU_MAPPINGS flag to ctf_link, which tells the
ctf_link machinery to create empty child dicts for each outgoing CU
mapping even if no CUs that correspond to it exist in the link. This is
a bit (OK, quite a lot) of a waste of space, but some existing consumers
require it. (Nobody else should use it.)
Its value is not consecutive with existing CTF_LINK flag values because
we're about to add more flags that are conceptually closer to the
existing ones than this one is.
include/
* ctf-api.h (CTF_LINK_EMPTY_CU_MAPPINGS): New.
libctf/
* ctf-impl.h (ctf_file_t): Improve comments.
<ctf_link_cu_mapping>: Split into...
<ctf_link_in_cu_mapping>: ... this...
<ctf_link_out_cu_mapping>: ... and this.
* ctf-create.c (ctf_serialize): Adjust.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-link.c (ctf_create_per_cu): Look things up in the
in_cu_mapping instead of the cu_mapping.
(ctf_link_add_cu_mapping): The deduplicating link will define
what happens if many FROMs share a TO.
(ctf_link_add_cu_mapping): Create in_cu_mapping and
out_cu_mapping. Do not create ctf_link_outputs here any more, or
create per-CU dicts here: they are already created when needed.
(ctf_link_one_variable): Log a debug message if we skip a
variable due to its type being concealed in a CU-mapped link.
(This is probably too common a case to make into a warning.)
(ctf_link): Create empty per-CU dicts if requested.
2020-06-06 00:36:16 +08:00
|
|
|
if ((one_out = ctf_dynhash_lookup (fp->ctf_link_out_cu_mapping, t)) == NULL)
|
|
|
|
{
|
|
|
|
if ((one_out = ctf_dynhash_create (ctf_hash_string, ctf_hash_eq_string,
|
|
|
|
free, NULL)) == NULL)
|
|
|
|
goto oom;
|
|
|
|
if ((err = ctf_dynhash_insert (fp->ctf_link_out_cu_mapping,
|
|
|
|
t, one_out)) < 0)
|
|
|
|
{
|
|
|
|
ctf_dynhash_destroy (one_out);
|
|
|
|
ctf_set_errno (fp, err);
|
|
|
|
goto oom_noerrno;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
else
|
2022-12-08 09:15:12 +08:00
|
|
|
{
|
|
|
|
free (t);
|
|
|
|
t = NULL;
|
|
|
|
}
|
libctf: add CU-mapping machinery
Once the deduplicator is capable of actually detecting conflicting types
with the same name (i.e., not yet) we will place such conflicting types,
and types that depend on them, into CTF dictionaries that are the child
of the main dictionary we usually emit: currently, this will lead to the
.ctf section becoming a CTF archive rather than a single dictionary,
with the default-named archive member (_CTF_SECTION, or NULL) being the
main shared dictionary with most of the types in it.
By default, the sections are named after the compilation unit they come
from (complete path and all), with the cuname field in the CTF header
providing further evidence of the name without requiring the caller to
engage in tiresome parsing. But some callers may not wish the mapping
from input CU to output sub-dictionary to be purely CU-based.
The machinery here allows this to be freely changed, in two ways:
- callers can call ctf_link_add_cu_mapping to specify that a single
input compilation unit should have its types placed in some other CU
if they conflict: the CU will always be created, even if empty, so
the consuming program can depend on its existence. You can map
multiple input CUs to one output CU to force all their types to be
merged together: if some of *those* types conflict, the behaviour is
currently unspecified (the new deduplicator will specify it).
- callers can call ctf_link_set_memb_name_changer to provide a function
which is passed every CTF sub-dictionary name in turn (including
_CTF_SECTION) and can return a new name, or NULL if no change is
desired. The mapping from input to output names should not map two
input names to the same output name: if this happens, the two are not
merged but will result in an archive with two members with the same
name (technically valid, but it's hard to access the second
same-named member: you have to do an iteration over archive members).
This is used by the kernel's ctfarchive machinery (not yet upstream) to
encode CTF under member names like {module name}.ctf rather than
.ctf.CU, but it is anticipated that other large projects may wish to
have their own storage for CTF outside of .ctf sections and may wish to
have new naming schemes that suit their special-purpose consumers.
New in v3.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (ctf_link_add_cu_mapping): New.
(ctf_link_memb_name_changer_f): New.
(ctf_link_set_memb_name_changer): New.
libctf/
* ctf-impl.h (ctf_file_t) <ctf_link_cu_mappping>: New.
<ctf_link_memb_name_changer>: Likewise.
<ctf_link_memb_name_changer_arg>: Likewise.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-link.c (ctf_create_per_cu): Apply the cu mapping.
(ctf_link_add_cu_mapping): New.
(ctf_link_set_memb_name_changer): Likewise.
(ctf_change_parent_name): New.
(ctf_name_list_accum_cb_arg_t) <dynames>: New, storage for names
allocated by the caller's ctf_link_memb_name_changer.
<ndynames>: Likewise.
(ctf_accumulate_archive_names): Call the ctf_link_memb_name_changer.
(ctf_link_write): Likewise (for _CTF_SECTION only): also call
ctf_change_parent_name. Free any resulting names.
2019-07-20 21:44:44 +08:00
|
|
|
|
libctf, link: redo cu-mapping handling
Now a bunch of stuff that doesn't apply to ld or any normal use of
libctf, piled into one commit so that it's easier to ignore.
The cu-mapping machinery associates incoming compilation unit names with
outgoing names of CTF dictionaries that should correspond to them, for
non-gdb CTF consumers that would like to group multiple TUs into a
single child dict if conflicting types are found in it (the existing use
case is one kernel module, one child CTF dict, even if the kernel module
is composed of multiple CUs).
The upcoming deduplicator needs to track not only the mapping from
incoming CU name to outgoing dict name, but the inverse mapping from
outgoing dict name to incoming CU name, so it can work over every CTF
dict we might see in the output and link into it.
So rejig the ctf-link machinery to do that. Simultaneously (because
they are closely associated and were written at the same time), we add a
new CTF_LINK_EMPTY_CU_MAPPINGS flag to ctf_link, which tells the
ctf_link machinery to create empty child dicts for each outgoing CU
mapping even if no CUs that correspond to it exist in the link. This is
a bit (OK, quite a lot) of a waste of space, but some existing consumers
require it. (Nobody else should use it.)
Its value is not consecutive with existing CTF_LINK flag values because
we're about to add more flags that are conceptually closer to the
existing ones than this one is.
include/
* ctf-api.h (CTF_LINK_EMPTY_CU_MAPPINGS): New.
libctf/
* ctf-impl.h (ctf_file_t): Improve comments.
<ctf_link_cu_mapping>: Split into...
<ctf_link_in_cu_mapping>: ... this...
<ctf_link_out_cu_mapping>: ... and this.
* ctf-create.c (ctf_serialize): Adjust.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-link.c (ctf_create_per_cu): Look things up in the
in_cu_mapping instead of the cu_mapping.
(ctf_link_add_cu_mapping): The deduplicating link will define
what happens if many FROMs share a TO.
(ctf_link_add_cu_mapping): Create in_cu_mapping and
out_cu_mapping. Do not create ctf_link_outputs here any more, or
create per-CU dicts here: they are already created when needed.
(ctf_link_one_variable): Log a debug message if we skip a
variable due to its type being concealed in a CU-mapped link.
(This is probably too common a case to make into a warning.)
(ctf_link): Create empty per-CU dicts if requested.
2020-06-06 00:36:16 +08:00
|
|
|
if (ctf_dynhash_insert (one_out, f, NULL) < 0)
|
libctf: add CU-mapping machinery
Once the deduplicator is capable of actually detecting conflicting types
with the same name (i.e., not yet) we will place such conflicting types,
and types that depend on them, into CTF dictionaries that are the child
of the main dictionary we usually emit: currently, this will lead to the
.ctf section becoming a CTF archive rather than a single dictionary,
with the default-named archive member (_CTF_SECTION, or NULL) being the
main shared dictionary with most of the types in it.
By default, the sections are named after the compilation unit they come
from (complete path and all), with the cuname field in the CTF header
providing further evidence of the name without requiring the caller to
engage in tiresome parsing. But some callers may not wish the mapping
from input CU to output sub-dictionary to be purely CU-based.
The machinery here allows this to be freely changed, in two ways:
- callers can call ctf_link_add_cu_mapping to specify that a single
input compilation unit should have its types placed in some other CU
if they conflict: the CU will always be created, even if empty, so
the consuming program can depend on its existence. You can map
multiple input CUs to one output CU to force all their types to be
merged together: if some of *those* types conflict, the behaviour is
currently unspecified (the new deduplicator will specify it).
- callers can call ctf_link_set_memb_name_changer to provide a function
which is passed every CTF sub-dictionary name in turn (including
_CTF_SECTION) and can return a new name, or NULL if no change is
desired. The mapping from input to output names should not map two
input names to the same output name: if this happens, the two are not
merged but will result in an archive with two members with the same
name (technically valid, but it's hard to access the second
same-named member: you have to do an iteration over archive members).
This is used by the kernel's ctfarchive machinery (not yet upstream) to
encode CTF under member names like {module name}.ctf rather than
.ctf.CU, but it is anticipated that other large projects may wish to
have their own storage for CTF outside of .ctf sections and may wish to
have new naming schemes that suit their special-purpose consumers.
New in v3.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (ctf_link_add_cu_mapping): New.
(ctf_link_memb_name_changer_f): New.
(ctf_link_set_memb_name_changer): New.
libctf/
* ctf-impl.h (ctf_file_t) <ctf_link_cu_mappping>: New.
<ctf_link_memb_name_changer>: Likewise.
<ctf_link_memb_name_changer_arg>: Likewise.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-link.c (ctf_create_per_cu): Apply the cu mapping.
(ctf_link_add_cu_mapping): New.
(ctf_link_set_memb_name_changer): Likewise.
(ctf_change_parent_name): New.
(ctf_name_list_accum_cb_arg_t) <dynames>: New, storage for names
allocated by the caller's ctf_link_memb_name_changer.
<ndynames>: Likewise.
(ctf_accumulate_archive_names): Call the ctf_link_memb_name_changer.
(ctf_link_write): Likewise (for _CTF_SECTION only): also call
ctf_change_parent_name. Free any resulting names.
2019-07-20 21:44:44 +08:00
|
|
|
{
|
|
|
|
ctf_set_errno (fp, err);
|
|
|
|
goto oom_noerrno;
|
|
|
|
}
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
oom:
|
|
|
|
ctf_set_errno (fp, errno);
|
|
|
|
oom_noerrno:
|
|
|
|
free (f);
|
|
|
|
free (t);
|
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Set a function which is called to transform the names of archive members.
|
|
|
|
This is useful for applying regular transformations to many names, where
|
|
|
|
ctf_link_add_cu_mapping applies arbitrarily irregular changes to single
|
|
|
|
names. The member name changer is applied at ctf_link_write time, so it
|
|
|
|
cannot conflate multiple CUs into one the way ctf_link_add_cu_mapping can.
|
|
|
|
The changer function accepts a name and should return a new
|
|
|
|
dynamically-allocated name, or NULL if the name should be left unchanged. */
|
|
|
|
void
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
ctf_link_set_memb_name_changer (ctf_dict_t *fp,
|
libctf: add CU-mapping machinery
Once the deduplicator is capable of actually detecting conflicting types
with the same name (i.e., not yet) we will place such conflicting types,
and types that depend on them, into CTF dictionaries that are the child
of the main dictionary we usually emit: currently, this will lead to the
.ctf section becoming a CTF archive rather than a single dictionary,
with the default-named archive member (_CTF_SECTION, or NULL) being the
main shared dictionary with most of the types in it.
By default, the sections are named after the compilation unit they come
from (complete path and all), with the cuname field in the CTF header
providing further evidence of the name without requiring the caller to
engage in tiresome parsing. But some callers may not wish the mapping
from input CU to output sub-dictionary to be purely CU-based.
The machinery here allows this to be freely changed, in two ways:
- callers can call ctf_link_add_cu_mapping to specify that a single
input compilation unit should have its types placed in some other CU
if they conflict: the CU will always be created, even if empty, so
the consuming program can depend on its existence. You can map
multiple input CUs to one output CU to force all their types to be
merged together: if some of *those* types conflict, the behaviour is
currently unspecified (the new deduplicator will specify it).
- callers can call ctf_link_set_memb_name_changer to provide a function
which is passed every CTF sub-dictionary name in turn (including
_CTF_SECTION) and can return a new name, or NULL if no change is
desired. The mapping from input to output names should not map two
input names to the same output name: if this happens, the two are not
merged but will result in an archive with two members with the same
name (technically valid, but it's hard to access the second
same-named member: you have to do an iteration over archive members).
This is used by the kernel's ctfarchive machinery (not yet upstream) to
encode CTF under member names like {module name}.ctf rather than
.ctf.CU, but it is anticipated that other large projects may wish to
have their own storage for CTF outside of .ctf sections and may wish to
have new naming schemes that suit their special-purpose consumers.
New in v3.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (ctf_link_add_cu_mapping): New.
(ctf_link_memb_name_changer_f): New.
(ctf_link_set_memb_name_changer): New.
libctf/
* ctf-impl.h (ctf_file_t) <ctf_link_cu_mappping>: New.
<ctf_link_memb_name_changer>: Likewise.
<ctf_link_memb_name_changer_arg>: Likewise.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-link.c (ctf_create_per_cu): Apply the cu mapping.
(ctf_link_add_cu_mapping): New.
(ctf_link_set_memb_name_changer): Likewise.
(ctf_change_parent_name): New.
(ctf_name_list_accum_cb_arg_t) <dynames>: New, storage for names
allocated by the caller's ctf_link_memb_name_changer.
<ndynames>: Likewise.
(ctf_accumulate_archive_names): Call the ctf_link_memb_name_changer.
(ctf_link_write): Likewise (for _CTF_SECTION only): also call
ctf_change_parent_name. Free any resulting names.
2019-07-20 21:44:44 +08:00
|
|
|
ctf_link_memb_name_changer_f *changer,
|
|
|
|
void *arg)
|
|
|
|
{
|
|
|
|
fp->ctf_link_memb_name_changer = changer;
|
|
|
|
fp->ctf_link_memb_name_changer_arg = arg;
|
|
|
|
}
|
|
|
|
|
2020-06-06 01:15:26 +08:00
|
|
|
/* Set a function which is used to filter out unwanted variables from the link. */
|
|
|
|
int
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
ctf_link_set_variable_filter (ctf_dict_t *fp, ctf_link_variable_filter_f *filter,
|
2020-06-06 01:15:26 +08:00
|
|
|
void *arg)
|
|
|
|
{
|
|
|
|
fp->ctf_link_variable_filter = filter;
|
|
|
|
fp->ctf_link_variable_filter_arg = arg;
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
/* Check if we can safely add a variable with the given type to this dict. */
|
2019-07-14 04:41:25 +08:00
|
|
|
|
|
|
|
static int
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
check_variable (const char *name, ctf_dict_t *fp, ctf_id_t type,
|
2019-07-14 04:41:25 +08:00
|
|
|
ctf_dvdef_t **out_dvd)
|
|
|
|
{
|
|
|
|
ctf_dvdef_t *dvd;
|
|
|
|
|
|
|
|
dvd = ctf_dynhash_lookup (fp->ctf_dvhash, name);
|
|
|
|
*out_dvd = dvd;
|
|
|
|
if (!dvd)
|
|
|
|
return 1;
|
|
|
|
|
|
|
|
if (dvd->dvd_type != type)
|
|
|
|
{
|
|
|
|
/* Variable here. Wrong type: cannot add. Just skip it, because there is
|
libctf, link: add lazy linking: clean up input members: err/warn cleanup
This rather large and intertwined pile of changes does three things:
First, it transitions from dprintf to ctf_err_warn for things the user might
care about: this one file is the major impetus for the ctf_err_warn
infrastructure, because things like file names are crucial in linker
error messages, and errno values are utterly incapable of
communicating them
Second, it stabilizes the ctf_link APIs: you can now call
ctf_link_add_ctf without a CTF argument (only a NAME), to lazily
ctf_open the file with the given NAME when needed, and close it as soon
as possible, to save memory. This is not an API change because a null
CTF argument was prohibited before now.
Since getting CTF directly from files uses ctf_open, passing in only a
NAME requires use of libctf, not libctf-nobfd. The linker's behaviour
is unchanged, as it still passes in a ctf_archive_t as before.
This also let us fix a leak: we were opening ctf_archives and their
containing ctf_files, then only closing the files and leaving the
archives open.
Third, this commit restructures the ctf_link_in_member argument used by
the CTF linking machinery and adjusts its users accordingly.
We drop two members:
- arcname, which is difficult to construct and then only used in error
messages (that were only dprintf()ed, so never seen!)
- share_mode, since we store the flags passed to ctf_link (including the
share mode) in a new ctf_file_t.ctf_link_flags to help dedup get hold
of it
We rename others whose existing names were fairly dreadful:
- done_main_member -> done_parent, using consistent terminology for .ctf
as the parent of all archive members
- main_input_fp -> in_fp_parent, likewise
- file_name -> in_file_name, likewise
We add one new member, cu_mapped.
Finally, we move the various frees of things like mapping table data to
the top-level ctf_link, since deduplicating links will want to do that
too.
include/
* ctf-api.h (ECTF_NEEDSBFD): New.
(ECTF_NERR): Adjust.
(ctf_link): Rename share_mode arg to flags.
libctf/
* Makefile.am: Set -DNOBFD=1 in libctf-nobfd, and =0 elsewhere.
* Makefile.in: Regenerated.
* ctf-impl.h (ctf_link_input_name): New.
(ctf_file_t) <ctf_link_flags>: New.
* ctf-create.c (ctf_serialize): Adjust accordingly.
* ctf-link.c: Define ctf_open as weak when PIC.
(ctf_arc_close_thunk): Remove unnecessary thunk.
(ctf_file_close_thunk): Likewise.
(ctf_link_input_name): New.
(ctf_link_input_t): New value of the ctf_file_t.ctf_link_input.
(ctf_link_input_close): Adjust accordingly.
(ctf_link_add_ctf_internal): New, split from...
(ctf_link_add_ctf): ... here. Return error if lazy loading of
CTF is not possible. Change to just call...
(ctf_link_add): ... this new function.
(ctf_link_add_cu_mapping): Transition to ctf_err_warn. Drop the
ctf_file_close_thunk.
(ctf_link_in_member_cb_arg_t) <file_name> Rename to...
<in_file_name>: ... this.
<arcname>: Drop.
<share_mode>: Likewise (migrated to ctf_link_flags).
<done_main_member>: Rename to...
<done_parent>: ... this.
<main_input_fp>: Rename to...
<in_fp_parent>: ... this.
<cu_mapped>: New.
(ctf_link_one_type): Adjuwt accordingly. Transition to
ctf_err_warn, removing a TODO.
(ctf_link_one_variable): Note a case too common to warn about.
Report in the debug stream if a cu-mapped link prevents addition
of a conflicting variable.
(ctf_link_one_input_archive_member): Adjust.
(ctf_link_lazy_open): New, open a CTF archive for linking when
needed.
(ctf_link_close_one_input_archive): New, close it again.
(ctf_link_one_input_archive): Adjust for lazy opening, member
renames, and ctf_err_warn transition. Move the
empty_link_type_mapping call to...
(ctf_link): ... here. Adjut for renamings and thunk removal.
Don't spuriously fail if some input contains no CTF data.
(ctf_link_write): ctf_err_warn transition.
* libctf.ver: Remove not-yet-stable comment.
2020-06-05 02:28:52 +08:00
|
|
|
no way to express this in CTF. Don't even warn: this case is too
|
|
|
|
common. (This might be the parent, in which case we'll try adding in
|
|
|
|
the child first, and only then give up.) */
|
2019-07-14 04:41:25 +08:00
|
|
|
ctf_dprintf ("Inexpressible duplicate variable %s skipped.\n", name);
|
|
|
|
}
|
|
|
|
|
|
|
|
return 0; /* Already exists. */
|
|
|
|
}
|
|
|
|
|
libctf: add a deduplicator-specific type mapping table
When CTF linking is done, the linker has to track the association
between types in the inputs and types in the outputs. The deduplicator
does this via the cd_output_emission_hashes, which maps from hashes of
types (valid in both the input and output) to the IDs of types in the
specific dict in which the cd_emission_hashes is held. However, the
nondeduplicating linker and ctf_add_type used a different mechanism, a
dedicated hashtab stored in the ctf_link_type_mapping, populated via
ctf_add_type_mapping and queried via the ctf_type_mapping function. To
allow the same functions to be used for variable and symbol population
in both the deduplicating and nondeduplicating linker, the deduplicator
carefully transferred all its input->output mappings into this hashtab
before returning.
This is *expensive*. The number of entries in this hashtab scales as the
number of input types, and unlike the hashing machinery the type mapping
machinery (the only other thing which scales that way) has not been much
optimized.
Now the nondeduplicating linker is gone, we can throw this out, move
the existing type mapping machinery to ctf-create.c and dedicate it to
ctf_add_type alone, and add a new function ctf_dedup_type_mapping which
uses the deduplicator's built-in knowledge of type mappings directly,
without requiring an expensive repopulation phase.
This speeds up a test link of nouveau.ko (a good worst-case candidate
with a lot of types in each of a lot of input files) from 9.11s to 7.15s
in my testing, a speedup of over 20%.
libctf/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (ctf_dict_t) <ctf_link_type_mapping>: No longer used
by the nondeduplicating linker.
(ctf_add_type_mapping): Removed, now static.
(ctf_type_mapping): Likewise.
(ctf_dedup_type_mapping): New.
(ctf_dedup_t) <cd_input_nums>: New.
* ctf-dedup.c (ctf_dedup_init): Populate it.
(ctf_dedup_fini): Free it again. Emphasise that this has to be
the last thing called.
(ctf_dedup): Populate it.
(ctf_dedup_populate_type_mapping): Removed.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): No longer call it. No longer call
ctf_dedup_fini either.
(ctf_dedup_type_mapping): New.
* ctf-link.c (ctf_unnamed_cuname): New.
(ctf_create_per_cu): Arguments must be non-null now.
(ctf_in_member_cb_arg): Removed.
(ctf_link): No longer populate it. No longer discard the
mapping table.
(ctf_link_deduplicating_one_symtypetab): Use
ctf_dedup_type_mapping, not ctf_type_mapping. Use
ctf_unnamed_cuname.
(ctf_link_one_variable): Likewise. Pass in args individually: no
longer a ctf_variable_iter callback.
(empty_link_type_mapping): Removed.
(ctf_link_deduplicating_variables): Use ctf_variable_next, not
ctf_variable_iter. No longer pack arguments to
ctf_link_one_variable into a struct.
(ctf_link_deduplicating_per_cu): Call ctf_dedup_fini once
all link phases are done.
(ctf_link_deduplicating): Likewise.
(ctf_link_intern_extern_string): Improve comment.
(ctf_add_type_mapping): Migrate...
(ctf_type_mapping): ... these functions...
* ctf-create.c (ctf_add_type_mapping): ... here...
(ctf_type_mapping): ... and make static, for the sole use of
ctf_add_type.
2021-03-02 23:10:05 +08:00
|
|
|
/* Link one variable named NAME of type TYPE found in IN_FP into FP. */
|
2019-07-14 04:41:25 +08:00
|
|
|
|
|
|
|
static int
|
libctf: add a deduplicator-specific type mapping table
When CTF linking is done, the linker has to track the association
between types in the inputs and types in the outputs. The deduplicator
does this via the cd_output_emission_hashes, which maps from hashes of
types (valid in both the input and output) to the IDs of types in the
specific dict in which the cd_emission_hashes is held. However, the
nondeduplicating linker and ctf_add_type used a different mechanism, a
dedicated hashtab stored in the ctf_link_type_mapping, populated via
ctf_add_type_mapping and queried via the ctf_type_mapping function. To
allow the same functions to be used for variable and symbol population
in both the deduplicating and nondeduplicating linker, the deduplicator
carefully transferred all its input->output mappings into this hashtab
before returning.
This is *expensive*. The number of entries in this hashtab scales as the
number of input types, and unlike the hashing machinery the type mapping
machinery (the only other thing which scales that way) has not been much
optimized.
Now the nondeduplicating linker is gone, we can throw this out, move
the existing type mapping machinery to ctf-create.c and dedicate it to
ctf_add_type alone, and add a new function ctf_dedup_type_mapping which
uses the deduplicator's built-in knowledge of type mappings directly,
without requiring an expensive repopulation phase.
This speeds up a test link of nouveau.ko (a good worst-case candidate
with a lot of types in each of a lot of input files) from 9.11s to 7.15s
in my testing, a speedup of over 20%.
libctf/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (ctf_dict_t) <ctf_link_type_mapping>: No longer used
by the nondeduplicating linker.
(ctf_add_type_mapping): Removed, now static.
(ctf_type_mapping): Likewise.
(ctf_dedup_type_mapping): New.
(ctf_dedup_t) <cd_input_nums>: New.
* ctf-dedup.c (ctf_dedup_init): Populate it.
(ctf_dedup_fini): Free it again. Emphasise that this has to be
the last thing called.
(ctf_dedup): Populate it.
(ctf_dedup_populate_type_mapping): Removed.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): No longer call it. No longer call
ctf_dedup_fini either.
(ctf_dedup_type_mapping): New.
* ctf-link.c (ctf_unnamed_cuname): New.
(ctf_create_per_cu): Arguments must be non-null now.
(ctf_in_member_cb_arg): Removed.
(ctf_link): No longer populate it. No longer discard the
mapping table.
(ctf_link_deduplicating_one_symtypetab): Use
ctf_dedup_type_mapping, not ctf_type_mapping. Use
ctf_unnamed_cuname.
(ctf_link_one_variable): Likewise. Pass in args individually: no
longer a ctf_variable_iter callback.
(empty_link_type_mapping): Removed.
(ctf_link_deduplicating_variables): Use ctf_variable_next, not
ctf_variable_iter. No longer pack arguments to
ctf_link_one_variable into a struct.
(ctf_link_deduplicating_per_cu): Call ctf_dedup_fini once
all link phases are done.
(ctf_link_deduplicating): Likewise.
(ctf_link_intern_extern_string): Improve comment.
(ctf_add_type_mapping): Migrate...
(ctf_type_mapping): ... these functions...
* ctf-create.c (ctf_add_type_mapping): ... here...
(ctf_type_mapping): ... and make static, for the sole use of
ctf_add_type.
2021-03-02 23:10:05 +08:00
|
|
|
ctf_link_one_variable (ctf_dict_t *fp, ctf_dict_t *in_fp, const char *name,
|
|
|
|
ctf_id_t type, int cu_mapped)
|
2019-07-14 04:41:25 +08:00
|
|
|
{
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
ctf_dict_t *per_cu_out_fp;
|
2019-07-14 04:41:25 +08:00
|
|
|
ctf_id_t dst_type = 0;
|
|
|
|
ctf_dvdef_t *dvd;
|
|
|
|
|
2020-06-06 01:15:26 +08:00
|
|
|
/* See if this variable is filtered out. */
|
|
|
|
|
libctf: add a deduplicator-specific type mapping table
When CTF linking is done, the linker has to track the association
between types in the inputs and types in the outputs. The deduplicator
does this via the cd_output_emission_hashes, which maps from hashes of
types (valid in both the input and output) to the IDs of types in the
specific dict in which the cd_emission_hashes is held. However, the
nondeduplicating linker and ctf_add_type used a different mechanism, a
dedicated hashtab stored in the ctf_link_type_mapping, populated via
ctf_add_type_mapping and queried via the ctf_type_mapping function. To
allow the same functions to be used for variable and symbol population
in both the deduplicating and nondeduplicating linker, the deduplicator
carefully transferred all its input->output mappings into this hashtab
before returning.
This is *expensive*. The number of entries in this hashtab scales as the
number of input types, and unlike the hashing machinery the type mapping
machinery (the only other thing which scales that way) has not been much
optimized.
Now the nondeduplicating linker is gone, we can throw this out, move
the existing type mapping machinery to ctf-create.c and dedicate it to
ctf_add_type alone, and add a new function ctf_dedup_type_mapping which
uses the deduplicator's built-in knowledge of type mappings directly,
without requiring an expensive repopulation phase.
This speeds up a test link of nouveau.ko (a good worst-case candidate
with a lot of types in each of a lot of input files) from 9.11s to 7.15s
in my testing, a speedup of over 20%.
libctf/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (ctf_dict_t) <ctf_link_type_mapping>: No longer used
by the nondeduplicating linker.
(ctf_add_type_mapping): Removed, now static.
(ctf_type_mapping): Likewise.
(ctf_dedup_type_mapping): New.
(ctf_dedup_t) <cd_input_nums>: New.
* ctf-dedup.c (ctf_dedup_init): Populate it.
(ctf_dedup_fini): Free it again. Emphasise that this has to be
the last thing called.
(ctf_dedup): Populate it.
(ctf_dedup_populate_type_mapping): Removed.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): No longer call it. No longer call
ctf_dedup_fini either.
(ctf_dedup_type_mapping): New.
* ctf-link.c (ctf_unnamed_cuname): New.
(ctf_create_per_cu): Arguments must be non-null now.
(ctf_in_member_cb_arg): Removed.
(ctf_link): No longer populate it. No longer discard the
mapping table.
(ctf_link_deduplicating_one_symtypetab): Use
ctf_dedup_type_mapping, not ctf_type_mapping. Use
ctf_unnamed_cuname.
(ctf_link_one_variable): Likewise. Pass in args individually: no
longer a ctf_variable_iter callback.
(empty_link_type_mapping): Removed.
(ctf_link_deduplicating_variables): Use ctf_variable_next, not
ctf_variable_iter. No longer pack arguments to
ctf_link_one_variable into a struct.
(ctf_link_deduplicating_per_cu): Call ctf_dedup_fini once
all link phases are done.
(ctf_link_deduplicating): Likewise.
(ctf_link_intern_extern_string): Improve comment.
(ctf_add_type_mapping): Migrate...
(ctf_type_mapping): ... these functions...
* ctf-create.c (ctf_add_type_mapping): ... here...
(ctf_type_mapping): ... and make static, for the sole use of
ctf_add_type.
2021-03-02 23:10:05 +08:00
|
|
|
if (fp->ctf_link_variable_filter)
|
2020-06-06 01:15:26 +08:00
|
|
|
{
|
libctf: add a deduplicator-specific type mapping table
When CTF linking is done, the linker has to track the association
between types in the inputs and types in the outputs. The deduplicator
does this via the cd_output_emission_hashes, which maps from hashes of
types (valid in both the input and output) to the IDs of types in the
specific dict in which the cd_emission_hashes is held. However, the
nondeduplicating linker and ctf_add_type used a different mechanism, a
dedicated hashtab stored in the ctf_link_type_mapping, populated via
ctf_add_type_mapping and queried via the ctf_type_mapping function. To
allow the same functions to be used for variable and symbol population
in both the deduplicating and nondeduplicating linker, the deduplicator
carefully transferred all its input->output mappings into this hashtab
before returning.
This is *expensive*. The number of entries in this hashtab scales as the
number of input types, and unlike the hashing machinery the type mapping
machinery (the only other thing which scales that way) has not been much
optimized.
Now the nondeduplicating linker is gone, we can throw this out, move
the existing type mapping machinery to ctf-create.c and dedicate it to
ctf_add_type alone, and add a new function ctf_dedup_type_mapping which
uses the deduplicator's built-in knowledge of type mappings directly,
without requiring an expensive repopulation phase.
This speeds up a test link of nouveau.ko (a good worst-case candidate
with a lot of types in each of a lot of input files) from 9.11s to 7.15s
in my testing, a speedup of over 20%.
libctf/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (ctf_dict_t) <ctf_link_type_mapping>: No longer used
by the nondeduplicating linker.
(ctf_add_type_mapping): Removed, now static.
(ctf_type_mapping): Likewise.
(ctf_dedup_type_mapping): New.
(ctf_dedup_t) <cd_input_nums>: New.
* ctf-dedup.c (ctf_dedup_init): Populate it.
(ctf_dedup_fini): Free it again. Emphasise that this has to be
the last thing called.
(ctf_dedup): Populate it.
(ctf_dedup_populate_type_mapping): Removed.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): No longer call it. No longer call
ctf_dedup_fini either.
(ctf_dedup_type_mapping): New.
* ctf-link.c (ctf_unnamed_cuname): New.
(ctf_create_per_cu): Arguments must be non-null now.
(ctf_in_member_cb_arg): Removed.
(ctf_link): No longer populate it. No longer discard the
mapping table.
(ctf_link_deduplicating_one_symtypetab): Use
ctf_dedup_type_mapping, not ctf_type_mapping. Use
ctf_unnamed_cuname.
(ctf_link_one_variable): Likewise. Pass in args individually: no
longer a ctf_variable_iter callback.
(empty_link_type_mapping): Removed.
(ctf_link_deduplicating_variables): Use ctf_variable_next, not
ctf_variable_iter. No longer pack arguments to
ctf_link_one_variable into a struct.
(ctf_link_deduplicating_per_cu): Call ctf_dedup_fini once
all link phases are done.
(ctf_link_deduplicating): Likewise.
(ctf_link_intern_extern_string): Improve comment.
(ctf_add_type_mapping): Migrate...
(ctf_type_mapping): ... these functions...
* ctf-create.c (ctf_add_type_mapping): ... here...
(ctf_type_mapping): ... and make static, for the sole use of
ctf_add_type.
2021-03-02 23:10:05 +08:00
|
|
|
void *farg = fp->ctf_link_variable_filter_arg;
|
|
|
|
if (fp->ctf_link_variable_filter (in_fp, name, type, farg))
|
2020-06-06 01:15:26 +08:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2021-03-02 23:10:05 +08:00
|
|
|
/* If this type is mapped to a type in the parent dict, we want to try to add
|
|
|
|
to that first: if it reports a duplicate, or if the type is in a child
|
|
|
|
already, add straight to the child. */
|
2019-07-14 04:41:25 +08:00
|
|
|
|
libctf: add a deduplicator-specific type mapping table
When CTF linking is done, the linker has to track the association
between types in the inputs and types in the outputs. The deduplicator
does this via the cd_output_emission_hashes, which maps from hashes of
types (valid in both the input and output) to the IDs of types in the
specific dict in which the cd_emission_hashes is held. However, the
nondeduplicating linker and ctf_add_type used a different mechanism, a
dedicated hashtab stored in the ctf_link_type_mapping, populated via
ctf_add_type_mapping and queried via the ctf_type_mapping function. To
allow the same functions to be used for variable and symbol population
in both the deduplicating and nondeduplicating linker, the deduplicator
carefully transferred all its input->output mappings into this hashtab
before returning.
This is *expensive*. The number of entries in this hashtab scales as the
number of input types, and unlike the hashing machinery the type mapping
machinery (the only other thing which scales that way) has not been much
optimized.
Now the nondeduplicating linker is gone, we can throw this out, move
the existing type mapping machinery to ctf-create.c and dedicate it to
ctf_add_type alone, and add a new function ctf_dedup_type_mapping which
uses the deduplicator's built-in knowledge of type mappings directly,
without requiring an expensive repopulation phase.
This speeds up a test link of nouveau.ko (a good worst-case candidate
with a lot of types in each of a lot of input files) from 9.11s to 7.15s
in my testing, a speedup of over 20%.
libctf/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (ctf_dict_t) <ctf_link_type_mapping>: No longer used
by the nondeduplicating linker.
(ctf_add_type_mapping): Removed, now static.
(ctf_type_mapping): Likewise.
(ctf_dedup_type_mapping): New.
(ctf_dedup_t) <cd_input_nums>: New.
* ctf-dedup.c (ctf_dedup_init): Populate it.
(ctf_dedup_fini): Free it again. Emphasise that this has to be
the last thing called.
(ctf_dedup): Populate it.
(ctf_dedup_populate_type_mapping): Removed.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): No longer call it. No longer call
ctf_dedup_fini either.
(ctf_dedup_type_mapping): New.
* ctf-link.c (ctf_unnamed_cuname): New.
(ctf_create_per_cu): Arguments must be non-null now.
(ctf_in_member_cb_arg): Removed.
(ctf_link): No longer populate it. No longer discard the
mapping table.
(ctf_link_deduplicating_one_symtypetab): Use
ctf_dedup_type_mapping, not ctf_type_mapping. Use
ctf_unnamed_cuname.
(ctf_link_one_variable): Likewise. Pass in args individually: no
longer a ctf_variable_iter callback.
(empty_link_type_mapping): Removed.
(ctf_link_deduplicating_variables): Use ctf_variable_next, not
ctf_variable_iter. No longer pack arguments to
ctf_link_one_variable into a struct.
(ctf_link_deduplicating_per_cu): Call ctf_dedup_fini once
all link phases are done.
(ctf_link_deduplicating): Likewise.
(ctf_link_intern_extern_string): Improve comment.
(ctf_add_type_mapping): Migrate...
(ctf_type_mapping): ... these functions...
* ctf-create.c (ctf_add_type_mapping): ... here...
(ctf_type_mapping): ... and make static, for the sole use of
ctf_add_type.
2021-03-02 23:10:05 +08:00
|
|
|
if ((dst_type = ctf_dedup_type_mapping (fp, in_fp, type)) == CTF_ERR)
|
|
|
|
return -1; /* errno is set for us. */
|
2019-07-14 04:41:25 +08:00
|
|
|
|
|
|
|
if (dst_type != 0)
|
|
|
|
{
|
libctf: add a deduplicator-specific type mapping table
When CTF linking is done, the linker has to track the association
between types in the inputs and types in the outputs. The deduplicator
does this via the cd_output_emission_hashes, which maps from hashes of
types (valid in both the input and output) to the IDs of types in the
specific dict in which the cd_emission_hashes is held. However, the
nondeduplicating linker and ctf_add_type used a different mechanism, a
dedicated hashtab stored in the ctf_link_type_mapping, populated via
ctf_add_type_mapping and queried via the ctf_type_mapping function. To
allow the same functions to be used for variable and symbol population
in both the deduplicating and nondeduplicating linker, the deduplicator
carefully transferred all its input->output mappings into this hashtab
before returning.
This is *expensive*. The number of entries in this hashtab scales as the
number of input types, and unlike the hashing machinery the type mapping
machinery (the only other thing which scales that way) has not been much
optimized.
Now the nondeduplicating linker is gone, we can throw this out, move
the existing type mapping machinery to ctf-create.c and dedicate it to
ctf_add_type alone, and add a new function ctf_dedup_type_mapping which
uses the deduplicator's built-in knowledge of type mappings directly,
without requiring an expensive repopulation phase.
This speeds up a test link of nouveau.ko (a good worst-case candidate
with a lot of types in each of a lot of input files) from 9.11s to 7.15s
in my testing, a speedup of over 20%.
libctf/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (ctf_dict_t) <ctf_link_type_mapping>: No longer used
by the nondeduplicating linker.
(ctf_add_type_mapping): Removed, now static.
(ctf_type_mapping): Likewise.
(ctf_dedup_type_mapping): New.
(ctf_dedup_t) <cd_input_nums>: New.
* ctf-dedup.c (ctf_dedup_init): Populate it.
(ctf_dedup_fini): Free it again. Emphasise that this has to be
the last thing called.
(ctf_dedup): Populate it.
(ctf_dedup_populate_type_mapping): Removed.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): No longer call it. No longer call
ctf_dedup_fini either.
(ctf_dedup_type_mapping): New.
* ctf-link.c (ctf_unnamed_cuname): New.
(ctf_create_per_cu): Arguments must be non-null now.
(ctf_in_member_cb_arg): Removed.
(ctf_link): No longer populate it. No longer discard the
mapping table.
(ctf_link_deduplicating_one_symtypetab): Use
ctf_dedup_type_mapping, not ctf_type_mapping. Use
ctf_unnamed_cuname.
(ctf_link_one_variable): Likewise. Pass in args individually: no
longer a ctf_variable_iter callback.
(empty_link_type_mapping): Removed.
(ctf_link_deduplicating_variables): Use ctf_variable_next, not
ctf_variable_iter. No longer pack arguments to
ctf_link_one_variable into a struct.
(ctf_link_deduplicating_per_cu): Call ctf_dedup_fini once
all link phases are done.
(ctf_link_deduplicating): Likewise.
(ctf_link_intern_extern_string): Improve comment.
(ctf_add_type_mapping): Migrate...
(ctf_type_mapping): ... these functions...
* ctf-create.c (ctf_add_type_mapping): ... here...
(ctf_type_mapping): ... and make static, for the sole use of
ctf_add_type.
2021-03-02 23:10:05 +08:00
|
|
|
if (!ctf_assert (fp, ctf_type_isparent (fp, dst_type)))
|
|
|
|
return -1; /* errno is set for us. */
|
2019-07-14 04:41:25 +08:00
|
|
|
|
libctf: add a deduplicator-specific type mapping table
When CTF linking is done, the linker has to track the association
between types in the inputs and types in the outputs. The deduplicator
does this via the cd_output_emission_hashes, which maps from hashes of
types (valid in both the input and output) to the IDs of types in the
specific dict in which the cd_emission_hashes is held. However, the
nondeduplicating linker and ctf_add_type used a different mechanism, a
dedicated hashtab stored in the ctf_link_type_mapping, populated via
ctf_add_type_mapping and queried via the ctf_type_mapping function. To
allow the same functions to be used for variable and symbol population
in both the deduplicating and nondeduplicating linker, the deduplicator
carefully transferred all its input->output mappings into this hashtab
before returning.
This is *expensive*. The number of entries in this hashtab scales as the
number of input types, and unlike the hashing machinery the type mapping
machinery (the only other thing which scales that way) has not been much
optimized.
Now the nondeduplicating linker is gone, we can throw this out, move
the existing type mapping machinery to ctf-create.c and dedicate it to
ctf_add_type alone, and add a new function ctf_dedup_type_mapping which
uses the deduplicator's built-in knowledge of type mappings directly,
without requiring an expensive repopulation phase.
This speeds up a test link of nouveau.ko (a good worst-case candidate
with a lot of types in each of a lot of input files) from 9.11s to 7.15s
in my testing, a speedup of over 20%.
libctf/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (ctf_dict_t) <ctf_link_type_mapping>: No longer used
by the nondeduplicating linker.
(ctf_add_type_mapping): Removed, now static.
(ctf_type_mapping): Likewise.
(ctf_dedup_type_mapping): New.
(ctf_dedup_t) <cd_input_nums>: New.
* ctf-dedup.c (ctf_dedup_init): Populate it.
(ctf_dedup_fini): Free it again. Emphasise that this has to be
the last thing called.
(ctf_dedup): Populate it.
(ctf_dedup_populate_type_mapping): Removed.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): No longer call it. No longer call
ctf_dedup_fini either.
(ctf_dedup_type_mapping): New.
* ctf-link.c (ctf_unnamed_cuname): New.
(ctf_create_per_cu): Arguments must be non-null now.
(ctf_in_member_cb_arg): Removed.
(ctf_link): No longer populate it. No longer discard the
mapping table.
(ctf_link_deduplicating_one_symtypetab): Use
ctf_dedup_type_mapping, not ctf_type_mapping. Use
ctf_unnamed_cuname.
(ctf_link_one_variable): Likewise. Pass in args individually: no
longer a ctf_variable_iter callback.
(empty_link_type_mapping): Removed.
(ctf_link_deduplicating_variables): Use ctf_variable_next, not
ctf_variable_iter. No longer pack arguments to
ctf_link_one_variable into a struct.
(ctf_link_deduplicating_per_cu): Call ctf_dedup_fini once
all link phases are done.
(ctf_link_deduplicating): Likewise.
(ctf_link_intern_extern_string): Improve comment.
(ctf_add_type_mapping): Migrate...
(ctf_type_mapping): ... these functions...
* ctf-create.c (ctf_add_type_mapping): ... here...
(ctf_type_mapping): ... and make static, for the sole use of
ctf_add_type.
2021-03-02 23:10:05 +08:00
|
|
|
if (check_variable (name, fp, dst_type, &dvd))
|
|
|
|
{
|
|
|
|
/* No variable here: we can add it. */
|
|
|
|
if (ctf_add_variable (fp, name, dst_type) < 0)
|
|
|
|
return -1; /* errno is set for us. */
|
|
|
|
return 0;
|
2019-07-14 04:41:25 +08:00
|
|
|
}
|
libctf: add a deduplicator-specific type mapping table
When CTF linking is done, the linker has to track the association
between types in the inputs and types in the outputs. The deduplicator
does this via the cd_output_emission_hashes, which maps from hashes of
types (valid in both the input and output) to the IDs of types in the
specific dict in which the cd_emission_hashes is held. However, the
nondeduplicating linker and ctf_add_type used a different mechanism, a
dedicated hashtab stored in the ctf_link_type_mapping, populated via
ctf_add_type_mapping and queried via the ctf_type_mapping function. To
allow the same functions to be used for variable and symbol population
in both the deduplicating and nondeduplicating linker, the deduplicator
carefully transferred all its input->output mappings into this hashtab
before returning.
This is *expensive*. The number of entries in this hashtab scales as the
number of input types, and unlike the hashing machinery the type mapping
machinery (the only other thing which scales that way) has not been much
optimized.
Now the nondeduplicating linker is gone, we can throw this out, move
the existing type mapping machinery to ctf-create.c and dedicate it to
ctf_add_type alone, and add a new function ctf_dedup_type_mapping which
uses the deduplicator's built-in knowledge of type mappings directly,
without requiring an expensive repopulation phase.
This speeds up a test link of nouveau.ko (a good worst-case candidate
with a lot of types in each of a lot of input files) from 9.11s to 7.15s
in my testing, a speedup of over 20%.
libctf/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (ctf_dict_t) <ctf_link_type_mapping>: No longer used
by the nondeduplicating linker.
(ctf_add_type_mapping): Removed, now static.
(ctf_type_mapping): Likewise.
(ctf_dedup_type_mapping): New.
(ctf_dedup_t) <cd_input_nums>: New.
* ctf-dedup.c (ctf_dedup_init): Populate it.
(ctf_dedup_fini): Free it again. Emphasise that this has to be
the last thing called.
(ctf_dedup): Populate it.
(ctf_dedup_populate_type_mapping): Removed.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): No longer call it. No longer call
ctf_dedup_fini either.
(ctf_dedup_type_mapping): New.
* ctf-link.c (ctf_unnamed_cuname): New.
(ctf_create_per_cu): Arguments must be non-null now.
(ctf_in_member_cb_arg): Removed.
(ctf_link): No longer populate it. No longer discard the
mapping table.
(ctf_link_deduplicating_one_symtypetab): Use
ctf_dedup_type_mapping, not ctf_type_mapping. Use
ctf_unnamed_cuname.
(ctf_link_one_variable): Likewise. Pass in args individually: no
longer a ctf_variable_iter callback.
(empty_link_type_mapping): Removed.
(ctf_link_deduplicating_variables): Use ctf_variable_next, not
ctf_variable_iter. No longer pack arguments to
ctf_link_one_variable into a struct.
(ctf_link_deduplicating_per_cu): Call ctf_dedup_fini once
all link phases are done.
(ctf_link_deduplicating): Likewise.
(ctf_link_intern_extern_string): Improve comment.
(ctf_add_type_mapping): Migrate...
(ctf_type_mapping): ... these functions...
* ctf-create.c (ctf_add_type_mapping): ... here...
(ctf_type_mapping): ... and make static, for the sole use of
ctf_add_type.
2021-03-02 23:10:05 +08:00
|
|
|
|
|
|
|
/* Already present? Nothing to do. */
|
|
|
|
if (dvd && dvd->dvd_type == dst_type)
|
|
|
|
return 0;
|
2019-07-14 04:41:25 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
/* Can't add to the parent due to a name clash, or because it references a
|
|
|
|
type only present in the child. Try adding to the child, creating if need
|
libctf, link: add lazy linking: clean up input members: err/warn cleanup
This rather large and intertwined pile of changes does three things:
First, it transitions from dprintf to ctf_err_warn for things the user might
care about: this one file is the major impetus for the ctf_err_warn
infrastructure, because things like file names are crucial in linker
error messages, and errno values are utterly incapable of
communicating them
Second, it stabilizes the ctf_link APIs: you can now call
ctf_link_add_ctf without a CTF argument (only a NAME), to lazily
ctf_open the file with the given NAME when needed, and close it as soon
as possible, to save memory. This is not an API change because a null
CTF argument was prohibited before now.
Since getting CTF directly from files uses ctf_open, passing in only a
NAME requires use of libctf, not libctf-nobfd. The linker's behaviour
is unchanged, as it still passes in a ctf_archive_t as before.
This also let us fix a leak: we were opening ctf_archives and their
containing ctf_files, then only closing the files and leaving the
archives open.
Third, this commit restructures the ctf_link_in_member argument used by
the CTF linking machinery and adjusts its users accordingly.
We drop two members:
- arcname, which is difficult to construct and then only used in error
messages (that were only dprintf()ed, so never seen!)
- share_mode, since we store the flags passed to ctf_link (including the
share mode) in a new ctf_file_t.ctf_link_flags to help dedup get hold
of it
We rename others whose existing names were fairly dreadful:
- done_main_member -> done_parent, using consistent terminology for .ctf
as the parent of all archive members
- main_input_fp -> in_fp_parent, likewise
- file_name -> in_file_name, likewise
We add one new member, cu_mapped.
Finally, we move the various frees of things like mapping table data to
the top-level ctf_link, since deduplicating links will want to do that
too.
include/
* ctf-api.h (ECTF_NEEDSBFD): New.
(ECTF_NERR): Adjust.
(ctf_link): Rename share_mode arg to flags.
libctf/
* Makefile.am: Set -DNOBFD=1 in libctf-nobfd, and =0 elsewhere.
* Makefile.in: Regenerated.
* ctf-impl.h (ctf_link_input_name): New.
(ctf_file_t) <ctf_link_flags>: New.
* ctf-create.c (ctf_serialize): Adjust accordingly.
* ctf-link.c: Define ctf_open as weak when PIC.
(ctf_arc_close_thunk): Remove unnecessary thunk.
(ctf_file_close_thunk): Likewise.
(ctf_link_input_name): New.
(ctf_link_input_t): New value of the ctf_file_t.ctf_link_input.
(ctf_link_input_close): Adjust accordingly.
(ctf_link_add_ctf_internal): New, split from...
(ctf_link_add_ctf): ... here. Return error if lazy loading of
CTF is not possible. Change to just call...
(ctf_link_add): ... this new function.
(ctf_link_add_cu_mapping): Transition to ctf_err_warn. Drop the
ctf_file_close_thunk.
(ctf_link_in_member_cb_arg_t) <file_name> Rename to...
<in_file_name>: ... this.
<arcname>: Drop.
<share_mode>: Likewise (migrated to ctf_link_flags).
<done_main_member>: Rename to...
<done_parent>: ... this.
<main_input_fp>: Rename to...
<in_fp_parent>: ... this.
<cu_mapped>: New.
(ctf_link_one_type): Adjuwt accordingly. Transition to
ctf_err_warn, removing a TODO.
(ctf_link_one_variable): Note a case too common to warn about.
Report in the debug stream if a cu-mapped link prevents addition
of a conflicting variable.
(ctf_link_one_input_archive_member): Adjust.
(ctf_link_lazy_open): New, open a CTF archive for linking when
needed.
(ctf_link_close_one_input_archive): New, close it again.
(ctf_link_one_input_archive): Adjust for lazy opening, member
renames, and ctf_err_warn transition. Move the
empty_link_type_mapping call to...
(ctf_link): ... here. Adjut for renamings and thunk removal.
Don't spuriously fail if some input contains no CTF data.
(ctf_link_write): ctf_err_warn transition.
* libctf.ver: Remove not-yet-stable comment.
2020-06-05 02:28:52 +08:00
|
|
|
be. If we can't do that, skip it. Don't add to a child if we're doing a
|
|
|
|
CU-mapped link, since that has only one output. */
|
|
|
|
|
libctf: add a deduplicator-specific type mapping table
When CTF linking is done, the linker has to track the association
between types in the inputs and types in the outputs. The deduplicator
does this via the cd_output_emission_hashes, which maps from hashes of
types (valid in both the input and output) to the IDs of types in the
specific dict in which the cd_emission_hashes is held. However, the
nondeduplicating linker and ctf_add_type used a different mechanism, a
dedicated hashtab stored in the ctf_link_type_mapping, populated via
ctf_add_type_mapping and queried via the ctf_type_mapping function. To
allow the same functions to be used for variable and symbol population
in both the deduplicating and nondeduplicating linker, the deduplicator
carefully transferred all its input->output mappings into this hashtab
before returning.
This is *expensive*. The number of entries in this hashtab scales as the
number of input types, and unlike the hashing machinery the type mapping
machinery (the only other thing which scales that way) has not been much
optimized.
Now the nondeduplicating linker is gone, we can throw this out, move
the existing type mapping machinery to ctf-create.c and dedicate it to
ctf_add_type alone, and add a new function ctf_dedup_type_mapping which
uses the deduplicator's built-in knowledge of type mappings directly,
without requiring an expensive repopulation phase.
This speeds up a test link of nouveau.ko (a good worst-case candidate
with a lot of types in each of a lot of input files) from 9.11s to 7.15s
in my testing, a speedup of over 20%.
libctf/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (ctf_dict_t) <ctf_link_type_mapping>: No longer used
by the nondeduplicating linker.
(ctf_add_type_mapping): Removed, now static.
(ctf_type_mapping): Likewise.
(ctf_dedup_type_mapping): New.
(ctf_dedup_t) <cd_input_nums>: New.
* ctf-dedup.c (ctf_dedup_init): Populate it.
(ctf_dedup_fini): Free it again. Emphasise that this has to be
the last thing called.
(ctf_dedup): Populate it.
(ctf_dedup_populate_type_mapping): Removed.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): No longer call it. No longer call
ctf_dedup_fini either.
(ctf_dedup_type_mapping): New.
* ctf-link.c (ctf_unnamed_cuname): New.
(ctf_create_per_cu): Arguments must be non-null now.
(ctf_in_member_cb_arg): Removed.
(ctf_link): No longer populate it. No longer discard the
mapping table.
(ctf_link_deduplicating_one_symtypetab): Use
ctf_dedup_type_mapping, not ctf_type_mapping. Use
ctf_unnamed_cuname.
(ctf_link_one_variable): Likewise. Pass in args individually: no
longer a ctf_variable_iter callback.
(empty_link_type_mapping): Removed.
(ctf_link_deduplicating_variables): Use ctf_variable_next, not
ctf_variable_iter. No longer pack arguments to
ctf_link_one_variable into a struct.
(ctf_link_deduplicating_per_cu): Call ctf_dedup_fini once
all link phases are done.
(ctf_link_deduplicating): Likewise.
(ctf_link_intern_extern_string): Improve comment.
(ctf_add_type_mapping): Migrate...
(ctf_type_mapping): ... these functions...
* ctf-create.c (ctf_add_type_mapping): ... here...
(ctf_type_mapping): ... and make static, for the sole use of
ctf_add_type.
2021-03-02 23:10:05 +08:00
|
|
|
if (cu_mapped)
|
libctf, link: add lazy linking: clean up input members: err/warn cleanup
This rather large and intertwined pile of changes does three things:
First, it transitions from dprintf to ctf_err_warn for things the user might
care about: this one file is the major impetus for the ctf_err_warn
infrastructure, because things like file names are crucial in linker
error messages, and errno values are utterly incapable of
communicating them
Second, it stabilizes the ctf_link APIs: you can now call
ctf_link_add_ctf without a CTF argument (only a NAME), to lazily
ctf_open the file with the given NAME when needed, and close it as soon
as possible, to save memory. This is not an API change because a null
CTF argument was prohibited before now.
Since getting CTF directly from files uses ctf_open, passing in only a
NAME requires use of libctf, not libctf-nobfd. The linker's behaviour
is unchanged, as it still passes in a ctf_archive_t as before.
This also let us fix a leak: we were opening ctf_archives and their
containing ctf_files, then only closing the files and leaving the
archives open.
Third, this commit restructures the ctf_link_in_member argument used by
the CTF linking machinery and adjusts its users accordingly.
We drop two members:
- arcname, which is difficult to construct and then only used in error
messages (that were only dprintf()ed, so never seen!)
- share_mode, since we store the flags passed to ctf_link (including the
share mode) in a new ctf_file_t.ctf_link_flags to help dedup get hold
of it
We rename others whose existing names were fairly dreadful:
- done_main_member -> done_parent, using consistent terminology for .ctf
as the parent of all archive members
- main_input_fp -> in_fp_parent, likewise
- file_name -> in_file_name, likewise
We add one new member, cu_mapped.
Finally, we move the various frees of things like mapping table data to
the top-level ctf_link, since deduplicating links will want to do that
too.
include/
* ctf-api.h (ECTF_NEEDSBFD): New.
(ECTF_NERR): Adjust.
(ctf_link): Rename share_mode arg to flags.
libctf/
* Makefile.am: Set -DNOBFD=1 in libctf-nobfd, and =0 elsewhere.
* Makefile.in: Regenerated.
* ctf-impl.h (ctf_link_input_name): New.
(ctf_file_t) <ctf_link_flags>: New.
* ctf-create.c (ctf_serialize): Adjust accordingly.
* ctf-link.c: Define ctf_open as weak when PIC.
(ctf_arc_close_thunk): Remove unnecessary thunk.
(ctf_file_close_thunk): Likewise.
(ctf_link_input_name): New.
(ctf_link_input_t): New value of the ctf_file_t.ctf_link_input.
(ctf_link_input_close): Adjust accordingly.
(ctf_link_add_ctf_internal): New, split from...
(ctf_link_add_ctf): ... here. Return error if lazy loading of
CTF is not possible. Change to just call...
(ctf_link_add): ... this new function.
(ctf_link_add_cu_mapping): Transition to ctf_err_warn. Drop the
ctf_file_close_thunk.
(ctf_link_in_member_cb_arg_t) <file_name> Rename to...
<in_file_name>: ... this.
<arcname>: Drop.
<share_mode>: Likewise (migrated to ctf_link_flags).
<done_main_member>: Rename to...
<done_parent>: ... this.
<main_input_fp>: Rename to...
<in_fp_parent>: ... this.
<cu_mapped>: New.
(ctf_link_one_type): Adjuwt accordingly. Transition to
ctf_err_warn, removing a TODO.
(ctf_link_one_variable): Note a case too common to warn about.
Report in the debug stream if a cu-mapped link prevents addition
of a conflicting variable.
(ctf_link_one_input_archive_member): Adjust.
(ctf_link_lazy_open): New, open a CTF archive for linking when
needed.
(ctf_link_close_one_input_archive): New, close it again.
(ctf_link_one_input_archive): Adjust for lazy opening, member
renames, and ctf_err_warn transition. Move the
empty_link_type_mapping call to...
(ctf_link): ... here. Adjut for renamings and thunk removal.
Don't spuriously fail if some input contains no CTF data.
(ctf_link_write): ctf_err_warn transition.
* libctf.ver: Remove not-yet-stable comment.
2020-06-05 02:28:52 +08:00
|
|
|
{
|
|
|
|
ctf_dprintf ("Variable %s in input file %s depends on a type %lx hidden "
|
libctf: add a deduplicator-specific type mapping table
When CTF linking is done, the linker has to track the association
between types in the inputs and types in the outputs. The deduplicator
does this via the cd_output_emission_hashes, which maps from hashes of
types (valid in both the input and output) to the IDs of types in the
specific dict in which the cd_emission_hashes is held. However, the
nondeduplicating linker and ctf_add_type used a different mechanism, a
dedicated hashtab stored in the ctf_link_type_mapping, populated via
ctf_add_type_mapping and queried via the ctf_type_mapping function. To
allow the same functions to be used for variable and symbol population
in both the deduplicating and nondeduplicating linker, the deduplicator
carefully transferred all its input->output mappings into this hashtab
before returning.
This is *expensive*. The number of entries in this hashtab scales as the
number of input types, and unlike the hashing machinery the type mapping
machinery (the only other thing which scales that way) has not been much
optimized.
Now the nondeduplicating linker is gone, we can throw this out, move
the existing type mapping machinery to ctf-create.c and dedicate it to
ctf_add_type alone, and add a new function ctf_dedup_type_mapping which
uses the deduplicator's built-in knowledge of type mappings directly,
without requiring an expensive repopulation phase.
This speeds up a test link of nouveau.ko (a good worst-case candidate
with a lot of types in each of a lot of input files) from 9.11s to 7.15s
in my testing, a speedup of over 20%.
libctf/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (ctf_dict_t) <ctf_link_type_mapping>: No longer used
by the nondeduplicating linker.
(ctf_add_type_mapping): Removed, now static.
(ctf_type_mapping): Likewise.
(ctf_dedup_type_mapping): New.
(ctf_dedup_t) <cd_input_nums>: New.
* ctf-dedup.c (ctf_dedup_init): Populate it.
(ctf_dedup_fini): Free it again. Emphasise that this has to be
the last thing called.
(ctf_dedup): Populate it.
(ctf_dedup_populate_type_mapping): Removed.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): No longer call it. No longer call
ctf_dedup_fini either.
(ctf_dedup_type_mapping): New.
* ctf-link.c (ctf_unnamed_cuname): New.
(ctf_create_per_cu): Arguments must be non-null now.
(ctf_in_member_cb_arg): Removed.
(ctf_link): No longer populate it. No longer discard the
mapping table.
(ctf_link_deduplicating_one_symtypetab): Use
ctf_dedup_type_mapping, not ctf_type_mapping. Use
ctf_unnamed_cuname.
(ctf_link_one_variable): Likewise. Pass in args individually: no
longer a ctf_variable_iter callback.
(empty_link_type_mapping): Removed.
(ctf_link_deduplicating_variables): Use ctf_variable_next, not
ctf_variable_iter. No longer pack arguments to
ctf_link_one_variable into a struct.
(ctf_link_deduplicating_per_cu): Call ctf_dedup_fini once
all link phases are done.
(ctf_link_deduplicating): Likewise.
(ctf_link_intern_extern_string): Improve comment.
(ctf_add_type_mapping): Migrate...
(ctf_type_mapping): ... these functions...
* ctf-create.c (ctf_add_type_mapping): ... here...
(ctf_type_mapping): ... and make static, for the sole use of
ctf_add_type.
2021-03-02 23:10:05 +08:00
|
|
|
"due to conflicts: skipped.\n", name,
|
|
|
|
ctf_unnamed_cuname (in_fp), type);
|
libctf, link: add lazy linking: clean up input members: err/warn cleanup
This rather large and intertwined pile of changes does three things:
First, it transitions from dprintf to ctf_err_warn for things the user might
care about: this one file is the major impetus for the ctf_err_warn
infrastructure, because things like file names are crucial in linker
error messages, and errno values are utterly incapable of
communicating them
Second, it stabilizes the ctf_link APIs: you can now call
ctf_link_add_ctf without a CTF argument (only a NAME), to lazily
ctf_open the file with the given NAME when needed, and close it as soon
as possible, to save memory. This is not an API change because a null
CTF argument was prohibited before now.
Since getting CTF directly from files uses ctf_open, passing in only a
NAME requires use of libctf, not libctf-nobfd. The linker's behaviour
is unchanged, as it still passes in a ctf_archive_t as before.
This also let us fix a leak: we were opening ctf_archives and their
containing ctf_files, then only closing the files and leaving the
archives open.
Third, this commit restructures the ctf_link_in_member argument used by
the CTF linking machinery and adjusts its users accordingly.
We drop two members:
- arcname, which is difficult to construct and then only used in error
messages (that were only dprintf()ed, so never seen!)
- share_mode, since we store the flags passed to ctf_link (including the
share mode) in a new ctf_file_t.ctf_link_flags to help dedup get hold
of it
We rename others whose existing names were fairly dreadful:
- done_main_member -> done_parent, using consistent terminology for .ctf
as the parent of all archive members
- main_input_fp -> in_fp_parent, likewise
- file_name -> in_file_name, likewise
We add one new member, cu_mapped.
Finally, we move the various frees of things like mapping table data to
the top-level ctf_link, since deduplicating links will want to do that
too.
include/
* ctf-api.h (ECTF_NEEDSBFD): New.
(ECTF_NERR): Adjust.
(ctf_link): Rename share_mode arg to flags.
libctf/
* Makefile.am: Set -DNOBFD=1 in libctf-nobfd, and =0 elsewhere.
* Makefile.in: Regenerated.
* ctf-impl.h (ctf_link_input_name): New.
(ctf_file_t) <ctf_link_flags>: New.
* ctf-create.c (ctf_serialize): Adjust accordingly.
* ctf-link.c: Define ctf_open as weak when PIC.
(ctf_arc_close_thunk): Remove unnecessary thunk.
(ctf_file_close_thunk): Likewise.
(ctf_link_input_name): New.
(ctf_link_input_t): New value of the ctf_file_t.ctf_link_input.
(ctf_link_input_close): Adjust accordingly.
(ctf_link_add_ctf_internal): New, split from...
(ctf_link_add_ctf): ... here. Return error if lazy loading of
CTF is not possible. Change to just call...
(ctf_link_add): ... this new function.
(ctf_link_add_cu_mapping): Transition to ctf_err_warn. Drop the
ctf_file_close_thunk.
(ctf_link_in_member_cb_arg_t) <file_name> Rename to...
<in_file_name>: ... this.
<arcname>: Drop.
<share_mode>: Likewise (migrated to ctf_link_flags).
<done_main_member>: Rename to...
<done_parent>: ... this.
<main_input_fp>: Rename to...
<in_fp_parent>: ... this.
<cu_mapped>: New.
(ctf_link_one_type): Adjuwt accordingly. Transition to
ctf_err_warn, removing a TODO.
(ctf_link_one_variable): Note a case too common to warn about.
Report in the debug stream if a cu-mapped link prevents addition
of a conflicting variable.
(ctf_link_one_input_archive_member): Adjust.
(ctf_link_lazy_open): New, open a CTF archive for linking when
needed.
(ctf_link_close_one_input_archive): New, close it again.
(ctf_link_one_input_archive): Adjust for lazy opening, member
renames, and ctf_err_warn transition. Move the
empty_link_type_mapping call to...
(ctf_link): ... here. Adjut for renamings and thunk removal.
Don't spuriously fail if some input contains no CTF data.
(ctf_link_write): ctf_err_warn transition.
* libctf.ver: Remove not-yet-stable comment.
2020-06-05 02:28:52 +08:00
|
|
|
return 0;
|
|
|
|
}
|
2019-07-14 04:41:25 +08:00
|
|
|
|
libctf: fix linking together multiple objects derived from the same source
Right now, if you compile the same .c input repeatedly with CTF enabled
and different compilation flags, then arrange to link all of these
together, then things misbehave in various ways. libctf may conflate
either inputs (if the .o files have the same name, say if they are
stored in different .a archives), or per-CU outputs when conflicting
types are found: the latter can lead to entirely spurious errors when
it tries to produce multiple per-CU outputs with the same name
(discarding all but the last, but then looking for types in the earlier
ones which have just been thrown away).
Fixing this is multi-pronged. Both inputs and outputs need to be
differentiated in the hashtables libctf keeps them in: inputs with the
same cuname and filename need to be considered distinct as long as they
have different associated CTF dicts, and per-CU outputs need to be
considered distinct as long as they have different associated input
dicts. Right now there is nothing tying the two together other than the
CU name: fix this by introducing a new field in the ctf_dict_t named
ctf_link_in_out, which (for input dicts) points to the associated per-CU
output dict (if any), and for output dicts points to the associated
input dict. At creation time the name used is completely arbitrary:
it's only important that it be distinct if CTF dicts are distinct. So,
when a clash is found, adjust the CU name by sticking the number of
elements in the input on the end. At output time, the CU name will
appear in the linked object, so it matters a little more that it look
slightly less ugly: in conflicting cases, append an incrementing
integer, starting at 0.
This naming scheme is not very helpful, but it's hard to see what else
we can do. The input .o name may be the same. The input .a name is not
even visible to ctf_link, and even *that* might be the same, because
.a's can contain many members with the same name, all of which
participate in the link. All we really know is that the two have
distinct dictionaries with distinct types in them, and at least this way
they are all represented, any any symbols, variables etc referring to
those types are accurately stored.
(As a side-effect this also fixes a use-after-free and double-free when
errors are found during variable or symbol emission.)
Use the opportunity to prevent a couple of sources of problems, to wit
changing the active CU mappings when a link has already been done
(no effect on ld, which doesn't use CU mappings at all), and causing
multiple consecutive ctf_link's to have the same net effect as just
doing the last one (no effect on ld, which only ever does one
ctf_link) rather than having the links be a sort of half-incremental
not-really-intended mess.
libctf/ChangeLog:
PR libctf/29242
* ctf-impl.h (struct ctf_dict) [ctf_link_in_out]: New.
* ctf-dedup.c (ctf_dedup_emit_type): Set it.
* ctf-link.c (ctf_link_add_ctf_internal): Set the input
CU name uniquely when clashes are found.
(ctf_link_add): Document what repeated additions do.
(ctf_new_per_cu_name): New, come up with a consistent
name for a new per-CU dict.
(ctf_link_deduplicating): Use it.
(ctf_create_per_cu): Use it, and ctf_link_in_out, and set
ctf_link_in_out properly. Don't overwrite per-CU dicts with
per-CU dicts relating to different inputs.
(ctf_link_add_cu_mapping): Prevent per-CU mappings being set up
if we already have per-CU outputs.
(ctf_link_one_variable): Adjust ctf_link_per_cu call.
(ctf_link_deduplicating_one_symtypetab): Likewise.
(ctf_link_empty_outputs): New, delete all the ctf_link_outputs
and blank out ctf_link_in_out on the corresponding inputs.
(ctf_link): Clarify the effect of multiple ctf_link calls.
Empty ctf_link_outputs if it already exists rather than
having the old output leak into the new link. Fix a variable
name.
* testsuite/config/default.exp (AR): Add.
(OBJDUMP): Likewise.
* testsuite/libctf-regression/libctf-repeat-cu.exp: New test.
* testsuite/libctf-regression/libctf-repeat-cu*: Main program,
library, and expected results for the test.
2022-06-11 00:05:50 +08:00
|
|
|
if ((per_cu_out_fp = ctf_create_per_cu (fp, in_fp, NULL)) == NULL)
|
libctf: add a deduplicator-specific type mapping table
When CTF linking is done, the linker has to track the association
between types in the inputs and types in the outputs. The deduplicator
does this via the cd_output_emission_hashes, which maps from hashes of
types (valid in both the input and output) to the IDs of types in the
specific dict in which the cd_emission_hashes is held. However, the
nondeduplicating linker and ctf_add_type used a different mechanism, a
dedicated hashtab stored in the ctf_link_type_mapping, populated via
ctf_add_type_mapping and queried via the ctf_type_mapping function. To
allow the same functions to be used for variable and symbol population
in both the deduplicating and nondeduplicating linker, the deduplicator
carefully transferred all its input->output mappings into this hashtab
before returning.
This is *expensive*. The number of entries in this hashtab scales as the
number of input types, and unlike the hashing machinery the type mapping
machinery (the only other thing which scales that way) has not been much
optimized.
Now the nondeduplicating linker is gone, we can throw this out, move
the existing type mapping machinery to ctf-create.c and dedicate it to
ctf_add_type alone, and add a new function ctf_dedup_type_mapping which
uses the deduplicator's built-in knowledge of type mappings directly,
without requiring an expensive repopulation phase.
This speeds up a test link of nouveau.ko (a good worst-case candidate
with a lot of types in each of a lot of input files) from 9.11s to 7.15s
in my testing, a speedup of over 20%.
libctf/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (ctf_dict_t) <ctf_link_type_mapping>: No longer used
by the nondeduplicating linker.
(ctf_add_type_mapping): Removed, now static.
(ctf_type_mapping): Likewise.
(ctf_dedup_type_mapping): New.
(ctf_dedup_t) <cd_input_nums>: New.
* ctf-dedup.c (ctf_dedup_init): Populate it.
(ctf_dedup_fini): Free it again. Emphasise that this has to be
the last thing called.
(ctf_dedup): Populate it.
(ctf_dedup_populate_type_mapping): Removed.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): No longer call it. No longer call
ctf_dedup_fini either.
(ctf_dedup_type_mapping): New.
* ctf-link.c (ctf_unnamed_cuname): New.
(ctf_create_per_cu): Arguments must be non-null now.
(ctf_in_member_cb_arg): Removed.
(ctf_link): No longer populate it. No longer discard the
mapping table.
(ctf_link_deduplicating_one_symtypetab): Use
ctf_dedup_type_mapping, not ctf_type_mapping. Use
ctf_unnamed_cuname.
(ctf_link_one_variable): Likewise. Pass in args individually: no
longer a ctf_variable_iter callback.
(empty_link_type_mapping): Removed.
(ctf_link_deduplicating_variables): Use ctf_variable_next, not
ctf_variable_iter. No longer pack arguments to
ctf_link_one_variable into a struct.
(ctf_link_deduplicating_per_cu): Call ctf_dedup_fini once
all link phases are done.
(ctf_link_deduplicating): Likewise.
(ctf_link_intern_extern_string): Improve comment.
(ctf_add_type_mapping): Migrate...
(ctf_type_mapping): ... these functions...
* ctf-create.c (ctf_add_type_mapping): ... here...
(ctf_type_mapping): ... and make static, for the sole use of
ctf_add_type.
2021-03-02 23:10:05 +08:00
|
|
|
return -1; /* errno is set for us. */
|
2019-07-14 04:41:25 +08:00
|
|
|
|
libctf: symbol type linking support
This adds facilities to write out the function info and data object
sections, which efficiently map from entries in the symbol table to
types. The write-side code is entirely new: the read-side code was
merely significantly changed and support for indexed tables added
(pointed to by the no-longer-unused cth_objtidxoff and cth_funcidxoff
header fields).
With this in place, you can use ctf_lookup_by_symbol to look up the
types of symbols of function and object type (and, as before, you can
use ctf_lookup_variable to look up types of file-scope variables not
present in the symbol table, as long as you know their name: but
variables that are also data objects are now found in the data object
section instead.)
(Compatible) file format change:
The CTF spec has always said that the function info section looks much
like the CTF_K_FUNCTIONs in the type section: an info word (including an
argument count) followed by a return type and N argument types. This
format is suboptimal: it means function symbols cannot be deduplicated
and it causes a lot of ugly code duplication in libctf. But
conveniently the compiler has never emitted this! Because it has always
emitted a rather different format that libctf has never accepted, we can
be sure that there are no instances of this function info section in the
wild, and can freely change its format without compatibility concerns or
a file format version bump. (And since it has never been emitted in any
code that generated any older file format version, either, we need keep
no code to read the format as specified at all!)
So the function info section is now specified as an array of uint32_t,
exactly like the object data section: each entry is a type ID in the
type section which must be of kind CTF_K_FUNCTION, the prototype of
this function.
This allows function types to be deduplicated and also correctly encodes
the fact that all functions declared in C really are types available to
the program: so they should be stored in the type section like all other
types. (In format v4, we will be able to represent the types of static
functions as well, but that really does require a file format change.)
We introduce a new header flag, CTF_F_NEWFUNCINFO, which is set if the
new function info format is in use. A sufficiently new compiler will
always set this flag. New libctf will always set this flag: old libctf
will refuse to open any CTF dicts that have this flag set. If the flag
is not set on a dict being read in, new libctf will disregard the
function info section. Format v4 will remove this flag (or, rather, the
flag has no meaning there and the bit position may be recycled for some
other purpose).
New API:
Symbol addition:
ctf_add_func_sym: Add a symbol with a given name and type. The
type must be of kind CTF_K_FUNCTION (a function
pointer). Internally this adds a name -> type
mapping to the ctf_funchash in the ctf_dict.
ctf_add_objt_sym: Add a symbol with a given name and type. The type
kind can be anything, including function pointers.
This adds to ctf_objthash.
These both treat symbols as name -> type mappings: the linker associates
symbol names with symbol indexes via the ctf_link_shuffle_syms callback,
which sets up the ctf_dynsyms/ctf_dynsymidx/ctf_dynsymmax fields in the
ctf_dict. Repeated relinks can add more symbols.
Variables that are also exposed as symbols are removed from the variable
section at serialization time.
CTF symbol type sections which have enough pads, defined by
CTF_INDEX_PAD_THRESHOLD (whether because they are in dicts with symbols
where most types are unknown, or in archive where most types are defined
in some child or parent dict, not in this specific dict) are sorted by
name rather than symidx and accompanied by an index which associates
each symbol type entry with a name: the existing ctf_lookup_by_symbol
will map symbol indexes to symbol names and look the names up in the
index automatically. (This is currently ELF-symbol-table-dependent, but
there is almost nothing specific to ELF in here and we can add support
for other symbol table formats easily).
The compiler also uses index sections to communicate the contents of
object file symbol tables without relying on any specific ordering of
symbols: it doesn't need to sort them, and libctf will detect an
unsorted index section via the absence of the new CTF_F_IDXSORTED header
flag, and sort it if needed.
Iteration:
ctf_symbol_next: Iterator which returns the types and names of symbols
one by one, either for function or data symbols.
This does not require any sorting: the ctf_link machinery uses it to
pull in all the compiler-provided symbols cheaply, but it is not
restricted to that use.
(Compatible) changes in API:
ctf_lookup_by_symbol: can now be called for object and function
symbols: never returns ECTF_NOTDATA (which is
now not thrown by anything, but is kept for
compatibility and because it is a plausible
error that we might start throwing again at some
later date).
Internally we also have changes to the ctf-string functionality so that
"external" strings (those where we track a string -> offset mapping, but
only write out an offset) can be consulted via the usual means
(ctf_strptr) before the strtab is written out. This is important
because ctf_link_add_linker_symbol can now be handed symbols named via
strtab offsets, and ctf_link_shuffle_syms must figure out their actual
names by looking in the external symtab we have just been fed by the
ctf_link_add_strtab callback, long before that strtab is written out.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_symbol_next): New.
(ctf_add_objt_sym): Likewise.
(ctf_add_func_sym): Likewise.
* ctf.h: Document new function info section format.
(CTF_F_NEWFUNCINFO): New.
(CTF_F_IDXSORTED): New.
(CTF_F_MAX): Adjust accordingly.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (CTF_INDEX_PAD_THRESHOLD): New.
(_libctf_nonnull_): Likewise.
(ctf_in_flight_dynsym_t): New.
(ctf_dict_t) <ctf_funcidx_names>: Likewise.
<ctf_objtidx_names>: Likewise.
<ctf_nfuncidx>: Likewise.
<ctf_nobjtidx>: Likewise.
<ctf_funcidx_sxlate>: Likewise.
<ctf_objtidx_sxlate>: Likewise.
<ctf_objthash>: Likewise.
<ctf_funchash>: Likewise.
<ctf_dynsyms>: Likewise.
<ctf_dynsymidx>: Likewise.
<ctf_dynsymmax>: Likewise.
<ctf_in_flight_dynsym>: Likewise.
(struct ctf_next) <u.ctn_next>: Likewise.
(ctf_symtab_skippable): New prototype.
(ctf_add_funcobjt_sym): Likewise.
(ctf_dynhash_sort_by_name): Likewise.
(ctf_sym_to_elf64): Rename to...
(ctf_elf32_to_link_sym): ... this, and...
(ctf_elf64_to_link_sym): ... this.
* ctf-open.c (init_symtab): Check for lack of CTF_F_NEWFUNCINFO
flag, and presence of index sections. Refactor out
ctf_symtab_skippable and ctf_elf*_to_link_sym, and use them. Use
ctf_link_sym_t, not Elf64_Sym. Skip initializing objt or func
sxlate sections if corresponding index section is present. Adjust
for new func info section format.
(ctf_bufopen_internal): Add ctf_err_warn to corrupt-file error
handling. Report incorrect-length index sections. Always do an
init_symtab, even if there is no symtab section (there may be index
sections still).
(flip_objts): Adjust comment: func and objt sections are actually
identical in structure now, no need to caveat.
(ctf_dict_close): Free newly-added data structures.
* ctf-create.c (ctf_create): Initialize them.
(ctf_symtab_skippable): New, refactored out of
init_symtab, with st_nameidx_set check added.
(ctf_add_funcobjt_sym): New, add a function or object symbol to the
ctf_objthash or ctf_funchash, by name.
(ctf_add_objt_sym): Call it.
(ctf_add_func_sym): Likewise.
(symtypetab_delete_nonstatic_vars): New, delete vars also present as
data objects.
(CTF_SYMTYPETAB_EMIT_FUNCTION): New flag to symtypetab emitters:
this is a function emission, not a data object emission.
(CTF_SYMTYPETAB_EMIT_PAD): New flag to symtypetab emitters: emit
pads for symbols with no type (only set for unindexed sections).
(CTF_SYMTYPETAB_FORCE_INDEXED): New flag to symtypetab emitters:
always emit indexed.
(symtypetab_density): New, figure out section sizes.
(emit_symtypetab): New, emit a symtypetab.
(emit_symtypetab_index): New, emit a symtypetab index.
(ctf_serialize): Call them, emitting suitably sorted symtypetab
sections and indexes. Set suitable header flags. Copy over new
fields.
* ctf-hash.c (ctf_dynhash_sort_by_name): New, used to impose an
order on symtypetab index sections.
* ctf-link.c (ctf_add_type_mapping): Delete erroneous comment
relating to code that was never committed.
(ctf_link_one_variable): Improve variable name.
(check_sym): New, symtypetab analogue of check_variable.
(ctf_link_deduplicating_one_symtypetab): New.
(ctf_link_deduplicating_syms): Likewise.
(ctf_link_deduplicating): Call them.
(ctf_link_deduplicating_per_cu): Note that we don't call them in
this case (yet).
(ctf_link_add_strtab): Set the error on the fp correctly.
(ctf_link_add_linker_symbol): New (no longer a do-nothing stub), add
a linker symbol to the in-flight list.
(ctf_link_shuffle_syms): New (no longer a do-nothing stub), turn the
in-flight list into a mapping we can use, now its names are
resolvable in the external strtab.
* ctf-string.c (ctf_str_rollback_atom): Don't roll back atoms with
external strtab offsets.
(ctf_str_rollback): Adjust comment.
(ctf_str_write_strtab): Migrate ctf_syn_ext_strtab population from
writeout time...
(ctf_str_add_external): ... to string addition time.
* ctf-lookup.c (ctf_lookup_var_key_t): Rename to...
(ctf_lookup_idx_key_t): ... this, now we use it for syms too.
<clik_names>: New member, a name table.
(ctf_lookup_var): Adjust accordingly.
(ctf_lookup_variable): Likewise.
(ctf_lookup_by_id): Shuffle further up in the file.
(ctf_symidx_sort_arg_cb): New, callback for...
(sort_symidx_by_name): ... this new function to sort a symidx
found to be unsorted (likely originating from the compiler).
(ctf_symidx_sort): New, sort a symidx.
(ctf_lookup_symbol_name): Support dynamic symbols with indexes
provided by the linker. Use ctf_link_sym_t, not Elf64_Sym.
Check the parent if a child lookup fails.
(ctf_lookup_by_symbol): Likewise. Work for function symbols too.
(ctf_symbol_next): New, iterate over symbols with types (without
sorting).
(ctf_lookup_idx_name): New, bsearch for symbol names in indexes.
(ctf_try_lookup_indexed): New, attempt an indexed lookup.
(ctf_func_info): Reimplement in terms of ctf_lookup_by_symbol.
(ctf_func_args): Likewise.
(ctf_get_dict): Move...
* ctf-types.c (ctf_get_dict): ... here.
* ctf-util.c (ctf_sym_to_elf64): Re-express as...
(ctf_elf64_to_link_sym): ... this. Add new st_symidx field, and
st_nameidx_set (always 0, so st_nameidx can be ignored). Look in
the ELF strtab for names.
(ctf_elf32_to_link_sym): Likewise, for Elf32_Sym.
(ctf_next_destroy): Destroy ctf_next_t.u.ctn_next if need be.
* libctf.ver: Add ctf_symbol_next, ctf_add_objt_sym and
ctf_add_func_sym.
2020-11-20 21:34:04 +08:00
|
|
|
/* If the type was not found, check for it in the child too. */
|
2019-07-14 04:41:25 +08:00
|
|
|
if (dst_type == 0)
|
|
|
|
{
|
libctf: add a deduplicator-specific type mapping table
When CTF linking is done, the linker has to track the association
between types in the inputs and types in the outputs. The deduplicator
does this via the cd_output_emission_hashes, which maps from hashes of
types (valid in both the input and output) to the IDs of types in the
specific dict in which the cd_emission_hashes is held. However, the
nondeduplicating linker and ctf_add_type used a different mechanism, a
dedicated hashtab stored in the ctf_link_type_mapping, populated via
ctf_add_type_mapping and queried via the ctf_type_mapping function. To
allow the same functions to be used for variable and symbol population
in both the deduplicating and nondeduplicating linker, the deduplicator
carefully transferred all its input->output mappings into this hashtab
before returning.
This is *expensive*. The number of entries in this hashtab scales as the
number of input types, and unlike the hashing machinery the type mapping
machinery (the only other thing which scales that way) has not been much
optimized.
Now the nondeduplicating linker is gone, we can throw this out, move
the existing type mapping machinery to ctf-create.c and dedicate it to
ctf_add_type alone, and add a new function ctf_dedup_type_mapping which
uses the deduplicator's built-in knowledge of type mappings directly,
without requiring an expensive repopulation phase.
This speeds up a test link of nouveau.ko (a good worst-case candidate
with a lot of types in each of a lot of input files) from 9.11s to 7.15s
in my testing, a speedup of over 20%.
libctf/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (ctf_dict_t) <ctf_link_type_mapping>: No longer used
by the nondeduplicating linker.
(ctf_add_type_mapping): Removed, now static.
(ctf_type_mapping): Likewise.
(ctf_dedup_type_mapping): New.
(ctf_dedup_t) <cd_input_nums>: New.
* ctf-dedup.c (ctf_dedup_init): Populate it.
(ctf_dedup_fini): Free it again. Emphasise that this has to be
the last thing called.
(ctf_dedup): Populate it.
(ctf_dedup_populate_type_mapping): Removed.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): No longer call it. No longer call
ctf_dedup_fini either.
(ctf_dedup_type_mapping): New.
* ctf-link.c (ctf_unnamed_cuname): New.
(ctf_create_per_cu): Arguments must be non-null now.
(ctf_in_member_cb_arg): Removed.
(ctf_link): No longer populate it. No longer discard the
mapping table.
(ctf_link_deduplicating_one_symtypetab): Use
ctf_dedup_type_mapping, not ctf_type_mapping. Use
ctf_unnamed_cuname.
(ctf_link_one_variable): Likewise. Pass in args individually: no
longer a ctf_variable_iter callback.
(empty_link_type_mapping): Removed.
(ctf_link_deduplicating_variables): Use ctf_variable_next, not
ctf_variable_iter. No longer pack arguments to
ctf_link_one_variable into a struct.
(ctf_link_deduplicating_per_cu): Call ctf_dedup_fini once
all link phases are done.
(ctf_link_deduplicating): Likewise.
(ctf_link_intern_extern_string): Improve comment.
(ctf_add_type_mapping): Migrate...
(ctf_type_mapping): ... these functions...
* ctf-create.c (ctf_add_type_mapping): ... here...
(ctf_type_mapping): ... and make static, for the sole use of
ctf_add_type.
2021-03-02 23:10:05 +08:00
|
|
|
if ((dst_type = ctf_dedup_type_mapping (per_cu_out_fp,
|
|
|
|
in_fp, type)) == CTF_ERR)
|
|
|
|
return -1; /* errno is set for us. */
|
2019-07-14 04:41:25 +08:00
|
|
|
|
|
|
|
if (dst_type == 0)
|
|
|
|
{
|
libctf: add a deduplicator-specific type mapping table
When CTF linking is done, the linker has to track the association
between types in the inputs and types in the outputs. The deduplicator
does this via the cd_output_emission_hashes, which maps from hashes of
types (valid in both the input and output) to the IDs of types in the
specific dict in which the cd_emission_hashes is held. However, the
nondeduplicating linker and ctf_add_type used a different mechanism, a
dedicated hashtab stored in the ctf_link_type_mapping, populated via
ctf_add_type_mapping and queried via the ctf_type_mapping function. To
allow the same functions to be used for variable and symbol population
in both the deduplicating and nondeduplicating linker, the deduplicator
carefully transferred all its input->output mappings into this hashtab
before returning.
This is *expensive*. The number of entries in this hashtab scales as the
number of input types, and unlike the hashing machinery the type mapping
machinery (the only other thing which scales that way) has not been much
optimized.
Now the nondeduplicating linker is gone, we can throw this out, move
the existing type mapping machinery to ctf-create.c and dedicate it to
ctf_add_type alone, and add a new function ctf_dedup_type_mapping which
uses the deduplicator's built-in knowledge of type mappings directly,
without requiring an expensive repopulation phase.
This speeds up a test link of nouveau.ko (a good worst-case candidate
with a lot of types in each of a lot of input files) from 9.11s to 7.15s
in my testing, a speedup of over 20%.
libctf/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (ctf_dict_t) <ctf_link_type_mapping>: No longer used
by the nondeduplicating linker.
(ctf_add_type_mapping): Removed, now static.
(ctf_type_mapping): Likewise.
(ctf_dedup_type_mapping): New.
(ctf_dedup_t) <cd_input_nums>: New.
* ctf-dedup.c (ctf_dedup_init): Populate it.
(ctf_dedup_fini): Free it again. Emphasise that this has to be
the last thing called.
(ctf_dedup): Populate it.
(ctf_dedup_populate_type_mapping): Removed.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): No longer call it. No longer call
ctf_dedup_fini either.
(ctf_dedup_type_mapping): New.
* ctf-link.c (ctf_unnamed_cuname): New.
(ctf_create_per_cu): Arguments must be non-null now.
(ctf_in_member_cb_arg): Removed.
(ctf_link): No longer populate it. No longer discard the
mapping table.
(ctf_link_deduplicating_one_symtypetab): Use
ctf_dedup_type_mapping, not ctf_type_mapping. Use
ctf_unnamed_cuname.
(ctf_link_one_variable): Likewise. Pass in args individually: no
longer a ctf_variable_iter callback.
(empty_link_type_mapping): Removed.
(ctf_link_deduplicating_variables): Use ctf_variable_next, not
ctf_variable_iter. No longer pack arguments to
ctf_link_one_variable into a struct.
(ctf_link_deduplicating_per_cu): Call ctf_dedup_fini once
all link phases are done.
(ctf_link_deduplicating): Likewise.
(ctf_link_intern_extern_string): Improve comment.
(ctf_add_type_mapping): Migrate...
(ctf_type_mapping): ... these functions...
* ctf-create.c (ctf_add_type_mapping): ... here...
(ctf_type_mapping): ... and make static, for the sole use of
ctf_add_type.
2021-03-02 23:10:05 +08:00
|
|
|
ctf_err_warn (fp, 1, 0, _("type %lx for variable %s in input file %s "
|
|
|
|
"not found: skipped"), type, name,
|
|
|
|
ctf_unnamed_cuname (in_fp));
|
2019-07-14 04:41:25 +08:00
|
|
|
/* Do not terminate the link: just skip the variable. */
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
if (check_variable (name, per_cu_out_fp, dst_type, &dvd))
|
|
|
|
if (ctf_add_variable (per_cu_out_fp, name, dst_type) < 0)
|
libctf: add a deduplicator-specific type mapping table
When CTF linking is done, the linker has to track the association
between types in the inputs and types in the outputs. The deduplicator
does this via the cd_output_emission_hashes, which maps from hashes of
types (valid in both the input and output) to the IDs of types in the
specific dict in which the cd_emission_hashes is held. However, the
nondeduplicating linker and ctf_add_type used a different mechanism, a
dedicated hashtab stored in the ctf_link_type_mapping, populated via
ctf_add_type_mapping and queried via the ctf_type_mapping function. To
allow the same functions to be used for variable and symbol population
in both the deduplicating and nondeduplicating linker, the deduplicator
carefully transferred all its input->output mappings into this hashtab
before returning.
This is *expensive*. The number of entries in this hashtab scales as the
number of input types, and unlike the hashing machinery the type mapping
machinery (the only other thing which scales that way) has not been much
optimized.
Now the nondeduplicating linker is gone, we can throw this out, move
the existing type mapping machinery to ctf-create.c and dedicate it to
ctf_add_type alone, and add a new function ctf_dedup_type_mapping which
uses the deduplicator's built-in knowledge of type mappings directly,
without requiring an expensive repopulation phase.
This speeds up a test link of nouveau.ko (a good worst-case candidate
with a lot of types in each of a lot of input files) from 9.11s to 7.15s
in my testing, a speedup of over 20%.
libctf/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (ctf_dict_t) <ctf_link_type_mapping>: No longer used
by the nondeduplicating linker.
(ctf_add_type_mapping): Removed, now static.
(ctf_type_mapping): Likewise.
(ctf_dedup_type_mapping): New.
(ctf_dedup_t) <cd_input_nums>: New.
* ctf-dedup.c (ctf_dedup_init): Populate it.
(ctf_dedup_fini): Free it again. Emphasise that this has to be
the last thing called.
(ctf_dedup): Populate it.
(ctf_dedup_populate_type_mapping): Removed.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): No longer call it. No longer call
ctf_dedup_fini either.
(ctf_dedup_type_mapping): New.
* ctf-link.c (ctf_unnamed_cuname): New.
(ctf_create_per_cu): Arguments must be non-null now.
(ctf_in_member_cb_arg): Removed.
(ctf_link): No longer populate it. No longer discard the
mapping table.
(ctf_link_deduplicating_one_symtypetab): Use
ctf_dedup_type_mapping, not ctf_type_mapping. Use
ctf_unnamed_cuname.
(ctf_link_one_variable): Likewise. Pass in args individually: no
longer a ctf_variable_iter callback.
(empty_link_type_mapping): Removed.
(ctf_link_deduplicating_variables): Use ctf_variable_next, not
ctf_variable_iter. No longer pack arguments to
ctf_link_one_variable into a struct.
(ctf_link_deduplicating_per_cu): Call ctf_dedup_fini once
all link phases are done.
(ctf_link_deduplicating): Likewise.
(ctf_link_intern_extern_string): Improve comment.
(ctf_add_type_mapping): Migrate...
(ctf_type_mapping): ... these functions...
* ctf-create.c (ctf_add_type_mapping): ... here...
(ctf_type_mapping): ... and make static, for the sole use of
ctf_add_type.
2021-03-02 23:10:05 +08:00
|
|
|
return (ctf_set_errno (fp, ctf_errno (per_cu_out_fp)));
|
2019-07-14 04:41:25 +08:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
libctf, link: tie in the deduplicating linker
This fairly intricate commit connects up the CTF linker machinery (which
operates in terms of ctf_archive_t's on ctf_link_inputs ->
ctf_link_outputs) to the deduplicator (which operates in terms of arrays
of ctf_file_t's, all the archives exploded).
The nondeduplicating linker is retained, but is not called unless the
CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the
environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have
confidence in the much-more-complex deduplicating linker, I hope the
nondeduplicating linker can be removed.
In brief, what this does is traverses each input archive in
ctf_link_inputs, opening every member (if not already open) and tying
child dicts to their parents, shoving them into an array and
constructing a corresponding parents array that tells the deduplicator
which dict is the parent of which child. We then call ctf_dedup and
ctf_dedup_emit with that array of inputs, taking the outputs that result
and putting them into ctf_link_outputs where the rest of the CTF linker
expects to find them, then linking in the variables just as is done by
the nondeduplicating linker.
It also implements much of the CU-mapping side of things. The problem
CU-mapping introduces is that if you map many input CUs into one output,
this is saying that you want many translation units to produce at most
one child dict if conflicting types are found in any of them. This
means you can suddenly have multiple distinct types with the same name
in the same dict, which libctf cannot really represent because it's not
something you can do with C translation units.
The deduplicator machinery already committed does as best it can with
these, hiding types with conflicting names rather than making child
dicts out of them: but we still need to call it. This is done similarly
to the main link, taking the inputs (one CU output at a time),
deduplicating them, taking the output and making it an input to the
final link. Two (significant) optimizations are done: we share atoms
tables between all these links and the final link (so e.g. all type hash
values are shared, all decorated type names, etc); and any CU-mapped
links with only one input (and no child dicts) doesn't need to do
anything other than renaming the CU: the CU-mapped link phase can be
skipped for it. Put together, large CU-mapped links can save 50% of
their memory usage and about as much time (and the memory usage for
CU-mapped links is significant, because all those output CUs have to
have all their types stored in memory all at once).
include/
* ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the
deduplicator.
libctf/
* ctf-impl.h (ctf_list_splice): New.
* ctf-util.h (ctf_list_splice): Likewise.
* ctf-link.c (link_sort_inputs_cb_arg_t): Likewise.
(ctf_link_sort_inputs): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_variables): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Call it.
2020-06-06 05:57:06 +08:00
|
|
|
typedef struct link_sort_inputs_cb_arg
|
|
|
|
{
|
|
|
|
int is_cu_mapped;
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
ctf_dict_t *fp;
|
libctf, link: tie in the deduplicating linker
This fairly intricate commit connects up the CTF linker machinery (which
operates in terms of ctf_archive_t's on ctf_link_inputs ->
ctf_link_outputs) to the deduplicator (which operates in terms of arrays
of ctf_file_t's, all the archives exploded).
The nondeduplicating linker is retained, but is not called unless the
CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the
environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have
confidence in the much-more-complex deduplicating linker, I hope the
nondeduplicating linker can be removed.
In brief, what this does is traverses each input archive in
ctf_link_inputs, opening every member (if not already open) and tying
child dicts to their parents, shoving them into an array and
constructing a corresponding parents array that tells the deduplicator
which dict is the parent of which child. We then call ctf_dedup and
ctf_dedup_emit with that array of inputs, taking the outputs that result
and putting them into ctf_link_outputs where the rest of the CTF linker
expects to find them, then linking in the variables just as is done by
the nondeduplicating linker.
It also implements much of the CU-mapping side of things. The problem
CU-mapping introduces is that if you map many input CUs into one output,
this is saying that you want many translation units to produce at most
one child dict if conflicting types are found in any of them. This
means you can suddenly have multiple distinct types with the same name
in the same dict, which libctf cannot really represent because it's not
something you can do with C translation units.
The deduplicator machinery already committed does as best it can with
these, hiding types with conflicting names rather than making child
dicts out of them: but we still need to call it. This is done similarly
to the main link, taking the inputs (one CU output at a time),
deduplicating them, taking the output and making it an input to the
final link. Two (significant) optimizations are done: we share atoms
tables between all these links and the final link (so e.g. all type hash
values are shared, all decorated type names, etc); and any CU-mapped
links with only one input (and no child dicts) doesn't need to do
anything other than renaming the CU: the CU-mapped link phase can be
skipped for it. Put together, large CU-mapped links can save 50% of
their memory usage and about as much time (and the memory usage for
CU-mapped links is significant, because all those output CUs have to
have all their types stored in memory all at once).
include/
* ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the
deduplicator.
libctf/
* ctf-impl.h (ctf_list_splice): New.
* ctf-util.h (ctf_list_splice): Likewise.
* ctf-link.c (link_sort_inputs_cb_arg_t): Likewise.
(ctf_link_sort_inputs): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_variables): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Call it.
2020-06-06 05:57:06 +08:00
|
|
|
} link_sort_inputs_cb_arg_t;
|
|
|
|
|
|
|
|
/* Sort the inputs by N (the link order). For CU-mapped links, this is a
|
|
|
|
mapping of input to output name, not a mapping of input name to input
|
|
|
|
ctf_link_input_t: compensate accordingly. */
|
|
|
|
static int
|
|
|
|
ctf_link_sort_inputs (const ctf_next_hkv_t *one, const ctf_next_hkv_t *two,
|
|
|
|
void *arg)
|
|
|
|
{
|
|
|
|
ctf_link_input_t *input_1;
|
|
|
|
ctf_link_input_t *input_2;
|
|
|
|
link_sort_inputs_cb_arg_t *cu_mapped = (link_sort_inputs_cb_arg_t *) arg;
|
|
|
|
|
|
|
|
if (!cu_mapped || !cu_mapped->is_cu_mapped)
|
|
|
|
{
|
|
|
|
input_1 = (ctf_link_input_t *) one->hkv_value;
|
|
|
|
input_2 = (ctf_link_input_t *) two->hkv_value;
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
const char *name_1 = (const char *) one->hkv_key;
|
|
|
|
const char *name_2 = (const char *) two->hkv_key;
|
|
|
|
|
|
|
|
input_1 = ctf_dynhash_lookup (cu_mapped->fp->ctf_link_inputs, name_1);
|
|
|
|
input_2 = ctf_dynhash_lookup (cu_mapped->fp->ctf_link_inputs, name_2);
|
|
|
|
|
|
|
|
/* There is no guarantee that CU-mappings actually have corresponding
|
|
|
|
inputs: the relative ordering in that case is unimportant. */
|
|
|
|
if (!input_1)
|
|
|
|
return -1;
|
|
|
|
if (!input_2)
|
|
|
|
return 1;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (input_1->n < input_2->n)
|
|
|
|
return -1;
|
|
|
|
else if (input_1->n > input_2->n)
|
|
|
|
return 1;
|
|
|
|
else
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Count the number of input dicts in the ctf_link_inputs, or that subset of the
|
|
|
|
ctf_link_inputs given by CU_NAMES if set. Return the number of input dicts,
|
|
|
|
and optionally the name and ctf_link_input_t of the single input archive if
|
|
|
|
only one exists (no matter how many dicts it contains). */
|
|
|
|
static ssize_t
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
ctf_link_deduplicating_count_inputs (ctf_dict_t *fp, ctf_dynhash_t *cu_names,
|
libctf, link: tie in the deduplicating linker
This fairly intricate commit connects up the CTF linker machinery (which
operates in terms of ctf_archive_t's on ctf_link_inputs ->
ctf_link_outputs) to the deduplicator (which operates in terms of arrays
of ctf_file_t's, all the archives exploded).
The nondeduplicating linker is retained, but is not called unless the
CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the
environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have
confidence in the much-more-complex deduplicating linker, I hope the
nondeduplicating linker can be removed.
In brief, what this does is traverses each input archive in
ctf_link_inputs, opening every member (if not already open) and tying
child dicts to their parents, shoving them into an array and
constructing a corresponding parents array that tells the deduplicator
which dict is the parent of which child. We then call ctf_dedup and
ctf_dedup_emit with that array of inputs, taking the outputs that result
and putting them into ctf_link_outputs where the rest of the CTF linker
expects to find them, then linking in the variables just as is done by
the nondeduplicating linker.
It also implements much of the CU-mapping side of things. The problem
CU-mapping introduces is that if you map many input CUs into one output,
this is saying that you want many translation units to produce at most
one child dict if conflicting types are found in any of them. This
means you can suddenly have multiple distinct types with the same name
in the same dict, which libctf cannot really represent because it's not
something you can do with C translation units.
The deduplicator machinery already committed does as best it can with
these, hiding types with conflicting names rather than making child
dicts out of them: but we still need to call it. This is done similarly
to the main link, taking the inputs (one CU output at a time),
deduplicating them, taking the output and making it an input to the
final link. Two (significant) optimizations are done: we share atoms
tables between all these links and the final link (so e.g. all type hash
values are shared, all decorated type names, etc); and any CU-mapped
links with only one input (and no child dicts) doesn't need to do
anything other than renaming the CU: the CU-mapped link phase can be
skipped for it. Put together, large CU-mapped links can save 50% of
their memory usage and about as much time (and the memory usage for
CU-mapped links is significant, because all those output CUs have to
have all their types stored in memory all at once).
include/
* ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the
deduplicator.
libctf/
* ctf-impl.h (ctf_list_splice): New.
* ctf-util.h (ctf_list_splice): Likewise.
* ctf-link.c (link_sort_inputs_cb_arg_t): Likewise.
(ctf_link_sort_inputs): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_variables): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Call it.
2020-06-06 05:57:06 +08:00
|
|
|
ctf_link_input_t **only_one_input)
|
|
|
|
{
|
|
|
|
ctf_dynhash_t *inputs = fp->ctf_link_inputs;
|
|
|
|
ctf_next_t *i = NULL;
|
|
|
|
void *name, *input;
|
|
|
|
ctf_link_input_t *one_input = NULL;
|
|
|
|
const char *one_name = NULL;
|
|
|
|
ssize_t count = 0, narcs = 0;
|
|
|
|
int err;
|
|
|
|
|
|
|
|
if (cu_names)
|
|
|
|
inputs = cu_names;
|
|
|
|
|
|
|
|
while ((err = ctf_dynhash_next (inputs, &i, &name, &input)) == 0)
|
|
|
|
{
|
|
|
|
ssize_t one_count;
|
|
|
|
|
|
|
|
one_name = (const char *) name;
|
|
|
|
/* If we are processing CU names, get the real input. */
|
|
|
|
if (cu_names)
|
|
|
|
one_input = ctf_dynhash_lookup (fp->ctf_link_inputs, one_name);
|
|
|
|
else
|
|
|
|
one_input = (ctf_link_input_t *) input;
|
|
|
|
|
|
|
|
if (!one_input)
|
|
|
|
continue;
|
|
|
|
|
|
|
|
one_count = ctf_link_lazy_open (fp, one_input);
|
|
|
|
|
|
|
|
if (one_count < 0)
|
|
|
|
{
|
|
|
|
ctf_next_destroy (i);
|
|
|
|
return -1; /* errno is set for us. */
|
|
|
|
}
|
|
|
|
|
|
|
|
count += one_count;
|
|
|
|
narcs++;
|
|
|
|
}
|
|
|
|
if (err != ECTF_NEXT_END)
|
|
|
|
{
|
libctf, binutils, include, ld: gettextize and improve error handling
This commit follows on from the earlier commit "libctf, ld, binutils:
add textual error/warning reporting for libctf" and converts every error
in libctf that was reported using ctf_dprintf to use ctf_err_warn
instead, gettextizing them in the process, using N_() where necessary to
avoid doing gettext calls unless an error message is actually generated,
and rephrasing some error messages for ease of translation.
This requires a slight change in the ctf_errwarning_next API: this API
is public but has not been in a release yet, so can still change freely.
The problem is that many errors are emitted at open time (whether
opening of a CTF dict, or opening of a CTF archive): the former of these
throws away its incompletely-initialized ctf_file_t rather than return
it, and the latter has no ctf_file_t at all. So errors and warnings
emitted at open time cannot be stored in the ctf_file_t, and have to go
elsewhere.
We put them in a static local in ctf-subr.c (which is not very
thread-safe: a later commit will improve things here): ctf_err_warn with
a NULL fp adds to this list, and the public interface
ctf_errwarning_next with a NULL fp retrieves from it.
We need a slight exception from the usual iterator rules in this case:
with a NULL fp, there is nowhere to store the ECTF_NEXT_END "error"
which signifies the end of iteration, so we add a new err parameter to
ctf_errwarning_next which is used to report such iteration-related
errors. (If an fp is provided -- i.e., if not reporting open errors --
this is optional, but even if it's optional it's still an API change.
This is actually useful from a usability POV as well, since
ctf_errwarning_next is usually called when there's been an error, so
overwriting the error code with ECTF_NEXT_END is not very helpful!
So, unusually, ctf_errwarning_next now uses the passed fp for its
error code *only* if no errp pointer is passed in, and leaves it
untouched otherwise.)
ld, objdump and readelf are adapted to call ctf_errwarning_next with a
NULL fp to report open errors where appropriate.
The ctf_err_warn API also has to change, gaining a new error-number
parameter which is used to add the error message corresponding to that
error number into the debug stream when LIBCTF_DEBUG is enabled:
changing this API is easy at this point since we are already touching
all existing calls to gettextize them. We need this because the debug
stream should contain the errno's message, but the error reported in the
error/warning stream should *not*, because the caller will probably
report it themselves at failure time regardless, and reporting it in
every error message that leads up to it leads to a ridiculous chattering
on failure, which is likely to end up as ridiculous chattering on stderr
(trimmed a bit):
CTF error: `ld/testsuite/ld-ctf/A.c (0): lookup failure for type 3: flags 1: The parent CTF dictionary is unavailable'
CTF error: `ld/testsuite/ld-ctf/A.c (0): struct/union member type hashing error during type hashing for type 80000001, kind 6: The parent CTF dictionary is unavailable'
CTF error: `deduplicating link variable emission failed for ld/testsuite/ld-ctf/A.c: The parent CTF dictionary is unavailable'
ld/.libs/lt-ld-new: warning: CTF linking failed; output will have no CTF section: `The parent CTF dictionary is unavailable'
We only need to be told that the parent CTF dictionary is unavailable
*once*, not over and over again!
errmsgs are still emitted on warning generation, because warnings do not
usually lead to a failure propagated up to the caller and reported
there.
Debug-stream messages are not translated. If translation is turned on,
there will be a mixture of English and translated messages in the debug
stream, but rather that than burden the translators with debug-only
output.
binutils/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_archive_member): Move error-
reporting...
(dump_ctf_errs): ... into this separate function.
(dump_ctf): Call it on open errors.
* readelf.c (dump_ctf_archive_member): Move error-
reporting...
(dump_ctf_errs): ... into this separate function. Support
calls with NULL fp. Adjust for new err parameter to
ctf_errwarning_next.
(dump_section_as_ctf): Call it on open errors.
include/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_errwarning_next): New err parameter.
ld/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (lang_ctf_errs_warnings): Support calls with NULL fp.
Adjust for new err parameter to ctf_errwarning_next. Only
check for assertion failures when fp is non-NULL.
(ldlang_open_ctf): Call it on open errors.
* testsuite/ld-ctf/ctf.exp: Always use the C locale to avoid
breaking the diags tests.
libctf/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-subr.c (open_errors): New list.
(ctf_err_warn): Calls with NULL fp append to open_errors. Add err
parameter, and use it to decorate the debug stream with errmsgs.
(ctf_err_warn_to_open): Splice errors from a CTF dict into the
open_errors.
(ctf_errwarning_next): Calls with NULL fp report from open_errors.
New err param to report iteration errors (including end-of-iteration)
when fp is NULL.
(ctf_assert_fail_internal): Adjust ctf_err_warn call for new err
parameter: gettextize.
* ctf-impl.h (ctfo_get_vbytes): Add ctf_file_t parameter.
(LCTF_VBYTES): Adjust.
(ctf_err_warn_to_open): New.
(ctf_err_warn): Adjust.
(ctf_bundle): Used in only one place: move...
* ctf-create.c: ... here.
(enumcmp): Use ctf_err_warn, not ctf_dprintf, passing the err number
down as needed. Don't emit the errmsg. Gettextize.
(membcmp): Likewise.
(ctf_add_type_internal): Likewise.
(ctf_write_mem): Likewise.
(ctf_compress_write): Likewise. Report errors writing the header or
body.
(ctf_write): Likewise.
* ctf-archive.c (ctf_arc_write_fd): Use ctf_err_warn, not
ctf_dprintf, and gettextize, as above.
(ctf_arc_write): Likewise.
(ctf_arc_bufopen): Likewise.
(ctf_arc_open_internal): Likewise.
* ctf-labels.c (ctf_label_iter): Likewise.
* ctf-open-bfd.c (ctf_bfdclose): Likewise.
(ctf_bfdopen): Likewise.
(ctf_bfdopen_ctfsect): Likewise.
(ctf_fdopen): Likewise.
* ctf-string.c (ctf_str_write_strtab): Likewise.
* ctf-types.c (ctf_type_resolve): Likewise.
* ctf-open.c (get_vbytes_common): Likewise. Pass down the ctf dict.
(get_vbytes_v1): Pass down the ctf dict.
(get_vbytes_v2): Likewise.
(flip_ctf): Likewise.
(flip_types): Likewise. Use ctf_err_warn, not ctf_dprintf, and
gettextize, as above.
(upgrade_types_v1): Adjust calls.
(init_types): Use ctf_err_warn, not ctf_dprintf, as above.
(ctf_bufopen_internal): Likewise. Adjust calls. Transplant errors
emitted into individual dicts into the open errors if this turns
out to be a failed open in the end.
* ctf-dump.c (ctf_dump_format_type): Adjust ctf_err_warn for new err
argument. Gettextize. Don't emit the errmsg.
(ctf_dump_funcs): Likewise. Collapse err label into its only case.
(ctf_dump_type): Likewise.
* ctf-link.c (ctf_create_per_cu): Adjust ctf_err_warn for new err
argument. Gettextize. Don't emit the errmsg.
(ctf_link_one_type): Likewise.
(ctf_link_lazy_open): Likewise.
(ctf_link_one_input_archive): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Likewise.
(ctf_link_deduplicating_per_cu): Likewise. Add some missed
ctf_set_errnos to obscure error cases.
* ctf-dedup.c (ctf_dedup_rhash_type): Adjust ctf_err_warn for new
err argument. Gettextize. Don't emit the errmsg.
(ctf_dedup_populate_mappings): Likewise.
(ctf_dedup_detect_name_ambiguity): Likewise.
(ctf_dedup_init): Likewise.
(ctf_dedup_multiple_input_dicts): Likewise.
(ctf_dedup_conflictify_unshared): Likewise.
(ctf_dedup): Likewise.
(ctf_dedup_rwalk_one_output_mapping): Likewise.
(ctf_dedup_id_to_target): Likewise.
(ctf_dedup_emit_type): Likewise.
(ctf_dedup_emit_struct_members): Likewise.
(ctf_dedup_populate_type_mapping): Likewise.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): Likewise.
(ctf_dedup_hash_type): Likewise. Fix a bit of messed-up error
status setting.
(ctf_dedup_rwalk_one_output_mapping): Likewise. Don't hide
unknown-type-kind messages (which signify file corruption).
2020-07-27 23:45:15 +08:00
|
|
|
ctf_err_warn (fp, 0, err, _("iteration error counting deduplicating "
|
|
|
|
"CTF link inputs"));
|
2023-09-13 17:02:36 +08:00
|
|
|
return ctf_set_errno (fp, err);
|
libctf, link: tie in the deduplicating linker
This fairly intricate commit connects up the CTF linker machinery (which
operates in terms of ctf_archive_t's on ctf_link_inputs ->
ctf_link_outputs) to the deduplicator (which operates in terms of arrays
of ctf_file_t's, all the archives exploded).
The nondeduplicating linker is retained, but is not called unless the
CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the
environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have
confidence in the much-more-complex deduplicating linker, I hope the
nondeduplicating linker can be removed.
In brief, what this does is traverses each input archive in
ctf_link_inputs, opening every member (if not already open) and tying
child dicts to their parents, shoving them into an array and
constructing a corresponding parents array that tells the deduplicator
which dict is the parent of which child. We then call ctf_dedup and
ctf_dedup_emit with that array of inputs, taking the outputs that result
and putting them into ctf_link_outputs where the rest of the CTF linker
expects to find them, then linking in the variables just as is done by
the nondeduplicating linker.
It also implements much of the CU-mapping side of things. The problem
CU-mapping introduces is that if you map many input CUs into one output,
this is saying that you want many translation units to produce at most
one child dict if conflicting types are found in any of them. This
means you can suddenly have multiple distinct types with the same name
in the same dict, which libctf cannot really represent because it's not
something you can do with C translation units.
The deduplicator machinery already committed does as best it can with
these, hiding types with conflicting names rather than making child
dicts out of them: but we still need to call it. This is done similarly
to the main link, taking the inputs (one CU output at a time),
deduplicating them, taking the output and making it an input to the
final link. Two (significant) optimizations are done: we share atoms
tables between all these links and the final link (so e.g. all type hash
values are shared, all decorated type names, etc); and any CU-mapped
links with only one input (and no child dicts) doesn't need to do
anything other than renaming the CU: the CU-mapped link phase can be
skipped for it. Put together, large CU-mapped links can save 50% of
their memory usage and about as much time (and the memory usage for
CU-mapped links is significant, because all those output CUs have to
have all their types stored in memory all at once).
include/
* ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the
deduplicator.
libctf/
* ctf-impl.h (ctf_list_splice): New.
* ctf-util.h (ctf_list_splice): Likewise.
* ctf-link.c (link_sort_inputs_cb_arg_t): Likewise.
(ctf_link_sort_inputs): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_variables): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Call it.
2020-06-06 05:57:06 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
if (!count)
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
if (narcs == 1)
|
|
|
|
{
|
|
|
|
if (only_one_input)
|
|
|
|
*only_one_input = one_input;
|
|
|
|
}
|
|
|
|
else if (only_one_input)
|
|
|
|
*only_one_input = NULL;
|
|
|
|
|
|
|
|
return count;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Allocate and populate an inputs array big enough for a given set of inputs:
|
|
|
|
either a specific set of CU names (those from that set found in the
|
|
|
|
ctf_link_inputs), or the entire ctf_link_inputs (if cu_names is not set).
|
|
|
|
The number of inputs (from ctf_link_deduplicating_count_inputs, above) is
|
|
|
|
passed in NINPUTS: an array of uint32_t containing parent pointers
|
|
|
|
(corresponding to those members of the inputs that have parents) is allocated
|
|
|
|
and returned in PARENTS.
|
|
|
|
|
|
|
|
The inputs are *archives*, not files: the archive can have multiple members
|
|
|
|
if it is the result of a previous incremental link. We want to add every one
|
|
|
|
in turn, including the shared parent. (The dedup machinery knows that a type
|
|
|
|
used by a single dictionary and its parent should not be shared in
|
|
|
|
CTF_LINK_SHARE_DUPLICATED mode.)
|
|
|
|
|
|
|
|
If no inputs exist that correspond to these CUs, return NULL with the errno
|
|
|
|
set to ECTF_NOCTFDATA. */
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
static ctf_dict_t **
|
|
|
|
ctf_link_deduplicating_open_inputs (ctf_dict_t *fp, ctf_dynhash_t *cu_names,
|
libctf, link: tie in the deduplicating linker
This fairly intricate commit connects up the CTF linker machinery (which
operates in terms of ctf_archive_t's on ctf_link_inputs ->
ctf_link_outputs) to the deduplicator (which operates in terms of arrays
of ctf_file_t's, all the archives exploded).
The nondeduplicating linker is retained, but is not called unless the
CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the
environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have
confidence in the much-more-complex deduplicating linker, I hope the
nondeduplicating linker can be removed.
In brief, what this does is traverses each input archive in
ctf_link_inputs, opening every member (if not already open) and tying
child dicts to their parents, shoving them into an array and
constructing a corresponding parents array that tells the deduplicator
which dict is the parent of which child. We then call ctf_dedup and
ctf_dedup_emit with that array of inputs, taking the outputs that result
and putting them into ctf_link_outputs where the rest of the CTF linker
expects to find them, then linking in the variables just as is done by
the nondeduplicating linker.
It also implements much of the CU-mapping side of things. The problem
CU-mapping introduces is that if you map many input CUs into one output,
this is saying that you want many translation units to produce at most
one child dict if conflicting types are found in any of them. This
means you can suddenly have multiple distinct types with the same name
in the same dict, which libctf cannot really represent because it's not
something you can do with C translation units.
The deduplicator machinery already committed does as best it can with
these, hiding types with conflicting names rather than making child
dicts out of them: but we still need to call it. This is done similarly
to the main link, taking the inputs (one CU output at a time),
deduplicating them, taking the output and making it an input to the
final link. Two (significant) optimizations are done: we share atoms
tables between all these links and the final link (so e.g. all type hash
values are shared, all decorated type names, etc); and any CU-mapped
links with only one input (and no child dicts) doesn't need to do
anything other than renaming the CU: the CU-mapped link phase can be
skipped for it. Put together, large CU-mapped links can save 50% of
their memory usage and about as much time (and the memory usage for
CU-mapped links is significant, because all those output CUs have to
have all their types stored in memory all at once).
include/
* ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the
deduplicator.
libctf/
* ctf-impl.h (ctf_list_splice): New.
* ctf-util.h (ctf_list_splice): Likewise.
* ctf-link.c (link_sort_inputs_cb_arg_t): Likewise.
(ctf_link_sort_inputs): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_variables): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Call it.
2020-06-06 05:57:06 +08:00
|
|
|
ssize_t ninputs, uint32_t **parents)
|
|
|
|
{
|
|
|
|
ctf_dynhash_t *inputs = fp->ctf_link_inputs;
|
|
|
|
ctf_next_t *i = NULL;
|
|
|
|
void *name, *input;
|
|
|
|
link_sort_inputs_cb_arg_t sort_arg;
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
ctf_dict_t **dedup_inputs = NULL;
|
|
|
|
ctf_dict_t **walk;
|
libctf, link: tie in the deduplicating linker
This fairly intricate commit connects up the CTF linker machinery (which
operates in terms of ctf_archive_t's on ctf_link_inputs ->
ctf_link_outputs) to the deduplicator (which operates in terms of arrays
of ctf_file_t's, all the archives exploded).
The nondeduplicating linker is retained, but is not called unless the
CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the
environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have
confidence in the much-more-complex deduplicating linker, I hope the
nondeduplicating linker can be removed.
In brief, what this does is traverses each input archive in
ctf_link_inputs, opening every member (if not already open) and tying
child dicts to their parents, shoving them into an array and
constructing a corresponding parents array that tells the deduplicator
which dict is the parent of which child. We then call ctf_dedup and
ctf_dedup_emit with that array of inputs, taking the outputs that result
and putting them into ctf_link_outputs where the rest of the CTF linker
expects to find them, then linking in the variables just as is done by
the nondeduplicating linker.
It also implements much of the CU-mapping side of things. The problem
CU-mapping introduces is that if you map many input CUs into one output,
this is saying that you want many translation units to produce at most
one child dict if conflicting types are found in any of them. This
means you can suddenly have multiple distinct types with the same name
in the same dict, which libctf cannot really represent because it's not
something you can do with C translation units.
The deduplicator machinery already committed does as best it can with
these, hiding types with conflicting names rather than making child
dicts out of them: but we still need to call it. This is done similarly
to the main link, taking the inputs (one CU output at a time),
deduplicating them, taking the output and making it an input to the
final link. Two (significant) optimizations are done: we share atoms
tables between all these links and the final link (so e.g. all type hash
values are shared, all decorated type names, etc); and any CU-mapped
links with only one input (and no child dicts) doesn't need to do
anything other than renaming the CU: the CU-mapped link phase can be
skipped for it. Put together, large CU-mapped links can save 50% of
their memory usage and about as much time (and the memory usage for
CU-mapped links is significant, because all those output CUs have to
have all their types stored in memory all at once).
include/
* ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the
deduplicator.
libctf/
* ctf-impl.h (ctf_list_splice): New.
* ctf-util.h (ctf_list_splice): Likewise.
* ctf-link.c (link_sort_inputs_cb_arg_t): Likewise.
(ctf_link_sort_inputs): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_variables): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Call it.
2020-06-06 05:57:06 +08:00
|
|
|
uint32_t *parents_ = NULL;
|
|
|
|
int err;
|
|
|
|
|
|
|
|
if (cu_names)
|
|
|
|
inputs = cu_names;
|
|
|
|
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
if ((dedup_inputs = calloc (ninputs, sizeof (ctf_dict_t *))) == NULL)
|
libctf, link: tie in the deduplicating linker
This fairly intricate commit connects up the CTF linker machinery (which
operates in terms of ctf_archive_t's on ctf_link_inputs ->
ctf_link_outputs) to the deduplicator (which operates in terms of arrays
of ctf_file_t's, all the archives exploded).
The nondeduplicating linker is retained, but is not called unless the
CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the
environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have
confidence in the much-more-complex deduplicating linker, I hope the
nondeduplicating linker can be removed.
In brief, what this does is traverses each input archive in
ctf_link_inputs, opening every member (if not already open) and tying
child dicts to their parents, shoving them into an array and
constructing a corresponding parents array that tells the deduplicator
which dict is the parent of which child. We then call ctf_dedup and
ctf_dedup_emit with that array of inputs, taking the outputs that result
and putting them into ctf_link_outputs where the rest of the CTF linker
expects to find them, then linking in the variables just as is done by
the nondeduplicating linker.
It also implements much of the CU-mapping side of things. The problem
CU-mapping introduces is that if you map many input CUs into one output,
this is saying that you want many translation units to produce at most
one child dict if conflicting types are found in any of them. This
means you can suddenly have multiple distinct types with the same name
in the same dict, which libctf cannot really represent because it's not
something you can do with C translation units.
The deduplicator machinery already committed does as best it can with
these, hiding types with conflicting names rather than making child
dicts out of them: but we still need to call it. This is done similarly
to the main link, taking the inputs (one CU output at a time),
deduplicating them, taking the output and making it an input to the
final link. Two (significant) optimizations are done: we share atoms
tables between all these links and the final link (so e.g. all type hash
values are shared, all decorated type names, etc); and any CU-mapped
links with only one input (and no child dicts) doesn't need to do
anything other than renaming the CU: the CU-mapped link phase can be
skipped for it. Put together, large CU-mapped links can save 50% of
their memory usage and about as much time (and the memory usage for
CU-mapped links is significant, because all those output CUs have to
have all their types stored in memory all at once).
include/
* ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the
deduplicator.
libctf/
* ctf-impl.h (ctf_list_splice): New.
* ctf-util.h (ctf_list_splice): Likewise.
* ctf-link.c (link_sort_inputs_cb_arg_t): Likewise.
(ctf_link_sort_inputs): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_variables): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Call it.
2020-06-06 05:57:06 +08:00
|
|
|
goto oom;
|
|
|
|
|
|
|
|
if ((parents_ = calloc (ninputs, sizeof (uint32_t))) == NULL)
|
|
|
|
goto oom;
|
|
|
|
|
|
|
|
walk = dedup_inputs;
|
|
|
|
|
|
|
|
/* Counting done: push every input into the array, in the order they were
|
|
|
|
passed to ctf_link_add_ctf (and ultimately ld). */
|
|
|
|
|
|
|
|
sort_arg.is_cu_mapped = (cu_names != NULL);
|
|
|
|
sort_arg.fp = fp;
|
|
|
|
|
|
|
|
while ((err = ctf_dynhash_next_sorted (inputs, &i, &name, &input,
|
|
|
|
ctf_link_sort_inputs, &sort_arg)) == 0)
|
|
|
|
{
|
|
|
|
const char *one_name = (const char *) name;
|
|
|
|
ctf_link_input_t *one_input;
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
ctf_dict_t *one_fp;
|
|
|
|
ctf_dict_t *parent_fp = NULL;
|
2024-04-09 07:23:35 +08:00
|
|
|
uint32_t parent_i = 0;
|
libctf, link: tie in the deduplicating linker
This fairly intricate commit connects up the CTF linker machinery (which
operates in terms of ctf_archive_t's on ctf_link_inputs ->
ctf_link_outputs) to the deduplicator (which operates in terms of arrays
of ctf_file_t's, all the archives exploded).
The nondeduplicating linker is retained, but is not called unless the
CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the
environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have
confidence in the much-more-complex deduplicating linker, I hope the
nondeduplicating linker can be removed.
In brief, what this does is traverses each input archive in
ctf_link_inputs, opening every member (if not already open) and tying
child dicts to their parents, shoving them into an array and
constructing a corresponding parents array that tells the deduplicator
which dict is the parent of which child. We then call ctf_dedup and
ctf_dedup_emit with that array of inputs, taking the outputs that result
and putting them into ctf_link_outputs where the rest of the CTF linker
expects to find them, then linking in the variables just as is done by
the nondeduplicating linker.
It also implements much of the CU-mapping side of things. The problem
CU-mapping introduces is that if you map many input CUs into one output,
this is saying that you want many translation units to produce at most
one child dict if conflicting types are found in any of them. This
means you can suddenly have multiple distinct types with the same name
in the same dict, which libctf cannot really represent because it's not
something you can do with C translation units.
The deduplicator machinery already committed does as best it can with
these, hiding types with conflicting names rather than making child
dicts out of them: but we still need to call it. This is done similarly
to the main link, taking the inputs (one CU output at a time),
deduplicating them, taking the output and making it an input to the
final link. Two (significant) optimizations are done: we share atoms
tables between all these links and the final link (so e.g. all type hash
values are shared, all decorated type names, etc); and any CU-mapped
links with only one input (and no child dicts) doesn't need to do
anything other than renaming the CU: the CU-mapped link phase can be
skipped for it. Put together, large CU-mapped links can save 50% of
their memory usage and about as much time (and the memory usage for
CU-mapped links is significant, because all those output CUs have to
have all their types stored in memory all at once).
include/
* ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the
deduplicator.
libctf/
* ctf-impl.h (ctf_list_splice): New.
* ctf-util.h (ctf_list_splice): Likewise.
* ctf-link.c (link_sort_inputs_cb_arg_t): Likewise.
(ctf_link_sort_inputs): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_variables): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Call it.
2020-06-06 05:57:06 +08:00
|
|
|
ctf_next_t *j = NULL;
|
|
|
|
|
|
|
|
/* If we are processing CU names, get the real input. All the inputs
|
|
|
|
will have been opened, if they contained any CTF at all. */
|
|
|
|
if (cu_names)
|
|
|
|
one_input = ctf_dynhash_lookup (fp->ctf_link_inputs, one_name);
|
|
|
|
else
|
|
|
|
one_input = (ctf_link_input_t *) input;
|
|
|
|
|
|
|
|
if (!one_input || (!one_input->clin_arc && !one_input->clin_fp))
|
|
|
|
continue;
|
|
|
|
|
|
|
|
/* Short-circuit: if clin_fp is set, just use it. */
|
|
|
|
if (one_input->clin_fp)
|
|
|
|
{
|
|
|
|
parents_[walk - dedup_inputs] = walk - dedup_inputs;
|
|
|
|
*walk = one_input->clin_fp;
|
|
|
|
walk++;
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Get and insert the parent archive (if any), if this archive has
|
|
|
|
multiple members. We assume, as elsewhere, that the parent is named
|
|
|
|
_CTF_SECTION. */
|
|
|
|
|
libctf, include, binutils, gdb: rename CTF-opening functions
The functions that return ctf_dict_t's given a ctf_archive_t and a name
are very clumsily named. It sounds like they return *archives*, not
dictionaries, and the names are very long and clunky. Why do we
have a ctf_arc_open_by_name when it opens a dictionary, not an archive,
and when there is no way to open a dictionary in any other way? The
answer is purely internal: the function is located in ctf-archive.c,
and everything in there was called ctf_arc_*, and there is another
way to open a dict (by offset in the archive), that is internal to
ctf-archive.c and that nothing else can call.
This is clearly bad naming. The internal organization of the source tree
should not dictate public API names!
So rename things (keeping the old, bad names for compatibility), and
adjust all users. You now open a dict using ctf_dict_open, and
open it giving ELF sections via ctf_dict_open_sections.
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf): Use ctf_dict_open, not
ctf_arc_open_by_name.
* readelf.c (dump_section_as_ctf): Likewise.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c (elfctf_build_psymtabs): Use ctf_dict_open, not
ctf_arc_open_by_name.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_arc_open_by_name): Rename to...
(ctf_dict_open): ... this, keeping compatibility function.
(ctf_arc_open_by_name_sections): Rename to...
(ctf_dict_open_sections): ... this, keeping compatibility function.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-archive.c (ctf_arc_open_by_offset): Rename to...
(ctf_dict_open_by_offset): ... this. Adjust callers.
(ctf_arc_open_by_name_internal): Rename to...
(ctf_dict_open_internal): ... this. Adjust callers.
(ctf_arc_open_by_name_sections): Rename to...
(ctf_dict_open_sections): ... this, keeping compatibility function.
(ctf_arc_open_by_name): Rename to...
(ctf_dict_open): ... this, keeping compatibility function.
* libctf.ver: New functions added.
* ctf-link.c (ctf_link_one_input_archive): Adjusted accordingly.
(ctf_link_deduplicating_open_inputs): Likewise.
2020-11-20 21:34:04 +08:00
|
|
|
if ((parent_fp = ctf_dict_open (one_input->clin_arc, _CTF_SECTION,
|
|
|
|
&err)) == NULL)
|
libctf, link: tie in the deduplicating linker
This fairly intricate commit connects up the CTF linker machinery (which
operates in terms of ctf_archive_t's on ctf_link_inputs ->
ctf_link_outputs) to the deduplicator (which operates in terms of arrays
of ctf_file_t's, all the archives exploded).
The nondeduplicating linker is retained, but is not called unless the
CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the
environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have
confidence in the much-more-complex deduplicating linker, I hope the
nondeduplicating linker can be removed.
In brief, what this does is traverses each input archive in
ctf_link_inputs, opening every member (if not already open) and tying
child dicts to their parents, shoving them into an array and
constructing a corresponding parents array that tells the deduplicator
which dict is the parent of which child. We then call ctf_dedup and
ctf_dedup_emit with that array of inputs, taking the outputs that result
and putting them into ctf_link_outputs where the rest of the CTF linker
expects to find them, then linking in the variables just as is done by
the nondeduplicating linker.
It also implements much of the CU-mapping side of things. The problem
CU-mapping introduces is that if you map many input CUs into one output,
this is saying that you want many translation units to produce at most
one child dict if conflicting types are found in any of them. This
means you can suddenly have multiple distinct types with the same name
in the same dict, which libctf cannot really represent because it's not
something you can do with C translation units.
The deduplicator machinery already committed does as best it can with
these, hiding types with conflicting names rather than making child
dicts out of them: but we still need to call it. This is done similarly
to the main link, taking the inputs (one CU output at a time),
deduplicating them, taking the output and making it an input to the
final link. Two (significant) optimizations are done: we share atoms
tables between all these links and the final link (so e.g. all type hash
values are shared, all decorated type names, etc); and any CU-mapped
links with only one input (and no child dicts) doesn't need to do
anything other than renaming the CU: the CU-mapped link phase can be
skipped for it. Put together, large CU-mapped links can save 50% of
their memory usage and about as much time (and the memory usage for
CU-mapped links is significant, because all those output CUs have to
have all their types stored in memory all at once).
include/
* ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the
deduplicator.
libctf/
* ctf-impl.h (ctf_list_splice): New.
* ctf-util.h (ctf_list_splice): Likewise.
* ctf-link.c (link_sort_inputs_cb_arg_t): Likewise.
(ctf_link_sort_inputs): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_variables): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Call it.
2020-06-06 05:57:06 +08:00
|
|
|
{
|
|
|
|
if (err != ECTF_NOMEMBNAM)
|
|
|
|
{
|
|
|
|
ctf_next_destroy (i);
|
|
|
|
ctf_set_errno (fp, err);
|
|
|
|
goto err;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
*walk = parent_fp;
|
|
|
|
parent_i = walk - dedup_inputs;
|
|
|
|
walk++;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* We disregard the input archive name: either it is the parent (which we
|
|
|
|
already have), or we want to put everything into one TU sharing the
|
|
|
|
cuname anyway (if this is a CU-mapped link), or this is the final phase
|
|
|
|
of a relink with CU-mapping off (i.e. ld -r) in which case the cuname
|
|
|
|
is correctly set regardless. */
|
|
|
|
while ((one_fp = ctf_archive_next (one_input->clin_arc, &j, NULL,
|
|
|
|
1, &err)) != NULL)
|
|
|
|
{
|
|
|
|
if (one_fp->ctf_flags & LCTF_CHILD)
|
|
|
|
{
|
|
|
|
/* The contents of the parents array for elements not
|
|
|
|
corresponding to children is undefined. If there is no parent
|
|
|
|
(itself a sign of a likely linker bug or corrupt input), we set
|
|
|
|
it to itself. */
|
|
|
|
|
|
|
|
ctf_import (one_fp, parent_fp);
|
|
|
|
if (parent_fp)
|
|
|
|
parents_[walk - dedup_inputs] = parent_i;
|
|
|
|
else
|
|
|
|
parents_[walk - dedup_inputs] = walk - dedup_inputs;
|
|
|
|
}
|
|
|
|
*walk = one_fp;
|
|
|
|
walk++;
|
|
|
|
}
|
|
|
|
if (err != ECTF_NEXT_END)
|
|
|
|
{
|
|
|
|
ctf_next_destroy (i);
|
|
|
|
goto iterr;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
if (err != ECTF_NEXT_END)
|
|
|
|
goto iterr;
|
|
|
|
|
|
|
|
*parents = parents_;
|
|
|
|
|
|
|
|
return dedup_inputs;
|
|
|
|
|
|
|
|
oom:
|
|
|
|
err = ENOMEM;
|
|
|
|
|
|
|
|
iterr:
|
|
|
|
ctf_set_errno (fp, err);
|
|
|
|
|
|
|
|
err:
|
|
|
|
free (dedup_inputs);
|
|
|
|
free (parents_);
|
libctf, binutils, include, ld: gettextize and improve error handling
This commit follows on from the earlier commit "libctf, ld, binutils:
add textual error/warning reporting for libctf" and converts every error
in libctf that was reported using ctf_dprintf to use ctf_err_warn
instead, gettextizing them in the process, using N_() where necessary to
avoid doing gettext calls unless an error message is actually generated,
and rephrasing some error messages for ease of translation.
This requires a slight change in the ctf_errwarning_next API: this API
is public but has not been in a release yet, so can still change freely.
The problem is that many errors are emitted at open time (whether
opening of a CTF dict, or opening of a CTF archive): the former of these
throws away its incompletely-initialized ctf_file_t rather than return
it, and the latter has no ctf_file_t at all. So errors and warnings
emitted at open time cannot be stored in the ctf_file_t, and have to go
elsewhere.
We put them in a static local in ctf-subr.c (which is not very
thread-safe: a later commit will improve things here): ctf_err_warn with
a NULL fp adds to this list, and the public interface
ctf_errwarning_next with a NULL fp retrieves from it.
We need a slight exception from the usual iterator rules in this case:
with a NULL fp, there is nowhere to store the ECTF_NEXT_END "error"
which signifies the end of iteration, so we add a new err parameter to
ctf_errwarning_next which is used to report such iteration-related
errors. (If an fp is provided -- i.e., if not reporting open errors --
this is optional, but even if it's optional it's still an API change.
This is actually useful from a usability POV as well, since
ctf_errwarning_next is usually called when there's been an error, so
overwriting the error code with ECTF_NEXT_END is not very helpful!
So, unusually, ctf_errwarning_next now uses the passed fp for its
error code *only* if no errp pointer is passed in, and leaves it
untouched otherwise.)
ld, objdump and readelf are adapted to call ctf_errwarning_next with a
NULL fp to report open errors where appropriate.
The ctf_err_warn API also has to change, gaining a new error-number
parameter which is used to add the error message corresponding to that
error number into the debug stream when LIBCTF_DEBUG is enabled:
changing this API is easy at this point since we are already touching
all existing calls to gettextize them. We need this because the debug
stream should contain the errno's message, but the error reported in the
error/warning stream should *not*, because the caller will probably
report it themselves at failure time regardless, and reporting it in
every error message that leads up to it leads to a ridiculous chattering
on failure, which is likely to end up as ridiculous chattering on stderr
(trimmed a bit):
CTF error: `ld/testsuite/ld-ctf/A.c (0): lookup failure for type 3: flags 1: The parent CTF dictionary is unavailable'
CTF error: `ld/testsuite/ld-ctf/A.c (0): struct/union member type hashing error during type hashing for type 80000001, kind 6: The parent CTF dictionary is unavailable'
CTF error: `deduplicating link variable emission failed for ld/testsuite/ld-ctf/A.c: The parent CTF dictionary is unavailable'
ld/.libs/lt-ld-new: warning: CTF linking failed; output will have no CTF section: `The parent CTF dictionary is unavailable'
We only need to be told that the parent CTF dictionary is unavailable
*once*, not over and over again!
errmsgs are still emitted on warning generation, because warnings do not
usually lead to a failure propagated up to the caller and reported
there.
Debug-stream messages are not translated. If translation is turned on,
there will be a mixture of English and translated messages in the debug
stream, but rather that than burden the translators with debug-only
output.
binutils/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_archive_member): Move error-
reporting...
(dump_ctf_errs): ... into this separate function.
(dump_ctf): Call it on open errors.
* readelf.c (dump_ctf_archive_member): Move error-
reporting...
(dump_ctf_errs): ... into this separate function. Support
calls with NULL fp. Adjust for new err parameter to
ctf_errwarning_next.
(dump_section_as_ctf): Call it on open errors.
include/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_errwarning_next): New err parameter.
ld/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (lang_ctf_errs_warnings): Support calls with NULL fp.
Adjust for new err parameter to ctf_errwarning_next. Only
check for assertion failures when fp is non-NULL.
(ldlang_open_ctf): Call it on open errors.
* testsuite/ld-ctf/ctf.exp: Always use the C locale to avoid
breaking the diags tests.
libctf/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-subr.c (open_errors): New list.
(ctf_err_warn): Calls with NULL fp append to open_errors. Add err
parameter, and use it to decorate the debug stream with errmsgs.
(ctf_err_warn_to_open): Splice errors from a CTF dict into the
open_errors.
(ctf_errwarning_next): Calls with NULL fp report from open_errors.
New err param to report iteration errors (including end-of-iteration)
when fp is NULL.
(ctf_assert_fail_internal): Adjust ctf_err_warn call for new err
parameter: gettextize.
* ctf-impl.h (ctfo_get_vbytes): Add ctf_file_t parameter.
(LCTF_VBYTES): Adjust.
(ctf_err_warn_to_open): New.
(ctf_err_warn): Adjust.
(ctf_bundle): Used in only one place: move...
* ctf-create.c: ... here.
(enumcmp): Use ctf_err_warn, not ctf_dprintf, passing the err number
down as needed. Don't emit the errmsg. Gettextize.
(membcmp): Likewise.
(ctf_add_type_internal): Likewise.
(ctf_write_mem): Likewise.
(ctf_compress_write): Likewise. Report errors writing the header or
body.
(ctf_write): Likewise.
* ctf-archive.c (ctf_arc_write_fd): Use ctf_err_warn, not
ctf_dprintf, and gettextize, as above.
(ctf_arc_write): Likewise.
(ctf_arc_bufopen): Likewise.
(ctf_arc_open_internal): Likewise.
* ctf-labels.c (ctf_label_iter): Likewise.
* ctf-open-bfd.c (ctf_bfdclose): Likewise.
(ctf_bfdopen): Likewise.
(ctf_bfdopen_ctfsect): Likewise.
(ctf_fdopen): Likewise.
* ctf-string.c (ctf_str_write_strtab): Likewise.
* ctf-types.c (ctf_type_resolve): Likewise.
* ctf-open.c (get_vbytes_common): Likewise. Pass down the ctf dict.
(get_vbytes_v1): Pass down the ctf dict.
(get_vbytes_v2): Likewise.
(flip_ctf): Likewise.
(flip_types): Likewise. Use ctf_err_warn, not ctf_dprintf, and
gettextize, as above.
(upgrade_types_v1): Adjust calls.
(init_types): Use ctf_err_warn, not ctf_dprintf, as above.
(ctf_bufopen_internal): Likewise. Adjust calls. Transplant errors
emitted into individual dicts into the open errors if this turns
out to be a failed open in the end.
* ctf-dump.c (ctf_dump_format_type): Adjust ctf_err_warn for new err
argument. Gettextize. Don't emit the errmsg.
(ctf_dump_funcs): Likewise. Collapse err label into its only case.
(ctf_dump_type): Likewise.
* ctf-link.c (ctf_create_per_cu): Adjust ctf_err_warn for new err
argument. Gettextize. Don't emit the errmsg.
(ctf_link_one_type): Likewise.
(ctf_link_lazy_open): Likewise.
(ctf_link_one_input_archive): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Likewise.
(ctf_link_deduplicating_per_cu): Likewise. Add some missed
ctf_set_errnos to obscure error cases.
* ctf-dedup.c (ctf_dedup_rhash_type): Adjust ctf_err_warn for new
err argument. Gettextize. Don't emit the errmsg.
(ctf_dedup_populate_mappings): Likewise.
(ctf_dedup_detect_name_ambiguity): Likewise.
(ctf_dedup_init): Likewise.
(ctf_dedup_multiple_input_dicts): Likewise.
(ctf_dedup_conflictify_unshared): Likewise.
(ctf_dedup): Likewise.
(ctf_dedup_rwalk_one_output_mapping): Likewise.
(ctf_dedup_id_to_target): Likewise.
(ctf_dedup_emit_type): Likewise.
(ctf_dedup_emit_struct_members): Likewise.
(ctf_dedup_populate_type_mapping): Likewise.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): Likewise.
(ctf_dedup_hash_type): Likewise. Fix a bit of messed-up error
status setting.
(ctf_dedup_rwalk_one_output_mapping): Likewise. Don't hide
unknown-type-kind messages (which signify file corruption).
2020-07-27 23:45:15 +08:00
|
|
|
ctf_err_warn (fp, 0, 0, _("error in deduplicating CTF link "
|
|
|
|
"input allocation"));
|
libctf, link: tie in the deduplicating linker
This fairly intricate commit connects up the CTF linker machinery (which
operates in terms of ctf_archive_t's on ctf_link_inputs ->
ctf_link_outputs) to the deduplicator (which operates in terms of arrays
of ctf_file_t's, all the archives exploded).
The nondeduplicating linker is retained, but is not called unless the
CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the
environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have
confidence in the much-more-complex deduplicating linker, I hope the
nondeduplicating linker can be removed.
In brief, what this does is traverses each input archive in
ctf_link_inputs, opening every member (if not already open) and tying
child dicts to their parents, shoving them into an array and
constructing a corresponding parents array that tells the deduplicator
which dict is the parent of which child. We then call ctf_dedup and
ctf_dedup_emit with that array of inputs, taking the outputs that result
and putting them into ctf_link_outputs where the rest of the CTF linker
expects to find them, then linking in the variables just as is done by
the nondeduplicating linker.
It also implements much of the CU-mapping side of things. The problem
CU-mapping introduces is that if you map many input CUs into one output,
this is saying that you want many translation units to produce at most
one child dict if conflicting types are found in any of them. This
means you can suddenly have multiple distinct types with the same name
in the same dict, which libctf cannot really represent because it's not
something you can do with C translation units.
The deduplicator machinery already committed does as best it can with
these, hiding types with conflicting names rather than making child
dicts out of them: but we still need to call it. This is done similarly
to the main link, taking the inputs (one CU output at a time),
deduplicating them, taking the output and making it an input to the
final link. Two (significant) optimizations are done: we share atoms
tables between all these links and the final link (so e.g. all type hash
values are shared, all decorated type names, etc); and any CU-mapped
links with only one input (and no child dicts) doesn't need to do
anything other than renaming the CU: the CU-mapped link phase can be
skipped for it. Put together, large CU-mapped links can save 50% of
their memory usage and about as much time (and the memory usage for
CU-mapped links is significant, because all those output CUs have to
have all their types stored in memory all at once).
include/
* ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the
deduplicator.
libctf/
* ctf-impl.h (ctf_list_splice): New.
* ctf-util.h (ctf_list_splice): Likewise.
* ctf-link.c (link_sort_inputs_cb_arg_t): Likewise.
(ctf_link_sort_inputs): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_variables): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Call it.
2020-06-06 05:57:06 +08:00
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Close INPUTS that have already been linked, first the passed array, and then
|
|
|
|
that subset of the ctf_link_inputs archives they came from cited by the
|
|
|
|
CU_NAMES. If CU_NAMES is not specified, close all the ctf_link_inputs in one
|
|
|
|
go, leaving it empty. */
|
|
|
|
static int
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
ctf_link_deduplicating_close_inputs (ctf_dict_t *fp, ctf_dynhash_t *cu_names,
|
|
|
|
ctf_dict_t **inputs, ssize_t ninputs)
|
libctf, link: tie in the deduplicating linker
This fairly intricate commit connects up the CTF linker machinery (which
operates in terms of ctf_archive_t's on ctf_link_inputs ->
ctf_link_outputs) to the deduplicator (which operates in terms of arrays
of ctf_file_t's, all the archives exploded).
The nondeduplicating linker is retained, but is not called unless the
CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the
environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have
confidence in the much-more-complex deduplicating linker, I hope the
nondeduplicating linker can be removed.
In brief, what this does is traverses each input archive in
ctf_link_inputs, opening every member (if not already open) and tying
child dicts to their parents, shoving them into an array and
constructing a corresponding parents array that tells the deduplicator
which dict is the parent of which child. We then call ctf_dedup and
ctf_dedup_emit with that array of inputs, taking the outputs that result
and putting them into ctf_link_outputs where the rest of the CTF linker
expects to find them, then linking in the variables just as is done by
the nondeduplicating linker.
It also implements much of the CU-mapping side of things. The problem
CU-mapping introduces is that if you map many input CUs into one output,
this is saying that you want many translation units to produce at most
one child dict if conflicting types are found in any of them. This
means you can suddenly have multiple distinct types with the same name
in the same dict, which libctf cannot really represent because it's not
something you can do with C translation units.
The deduplicator machinery already committed does as best it can with
these, hiding types with conflicting names rather than making child
dicts out of them: but we still need to call it. This is done similarly
to the main link, taking the inputs (one CU output at a time),
deduplicating them, taking the output and making it an input to the
final link. Two (significant) optimizations are done: we share atoms
tables between all these links and the final link (so e.g. all type hash
values are shared, all decorated type names, etc); and any CU-mapped
links with only one input (and no child dicts) doesn't need to do
anything other than renaming the CU: the CU-mapped link phase can be
skipped for it. Put together, large CU-mapped links can save 50% of
their memory usage and about as much time (and the memory usage for
CU-mapped links is significant, because all those output CUs have to
have all their types stored in memory all at once).
include/
* ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the
deduplicator.
libctf/
* ctf-impl.h (ctf_list_splice): New.
* ctf-util.h (ctf_list_splice): Likewise.
* ctf-link.c (link_sort_inputs_cb_arg_t): Likewise.
(ctf_link_sort_inputs): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_variables): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Call it.
2020-06-06 05:57:06 +08:00
|
|
|
{
|
|
|
|
ctf_next_t *it = NULL;
|
|
|
|
void *name;
|
|
|
|
int err;
|
|
|
|
ssize_t i;
|
|
|
|
|
|
|
|
/* This is the inverse of ctf_link_deduplicating_open_inputs: so first, close
|
|
|
|
all the individual input dicts, opened by the archive iterator. */
|
|
|
|
for (i = 0; i < ninputs; i++)
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
ctf_dict_close (inputs[i]);
|
libctf, link: tie in the deduplicating linker
This fairly intricate commit connects up the CTF linker machinery (which
operates in terms of ctf_archive_t's on ctf_link_inputs ->
ctf_link_outputs) to the deduplicator (which operates in terms of arrays
of ctf_file_t's, all the archives exploded).
The nondeduplicating linker is retained, but is not called unless the
CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the
environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have
confidence in the much-more-complex deduplicating linker, I hope the
nondeduplicating linker can be removed.
In brief, what this does is traverses each input archive in
ctf_link_inputs, opening every member (if not already open) and tying
child dicts to their parents, shoving them into an array and
constructing a corresponding parents array that tells the deduplicator
which dict is the parent of which child. We then call ctf_dedup and
ctf_dedup_emit with that array of inputs, taking the outputs that result
and putting them into ctf_link_outputs where the rest of the CTF linker
expects to find them, then linking in the variables just as is done by
the nondeduplicating linker.
It also implements much of the CU-mapping side of things. The problem
CU-mapping introduces is that if you map many input CUs into one output,
this is saying that you want many translation units to produce at most
one child dict if conflicting types are found in any of them. This
means you can suddenly have multiple distinct types with the same name
in the same dict, which libctf cannot really represent because it's not
something you can do with C translation units.
The deduplicator machinery already committed does as best it can with
these, hiding types with conflicting names rather than making child
dicts out of them: but we still need to call it. This is done similarly
to the main link, taking the inputs (one CU output at a time),
deduplicating them, taking the output and making it an input to the
final link. Two (significant) optimizations are done: we share atoms
tables between all these links and the final link (so e.g. all type hash
values are shared, all decorated type names, etc); and any CU-mapped
links with only one input (and no child dicts) doesn't need to do
anything other than renaming the CU: the CU-mapped link phase can be
skipped for it. Put together, large CU-mapped links can save 50% of
their memory usage and about as much time (and the memory usage for
CU-mapped links is significant, because all those output CUs have to
have all their types stored in memory all at once).
include/
* ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the
deduplicator.
libctf/
* ctf-impl.h (ctf_list_splice): New.
* ctf-util.h (ctf_list_splice): Likewise.
* ctf-link.c (link_sort_inputs_cb_arg_t): Likewise.
(ctf_link_sort_inputs): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_variables): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Call it.
2020-06-06 05:57:06 +08:00
|
|
|
|
|
|
|
/* Now close the archives they are part of. */
|
|
|
|
if (cu_names)
|
|
|
|
{
|
|
|
|
while ((err = ctf_dynhash_next (cu_names, &it, &name, NULL)) == 0)
|
|
|
|
{
|
|
|
|
/* Remove the input from the linker inputs, if it exists, which also
|
|
|
|
closes it. */
|
|
|
|
|
|
|
|
ctf_dynhash_remove (fp->ctf_link_inputs, (const char *) name);
|
|
|
|
}
|
|
|
|
if (err != ECTF_NEXT_END)
|
|
|
|
{
|
libctf, binutils, include, ld: gettextize and improve error handling
This commit follows on from the earlier commit "libctf, ld, binutils:
add textual error/warning reporting for libctf" and converts every error
in libctf that was reported using ctf_dprintf to use ctf_err_warn
instead, gettextizing them in the process, using N_() where necessary to
avoid doing gettext calls unless an error message is actually generated,
and rephrasing some error messages for ease of translation.
This requires a slight change in the ctf_errwarning_next API: this API
is public but has not been in a release yet, so can still change freely.
The problem is that many errors are emitted at open time (whether
opening of a CTF dict, or opening of a CTF archive): the former of these
throws away its incompletely-initialized ctf_file_t rather than return
it, and the latter has no ctf_file_t at all. So errors and warnings
emitted at open time cannot be stored in the ctf_file_t, and have to go
elsewhere.
We put them in a static local in ctf-subr.c (which is not very
thread-safe: a later commit will improve things here): ctf_err_warn with
a NULL fp adds to this list, and the public interface
ctf_errwarning_next with a NULL fp retrieves from it.
We need a slight exception from the usual iterator rules in this case:
with a NULL fp, there is nowhere to store the ECTF_NEXT_END "error"
which signifies the end of iteration, so we add a new err parameter to
ctf_errwarning_next which is used to report such iteration-related
errors. (If an fp is provided -- i.e., if not reporting open errors --
this is optional, but even if it's optional it's still an API change.
This is actually useful from a usability POV as well, since
ctf_errwarning_next is usually called when there's been an error, so
overwriting the error code with ECTF_NEXT_END is not very helpful!
So, unusually, ctf_errwarning_next now uses the passed fp for its
error code *only* if no errp pointer is passed in, and leaves it
untouched otherwise.)
ld, objdump and readelf are adapted to call ctf_errwarning_next with a
NULL fp to report open errors where appropriate.
The ctf_err_warn API also has to change, gaining a new error-number
parameter which is used to add the error message corresponding to that
error number into the debug stream when LIBCTF_DEBUG is enabled:
changing this API is easy at this point since we are already touching
all existing calls to gettextize them. We need this because the debug
stream should contain the errno's message, but the error reported in the
error/warning stream should *not*, because the caller will probably
report it themselves at failure time regardless, and reporting it in
every error message that leads up to it leads to a ridiculous chattering
on failure, which is likely to end up as ridiculous chattering on stderr
(trimmed a bit):
CTF error: `ld/testsuite/ld-ctf/A.c (0): lookup failure for type 3: flags 1: The parent CTF dictionary is unavailable'
CTF error: `ld/testsuite/ld-ctf/A.c (0): struct/union member type hashing error during type hashing for type 80000001, kind 6: The parent CTF dictionary is unavailable'
CTF error: `deduplicating link variable emission failed for ld/testsuite/ld-ctf/A.c: The parent CTF dictionary is unavailable'
ld/.libs/lt-ld-new: warning: CTF linking failed; output will have no CTF section: `The parent CTF dictionary is unavailable'
We only need to be told that the parent CTF dictionary is unavailable
*once*, not over and over again!
errmsgs are still emitted on warning generation, because warnings do not
usually lead to a failure propagated up to the caller and reported
there.
Debug-stream messages are not translated. If translation is turned on,
there will be a mixture of English and translated messages in the debug
stream, but rather that than burden the translators with debug-only
output.
binutils/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_archive_member): Move error-
reporting...
(dump_ctf_errs): ... into this separate function.
(dump_ctf): Call it on open errors.
* readelf.c (dump_ctf_archive_member): Move error-
reporting...
(dump_ctf_errs): ... into this separate function. Support
calls with NULL fp. Adjust for new err parameter to
ctf_errwarning_next.
(dump_section_as_ctf): Call it on open errors.
include/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_errwarning_next): New err parameter.
ld/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (lang_ctf_errs_warnings): Support calls with NULL fp.
Adjust for new err parameter to ctf_errwarning_next. Only
check for assertion failures when fp is non-NULL.
(ldlang_open_ctf): Call it on open errors.
* testsuite/ld-ctf/ctf.exp: Always use the C locale to avoid
breaking the diags tests.
libctf/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-subr.c (open_errors): New list.
(ctf_err_warn): Calls with NULL fp append to open_errors. Add err
parameter, and use it to decorate the debug stream with errmsgs.
(ctf_err_warn_to_open): Splice errors from a CTF dict into the
open_errors.
(ctf_errwarning_next): Calls with NULL fp report from open_errors.
New err param to report iteration errors (including end-of-iteration)
when fp is NULL.
(ctf_assert_fail_internal): Adjust ctf_err_warn call for new err
parameter: gettextize.
* ctf-impl.h (ctfo_get_vbytes): Add ctf_file_t parameter.
(LCTF_VBYTES): Adjust.
(ctf_err_warn_to_open): New.
(ctf_err_warn): Adjust.
(ctf_bundle): Used in only one place: move...
* ctf-create.c: ... here.
(enumcmp): Use ctf_err_warn, not ctf_dprintf, passing the err number
down as needed. Don't emit the errmsg. Gettextize.
(membcmp): Likewise.
(ctf_add_type_internal): Likewise.
(ctf_write_mem): Likewise.
(ctf_compress_write): Likewise. Report errors writing the header or
body.
(ctf_write): Likewise.
* ctf-archive.c (ctf_arc_write_fd): Use ctf_err_warn, not
ctf_dprintf, and gettextize, as above.
(ctf_arc_write): Likewise.
(ctf_arc_bufopen): Likewise.
(ctf_arc_open_internal): Likewise.
* ctf-labels.c (ctf_label_iter): Likewise.
* ctf-open-bfd.c (ctf_bfdclose): Likewise.
(ctf_bfdopen): Likewise.
(ctf_bfdopen_ctfsect): Likewise.
(ctf_fdopen): Likewise.
* ctf-string.c (ctf_str_write_strtab): Likewise.
* ctf-types.c (ctf_type_resolve): Likewise.
* ctf-open.c (get_vbytes_common): Likewise. Pass down the ctf dict.
(get_vbytes_v1): Pass down the ctf dict.
(get_vbytes_v2): Likewise.
(flip_ctf): Likewise.
(flip_types): Likewise. Use ctf_err_warn, not ctf_dprintf, and
gettextize, as above.
(upgrade_types_v1): Adjust calls.
(init_types): Use ctf_err_warn, not ctf_dprintf, as above.
(ctf_bufopen_internal): Likewise. Adjust calls. Transplant errors
emitted into individual dicts into the open errors if this turns
out to be a failed open in the end.
* ctf-dump.c (ctf_dump_format_type): Adjust ctf_err_warn for new err
argument. Gettextize. Don't emit the errmsg.
(ctf_dump_funcs): Likewise. Collapse err label into its only case.
(ctf_dump_type): Likewise.
* ctf-link.c (ctf_create_per_cu): Adjust ctf_err_warn for new err
argument. Gettextize. Don't emit the errmsg.
(ctf_link_one_type): Likewise.
(ctf_link_lazy_open): Likewise.
(ctf_link_one_input_archive): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Likewise.
(ctf_link_deduplicating_per_cu): Likewise. Add some missed
ctf_set_errnos to obscure error cases.
* ctf-dedup.c (ctf_dedup_rhash_type): Adjust ctf_err_warn for new
err argument. Gettextize. Don't emit the errmsg.
(ctf_dedup_populate_mappings): Likewise.
(ctf_dedup_detect_name_ambiguity): Likewise.
(ctf_dedup_init): Likewise.
(ctf_dedup_multiple_input_dicts): Likewise.
(ctf_dedup_conflictify_unshared): Likewise.
(ctf_dedup): Likewise.
(ctf_dedup_rwalk_one_output_mapping): Likewise.
(ctf_dedup_id_to_target): Likewise.
(ctf_dedup_emit_type): Likewise.
(ctf_dedup_emit_struct_members): Likewise.
(ctf_dedup_populate_type_mapping): Likewise.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): Likewise.
(ctf_dedup_hash_type): Likewise. Fix a bit of messed-up error
status setting.
(ctf_dedup_rwalk_one_output_mapping): Likewise. Don't hide
unknown-type-kind messages (which signify file corruption).
2020-07-27 23:45:15 +08:00
|
|
|
ctf_err_warn (fp, 0, err, _("iteration error in deduplicating link "
|
|
|
|
"input freeing"));
|
libctf, link: tie in the deduplicating linker
This fairly intricate commit connects up the CTF linker machinery (which
operates in terms of ctf_archive_t's on ctf_link_inputs ->
ctf_link_outputs) to the deduplicator (which operates in terms of arrays
of ctf_file_t's, all the archives exploded).
The nondeduplicating linker is retained, but is not called unless the
CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the
environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have
confidence in the much-more-complex deduplicating linker, I hope the
nondeduplicating linker can be removed.
In brief, what this does is traverses each input archive in
ctf_link_inputs, opening every member (if not already open) and tying
child dicts to their parents, shoving them into an array and
constructing a corresponding parents array that tells the deduplicator
which dict is the parent of which child. We then call ctf_dedup and
ctf_dedup_emit with that array of inputs, taking the outputs that result
and putting them into ctf_link_outputs where the rest of the CTF linker
expects to find them, then linking in the variables just as is done by
the nondeduplicating linker.
It also implements much of the CU-mapping side of things. The problem
CU-mapping introduces is that if you map many input CUs into one output,
this is saying that you want many translation units to produce at most
one child dict if conflicting types are found in any of them. This
means you can suddenly have multiple distinct types with the same name
in the same dict, which libctf cannot really represent because it's not
something you can do with C translation units.
The deduplicator machinery already committed does as best it can with
these, hiding types with conflicting names rather than making child
dicts out of them: but we still need to call it. This is done similarly
to the main link, taking the inputs (one CU output at a time),
deduplicating them, taking the output and making it an input to the
final link. Two (significant) optimizations are done: we share atoms
tables between all these links and the final link (so e.g. all type hash
values are shared, all decorated type names, etc); and any CU-mapped
links with only one input (and no child dicts) doesn't need to do
anything other than renaming the CU: the CU-mapped link phase can be
skipped for it. Put together, large CU-mapped links can save 50% of
their memory usage and about as much time (and the memory usage for
CU-mapped links is significant, because all those output CUs have to
have all their types stored in memory all at once).
include/
* ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the
deduplicator.
libctf/
* ctf-impl.h (ctf_list_splice): New.
* ctf-util.h (ctf_list_splice): Likewise.
* ctf-link.c (link_sort_inputs_cb_arg_t): Likewise.
(ctf_link_sort_inputs): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_variables): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Call it.
2020-06-06 05:57:06 +08:00
|
|
|
ctf_set_errno (fp, err);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
else
|
|
|
|
ctf_dynhash_empty (fp->ctf_link_inputs);
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
include, libctf, ld: extend variable section to contain functions too
The CTF variable section is an optional (usually-not-present) section in
the CTF dict which contains name -> type mappings corresponding to data
symbols that are present in the linker input but not in the output
symbol table: the idea is that programs that use their own symbol-
resolution mechanisms can use this section to look up the types of
symbols they have found using their own mechanism.
Because these removed symbols (mostly static variables, functions, etc)
all have names that are unlikely to appear in the ELF symtab and because
very few programs have their own symbol-resolution mechanisms, a special
linker flag (--ctf-variables) is needed to emit this section.
Historically, we emitted only removed data symbols into the variable
section. This seemed to make sense at the time, but in hindsight it
really doesn't: functions are symbols too, and a C program can look them
up just like any other type. So extend the variable section so that it
contains all static function symbols too (if it is emitted at all), with
types of kind CTF_K_FUNCTION.
This is a little fiddly. We relied on compiler assistance for data
symbols: the compiler simply emits all data symbols twice, once into the
symtypetab as an indexed symbol and once into the variable section.
Rather than wait for a suitably adjusted compiler that does the same for
function symbols, we can pluck unreported function symbols out of the
symtab and add them to the variable section ourselves. While we're at
it, we do the same with data symbols: this is redundant right now
because the compiler does it, but it costs very little time and lets the
compiler drop this kludge and save a little space in .o files.
include/
* ctf.h: Mention the new things we can see in the variable
section.
ld/
* testsuite/ld-ctf/data-func-conflicted-vars.d: New test.
libctf/
* ctf-link.c (ctf_link_deduplicating_variables): Duplicate
symbols into the variable section too.
* ctf-serialize.c (symtypetab_delete_nonstatic_vars): Rename
to...
(symtypetab_delete_nonstatics): ... this. Check the funchash
when pruning redundant variables.
(ctf_symtypetab_sect_sizes): Adjust accordingly.
* NEWS: Describe this change.
2022-03-16 23:29:25 +08:00
|
|
|
/* Do a deduplicating link of all variables in the inputs.
|
|
|
|
|
|
|
|
Also, if we are not omitting the variable section, integrate all symbols from
|
|
|
|
the symtypetabs into the variable section too. (Duplication with the
|
|
|
|
symtypetab section in the output will be eliminated at serialization time.) */
|
|
|
|
|
libctf, link: tie in the deduplicating linker
This fairly intricate commit connects up the CTF linker machinery (which
operates in terms of ctf_archive_t's on ctf_link_inputs ->
ctf_link_outputs) to the deduplicator (which operates in terms of arrays
of ctf_file_t's, all the archives exploded).
The nondeduplicating linker is retained, but is not called unless the
CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the
environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have
confidence in the much-more-complex deduplicating linker, I hope the
nondeduplicating linker can be removed.
In brief, what this does is traverses each input archive in
ctf_link_inputs, opening every member (if not already open) and tying
child dicts to their parents, shoving them into an array and
constructing a corresponding parents array that tells the deduplicator
which dict is the parent of which child. We then call ctf_dedup and
ctf_dedup_emit with that array of inputs, taking the outputs that result
and putting them into ctf_link_outputs where the rest of the CTF linker
expects to find them, then linking in the variables just as is done by
the nondeduplicating linker.
It also implements much of the CU-mapping side of things. The problem
CU-mapping introduces is that if you map many input CUs into one output,
this is saying that you want many translation units to produce at most
one child dict if conflicting types are found in any of them. This
means you can suddenly have multiple distinct types with the same name
in the same dict, which libctf cannot really represent because it's not
something you can do with C translation units.
The deduplicator machinery already committed does as best it can with
these, hiding types with conflicting names rather than making child
dicts out of them: but we still need to call it. This is done similarly
to the main link, taking the inputs (one CU output at a time),
deduplicating them, taking the output and making it an input to the
final link. Two (significant) optimizations are done: we share atoms
tables between all these links and the final link (so e.g. all type hash
values are shared, all decorated type names, etc); and any CU-mapped
links with only one input (and no child dicts) doesn't need to do
anything other than renaming the CU: the CU-mapped link phase can be
skipped for it. Put together, large CU-mapped links can save 50% of
their memory usage and about as much time (and the memory usage for
CU-mapped links is significant, because all those output CUs have to
have all their types stored in memory all at once).
include/
* ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the
deduplicator.
libctf/
* ctf-impl.h (ctf_list_splice): New.
* ctf-util.h (ctf_list_splice): Likewise.
* ctf-link.c (link_sort_inputs_cb_arg_t): Likewise.
(ctf_link_sort_inputs): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_variables): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Call it.
2020-06-06 05:57:06 +08:00
|
|
|
static int
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
ctf_link_deduplicating_variables (ctf_dict_t *fp, ctf_dict_t **inputs,
|
libctf, link: tie in the deduplicating linker
This fairly intricate commit connects up the CTF linker machinery (which
operates in terms of ctf_archive_t's on ctf_link_inputs ->
ctf_link_outputs) to the deduplicator (which operates in terms of arrays
of ctf_file_t's, all the archives exploded).
The nondeduplicating linker is retained, but is not called unless the
CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the
environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have
confidence in the much-more-complex deduplicating linker, I hope the
nondeduplicating linker can be removed.
In brief, what this does is traverses each input archive in
ctf_link_inputs, opening every member (if not already open) and tying
child dicts to their parents, shoving them into an array and
constructing a corresponding parents array that tells the deduplicator
which dict is the parent of which child. We then call ctf_dedup and
ctf_dedup_emit with that array of inputs, taking the outputs that result
and putting them into ctf_link_outputs where the rest of the CTF linker
expects to find them, then linking in the variables just as is done by
the nondeduplicating linker.
It also implements much of the CU-mapping side of things. The problem
CU-mapping introduces is that if you map many input CUs into one output,
this is saying that you want many translation units to produce at most
one child dict if conflicting types are found in any of them. This
means you can suddenly have multiple distinct types with the same name
in the same dict, which libctf cannot really represent because it's not
something you can do with C translation units.
The deduplicator machinery already committed does as best it can with
these, hiding types with conflicting names rather than making child
dicts out of them: but we still need to call it. This is done similarly
to the main link, taking the inputs (one CU output at a time),
deduplicating them, taking the output and making it an input to the
final link. Two (significant) optimizations are done: we share atoms
tables between all these links and the final link (so e.g. all type hash
values are shared, all decorated type names, etc); and any CU-mapped
links with only one input (and no child dicts) doesn't need to do
anything other than renaming the CU: the CU-mapped link phase can be
skipped for it. Put together, large CU-mapped links can save 50% of
their memory usage and about as much time (and the memory usage for
CU-mapped links is significant, because all those output CUs have to
have all their types stored in memory all at once).
include/
* ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the
deduplicator.
libctf/
* ctf-impl.h (ctf_list_splice): New.
* ctf-util.h (ctf_list_splice): Likewise.
* ctf-link.c (link_sort_inputs_cb_arg_t): Likewise.
(ctf_link_sort_inputs): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_variables): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Call it.
2020-06-06 05:57:06 +08:00
|
|
|
size_t ninputs, int cu_mapped)
|
|
|
|
{
|
|
|
|
size_t i;
|
|
|
|
|
|
|
|
for (i = 0; i < ninputs; i++)
|
|
|
|
{
|
libctf: add a deduplicator-specific type mapping table
When CTF linking is done, the linker has to track the association
between types in the inputs and types in the outputs. The deduplicator
does this via the cd_output_emission_hashes, which maps from hashes of
types (valid in both the input and output) to the IDs of types in the
specific dict in which the cd_emission_hashes is held. However, the
nondeduplicating linker and ctf_add_type used a different mechanism, a
dedicated hashtab stored in the ctf_link_type_mapping, populated via
ctf_add_type_mapping and queried via the ctf_type_mapping function. To
allow the same functions to be used for variable and symbol population
in both the deduplicating and nondeduplicating linker, the deduplicator
carefully transferred all its input->output mappings into this hashtab
before returning.
This is *expensive*. The number of entries in this hashtab scales as the
number of input types, and unlike the hashing machinery the type mapping
machinery (the only other thing which scales that way) has not been much
optimized.
Now the nondeduplicating linker is gone, we can throw this out, move
the existing type mapping machinery to ctf-create.c and dedicate it to
ctf_add_type alone, and add a new function ctf_dedup_type_mapping which
uses the deduplicator's built-in knowledge of type mappings directly,
without requiring an expensive repopulation phase.
This speeds up a test link of nouveau.ko (a good worst-case candidate
with a lot of types in each of a lot of input files) from 9.11s to 7.15s
in my testing, a speedup of over 20%.
libctf/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (ctf_dict_t) <ctf_link_type_mapping>: No longer used
by the nondeduplicating linker.
(ctf_add_type_mapping): Removed, now static.
(ctf_type_mapping): Likewise.
(ctf_dedup_type_mapping): New.
(ctf_dedup_t) <cd_input_nums>: New.
* ctf-dedup.c (ctf_dedup_init): Populate it.
(ctf_dedup_fini): Free it again. Emphasise that this has to be
the last thing called.
(ctf_dedup): Populate it.
(ctf_dedup_populate_type_mapping): Removed.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): No longer call it. No longer call
ctf_dedup_fini either.
(ctf_dedup_type_mapping): New.
* ctf-link.c (ctf_unnamed_cuname): New.
(ctf_create_per_cu): Arguments must be non-null now.
(ctf_in_member_cb_arg): Removed.
(ctf_link): No longer populate it. No longer discard the
mapping table.
(ctf_link_deduplicating_one_symtypetab): Use
ctf_dedup_type_mapping, not ctf_type_mapping. Use
ctf_unnamed_cuname.
(ctf_link_one_variable): Likewise. Pass in args individually: no
longer a ctf_variable_iter callback.
(empty_link_type_mapping): Removed.
(ctf_link_deduplicating_variables): Use ctf_variable_next, not
ctf_variable_iter. No longer pack arguments to
ctf_link_one_variable into a struct.
(ctf_link_deduplicating_per_cu): Call ctf_dedup_fini once
all link phases are done.
(ctf_link_deduplicating): Likewise.
(ctf_link_intern_extern_string): Improve comment.
(ctf_add_type_mapping): Migrate...
(ctf_type_mapping): ... these functions...
* ctf-create.c (ctf_add_type_mapping): ... here...
(ctf_type_mapping): ... and make static, for the sole use of
ctf_add_type.
2021-03-02 23:10:05 +08:00
|
|
|
ctf_next_t *it = NULL;
|
|
|
|
ctf_id_t type;
|
|
|
|
const char *name;
|
|
|
|
|
include, libctf, ld: extend variable section to contain functions too
The CTF variable section is an optional (usually-not-present) section in
the CTF dict which contains name -> type mappings corresponding to data
symbols that are present in the linker input but not in the output
symbol table: the idea is that programs that use their own symbol-
resolution mechanisms can use this section to look up the types of
symbols they have found using their own mechanism.
Because these removed symbols (mostly static variables, functions, etc)
all have names that are unlikely to appear in the ELF symtab and because
very few programs have their own symbol-resolution mechanisms, a special
linker flag (--ctf-variables) is needed to emit this section.
Historically, we emitted only removed data symbols into the variable
section. This seemed to make sense at the time, but in hindsight it
really doesn't: functions are symbols too, and a C program can look them
up just like any other type. So extend the variable section so that it
contains all static function symbols too (if it is emitted at all), with
types of kind CTF_K_FUNCTION.
This is a little fiddly. We relied on compiler assistance for data
symbols: the compiler simply emits all data symbols twice, once into the
symtypetab as an indexed symbol and once into the variable section.
Rather than wait for a suitably adjusted compiler that does the same for
function symbols, we can pluck unreported function symbols out of the
symtab and add them to the variable section ourselves. While we're at
it, we do the same with data symbols: this is redundant right now
because the compiler does it, but it costs very little time and lets the
compiler drop this kludge and save a little space in .o files.
include/
* ctf.h: Mention the new things we can see in the variable
section.
ld/
* testsuite/ld-ctf/data-func-conflicted-vars.d: New test.
libctf/
* ctf-link.c (ctf_link_deduplicating_variables): Duplicate
symbols into the variable section too.
* ctf-serialize.c (symtypetab_delete_nonstatic_vars): Rename
to...
(symtypetab_delete_nonstatics): ... this. Check the funchash
when pruning redundant variables.
(ctf_symtypetab_sect_sizes): Adjust accordingly.
* NEWS: Describe this change.
2022-03-16 23:29:25 +08:00
|
|
|
/* First the variables on the inputs. */
|
|
|
|
|
libctf: add a deduplicator-specific type mapping table
When CTF linking is done, the linker has to track the association
between types in the inputs and types in the outputs. The deduplicator
does this via the cd_output_emission_hashes, which maps from hashes of
types (valid in both the input and output) to the IDs of types in the
specific dict in which the cd_emission_hashes is held. However, the
nondeduplicating linker and ctf_add_type used a different mechanism, a
dedicated hashtab stored in the ctf_link_type_mapping, populated via
ctf_add_type_mapping and queried via the ctf_type_mapping function. To
allow the same functions to be used for variable and symbol population
in both the deduplicating and nondeduplicating linker, the deduplicator
carefully transferred all its input->output mappings into this hashtab
before returning.
This is *expensive*. The number of entries in this hashtab scales as the
number of input types, and unlike the hashing machinery the type mapping
machinery (the only other thing which scales that way) has not been much
optimized.
Now the nondeduplicating linker is gone, we can throw this out, move
the existing type mapping machinery to ctf-create.c and dedicate it to
ctf_add_type alone, and add a new function ctf_dedup_type_mapping which
uses the deduplicator's built-in knowledge of type mappings directly,
without requiring an expensive repopulation phase.
This speeds up a test link of nouveau.ko (a good worst-case candidate
with a lot of types in each of a lot of input files) from 9.11s to 7.15s
in my testing, a speedup of over 20%.
libctf/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (ctf_dict_t) <ctf_link_type_mapping>: No longer used
by the nondeduplicating linker.
(ctf_add_type_mapping): Removed, now static.
(ctf_type_mapping): Likewise.
(ctf_dedup_type_mapping): New.
(ctf_dedup_t) <cd_input_nums>: New.
* ctf-dedup.c (ctf_dedup_init): Populate it.
(ctf_dedup_fini): Free it again. Emphasise that this has to be
the last thing called.
(ctf_dedup): Populate it.
(ctf_dedup_populate_type_mapping): Removed.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): No longer call it. No longer call
ctf_dedup_fini either.
(ctf_dedup_type_mapping): New.
* ctf-link.c (ctf_unnamed_cuname): New.
(ctf_create_per_cu): Arguments must be non-null now.
(ctf_in_member_cb_arg): Removed.
(ctf_link): No longer populate it. No longer discard the
mapping table.
(ctf_link_deduplicating_one_symtypetab): Use
ctf_dedup_type_mapping, not ctf_type_mapping. Use
ctf_unnamed_cuname.
(ctf_link_one_variable): Likewise. Pass in args individually: no
longer a ctf_variable_iter callback.
(empty_link_type_mapping): Removed.
(ctf_link_deduplicating_variables): Use ctf_variable_next, not
ctf_variable_iter. No longer pack arguments to
ctf_link_one_variable into a struct.
(ctf_link_deduplicating_per_cu): Call ctf_dedup_fini once
all link phases are done.
(ctf_link_deduplicating): Likewise.
(ctf_link_intern_extern_string): Improve comment.
(ctf_add_type_mapping): Migrate...
(ctf_type_mapping): ... these functions...
* ctf-create.c (ctf_add_type_mapping): ... here...
(ctf_type_mapping): ... and make static, for the sole use of
ctf_add_type.
2021-03-02 23:10:05 +08:00
|
|
|
while ((type = ctf_variable_next (inputs[i], &it, &name)) != CTF_ERR)
|
|
|
|
{
|
|
|
|
if (ctf_link_one_variable (fp, inputs[i], name, type, cu_mapped) < 0)
|
|
|
|
{
|
|
|
|
ctf_next_destroy (it);
|
|
|
|
return -1; /* errno is set for us. */
|
|
|
|
}
|
|
|
|
}
|
|
|
|
if (ctf_errno (inputs[i]) != ECTF_NEXT_END)
|
|
|
|
return ctf_set_errno (fp, ctf_errno (inputs[i]));
|
include, libctf, ld: extend variable section to contain functions too
The CTF variable section is an optional (usually-not-present) section in
the CTF dict which contains name -> type mappings corresponding to data
symbols that are present in the linker input but not in the output
symbol table: the idea is that programs that use their own symbol-
resolution mechanisms can use this section to look up the types of
symbols they have found using their own mechanism.
Because these removed symbols (mostly static variables, functions, etc)
all have names that are unlikely to appear in the ELF symtab and because
very few programs have their own symbol-resolution mechanisms, a special
linker flag (--ctf-variables) is needed to emit this section.
Historically, we emitted only removed data symbols into the variable
section. This seemed to make sense at the time, but in hindsight it
really doesn't: functions are symbols too, and a C program can look them
up just like any other type. So extend the variable section so that it
contains all static function symbols too (if it is emitted at all), with
types of kind CTF_K_FUNCTION.
This is a little fiddly. We relied on compiler assistance for data
symbols: the compiler simply emits all data symbols twice, once into the
symtypetab as an indexed symbol and once into the variable section.
Rather than wait for a suitably adjusted compiler that does the same for
function symbols, we can pluck unreported function symbols out of the
symtab and add them to the variable section ourselves. While we're at
it, we do the same with data symbols: this is redundant right now
because the compiler does it, but it costs very little time and lets the
compiler drop this kludge and save a little space in .o files.
include/
* ctf.h: Mention the new things we can see in the variable
section.
ld/
* testsuite/ld-ctf/data-func-conflicted-vars.d: New test.
libctf/
* ctf-link.c (ctf_link_deduplicating_variables): Duplicate
symbols into the variable section too.
* ctf-serialize.c (symtypetab_delete_nonstatic_vars): Rename
to...
(symtypetab_delete_nonstatics): ... this. Check the funchash
when pruning redundant variables.
(ctf_symtypetab_sect_sizes): Adjust accordingly.
* NEWS: Describe this change.
2022-03-16 23:29:25 +08:00
|
|
|
|
|
|
|
/* Next the symbols. We integrate data symbols even though the compiler
|
|
|
|
is currently doing the same, to allow the compiler to stop in
|
|
|
|
future. */
|
|
|
|
|
|
|
|
while ((type = ctf_symbol_next (inputs[i], &it, &name, 0)) != CTF_ERR)
|
|
|
|
{
|
|
|
|
if (ctf_link_one_variable (fp, inputs[i], name, type, 1) < 0)
|
|
|
|
{
|
|
|
|
ctf_next_destroy (it);
|
|
|
|
return -1; /* errno is set for us. */
|
|
|
|
}
|
|
|
|
}
|
|
|
|
if (ctf_errno (inputs[i]) != ECTF_NEXT_END)
|
|
|
|
return ctf_set_errno (fp, ctf_errno (inputs[i]));
|
|
|
|
|
|
|
|
/* Finally the function symbols. */
|
|
|
|
|
|
|
|
while ((type = ctf_symbol_next (inputs[i], &it, &name, 1)) != CTF_ERR)
|
|
|
|
{
|
|
|
|
if (ctf_link_one_variable (fp, inputs[i], name, type, 1) < 0)
|
|
|
|
{
|
|
|
|
ctf_next_destroy (it);
|
|
|
|
return -1; /* errno is set for us. */
|
|
|
|
}
|
|
|
|
}
|
|
|
|
if (ctf_errno (inputs[i]) != ECTF_NEXT_END)
|
|
|
|
return ctf_set_errno (fp, ctf_errno (inputs[i]));
|
libctf, link: tie in the deduplicating linker
This fairly intricate commit connects up the CTF linker machinery (which
operates in terms of ctf_archive_t's on ctf_link_inputs ->
ctf_link_outputs) to the deduplicator (which operates in terms of arrays
of ctf_file_t's, all the archives exploded).
The nondeduplicating linker is retained, but is not called unless the
CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the
environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have
confidence in the much-more-complex deduplicating linker, I hope the
nondeduplicating linker can be removed.
In brief, what this does is traverses each input archive in
ctf_link_inputs, opening every member (if not already open) and tying
child dicts to their parents, shoving them into an array and
constructing a corresponding parents array that tells the deduplicator
which dict is the parent of which child. We then call ctf_dedup and
ctf_dedup_emit with that array of inputs, taking the outputs that result
and putting them into ctf_link_outputs where the rest of the CTF linker
expects to find them, then linking in the variables just as is done by
the nondeduplicating linker.
It also implements much of the CU-mapping side of things. The problem
CU-mapping introduces is that if you map many input CUs into one output,
this is saying that you want many translation units to produce at most
one child dict if conflicting types are found in any of them. This
means you can suddenly have multiple distinct types with the same name
in the same dict, which libctf cannot really represent because it's not
something you can do with C translation units.
The deduplicator machinery already committed does as best it can with
these, hiding types with conflicting names rather than making child
dicts out of them: but we still need to call it. This is done similarly
to the main link, taking the inputs (one CU output at a time),
deduplicating them, taking the output and making it an input to the
final link. Two (significant) optimizations are done: we share atoms
tables between all these links and the final link (so e.g. all type hash
values are shared, all decorated type names, etc); and any CU-mapped
links with only one input (and no child dicts) doesn't need to do
anything other than renaming the CU: the CU-mapped link phase can be
skipped for it. Put together, large CU-mapped links can save 50% of
their memory usage and about as much time (and the memory usage for
CU-mapped links is significant, because all those output CUs have to
have all their types stored in memory all at once).
include/
* ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the
deduplicator.
libctf/
* ctf-impl.h (ctf_list_splice): New.
* ctf-util.h (ctf_list_splice): Likewise.
* ctf-link.c (link_sort_inputs_cb_arg_t): Likewise.
(ctf_link_sort_inputs): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_variables): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Call it.
2020-06-06 05:57:06 +08:00
|
|
|
}
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
libctf: symbol type linking support
This adds facilities to write out the function info and data object
sections, which efficiently map from entries in the symbol table to
types. The write-side code is entirely new: the read-side code was
merely significantly changed and support for indexed tables added
(pointed to by the no-longer-unused cth_objtidxoff and cth_funcidxoff
header fields).
With this in place, you can use ctf_lookup_by_symbol to look up the
types of symbols of function and object type (and, as before, you can
use ctf_lookup_variable to look up types of file-scope variables not
present in the symbol table, as long as you know their name: but
variables that are also data objects are now found in the data object
section instead.)
(Compatible) file format change:
The CTF spec has always said that the function info section looks much
like the CTF_K_FUNCTIONs in the type section: an info word (including an
argument count) followed by a return type and N argument types. This
format is suboptimal: it means function symbols cannot be deduplicated
and it causes a lot of ugly code duplication in libctf. But
conveniently the compiler has never emitted this! Because it has always
emitted a rather different format that libctf has never accepted, we can
be sure that there are no instances of this function info section in the
wild, and can freely change its format without compatibility concerns or
a file format version bump. (And since it has never been emitted in any
code that generated any older file format version, either, we need keep
no code to read the format as specified at all!)
So the function info section is now specified as an array of uint32_t,
exactly like the object data section: each entry is a type ID in the
type section which must be of kind CTF_K_FUNCTION, the prototype of
this function.
This allows function types to be deduplicated and also correctly encodes
the fact that all functions declared in C really are types available to
the program: so they should be stored in the type section like all other
types. (In format v4, we will be able to represent the types of static
functions as well, but that really does require a file format change.)
We introduce a new header flag, CTF_F_NEWFUNCINFO, which is set if the
new function info format is in use. A sufficiently new compiler will
always set this flag. New libctf will always set this flag: old libctf
will refuse to open any CTF dicts that have this flag set. If the flag
is not set on a dict being read in, new libctf will disregard the
function info section. Format v4 will remove this flag (or, rather, the
flag has no meaning there and the bit position may be recycled for some
other purpose).
New API:
Symbol addition:
ctf_add_func_sym: Add a symbol with a given name and type. The
type must be of kind CTF_K_FUNCTION (a function
pointer). Internally this adds a name -> type
mapping to the ctf_funchash in the ctf_dict.
ctf_add_objt_sym: Add a symbol with a given name and type. The type
kind can be anything, including function pointers.
This adds to ctf_objthash.
These both treat symbols as name -> type mappings: the linker associates
symbol names with symbol indexes via the ctf_link_shuffle_syms callback,
which sets up the ctf_dynsyms/ctf_dynsymidx/ctf_dynsymmax fields in the
ctf_dict. Repeated relinks can add more symbols.
Variables that are also exposed as symbols are removed from the variable
section at serialization time.
CTF symbol type sections which have enough pads, defined by
CTF_INDEX_PAD_THRESHOLD (whether because they are in dicts with symbols
where most types are unknown, or in archive where most types are defined
in some child or parent dict, not in this specific dict) are sorted by
name rather than symidx and accompanied by an index which associates
each symbol type entry with a name: the existing ctf_lookup_by_symbol
will map symbol indexes to symbol names and look the names up in the
index automatically. (This is currently ELF-symbol-table-dependent, but
there is almost nothing specific to ELF in here and we can add support
for other symbol table formats easily).
The compiler also uses index sections to communicate the contents of
object file symbol tables without relying on any specific ordering of
symbols: it doesn't need to sort them, and libctf will detect an
unsorted index section via the absence of the new CTF_F_IDXSORTED header
flag, and sort it if needed.
Iteration:
ctf_symbol_next: Iterator which returns the types and names of symbols
one by one, either for function or data symbols.
This does not require any sorting: the ctf_link machinery uses it to
pull in all the compiler-provided symbols cheaply, but it is not
restricted to that use.
(Compatible) changes in API:
ctf_lookup_by_symbol: can now be called for object and function
symbols: never returns ECTF_NOTDATA (which is
now not thrown by anything, but is kept for
compatibility and because it is a plausible
error that we might start throwing again at some
later date).
Internally we also have changes to the ctf-string functionality so that
"external" strings (those where we track a string -> offset mapping, but
only write out an offset) can be consulted via the usual means
(ctf_strptr) before the strtab is written out. This is important
because ctf_link_add_linker_symbol can now be handed symbols named via
strtab offsets, and ctf_link_shuffle_syms must figure out their actual
names by looking in the external symtab we have just been fed by the
ctf_link_add_strtab callback, long before that strtab is written out.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_symbol_next): New.
(ctf_add_objt_sym): Likewise.
(ctf_add_func_sym): Likewise.
* ctf.h: Document new function info section format.
(CTF_F_NEWFUNCINFO): New.
(CTF_F_IDXSORTED): New.
(CTF_F_MAX): Adjust accordingly.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (CTF_INDEX_PAD_THRESHOLD): New.
(_libctf_nonnull_): Likewise.
(ctf_in_flight_dynsym_t): New.
(ctf_dict_t) <ctf_funcidx_names>: Likewise.
<ctf_objtidx_names>: Likewise.
<ctf_nfuncidx>: Likewise.
<ctf_nobjtidx>: Likewise.
<ctf_funcidx_sxlate>: Likewise.
<ctf_objtidx_sxlate>: Likewise.
<ctf_objthash>: Likewise.
<ctf_funchash>: Likewise.
<ctf_dynsyms>: Likewise.
<ctf_dynsymidx>: Likewise.
<ctf_dynsymmax>: Likewise.
<ctf_in_flight_dynsym>: Likewise.
(struct ctf_next) <u.ctn_next>: Likewise.
(ctf_symtab_skippable): New prototype.
(ctf_add_funcobjt_sym): Likewise.
(ctf_dynhash_sort_by_name): Likewise.
(ctf_sym_to_elf64): Rename to...
(ctf_elf32_to_link_sym): ... this, and...
(ctf_elf64_to_link_sym): ... this.
* ctf-open.c (init_symtab): Check for lack of CTF_F_NEWFUNCINFO
flag, and presence of index sections. Refactor out
ctf_symtab_skippable and ctf_elf*_to_link_sym, and use them. Use
ctf_link_sym_t, not Elf64_Sym. Skip initializing objt or func
sxlate sections if corresponding index section is present. Adjust
for new func info section format.
(ctf_bufopen_internal): Add ctf_err_warn to corrupt-file error
handling. Report incorrect-length index sections. Always do an
init_symtab, even if there is no symtab section (there may be index
sections still).
(flip_objts): Adjust comment: func and objt sections are actually
identical in structure now, no need to caveat.
(ctf_dict_close): Free newly-added data structures.
* ctf-create.c (ctf_create): Initialize them.
(ctf_symtab_skippable): New, refactored out of
init_symtab, with st_nameidx_set check added.
(ctf_add_funcobjt_sym): New, add a function or object symbol to the
ctf_objthash or ctf_funchash, by name.
(ctf_add_objt_sym): Call it.
(ctf_add_func_sym): Likewise.
(symtypetab_delete_nonstatic_vars): New, delete vars also present as
data objects.
(CTF_SYMTYPETAB_EMIT_FUNCTION): New flag to symtypetab emitters:
this is a function emission, not a data object emission.
(CTF_SYMTYPETAB_EMIT_PAD): New flag to symtypetab emitters: emit
pads for symbols with no type (only set for unindexed sections).
(CTF_SYMTYPETAB_FORCE_INDEXED): New flag to symtypetab emitters:
always emit indexed.
(symtypetab_density): New, figure out section sizes.
(emit_symtypetab): New, emit a symtypetab.
(emit_symtypetab_index): New, emit a symtypetab index.
(ctf_serialize): Call them, emitting suitably sorted symtypetab
sections and indexes. Set suitable header flags. Copy over new
fields.
* ctf-hash.c (ctf_dynhash_sort_by_name): New, used to impose an
order on symtypetab index sections.
* ctf-link.c (ctf_add_type_mapping): Delete erroneous comment
relating to code that was never committed.
(ctf_link_one_variable): Improve variable name.
(check_sym): New, symtypetab analogue of check_variable.
(ctf_link_deduplicating_one_symtypetab): New.
(ctf_link_deduplicating_syms): Likewise.
(ctf_link_deduplicating): Call them.
(ctf_link_deduplicating_per_cu): Note that we don't call them in
this case (yet).
(ctf_link_add_strtab): Set the error on the fp correctly.
(ctf_link_add_linker_symbol): New (no longer a do-nothing stub), add
a linker symbol to the in-flight list.
(ctf_link_shuffle_syms): New (no longer a do-nothing stub), turn the
in-flight list into a mapping we can use, now its names are
resolvable in the external strtab.
* ctf-string.c (ctf_str_rollback_atom): Don't roll back atoms with
external strtab offsets.
(ctf_str_rollback): Adjust comment.
(ctf_str_write_strtab): Migrate ctf_syn_ext_strtab population from
writeout time...
(ctf_str_add_external): ... to string addition time.
* ctf-lookup.c (ctf_lookup_var_key_t): Rename to...
(ctf_lookup_idx_key_t): ... this, now we use it for syms too.
<clik_names>: New member, a name table.
(ctf_lookup_var): Adjust accordingly.
(ctf_lookup_variable): Likewise.
(ctf_lookup_by_id): Shuffle further up in the file.
(ctf_symidx_sort_arg_cb): New, callback for...
(sort_symidx_by_name): ... this new function to sort a symidx
found to be unsorted (likely originating from the compiler).
(ctf_symidx_sort): New, sort a symidx.
(ctf_lookup_symbol_name): Support dynamic symbols with indexes
provided by the linker. Use ctf_link_sym_t, not Elf64_Sym.
Check the parent if a child lookup fails.
(ctf_lookup_by_symbol): Likewise. Work for function symbols too.
(ctf_symbol_next): New, iterate over symbols with types (without
sorting).
(ctf_lookup_idx_name): New, bsearch for symbol names in indexes.
(ctf_try_lookup_indexed): New, attempt an indexed lookup.
(ctf_func_info): Reimplement in terms of ctf_lookup_by_symbol.
(ctf_func_args): Likewise.
(ctf_get_dict): Move...
* ctf-types.c (ctf_get_dict): ... here.
* ctf-util.c (ctf_sym_to_elf64): Re-express as...
(ctf_elf64_to_link_sym): ... this. Add new st_symidx field, and
st_nameidx_set (always 0, so st_nameidx can be ignored). Look in
the ELF strtab for names.
(ctf_elf32_to_link_sym): Likewise, for Elf32_Sym.
(ctf_next_destroy): Destroy ctf_next_t.u.ctn_next if need be.
* libctf.ver: Add ctf_symbol_next, ctf_add_objt_sym and
ctf_add_func_sym.
2020-11-20 21:34:04 +08:00
|
|
|
/* Check for symbol conflicts during linking. Three possibilities: already
|
|
|
|
exists, conflicting, or nonexistent. We don't have a dvd structure we can
|
|
|
|
use as a flag like check_variable does, so we use a tristate return
|
|
|
|
value instead: -1: conflicting; 1: nonexistent: 0: already exists. */
|
|
|
|
|
|
|
|
static int
|
|
|
|
check_sym (ctf_dict_t *fp, const char *name, ctf_id_t type, int functions)
|
|
|
|
{
|
|
|
|
ctf_dynhash_t *thishash = functions ? fp->ctf_funchash : fp->ctf_objthash;
|
|
|
|
ctf_dynhash_t *thathash = functions ? fp->ctf_objthash : fp->ctf_funchash;
|
|
|
|
void *value;
|
|
|
|
|
|
|
|
/* Wrong type (function when object is wanted, etc). */
|
|
|
|
if (ctf_dynhash_lookup_kv (thathash, name, NULL, NULL))
|
|
|
|
return -1;
|
|
|
|
|
|
|
|
/* Not present at all yet. */
|
|
|
|
if (!ctf_dynhash_lookup_kv (thishash, name, NULL, &value))
|
|
|
|
return 1;
|
|
|
|
|
|
|
|
/* Already present. */
|
|
|
|
if ((ctf_id_t) (uintptr_t) value == type)
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
/* Wrong type. */
|
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Do a deduplicating link of one symtypetab (function info or data object) in
|
|
|
|
one input dict. */
|
|
|
|
|
|
|
|
static int
|
|
|
|
ctf_link_deduplicating_one_symtypetab (ctf_dict_t *fp, ctf_dict_t *input,
|
|
|
|
int cu_mapped, int functions)
|
|
|
|
{
|
|
|
|
ctf_next_t *it = NULL;
|
|
|
|
const char *name;
|
|
|
|
ctf_id_t type;
|
|
|
|
|
|
|
|
while ((type = ctf_symbol_next (input, &it, &name, functions)) != CTF_ERR)
|
|
|
|
{
|
|
|
|
ctf_id_t dst_type;
|
|
|
|
ctf_dict_t *per_cu_out_fp;
|
|
|
|
int sym;
|
|
|
|
|
|
|
|
/* Look in the parent first. */
|
|
|
|
|
libctf: add a deduplicator-specific type mapping table
When CTF linking is done, the linker has to track the association
between types in the inputs and types in the outputs. The deduplicator
does this via the cd_output_emission_hashes, which maps from hashes of
types (valid in both the input and output) to the IDs of types in the
specific dict in which the cd_emission_hashes is held. However, the
nondeduplicating linker and ctf_add_type used a different mechanism, a
dedicated hashtab stored in the ctf_link_type_mapping, populated via
ctf_add_type_mapping and queried via the ctf_type_mapping function. To
allow the same functions to be used for variable and symbol population
in both the deduplicating and nondeduplicating linker, the deduplicator
carefully transferred all its input->output mappings into this hashtab
before returning.
This is *expensive*. The number of entries in this hashtab scales as the
number of input types, and unlike the hashing machinery the type mapping
machinery (the only other thing which scales that way) has not been much
optimized.
Now the nondeduplicating linker is gone, we can throw this out, move
the existing type mapping machinery to ctf-create.c and dedicate it to
ctf_add_type alone, and add a new function ctf_dedup_type_mapping which
uses the deduplicator's built-in knowledge of type mappings directly,
without requiring an expensive repopulation phase.
This speeds up a test link of nouveau.ko (a good worst-case candidate
with a lot of types in each of a lot of input files) from 9.11s to 7.15s
in my testing, a speedup of over 20%.
libctf/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (ctf_dict_t) <ctf_link_type_mapping>: No longer used
by the nondeduplicating linker.
(ctf_add_type_mapping): Removed, now static.
(ctf_type_mapping): Likewise.
(ctf_dedup_type_mapping): New.
(ctf_dedup_t) <cd_input_nums>: New.
* ctf-dedup.c (ctf_dedup_init): Populate it.
(ctf_dedup_fini): Free it again. Emphasise that this has to be
the last thing called.
(ctf_dedup): Populate it.
(ctf_dedup_populate_type_mapping): Removed.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): No longer call it. No longer call
ctf_dedup_fini either.
(ctf_dedup_type_mapping): New.
* ctf-link.c (ctf_unnamed_cuname): New.
(ctf_create_per_cu): Arguments must be non-null now.
(ctf_in_member_cb_arg): Removed.
(ctf_link): No longer populate it. No longer discard the
mapping table.
(ctf_link_deduplicating_one_symtypetab): Use
ctf_dedup_type_mapping, not ctf_type_mapping. Use
ctf_unnamed_cuname.
(ctf_link_one_variable): Likewise. Pass in args individually: no
longer a ctf_variable_iter callback.
(empty_link_type_mapping): Removed.
(ctf_link_deduplicating_variables): Use ctf_variable_next, not
ctf_variable_iter. No longer pack arguments to
ctf_link_one_variable into a struct.
(ctf_link_deduplicating_per_cu): Call ctf_dedup_fini once
all link phases are done.
(ctf_link_deduplicating): Likewise.
(ctf_link_intern_extern_string): Improve comment.
(ctf_add_type_mapping): Migrate...
(ctf_type_mapping): ... these functions...
* ctf-create.c (ctf_add_type_mapping): ... here...
(ctf_type_mapping): ... and make static, for the sole use of
ctf_add_type.
2021-03-02 23:10:05 +08:00
|
|
|
if ((dst_type = ctf_dedup_type_mapping (fp, input, type)) == CTF_ERR)
|
|
|
|
return -1; /* errno is set for us. */
|
|
|
|
|
libctf: symbol type linking support
This adds facilities to write out the function info and data object
sections, which efficiently map from entries in the symbol table to
types. The write-side code is entirely new: the read-side code was
merely significantly changed and support for indexed tables added
(pointed to by the no-longer-unused cth_objtidxoff and cth_funcidxoff
header fields).
With this in place, you can use ctf_lookup_by_symbol to look up the
types of symbols of function and object type (and, as before, you can
use ctf_lookup_variable to look up types of file-scope variables not
present in the symbol table, as long as you know their name: but
variables that are also data objects are now found in the data object
section instead.)
(Compatible) file format change:
The CTF spec has always said that the function info section looks much
like the CTF_K_FUNCTIONs in the type section: an info word (including an
argument count) followed by a return type and N argument types. This
format is suboptimal: it means function symbols cannot be deduplicated
and it causes a lot of ugly code duplication in libctf. But
conveniently the compiler has never emitted this! Because it has always
emitted a rather different format that libctf has never accepted, we can
be sure that there are no instances of this function info section in the
wild, and can freely change its format without compatibility concerns or
a file format version bump. (And since it has never been emitted in any
code that generated any older file format version, either, we need keep
no code to read the format as specified at all!)
So the function info section is now specified as an array of uint32_t,
exactly like the object data section: each entry is a type ID in the
type section which must be of kind CTF_K_FUNCTION, the prototype of
this function.
This allows function types to be deduplicated and also correctly encodes
the fact that all functions declared in C really are types available to
the program: so they should be stored in the type section like all other
types. (In format v4, we will be able to represent the types of static
functions as well, but that really does require a file format change.)
We introduce a new header flag, CTF_F_NEWFUNCINFO, which is set if the
new function info format is in use. A sufficiently new compiler will
always set this flag. New libctf will always set this flag: old libctf
will refuse to open any CTF dicts that have this flag set. If the flag
is not set on a dict being read in, new libctf will disregard the
function info section. Format v4 will remove this flag (or, rather, the
flag has no meaning there and the bit position may be recycled for some
other purpose).
New API:
Symbol addition:
ctf_add_func_sym: Add a symbol with a given name and type. The
type must be of kind CTF_K_FUNCTION (a function
pointer). Internally this adds a name -> type
mapping to the ctf_funchash in the ctf_dict.
ctf_add_objt_sym: Add a symbol with a given name and type. The type
kind can be anything, including function pointers.
This adds to ctf_objthash.
These both treat symbols as name -> type mappings: the linker associates
symbol names with symbol indexes via the ctf_link_shuffle_syms callback,
which sets up the ctf_dynsyms/ctf_dynsymidx/ctf_dynsymmax fields in the
ctf_dict. Repeated relinks can add more symbols.
Variables that are also exposed as symbols are removed from the variable
section at serialization time.
CTF symbol type sections which have enough pads, defined by
CTF_INDEX_PAD_THRESHOLD (whether because they are in dicts with symbols
where most types are unknown, or in archive where most types are defined
in some child or parent dict, not in this specific dict) are sorted by
name rather than symidx and accompanied by an index which associates
each symbol type entry with a name: the existing ctf_lookup_by_symbol
will map symbol indexes to symbol names and look the names up in the
index automatically. (This is currently ELF-symbol-table-dependent, but
there is almost nothing specific to ELF in here and we can add support
for other symbol table formats easily).
The compiler also uses index sections to communicate the contents of
object file symbol tables without relying on any specific ordering of
symbols: it doesn't need to sort them, and libctf will detect an
unsorted index section via the absence of the new CTF_F_IDXSORTED header
flag, and sort it if needed.
Iteration:
ctf_symbol_next: Iterator which returns the types and names of symbols
one by one, either for function or data symbols.
This does not require any sorting: the ctf_link machinery uses it to
pull in all the compiler-provided symbols cheaply, but it is not
restricted to that use.
(Compatible) changes in API:
ctf_lookup_by_symbol: can now be called for object and function
symbols: never returns ECTF_NOTDATA (which is
now not thrown by anything, but is kept for
compatibility and because it is a plausible
error that we might start throwing again at some
later date).
Internally we also have changes to the ctf-string functionality so that
"external" strings (those where we track a string -> offset mapping, but
only write out an offset) can be consulted via the usual means
(ctf_strptr) before the strtab is written out. This is important
because ctf_link_add_linker_symbol can now be handed symbols named via
strtab offsets, and ctf_link_shuffle_syms must figure out their actual
names by looking in the external symtab we have just been fed by the
ctf_link_add_strtab callback, long before that strtab is written out.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_symbol_next): New.
(ctf_add_objt_sym): Likewise.
(ctf_add_func_sym): Likewise.
* ctf.h: Document new function info section format.
(CTF_F_NEWFUNCINFO): New.
(CTF_F_IDXSORTED): New.
(CTF_F_MAX): Adjust accordingly.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (CTF_INDEX_PAD_THRESHOLD): New.
(_libctf_nonnull_): Likewise.
(ctf_in_flight_dynsym_t): New.
(ctf_dict_t) <ctf_funcidx_names>: Likewise.
<ctf_objtidx_names>: Likewise.
<ctf_nfuncidx>: Likewise.
<ctf_nobjtidx>: Likewise.
<ctf_funcidx_sxlate>: Likewise.
<ctf_objtidx_sxlate>: Likewise.
<ctf_objthash>: Likewise.
<ctf_funchash>: Likewise.
<ctf_dynsyms>: Likewise.
<ctf_dynsymidx>: Likewise.
<ctf_dynsymmax>: Likewise.
<ctf_in_flight_dynsym>: Likewise.
(struct ctf_next) <u.ctn_next>: Likewise.
(ctf_symtab_skippable): New prototype.
(ctf_add_funcobjt_sym): Likewise.
(ctf_dynhash_sort_by_name): Likewise.
(ctf_sym_to_elf64): Rename to...
(ctf_elf32_to_link_sym): ... this, and...
(ctf_elf64_to_link_sym): ... this.
* ctf-open.c (init_symtab): Check for lack of CTF_F_NEWFUNCINFO
flag, and presence of index sections. Refactor out
ctf_symtab_skippable and ctf_elf*_to_link_sym, and use them. Use
ctf_link_sym_t, not Elf64_Sym. Skip initializing objt or func
sxlate sections if corresponding index section is present. Adjust
for new func info section format.
(ctf_bufopen_internal): Add ctf_err_warn to corrupt-file error
handling. Report incorrect-length index sections. Always do an
init_symtab, even if there is no symtab section (there may be index
sections still).
(flip_objts): Adjust comment: func and objt sections are actually
identical in structure now, no need to caveat.
(ctf_dict_close): Free newly-added data structures.
* ctf-create.c (ctf_create): Initialize them.
(ctf_symtab_skippable): New, refactored out of
init_symtab, with st_nameidx_set check added.
(ctf_add_funcobjt_sym): New, add a function or object symbol to the
ctf_objthash or ctf_funchash, by name.
(ctf_add_objt_sym): Call it.
(ctf_add_func_sym): Likewise.
(symtypetab_delete_nonstatic_vars): New, delete vars also present as
data objects.
(CTF_SYMTYPETAB_EMIT_FUNCTION): New flag to symtypetab emitters:
this is a function emission, not a data object emission.
(CTF_SYMTYPETAB_EMIT_PAD): New flag to symtypetab emitters: emit
pads for symbols with no type (only set for unindexed sections).
(CTF_SYMTYPETAB_FORCE_INDEXED): New flag to symtypetab emitters:
always emit indexed.
(symtypetab_density): New, figure out section sizes.
(emit_symtypetab): New, emit a symtypetab.
(emit_symtypetab_index): New, emit a symtypetab index.
(ctf_serialize): Call them, emitting suitably sorted symtypetab
sections and indexes. Set suitable header flags. Copy over new
fields.
* ctf-hash.c (ctf_dynhash_sort_by_name): New, used to impose an
order on symtypetab index sections.
* ctf-link.c (ctf_add_type_mapping): Delete erroneous comment
relating to code that was never committed.
(ctf_link_one_variable): Improve variable name.
(check_sym): New, symtypetab analogue of check_variable.
(ctf_link_deduplicating_one_symtypetab): New.
(ctf_link_deduplicating_syms): Likewise.
(ctf_link_deduplicating): Call them.
(ctf_link_deduplicating_per_cu): Note that we don't call them in
this case (yet).
(ctf_link_add_strtab): Set the error on the fp correctly.
(ctf_link_add_linker_symbol): New (no longer a do-nothing stub), add
a linker symbol to the in-flight list.
(ctf_link_shuffle_syms): New (no longer a do-nothing stub), turn the
in-flight list into a mapping we can use, now its names are
resolvable in the external strtab.
* ctf-string.c (ctf_str_rollback_atom): Don't roll back atoms with
external strtab offsets.
(ctf_str_rollback): Adjust comment.
(ctf_str_write_strtab): Migrate ctf_syn_ext_strtab population from
writeout time...
(ctf_str_add_external): ... to string addition time.
* ctf-lookup.c (ctf_lookup_var_key_t): Rename to...
(ctf_lookup_idx_key_t): ... this, now we use it for syms too.
<clik_names>: New member, a name table.
(ctf_lookup_var): Adjust accordingly.
(ctf_lookup_variable): Likewise.
(ctf_lookup_by_id): Shuffle further up in the file.
(ctf_symidx_sort_arg_cb): New, callback for...
(sort_symidx_by_name): ... this new function to sort a symidx
found to be unsorted (likely originating from the compiler).
(ctf_symidx_sort): New, sort a symidx.
(ctf_lookup_symbol_name): Support dynamic symbols with indexes
provided by the linker. Use ctf_link_sym_t, not Elf64_Sym.
Check the parent if a child lookup fails.
(ctf_lookup_by_symbol): Likewise. Work for function symbols too.
(ctf_symbol_next): New, iterate over symbols with types (without
sorting).
(ctf_lookup_idx_name): New, bsearch for symbol names in indexes.
(ctf_try_lookup_indexed): New, attempt an indexed lookup.
(ctf_func_info): Reimplement in terms of ctf_lookup_by_symbol.
(ctf_func_args): Likewise.
(ctf_get_dict): Move...
* ctf-types.c (ctf_get_dict): ... here.
* ctf-util.c (ctf_sym_to_elf64): Re-express as...
(ctf_elf64_to_link_sym): ... this. Add new st_symidx field, and
st_nameidx_set (always 0, so st_nameidx can be ignored). Look in
the ELF strtab for names.
(ctf_elf32_to_link_sym): Likewise, for Elf32_Sym.
(ctf_next_destroy): Destroy ctf_next_t.u.ctn_next if need be.
* libctf.ver: Add ctf_symbol_next, ctf_add_objt_sym and
ctf_add_func_sym.
2020-11-20 21:34:04 +08:00
|
|
|
if (dst_type != 0)
|
|
|
|
{
|
libctf: add a deduplicator-specific type mapping table
When CTF linking is done, the linker has to track the association
between types in the inputs and types in the outputs. The deduplicator
does this via the cd_output_emission_hashes, which maps from hashes of
types (valid in both the input and output) to the IDs of types in the
specific dict in which the cd_emission_hashes is held. However, the
nondeduplicating linker and ctf_add_type used a different mechanism, a
dedicated hashtab stored in the ctf_link_type_mapping, populated via
ctf_add_type_mapping and queried via the ctf_type_mapping function. To
allow the same functions to be used for variable and symbol population
in both the deduplicating and nondeduplicating linker, the deduplicator
carefully transferred all its input->output mappings into this hashtab
before returning.
This is *expensive*. The number of entries in this hashtab scales as the
number of input types, and unlike the hashing machinery the type mapping
machinery (the only other thing which scales that way) has not been much
optimized.
Now the nondeduplicating linker is gone, we can throw this out, move
the existing type mapping machinery to ctf-create.c and dedicate it to
ctf_add_type alone, and add a new function ctf_dedup_type_mapping which
uses the deduplicator's built-in knowledge of type mappings directly,
without requiring an expensive repopulation phase.
This speeds up a test link of nouveau.ko (a good worst-case candidate
with a lot of types in each of a lot of input files) from 9.11s to 7.15s
in my testing, a speedup of over 20%.
libctf/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (ctf_dict_t) <ctf_link_type_mapping>: No longer used
by the nondeduplicating linker.
(ctf_add_type_mapping): Removed, now static.
(ctf_type_mapping): Likewise.
(ctf_dedup_type_mapping): New.
(ctf_dedup_t) <cd_input_nums>: New.
* ctf-dedup.c (ctf_dedup_init): Populate it.
(ctf_dedup_fini): Free it again. Emphasise that this has to be
the last thing called.
(ctf_dedup): Populate it.
(ctf_dedup_populate_type_mapping): Removed.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): No longer call it. No longer call
ctf_dedup_fini either.
(ctf_dedup_type_mapping): New.
* ctf-link.c (ctf_unnamed_cuname): New.
(ctf_create_per_cu): Arguments must be non-null now.
(ctf_in_member_cb_arg): Removed.
(ctf_link): No longer populate it. No longer discard the
mapping table.
(ctf_link_deduplicating_one_symtypetab): Use
ctf_dedup_type_mapping, not ctf_type_mapping. Use
ctf_unnamed_cuname.
(ctf_link_one_variable): Likewise. Pass in args individually: no
longer a ctf_variable_iter callback.
(empty_link_type_mapping): Removed.
(ctf_link_deduplicating_variables): Use ctf_variable_next, not
ctf_variable_iter. No longer pack arguments to
ctf_link_one_variable into a struct.
(ctf_link_deduplicating_per_cu): Call ctf_dedup_fini once
all link phases are done.
(ctf_link_deduplicating): Likewise.
(ctf_link_intern_extern_string): Improve comment.
(ctf_add_type_mapping): Migrate...
(ctf_type_mapping): ... these functions...
* ctf-create.c (ctf_add_type_mapping): ... here...
(ctf_type_mapping): ... and make static, for the sole use of
ctf_add_type.
2021-03-02 23:10:05 +08:00
|
|
|
if (!ctf_assert (fp, ctf_type_isparent (fp, dst_type)))
|
|
|
|
return -1; /* errno is set for us. */
|
libctf: symbol type linking support
This adds facilities to write out the function info and data object
sections, which efficiently map from entries in the symbol table to
types. The write-side code is entirely new: the read-side code was
merely significantly changed and support for indexed tables added
(pointed to by the no-longer-unused cth_objtidxoff and cth_funcidxoff
header fields).
With this in place, you can use ctf_lookup_by_symbol to look up the
types of symbols of function and object type (and, as before, you can
use ctf_lookup_variable to look up types of file-scope variables not
present in the symbol table, as long as you know their name: but
variables that are also data objects are now found in the data object
section instead.)
(Compatible) file format change:
The CTF spec has always said that the function info section looks much
like the CTF_K_FUNCTIONs in the type section: an info word (including an
argument count) followed by a return type and N argument types. This
format is suboptimal: it means function symbols cannot be deduplicated
and it causes a lot of ugly code duplication in libctf. But
conveniently the compiler has never emitted this! Because it has always
emitted a rather different format that libctf has never accepted, we can
be sure that there are no instances of this function info section in the
wild, and can freely change its format without compatibility concerns or
a file format version bump. (And since it has never been emitted in any
code that generated any older file format version, either, we need keep
no code to read the format as specified at all!)
So the function info section is now specified as an array of uint32_t,
exactly like the object data section: each entry is a type ID in the
type section which must be of kind CTF_K_FUNCTION, the prototype of
this function.
This allows function types to be deduplicated and also correctly encodes
the fact that all functions declared in C really are types available to
the program: so they should be stored in the type section like all other
types. (In format v4, we will be able to represent the types of static
functions as well, but that really does require a file format change.)
We introduce a new header flag, CTF_F_NEWFUNCINFO, which is set if the
new function info format is in use. A sufficiently new compiler will
always set this flag. New libctf will always set this flag: old libctf
will refuse to open any CTF dicts that have this flag set. If the flag
is not set on a dict being read in, new libctf will disregard the
function info section. Format v4 will remove this flag (or, rather, the
flag has no meaning there and the bit position may be recycled for some
other purpose).
New API:
Symbol addition:
ctf_add_func_sym: Add a symbol with a given name and type. The
type must be of kind CTF_K_FUNCTION (a function
pointer). Internally this adds a name -> type
mapping to the ctf_funchash in the ctf_dict.
ctf_add_objt_sym: Add a symbol with a given name and type. The type
kind can be anything, including function pointers.
This adds to ctf_objthash.
These both treat symbols as name -> type mappings: the linker associates
symbol names with symbol indexes via the ctf_link_shuffle_syms callback,
which sets up the ctf_dynsyms/ctf_dynsymidx/ctf_dynsymmax fields in the
ctf_dict. Repeated relinks can add more symbols.
Variables that are also exposed as symbols are removed from the variable
section at serialization time.
CTF symbol type sections which have enough pads, defined by
CTF_INDEX_PAD_THRESHOLD (whether because they are in dicts with symbols
where most types are unknown, or in archive where most types are defined
in some child or parent dict, not in this specific dict) are sorted by
name rather than symidx and accompanied by an index which associates
each symbol type entry with a name: the existing ctf_lookup_by_symbol
will map symbol indexes to symbol names and look the names up in the
index automatically. (This is currently ELF-symbol-table-dependent, but
there is almost nothing specific to ELF in here and we can add support
for other symbol table formats easily).
The compiler also uses index sections to communicate the contents of
object file symbol tables without relying on any specific ordering of
symbols: it doesn't need to sort them, and libctf will detect an
unsorted index section via the absence of the new CTF_F_IDXSORTED header
flag, and sort it if needed.
Iteration:
ctf_symbol_next: Iterator which returns the types and names of symbols
one by one, either for function or data symbols.
This does not require any sorting: the ctf_link machinery uses it to
pull in all the compiler-provided symbols cheaply, but it is not
restricted to that use.
(Compatible) changes in API:
ctf_lookup_by_symbol: can now be called for object and function
symbols: never returns ECTF_NOTDATA (which is
now not thrown by anything, but is kept for
compatibility and because it is a plausible
error that we might start throwing again at some
later date).
Internally we also have changes to the ctf-string functionality so that
"external" strings (those where we track a string -> offset mapping, but
only write out an offset) can be consulted via the usual means
(ctf_strptr) before the strtab is written out. This is important
because ctf_link_add_linker_symbol can now be handed symbols named via
strtab offsets, and ctf_link_shuffle_syms must figure out their actual
names by looking in the external symtab we have just been fed by the
ctf_link_add_strtab callback, long before that strtab is written out.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_symbol_next): New.
(ctf_add_objt_sym): Likewise.
(ctf_add_func_sym): Likewise.
* ctf.h: Document new function info section format.
(CTF_F_NEWFUNCINFO): New.
(CTF_F_IDXSORTED): New.
(CTF_F_MAX): Adjust accordingly.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (CTF_INDEX_PAD_THRESHOLD): New.
(_libctf_nonnull_): Likewise.
(ctf_in_flight_dynsym_t): New.
(ctf_dict_t) <ctf_funcidx_names>: Likewise.
<ctf_objtidx_names>: Likewise.
<ctf_nfuncidx>: Likewise.
<ctf_nobjtidx>: Likewise.
<ctf_funcidx_sxlate>: Likewise.
<ctf_objtidx_sxlate>: Likewise.
<ctf_objthash>: Likewise.
<ctf_funchash>: Likewise.
<ctf_dynsyms>: Likewise.
<ctf_dynsymidx>: Likewise.
<ctf_dynsymmax>: Likewise.
<ctf_in_flight_dynsym>: Likewise.
(struct ctf_next) <u.ctn_next>: Likewise.
(ctf_symtab_skippable): New prototype.
(ctf_add_funcobjt_sym): Likewise.
(ctf_dynhash_sort_by_name): Likewise.
(ctf_sym_to_elf64): Rename to...
(ctf_elf32_to_link_sym): ... this, and...
(ctf_elf64_to_link_sym): ... this.
* ctf-open.c (init_symtab): Check for lack of CTF_F_NEWFUNCINFO
flag, and presence of index sections. Refactor out
ctf_symtab_skippable and ctf_elf*_to_link_sym, and use them. Use
ctf_link_sym_t, not Elf64_Sym. Skip initializing objt or func
sxlate sections if corresponding index section is present. Adjust
for new func info section format.
(ctf_bufopen_internal): Add ctf_err_warn to corrupt-file error
handling. Report incorrect-length index sections. Always do an
init_symtab, even if there is no symtab section (there may be index
sections still).
(flip_objts): Adjust comment: func and objt sections are actually
identical in structure now, no need to caveat.
(ctf_dict_close): Free newly-added data structures.
* ctf-create.c (ctf_create): Initialize them.
(ctf_symtab_skippable): New, refactored out of
init_symtab, with st_nameidx_set check added.
(ctf_add_funcobjt_sym): New, add a function or object symbol to the
ctf_objthash or ctf_funchash, by name.
(ctf_add_objt_sym): Call it.
(ctf_add_func_sym): Likewise.
(symtypetab_delete_nonstatic_vars): New, delete vars also present as
data objects.
(CTF_SYMTYPETAB_EMIT_FUNCTION): New flag to symtypetab emitters:
this is a function emission, not a data object emission.
(CTF_SYMTYPETAB_EMIT_PAD): New flag to symtypetab emitters: emit
pads for symbols with no type (only set for unindexed sections).
(CTF_SYMTYPETAB_FORCE_INDEXED): New flag to symtypetab emitters:
always emit indexed.
(symtypetab_density): New, figure out section sizes.
(emit_symtypetab): New, emit a symtypetab.
(emit_symtypetab_index): New, emit a symtypetab index.
(ctf_serialize): Call them, emitting suitably sorted symtypetab
sections and indexes. Set suitable header flags. Copy over new
fields.
* ctf-hash.c (ctf_dynhash_sort_by_name): New, used to impose an
order on symtypetab index sections.
* ctf-link.c (ctf_add_type_mapping): Delete erroneous comment
relating to code that was never committed.
(ctf_link_one_variable): Improve variable name.
(check_sym): New, symtypetab analogue of check_variable.
(ctf_link_deduplicating_one_symtypetab): New.
(ctf_link_deduplicating_syms): Likewise.
(ctf_link_deduplicating): Call them.
(ctf_link_deduplicating_per_cu): Note that we don't call them in
this case (yet).
(ctf_link_add_strtab): Set the error on the fp correctly.
(ctf_link_add_linker_symbol): New (no longer a do-nothing stub), add
a linker symbol to the in-flight list.
(ctf_link_shuffle_syms): New (no longer a do-nothing stub), turn the
in-flight list into a mapping we can use, now its names are
resolvable in the external strtab.
* ctf-string.c (ctf_str_rollback_atom): Don't roll back atoms with
external strtab offsets.
(ctf_str_rollback): Adjust comment.
(ctf_str_write_strtab): Migrate ctf_syn_ext_strtab population from
writeout time...
(ctf_str_add_external): ... to string addition time.
* ctf-lookup.c (ctf_lookup_var_key_t): Rename to...
(ctf_lookup_idx_key_t): ... this, now we use it for syms too.
<clik_names>: New member, a name table.
(ctf_lookup_var): Adjust accordingly.
(ctf_lookup_variable): Likewise.
(ctf_lookup_by_id): Shuffle further up in the file.
(ctf_symidx_sort_arg_cb): New, callback for...
(sort_symidx_by_name): ... this new function to sort a symidx
found to be unsorted (likely originating from the compiler).
(ctf_symidx_sort): New, sort a symidx.
(ctf_lookup_symbol_name): Support dynamic symbols with indexes
provided by the linker. Use ctf_link_sym_t, not Elf64_Sym.
Check the parent if a child lookup fails.
(ctf_lookup_by_symbol): Likewise. Work for function symbols too.
(ctf_symbol_next): New, iterate over symbols with types (without
sorting).
(ctf_lookup_idx_name): New, bsearch for symbol names in indexes.
(ctf_try_lookup_indexed): New, attempt an indexed lookup.
(ctf_func_info): Reimplement in terms of ctf_lookup_by_symbol.
(ctf_func_args): Likewise.
(ctf_get_dict): Move...
* ctf-types.c (ctf_get_dict): ... here.
* ctf-util.c (ctf_sym_to_elf64): Re-express as...
(ctf_elf64_to_link_sym): ... this. Add new st_symidx field, and
st_nameidx_set (always 0, so st_nameidx can be ignored). Look in
the ELF strtab for names.
(ctf_elf32_to_link_sym): Likewise, for Elf32_Sym.
(ctf_next_destroy): Destroy ctf_next_t.u.ctn_next if need be.
* libctf.ver: Add ctf_symbol_next, ctf_add_objt_sym and
ctf_add_func_sym.
2020-11-20 21:34:04 +08:00
|
|
|
|
libctf: add a deduplicator-specific type mapping table
When CTF linking is done, the linker has to track the association
between types in the inputs and types in the outputs. The deduplicator
does this via the cd_output_emission_hashes, which maps from hashes of
types (valid in both the input and output) to the IDs of types in the
specific dict in which the cd_emission_hashes is held. However, the
nondeduplicating linker and ctf_add_type used a different mechanism, a
dedicated hashtab stored in the ctf_link_type_mapping, populated via
ctf_add_type_mapping and queried via the ctf_type_mapping function. To
allow the same functions to be used for variable and symbol population
in both the deduplicating and nondeduplicating linker, the deduplicator
carefully transferred all its input->output mappings into this hashtab
before returning.
This is *expensive*. The number of entries in this hashtab scales as the
number of input types, and unlike the hashing machinery the type mapping
machinery (the only other thing which scales that way) has not been much
optimized.
Now the nondeduplicating linker is gone, we can throw this out, move
the existing type mapping machinery to ctf-create.c and dedicate it to
ctf_add_type alone, and add a new function ctf_dedup_type_mapping which
uses the deduplicator's built-in knowledge of type mappings directly,
without requiring an expensive repopulation phase.
This speeds up a test link of nouveau.ko (a good worst-case candidate
with a lot of types in each of a lot of input files) from 9.11s to 7.15s
in my testing, a speedup of over 20%.
libctf/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (ctf_dict_t) <ctf_link_type_mapping>: No longer used
by the nondeduplicating linker.
(ctf_add_type_mapping): Removed, now static.
(ctf_type_mapping): Likewise.
(ctf_dedup_type_mapping): New.
(ctf_dedup_t) <cd_input_nums>: New.
* ctf-dedup.c (ctf_dedup_init): Populate it.
(ctf_dedup_fini): Free it again. Emphasise that this has to be
the last thing called.
(ctf_dedup): Populate it.
(ctf_dedup_populate_type_mapping): Removed.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): No longer call it. No longer call
ctf_dedup_fini either.
(ctf_dedup_type_mapping): New.
* ctf-link.c (ctf_unnamed_cuname): New.
(ctf_create_per_cu): Arguments must be non-null now.
(ctf_in_member_cb_arg): Removed.
(ctf_link): No longer populate it. No longer discard the
mapping table.
(ctf_link_deduplicating_one_symtypetab): Use
ctf_dedup_type_mapping, not ctf_type_mapping. Use
ctf_unnamed_cuname.
(ctf_link_one_variable): Likewise. Pass in args individually: no
longer a ctf_variable_iter callback.
(empty_link_type_mapping): Removed.
(ctf_link_deduplicating_variables): Use ctf_variable_next, not
ctf_variable_iter. No longer pack arguments to
ctf_link_one_variable into a struct.
(ctf_link_deduplicating_per_cu): Call ctf_dedup_fini once
all link phases are done.
(ctf_link_deduplicating): Likewise.
(ctf_link_intern_extern_string): Improve comment.
(ctf_add_type_mapping): Migrate...
(ctf_type_mapping): ... these functions...
* ctf-create.c (ctf_add_type_mapping): ... here...
(ctf_type_mapping): ... and make static, for the sole use of
ctf_add_type.
2021-03-02 23:10:05 +08:00
|
|
|
sym = check_sym (fp, name, dst_type, functions);
|
|
|
|
|
|
|
|
/* Already present: next symbol. */
|
|
|
|
if (sym == 0)
|
|
|
|
continue;
|
|
|
|
/* Not present: add it. */
|
|
|
|
else if (sym > 0)
|
|
|
|
{
|
|
|
|
if (ctf_add_funcobjt_sym (fp, functions,
|
|
|
|
name, dst_type) < 0)
|
|
|
|
return -1; /* errno is set for us. */
|
|
|
|
continue;
|
libctf: symbol type linking support
This adds facilities to write out the function info and data object
sections, which efficiently map from entries in the symbol table to
types. The write-side code is entirely new: the read-side code was
merely significantly changed and support for indexed tables added
(pointed to by the no-longer-unused cth_objtidxoff and cth_funcidxoff
header fields).
With this in place, you can use ctf_lookup_by_symbol to look up the
types of symbols of function and object type (and, as before, you can
use ctf_lookup_variable to look up types of file-scope variables not
present in the symbol table, as long as you know their name: but
variables that are also data objects are now found in the data object
section instead.)
(Compatible) file format change:
The CTF spec has always said that the function info section looks much
like the CTF_K_FUNCTIONs in the type section: an info word (including an
argument count) followed by a return type and N argument types. This
format is suboptimal: it means function symbols cannot be deduplicated
and it causes a lot of ugly code duplication in libctf. But
conveniently the compiler has never emitted this! Because it has always
emitted a rather different format that libctf has never accepted, we can
be sure that there are no instances of this function info section in the
wild, and can freely change its format without compatibility concerns or
a file format version bump. (And since it has never been emitted in any
code that generated any older file format version, either, we need keep
no code to read the format as specified at all!)
So the function info section is now specified as an array of uint32_t,
exactly like the object data section: each entry is a type ID in the
type section which must be of kind CTF_K_FUNCTION, the prototype of
this function.
This allows function types to be deduplicated and also correctly encodes
the fact that all functions declared in C really are types available to
the program: so they should be stored in the type section like all other
types. (In format v4, we will be able to represent the types of static
functions as well, but that really does require a file format change.)
We introduce a new header flag, CTF_F_NEWFUNCINFO, which is set if the
new function info format is in use. A sufficiently new compiler will
always set this flag. New libctf will always set this flag: old libctf
will refuse to open any CTF dicts that have this flag set. If the flag
is not set on a dict being read in, new libctf will disregard the
function info section. Format v4 will remove this flag (or, rather, the
flag has no meaning there and the bit position may be recycled for some
other purpose).
New API:
Symbol addition:
ctf_add_func_sym: Add a symbol with a given name and type. The
type must be of kind CTF_K_FUNCTION (a function
pointer). Internally this adds a name -> type
mapping to the ctf_funchash in the ctf_dict.
ctf_add_objt_sym: Add a symbol with a given name and type. The type
kind can be anything, including function pointers.
This adds to ctf_objthash.
These both treat symbols as name -> type mappings: the linker associates
symbol names with symbol indexes via the ctf_link_shuffle_syms callback,
which sets up the ctf_dynsyms/ctf_dynsymidx/ctf_dynsymmax fields in the
ctf_dict. Repeated relinks can add more symbols.
Variables that are also exposed as symbols are removed from the variable
section at serialization time.
CTF symbol type sections which have enough pads, defined by
CTF_INDEX_PAD_THRESHOLD (whether because they are in dicts with symbols
where most types are unknown, or in archive where most types are defined
in some child or parent dict, not in this specific dict) are sorted by
name rather than symidx and accompanied by an index which associates
each symbol type entry with a name: the existing ctf_lookup_by_symbol
will map symbol indexes to symbol names and look the names up in the
index automatically. (This is currently ELF-symbol-table-dependent, but
there is almost nothing specific to ELF in here and we can add support
for other symbol table formats easily).
The compiler also uses index sections to communicate the contents of
object file symbol tables without relying on any specific ordering of
symbols: it doesn't need to sort them, and libctf will detect an
unsorted index section via the absence of the new CTF_F_IDXSORTED header
flag, and sort it if needed.
Iteration:
ctf_symbol_next: Iterator which returns the types and names of symbols
one by one, either for function or data symbols.
This does not require any sorting: the ctf_link machinery uses it to
pull in all the compiler-provided symbols cheaply, but it is not
restricted to that use.
(Compatible) changes in API:
ctf_lookup_by_symbol: can now be called for object and function
symbols: never returns ECTF_NOTDATA (which is
now not thrown by anything, but is kept for
compatibility and because it is a plausible
error that we might start throwing again at some
later date).
Internally we also have changes to the ctf-string functionality so that
"external" strings (those where we track a string -> offset mapping, but
only write out an offset) can be consulted via the usual means
(ctf_strptr) before the strtab is written out. This is important
because ctf_link_add_linker_symbol can now be handed symbols named via
strtab offsets, and ctf_link_shuffle_syms must figure out their actual
names by looking in the external symtab we have just been fed by the
ctf_link_add_strtab callback, long before that strtab is written out.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_symbol_next): New.
(ctf_add_objt_sym): Likewise.
(ctf_add_func_sym): Likewise.
* ctf.h: Document new function info section format.
(CTF_F_NEWFUNCINFO): New.
(CTF_F_IDXSORTED): New.
(CTF_F_MAX): Adjust accordingly.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (CTF_INDEX_PAD_THRESHOLD): New.
(_libctf_nonnull_): Likewise.
(ctf_in_flight_dynsym_t): New.
(ctf_dict_t) <ctf_funcidx_names>: Likewise.
<ctf_objtidx_names>: Likewise.
<ctf_nfuncidx>: Likewise.
<ctf_nobjtidx>: Likewise.
<ctf_funcidx_sxlate>: Likewise.
<ctf_objtidx_sxlate>: Likewise.
<ctf_objthash>: Likewise.
<ctf_funchash>: Likewise.
<ctf_dynsyms>: Likewise.
<ctf_dynsymidx>: Likewise.
<ctf_dynsymmax>: Likewise.
<ctf_in_flight_dynsym>: Likewise.
(struct ctf_next) <u.ctn_next>: Likewise.
(ctf_symtab_skippable): New prototype.
(ctf_add_funcobjt_sym): Likewise.
(ctf_dynhash_sort_by_name): Likewise.
(ctf_sym_to_elf64): Rename to...
(ctf_elf32_to_link_sym): ... this, and...
(ctf_elf64_to_link_sym): ... this.
* ctf-open.c (init_symtab): Check for lack of CTF_F_NEWFUNCINFO
flag, and presence of index sections. Refactor out
ctf_symtab_skippable and ctf_elf*_to_link_sym, and use them. Use
ctf_link_sym_t, not Elf64_Sym. Skip initializing objt or func
sxlate sections if corresponding index section is present. Adjust
for new func info section format.
(ctf_bufopen_internal): Add ctf_err_warn to corrupt-file error
handling. Report incorrect-length index sections. Always do an
init_symtab, even if there is no symtab section (there may be index
sections still).
(flip_objts): Adjust comment: func and objt sections are actually
identical in structure now, no need to caveat.
(ctf_dict_close): Free newly-added data structures.
* ctf-create.c (ctf_create): Initialize them.
(ctf_symtab_skippable): New, refactored out of
init_symtab, with st_nameidx_set check added.
(ctf_add_funcobjt_sym): New, add a function or object symbol to the
ctf_objthash or ctf_funchash, by name.
(ctf_add_objt_sym): Call it.
(ctf_add_func_sym): Likewise.
(symtypetab_delete_nonstatic_vars): New, delete vars also present as
data objects.
(CTF_SYMTYPETAB_EMIT_FUNCTION): New flag to symtypetab emitters:
this is a function emission, not a data object emission.
(CTF_SYMTYPETAB_EMIT_PAD): New flag to symtypetab emitters: emit
pads for symbols with no type (only set for unindexed sections).
(CTF_SYMTYPETAB_FORCE_INDEXED): New flag to symtypetab emitters:
always emit indexed.
(symtypetab_density): New, figure out section sizes.
(emit_symtypetab): New, emit a symtypetab.
(emit_symtypetab_index): New, emit a symtypetab index.
(ctf_serialize): Call them, emitting suitably sorted symtypetab
sections and indexes. Set suitable header flags. Copy over new
fields.
* ctf-hash.c (ctf_dynhash_sort_by_name): New, used to impose an
order on symtypetab index sections.
* ctf-link.c (ctf_add_type_mapping): Delete erroneous comment
relating to code that was never committed.
(ctf_link_one_variable): Improve variable name.
(check_sym): New, symtypetab analogue of check_variable.
(ctf_link_deduplicating_one_symtypetab): New.
(ctf_link_deduplicating_syms): Likewise.
(ctf_link_deduplicating): Call them.
(ctf_link_deduplicating_per_cu): Note that we don't call them in
this case (yet).
(ctf_link_add_strtab): Set the error on the fp correctly.
(ctf_link_add_linker_symbol): New (no longer a do-nothing stub), add
a linker symbol to the in-flight list.
(ctf_link_shuffle_syms): New (no longer a do-nothing stub), turn the
in-flight list into a mapping we can use, now its names are
resolvable in the external strtab.
* ctf-string.c (ctf_str_rollback_atom): Don't roll back atoms with
external strtab offsets.
(ctf_str_rollback): Adjust comment.
(ctf_str_write_strtab): Migrate ctf_syn_ext_strtab population from
writeout time...
(ctf_str_add_external): ... to string addition time.
* ctf-lookup.c (ctf_lookup_var_key_t): Rename to...
(ctf_lookup_idx_key_t): ... this, now we use it for syms too.
<clik_names>: New member, a name table.
(ctf_lookup_var): Adjust accordingly.
(ctf_lookup_variable): Likewise.
(ctf_lookup_by_id): Shuffle further up in the file.
(ctf_symidx_sort_arg_cb): New, callback for...
(sort_symidx_by_name): ... this new function to sort a symidx
found to be unsorted (likely originating from the compiler).
(ctf_symidx_sort): New, sort a symidx.
(ctf_lookup_symbol_name): Support dynamic symbols with indexes
provided by the linker. Use ctf_link_sym_t, not Elf64_Sym.
Check the parent if a child lookup fails.
(ctf_lookup_by_symbol): Likewise. Work for function symbols too.
(ctf_symbol_next): New, iterate over symbols with types (without
sorting).
(ctf_lookup_idx_name): New, bsearch for symbol names in indexes.
(ctf_try_lookup_indexed): New, attempt an indexed lookup.
(ctf_func_info): Reimplement in terms of ctf_lookup_by_symbol.
(ctf_func_args): Likewise.
(ctf_get_dict): Move...
* ctf-types.c (ctf_get_dict): ... here.
* ctf-util.c (ctf_sym_to_elf64): Re-express as...
(ctf_elf64_to_link_sym): ... this. Add new st_symidx field, and
st_nameidx_set (always 0, so st_nameidx can be ignored). Look in
the ELF strtab for names.
(ctf_elf32_to_link_sym): Likewise, for Elf32_Sym.
(ctf_next_destroy): Destroy ctf_next_t.u.ctn_next if need be.
* libctf.ver: Add ctf_symbol_next, ctf_add_objt_sym and
ctf_add_func_sym.
2020-11-20 21:34:04 +08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Can't add to the parent due to a name clash (most unlikely), or because
|
|
|
|
it references a type only present in the child. Try adding to the
|
|
|
|
child, creating if need be. If we can't do that, skip it. Don't add
|
|
|
|
to a child if we're doing a CU-mapped link, since that has only one
|
|
|
|
output. */
|
|
|
|
if (cu_mapped)
|
|
|
|
{
|
|
|
|
ctf_dprintf ("Symbol %s in input file %s depends on a type %lx "
|
|
|
|
"hidden due to conflicts: skipped.\n", name,
|
libctf: add a deduplicator-specific type mapping table
When CTF linking is done, the linker has to track the association
between types in the inputs and types in the outputs. The deduplicator
does this via the cd_output_emission_hashes, which maps from hashes of
types (valid in both the input and output) to the IDs of types in the
specific dict in which the cd_emission_hashes is held. However, the
nondeduplicating linker and ctf_add_type used a different mechanism, a
dedicated hashtab stored in the ctf_link_type_mapping, populated via
ctf_add_type_mapping and queried via the ctf_type_mapping function. To
allow the same functions to be used for variable and symbol population
in both the deduplicating and nondeduplicating linker, the deduplicator
carefully transferred all its input->output mappings into this hashtab
before returning.
This is *expensive*. The number of entries in this hashtab scales as the
number of input types, and unlike the hashing machinery the type mapping
machinery (the only other thing which scales that way) has not been much
optimized.
Now the nondeduplicating linker is gone, we can throw this out, move
the existing type mapping machinery to ctf-create.c and dedicate it to
ctf_add_type alone, and add a new function ctf_dedup_type_mapping which
uses the deduplicator's built-in knowledge of type mappings directly,
without requiring an expensive repopulation phase.
This speeds up a test link of nouveau.ko (a good worst-case candidate
with a lot of types in each of a lot of input files) from 9.11s to 7.15s
in my testing, a speedup of over 20%.
libctf/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (ctf_dict_t) <ctf_link_type_mapping>: No longer used
by the nondeduplicating linker.
(ctf_add_type_mapping): Removed, now static.
(ctf_type_mapping): Likewise.
(ctf_dedup_type_mapping): New.
(ctf_dedup_t) <cd_input_nums>: New.
* ctf-dedup.c (ctf_dedup_init): Populate it.
(ctf_dedup_fini): Free it again. Emphasise that this has to be
the last thing called.
(ctf_dedup): Populate it.
(ctf_dedup_populate_type_mapping): Removed.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): No longer call it. No longer call
ctf_dedup_fini either.
(ctf_dedup_type_mapping): New.
* ctf-link.c (ctf_unnamed_cuname): New.
(ctf_create_per_cu): Arguments must be non-null now.
(ctf_in_member_cb_arg): Removed.
(ctf_link): No longer populate it. No longer discard the
mapping table.
(ctf_link_deduplicating_one_symtypetab): Use
ctf_dedup_type_mapping, not ctf_type_mapping. Use
ctf_unnamed_cuname.
(ctf_link_one_variable): Likewise. Pass in args individually: no
longer a ctf_variable_iter callback.
(empty_link_type_mapping): Removed.
(ctf_link_deduplicating_variables): Use ctf_variable_next, not
ctf_variable_iter. No longer pack arguments to
ctf_link_one_variable into a struct.
(ctf_link_deduplicating_per_cu): Call ctf_dedup_fini once
all link phases are done.
(ctf_link_deduplicating): Likewise.
(ctf_link_intern_extern_string): Improve comment.
(ctf_add_type_mapping): Migrate...
(ctf_type_mapping): ... these functions...
* ctf-create.c (ctf_add_type_mapping): ... here...
(ctf_type_mapping): ... and make static, for the sole use of
ctf_add_type.
2021-03-02 23:10:05 +08:00
|
|
|
ctf_unnamed_cuname (input), type);
|
libctf: symbol type linking support
This adds facilities to write out the function info and data object
sections, which efficiently map from entries in the symbol table to
types. The write-side code is entirely new: the read-side code was
merely significantly changed and support for indexed tables added
(pointed to by the no-longer-unused cth_objtidxoff and cth_funcidxoff
header fields).
With this in place, you can use ctf_lookup_by_symbol to look up the
types of symbols of function and object type (and, as before, you can
use ctf_lookup_variable to look up types of file-scope variables not
present in the symbol table, as long as you know their name: but
variables that are also data objects are now found in the data object
section instead.)
(Compatible) file format change:
The CTF spec has always said that the function info section looks much
like the CTF_K_FUNCTIONs in the type section: an info word (including an
argument count) followed by a return type and N argument types. This
format is suboptimal: it means function symbols cannot be deduplicated
and it causes a lot of ugly code duplication in libctf. But
conveniently the compiler has never emitted this! Because it has always
emitted a rather different format that libctf has never accepted, we can
be sure that there are no instances of this function info section in the
wild, and can freely change its format without compatibility concerns or
a file format version bump. (And since it has never been emitted in any
code that generated any older file format version, either, we need keep
no code to read the format as specified at all!)
So the function info section is now specified as an array of uint32_t,
exactly like the object data section: each entry is a type ID in the
type section which must be of kind CTF_K_FUNCTION, the prototype of
this function.
This allows function types to be deduplicated and also correctly encodes
the fact that all functions declared in C really are types available to
the program: so they should be stored in the type section like all other
types. (In format v4, we will be able to represent the types of static
functions as well, but that really does require a file format change.)
We introduce a new header flag, CTF_F_NEWFUNCINFO, which is set if the
new function info format is in use. A sufficiently new compiler will
always set this flag. New libctf will always set this flag: old libctf
will refuse to open any CTF dicts that have this flag set. If the flag
is not set on a dict being read in, new libctf will disregard the
function info section. Format v4 will remove this flag (or, rather, the
flag has no meaning there and the bit position may be recycled for some
other purpose).
New API:
Symbol addition:
ctf_add_func_sym: Add a symbol with a given name and type. The
type must be of kind CTF_K_FUNCTION (a function
pointer). Internally this adds a name -> type
mapping to the ctf_funchash in the ctf_dict.
ctf_add_objt_sym: Add a symbol with a given name and type. The type
kind can be anything, including function pointers.
This adds to ctf_objthash.
These both treat symbols as name -> type mappings: the linker associates
symbol names with symbol indexes via the ctf_link_shuffle_syms callback,
which sets up the ctf_dynsyms/ctf_dynsymidx/ctf_dynsymmax fields in the
ctf_dict. Repeated relinks can add more symbols.
Variables that are also exposed as symbols are removed from the variable
section at serialization time.
CTF symbol type sections which have enough pads, defined by
CTF_INDEX_PAD_THRESHOLD (whether because they are in dicts with symbols
where most types are unknown, or in archive where most types are defined
in some child or parent dict, not in this specific dict) are sorted by
name rather than symidx and accompanied by an index which associates
each symbol type entry with a name: the existing ctf_lookup_by_symbol
will map symbol indexes to symbol names and look the names up in the
index automatically. (This is currently ELF-symbol-table-dependent, but
there is almost nothing specific to ELF in here and we can add support
for other symbol table formats easily).
The compiler also uses index sections to communicate the contents of
object file symbol tables without relying on any specific ordering of
symbols: it doesn't need to sort them, and libctf will detect an
unsorted index section via the absence of the new CTF_F_IDXSORTED header
flag, and sort it if needed.
Iteration:
ctf_symbol_next: Iterator which returns the types and names of symbols
one by one, either for function or data symbols.
This does not require any sorting: the ctf_link machinery uses it to
pull in all the compiler-provided symbols cheaply, but it is not
restricted to that use.
(Compatible) changes in API:
ctf_lookup_by_symbol: can now be called for object and function
symbols: never returns ECTF_NOTDATA (which is
now not thrown by anything, but is kept for
compatibility and because it is a plausible
error that we might start throwing again at some
later date).
Internally we also have changes to the ctf-string functionality so that
"external" strings (those where we track a string -> offset mapping, but
only write out an offset) can be consulted via the usual means
(ctf_strptr) before the strtab is written out. This is important
because ctf_link_add_linker_symbol can now be handed symbols named via
strtab offsets, and ctf_link_shuffle_syms must figure out their actual
names by looking in the external symtab we have just been fed by the
ctf_link_add_strtab callback, long before that strtab is written out.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_symbol_next): New.
(ctf_add_objt_sym): Likewise.
(ctf_add_func_sym): Likewise.
* ctf.h: Document new function info section format.
(CTF_F_NEWFUNCINFO): New.
(CTF_F_IDXSORTED): New.
(CTF_F_MAX): Adjust accordingly.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (CTF_INDEX_PAD_THRESHOLD): New.
(_libctf_nonnull_): Likewise.
(ctf_in_flight_dynsym_t): New.
(ctf_dict_t) <ctf_funcidx_names>: Likewise.
<ctf_objtidx_names>: Likewise.
<ctf_nfuncidx>: Likewise.
<ctf_nobjtidx>: Likewise.
<ctf_funcidx_sxlate>: Likewise.
<ctf_objtidx_sxlate>: Likewise.
<ctf_objthash>: Likewise.
<ctf_funchash>: Likewise.
<ctf_dynsyms>: Likewise.
<ctf_dynsymidx>: Likewise.
<ctf_dynsymmax>: Likewise.
<ctf_in_flight_dynsym>: Likewise.
(struct ctf_next) <u.ctn_next>: Likewise.
(ctf_symtab_skippable): New prototype.
(ctf_add_funcobjt_sym): Likewise.
(ctf_dynhash_sort_by_name): Likewise.
(ctf_sym_to_elf64): Rename to...
(ctf_elf32_to_link_sym): ... this, and...
(ctf_elf64_to_link_sym): ... this.
* ctf-open.c (init_symtab): Check for lack of CTF_F_NEWFUNCINFO
flag, and presence of index sections. Refactor out
ctf_symtab_skippable and ctf_elf*_to_link_sym, and use them. Use
ctf_link_sym_t, not Elf64_Sym. Skip initializing objt or func
sxlate sections if corresponding index section is present. Adjust
for new func info section format.
(ctf_bufopen_internal): Add ctf_err_warn to corrupt-file error
handling. Report incorrect-length index sections. Always do an
init_symtab, even if there is no symtab section (there may be index
sections still).
(flip_objts): Adjust comment: func and objt sections are actually
identical in structure now, no need to caveat.
(ctf_dict_close): Free newly-added data structures.
* ctf-create.c (ctf_create): Initialize them.
(ctf_symtab_skippable): New, refactored out of
init_symtab, with st_nameidx_set check added.
(ctf_add_funcobjt_sym): New, add a function or object symbol to the
ctf_objthash or ctf_funchash, by name.
(ctf_add_objt_sym): Call it.
(ctf_add_func_sym): Likewise.
(symtypetab_delete_nonstatic_vars): New, delete vars also present as
data objects.
(CTF_SYMTYPETAB_EMIT_FUNCTION): New flag to symtypetab emitters:
this is a function emission, not a data object emission.
(CTF_SYMTYPETAB_EMIT_PAD): New flag to symtypetab emitters: emit
pads for symbols with no type (only set for unindexed sections).
(CTF_SYMTYPETAB_FORCE_INDEXED): New flag to symtypetab emitters:
always emit indexed.
(symtypetab_density): New, figure out section sizes.
(emit_symtypetab): New, emit a symtypetab.
(emit_symtypetab_index): New, emit a symtypetab index.
(ctf_serialize): Call them, emitting suitably sorted symtypetab
sections and indexes. Set suitable header flags. Copy over new
fields.
* ctf-hash.c (ctf_dynhash_sort_by_name): New, used to impose an
order on symtypetab index sections.
* ctf-link.c (ctf_add_type_mapping): Delete erroneous comment
relating to code that was never committed.
(ctf_link_one_variable): Improve variable name.
(check_sym): New, symtypetab analogue of check_variable.
(ctf_link_deduplicating_one_symtypetab): New.
(ctf_link_deduplicating_syms): Likewise.
(ctf_link_deduplicating): Call them.
(ctf_link_deduplicating_per_cu): Note that we don't call them in
this case (yet).
(ctf_link_add_strtab): Set the error on the fp correctly.
(ctf_link_add_linker_symbol): New (no longer a do-nothing stub), add
a linker symbol to the in-flight list.
(ctf_link_shuffle_syms): New (no longer a do-nothing stub), turn the
in-flight list into a mapping we can use, now its names are
resolvable in the external strtab.
* ctf-string.c (ctf_str_rollback_atom): Don't roll back atoms with
external strtab offsets.
(ctf_str_rollback): Adjust comment.
(ctf_str_write_strtab): Migrate ctf_syn_ext_strtab population from
writeout time...
(ctf_str_add_external): ... to string addition time.
* ctf-lookup.c (ctf_lookup_var_key_t): Rename to...
(ctf_lookup_idx_key_t): ... this, now we use it for syms too.
<clik_names>: New member, a name table.
(ctf_lookup_var): Adjust accordingly.
(ctf_lookup_variable): Likewise.
(ctf_lookup_by_id): Shuffle further up in the file.
(ctf_symidx_sort_arg_cb): New, callback for...
(sort_symidx_by_name): ... this new function to sort a symidx
found to be unsorted (likely originating from the compiler).
(ctf_symidx_sort): New, sort a symidx.
(ctf_lookup_symbol_name): Support dynamic symbols with indexes
provided by the linker. Use ctf_link_sym_t, not Elf64_Sym.
Check the parent if a child lookup fails.
(ctf_lookup_by_symbol): Likewise. Work for function symbols too.
(ctf_symbol_next): New, iterate over symbols with types (without
sorting).
(ctf_lookup_idx_name): New, bsearch for symbol names in indexes.
(ctf_try_lookup_indexed): New, attempt an indexed lookup.
(ctf_func_info): Reimplement in terms of ctf_lookup_by_symbol.
(ctf_func_args): Likewise.
(ctf_get_dict): Move...
* ctf-types.c (ctf_get_dict): ... here.
* ctf-util.c (ctf_sym_to_elf64): Re-express as...
(ctf_elf64_to_link_sym): ... this. Add new st_symidx field, and
st_nameidx_set (always 0, so st_nameidx can be ignored). Look in
the ELF strtab for names.
(ctf_elf32_to_link_sym): Likewise, for Elf32_Sym.
(ctf_next_destroy): Destroy ctf_next_t.u.ctn_next if need be.
* libctf.ver: Add ctf_symbol_next, ctf_add_objt_sym and
ctf_add_func_sym.
2020-11-20 21:34:04 +08:00
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
libctf: fix linking together multiple objects derived from the same source
Right now, if you compile the same .c input repeatedly with CTF enabled
and different compilation flags, then arrange to link all of these
together, then things misbehave in various ways. libctf may conflate
either inputs (if the .o files have the same name, say if they are
stored in different .a archives), or per-CU outputs when conflicting
types are found: the latter can lead to entirely spurious errors when
it tries to produce multiple per-CU outputs with the same name
(discarding all but the last, but then looking for types in the earlier
ones which have just been thrown away).
Fixing this is multi-pronged. Both inputs and outputs need to be
differentiated in the hashtables libctf keeps them in: inputs with the
same cuname and filename need to be considered distinct as long as they
have different associated CTF dicts, and per-CU outputs need to be
considered distinct as long as they have different associated input
dicts. Right now there is nothing tying the two together other than the
CU name: fix this by introducing a new field in the ctf_dict_t named
ctf_link_in_out, which (for input dicts) points to the associated per-CU
output dict (if any), and for output dicts points to the associated
input dict. At creation time the name used is completely arbitrary:
it's only important that it be distinct if CTF dicts are distinct. So,
when a clash is found, adjust the CU name by sticking the number of
elements in the input on the end. At output time, the CU name will
appear in the linked object, so it matters a little more that it look
slightly less ugly: in conflicting cases, append an incrementing
integer, starting at 0.
This naming scheme is not very helpful, but it's hard to see what else
we can do. The input .o name may be the same. The input .a name is not
even visible to ctf_link, and even *that* might be the same, because
.a's can contain many members with the same name, all of which
participate in the link. All we really know is that the two have
distinct dictionaries with distinct types in them, and at least this way
they are all represented, any any symbols, variables etc referring to
those types are accurately stored.
(As a side-effect this also fixes a use-after-free and double-free when
errors are found during variable or symbol emission.)
Use the opportunity to prevent a couple of sources of problems, to wit
changing the active CU mappings when a link has already been done
(no effect on ld, which doesn't use CU mappings at all), and causing
multiple consecutive ctf_link's to have the same net effect as just
doing the last one (no effect on ld, which only ever does one
ctf_link) rather than having the links be a sort of half-incremental
not-really-intended mess.
libctf/ChangeLog:
PR libctf/29242
* ctf-impl.h (struct ctf_dict) [ctf_link_in_out]: New.
* ctf-dedup.c (ctf_dedup_emit_type): Set it.
* ctf-link.c (ctf_link_add_ctf_internal): Set the input
CU name uniquely when clashes are found.
(ctf_link_add): Document what repeated additions do.
(ctf_new_per_cu_name): New, come up with a consistent
name for a new per-CU dict.
(ctf_link_deduplicating): Use it.
(ctf_create_per_cu): Use it, and ctf_link_in_out, and set
ctf_link_in_out properly. Don't overwrite per-CU dicts with
per-CU dicts relating to different inputs.
(ctf_link_add_cu_mapping): Prevent per-CU mappings being set up
if we already have per-CU outputs.
(ctf_link_one_variable): Adjust ctf_link_per_cu call.
(ctf_link_deduplicating_one_symtypetab): Likewise.
(ctf_link_empty_outputs): New, delete all the ctf_link_outputs
and blank out ctf_link_in_out on the corresponding inputs.
(ctf_link): Clarify the effect of multiple ctf_link calls.
Empty ctf_link_outputs if it already exists rather than
having the old output leak into the new link. Fix a variable
name.
* testsuite/config/default.exp (AR): Add.
(OBJDUMP): Likewise.
* testsuite/libctf-regression/libctf-repeat-cu.exp: New test.
* testsuite/libctf-regression/libctf-repeat-cu*: Main program,
library, and expected results for the test.
2022-06-11 00:05:50 +08:00
|
|
|
if ((per_cu_out_fp = ctf_create_per_cu (fp, input, NULL)) == NULL)
|
libctf: symbol type linking support
This adds facilities to write out the function info and data object
sections, which efficiently map from entries in the symbol table to
types. The write-side code is entirely new: the read-side code was
merely significantly changed and support for indexed tables added
(pointed to by the no-longer-unused cth_objtidxoff and cth_funcidxoff
header fields).
With this in place, you can use ctf_lookup_by_symbol to look up the
types of symbols of function and object type (and, as before, you can
use ctf_lookup_variable to look up types of file-scope variables not
present in the symbol table, as long as you know their name: but
variables that are also data objects are now found in the data object
section instead.)
(Compatible) file format change:
The CTF spec has always said that the function info section looks much
like the CTF_K_FUNCTIONs in the type section: an info word (including an
argument count) followed by a return type and N argument types. This
format is suboptimal: it means function symbols cannot be deduplicated
and it causes a lot of ugly code duplication in libctf. But
conveniently the compiler has never emitted this! Because it has always
emitted a rather different format that libctf has never accepted, we can
be sure that there are no instances of this function info section in the
wild, and can freely change its format without compatibility concerns or
a file format version bump. (And since it has never been emitted in any
code that generated any older file format version, either, we need keep
no code to read the format as specified at all!)
So the function info section is now specified as an array of uint32_t,
exactly like the object data section: each entry is a type ID in the
type section which must be of kind CTF_K_FUNCTION, the prototype of
this function.
This allows function types to be deduplicated and also correctly encodes
the fact that all functions declared in C really are types available to
the program: so they should be stored in the type section like all other
types. (In format v4, we will be able to represent the types of static
functions as well, but that really does require a file format change.)
We introduce a new header flag, CTF_F_NEWFUNCINFO, which is set if the
new function info format is in use. A sufficiently new compiler will
always set this flag. New libctf will always set this flag: old libctf
will refuse to open any CTF dicts that have this flag set. If the flag
is not set on a dict being read in, new libctf will disregard the
function info section. Format v4 will remove this flag (or, rather, the
flag has no meaning there and the bit position may be recycled for some
other purpose).
New API:
Symbol addition:
ctf_add_func_sym: Add a symbol with a given name and type. The
type must be of kind CTF_K_FUNCTION (a function
pointer). Internally this adds a name -> type
mapping to the ctf_funchash in the ctf_dict.
ctf_add_objt_sym: Add a symbol with a given name and type. The type
kind can be anything, including function pointers.
This adds to ctf_objthash.
These both treat symbols as name -> type mappings: the linker associates
symbol names with symbol indexes via the ctf_link_shuffle_syms callback,
which sets up the ctf_dynsyms/ctf_dynsymidx/ctf_dynsymmax fields in the
ctf_dict. Repeated relinks can add more symbols.
Variables that are also exposed as symbols are removed from the variable
section at serialization time.
CTF symbol type sections which have enough pads, defined by
CTF_INDEX_PAD_THRESHOLD (whether because they are in dicts with symbols
where most types are unknown, or in archive where most types are defined
in some child or parent dict, not in this specific dict) are sorted by
name rather than symidx and accompanied by an index which associates
each symbol type entry with a name: the existing ctf_lookup_by_symbol
will map symbol indexes to symbol names and look the names up in the
index automatically. (This is currently ELF-symbol-table-dependent, but
there is almost nothing specific to ELF in here and we can add support
for other symbol table formats easily).
The compiler also uses index sections to communicate the contents of
object file symbol tables without relying on any specific ordering of
symbols: it doesn't need to sort them, and libctf will detect an
unsorted index section via the absence of the new CTF_F_IDXSORTED header
flag, and sort it if needed.
Iteration:
ctf_symbol_next: Iterator which returns the types and names of symbols
one by one, either for function or data symbols.
This does not require any sorting: the ctf_link machinery uses it to
pull in all the compiler-provided symbols cheaply, but it is not
restricted to that use.
(Compatible) changes in API:
ctf_lookup_by_symbol: can now be called for object and function
symbols: never returns ECTF_NOTDATA (which is
now not thrown by anything, but is kept for
compatibility and because it is a plausible
error that we might start throwing again at some
later date).
Internally we also have changes to the ctf-string functionality so that
"external" strings (those where we track a string -> offset mapping, but
only write out an offset) can be consulted via the usual means
(ctf_strptr) before the strtab is written out. This is important
because ctf_link_add_linker_symbol can now be handed symbols named via
strtab offsets, and ctf_link_shuffle_syms must figure out their actual
names by looking in the external symtab we have just been fed by the
ctf_link_add_strtab callback, long before that strtab is written out.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_symbol_next): New.
(ctf_add_objt_sym): Likewise.
(ctf_add_func_sym): Likewise.
* ctf.h: Document new function info section format.
(CTF_F_NEWFUNCINFO): New.
(CTF_F_IDXSORTED): New.
(CTF_F_MAX): Adjust accordingly.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (CTF_INDEX_PAD_THRESHOLD): New.
(_libctf_nonnull_): Likewise.
(ctf_in_flight_dynsym_t): New.
(ctf_dict_t) <ctf_funcidx_names>: Likewise.
<ctf_objtidx_names>: Likewise.
<ctf_nfuncidx>: Likewise.
<ctf_nobjtidx>: Likewise.
<ctf_funcidx_sxlate>: Likewise.
<ctf_objtidx_sxlate>: Likewise.
<ctf_objthash>: Likewise.
<ctf_funchash>: Likewise.
<ctf_dynsyms>: Likewise.
<ctf_dynsymidx>: Likewise.
<ctf_dynsymmax>: Likewise.
<ctf_in_flight_dynsym>: Likewise.
(struct ctf_next) <u.ctn_next>: Likewise.
(ctf_symtab_skippable): New prototype.
(ctf_add_funcobjt_sym): Likewise.
(ctf_dynhash_sort_by_name): Likewise.
(ctf_sym_to_elf64): Rename to...
(ctf_elf32_to_link_sym): ... this, and...
(ctf_elf64_to_link_sym): ... this.
* ctf-open.c (init_symtab): Check for lack of CTF_F_NEWFUNCINFO
flag, and presence of index sections. Refactor out
ctf_symtab_skippable and ctf_elf*_to_link_sym, and use them. Use
ctf_link_sym_t, not Elf64_Sym. Skip initializing objt or func
sxlate sections if corresponding index section is present. Adjust
for new func info section format.
(ctf_bufopen_internal): Add ctf_err_warn to corrupt-file error
handling. Report incorrect-length index sections. Always do an
init_symtab, even if there is no symtab section (there may be index
sections still).
(flip_objts): Adjust comment: func and objt sections are actually
identical in structure now, no need to caveat.
(ctf_dict_close): Free newly-added data structures.
* ctf-create.c (ctf_create): Initialize them.
(ctf_symtab_skippable): New, refactored out of
init_symtab, with st_nameidx_set check added.
(ctf_add_funcobjt_sym): New, add a function or object symbol to the
ctf_objthash or ctf_funchash, by name.
(ctf_add_objt_sym): Call it.
(ctf_add_func_sym): Likewise.
(symtypetab_delete_nonstatic_vars): New, delete vars also present as
data objects.
(CTF_SYMTYPETAB_EMIT_FUNCTION): New flag to symtypetab emitters:
this is a function emission, not a data object emission.
(CTF_SYMTYPETAB_EMIT_PAD): New flag to symtypetab emitters: emit
pads for symbols with no type (only set for unindexed sections).
(CTF_SYMTYPETAB_FORCE_INDEXED): New flag to symtypetab emitters:
always emit indexed.
(symtypetab_density): New, figure out section sizes.
(emit_symtypetab): New, emit a symtypetab.
(emit_symtypetab_index): New, emit a symtypetab index.
(ctf_serialize): Call them, emitting suitably sorted symtypetab
sections and indexes. Set suitable header flags. Copy over new
fields.
* ctf-hash.c (ctf_dynhash_sort_by_name): New, used to impose an
order on symtypetab index sections.
* ctf-link.c (ctf_add_type_mapping): Delete erroneous comment
relating to code that was never committed.
(ctf_link_one_variable): Improve variable name.
(check_sym): New, symtypetab analogue of check_variable.
(ctf_link_deduplicating_one_symtypetab): New.
(ctf_link_deduplicating_syms): Likewise.
(ctf_link_deduplicating): Call them.
(ctf_link_deduplicating_per_cu): Note that we don't call them in
this case (yet).
(ctf_link_add_strtab): Set the error on the fp correctly.
(ctf_link_add_linker_symbol): New (no longer a do-nothing stub), add
a linker symbol to the in-flight list.
(ctf_link_shuffle_syms): New (no longer a do-nothing stub), turn the
in-flight list into a mapping we can use, now its names are
resolvable in the external strtab.
* ctf-string.c (ctf_str_rollback_atom): Don't roll back atoms with
external strtab offsets.
(ctf_str_rollback): Adjust comment.
(ctf_str_write_strtab): Migrate ctf_syn_ext_strtab population from
writeout time...
(ctf_str_add_external): ... to string addition time.
* ctf-lookup.c (ctf_lookup_var_key_t): Rename to...
(ctf_lookup_idx_key_t): ... this, now we use it for syms too.
<clik_names>: New member, a name table.
(ctf_lookup_var): Adjust accordingly.
(ctf_lookup_variable): Likewise.
(ctf_lookup_by_id): Shuffle further up in the file.
(ctf_symidx_sort_arg_cb): New, callback for...
(sort_symidx_by_name): ... this new function to sort a symidx
found to be unsorted (likely originating from the compiler).
(ctf_symidx_sort): New, sort a symidx.
(ctf_lookup_symbol_name): Support dynamic symbols with indexes
provided by the linker. Use ctf_link_sym_t, not Elf64_Sym.
Check the parent if a child lookup fails.
(ctf_lookup_by_symbol): Likewise. Work for function symbols too.
(ctf_symbol_next): New, iterate over symbols with types (without
sorting).
(ctf_lookup_idx_name): New, bsearch for symbol names in indexes.
(ctf_try_lookup_indexed): New, attempt an indexed lookup.
(ctf_func_info): Reimplement in terms of ctf_lookup_by_symbol.
(ctf_func_args): Likewise.
(ctf_get_dict): Move...
* ctf-types.c (ctf_get_dict): ... here.
* ctf-util.c (ctf_sym_to_elf64): Re-express as...
(ctf_elf64_to_link_sym): ... this. Add new st_symidx field, and
st_nameidx_set (always 0, so st_nameidx can be ignored). Look in
the ELF strtab for names.
(ctf_elf32_to_link_sym): Likewise, for Elf32_Sym.
(ctf_next_destroy): Destroy ctf_next_t.u.ctn_next if need be.
* libctf.ver: Add ctf_symbol_next, ctf_add_objt_sym and
ctf_add_func_sym.
2020-11-20 21:34:04 +08:00
|
|
|
return -1; /* errno is set for us. */
|
|
|
|
|
|
|
|
/* If the type was not found, check for it in the child too. */
|
|
|
|
if (dst_type == 0)
|
|
|
|
{
|
libctf: add a deduplicator-specific type mapping table
When CTF linking is done, the linker has to track the association
between types in the inputs and types in the outputs. The deduplicator
does this via the cd_output_emission_hashes, which maps from hashes of
types (valid in both the input and output) to the IDs of types in the
specific dict in which the cd_emission_hashes is held. However, the
nondeduplicating linker and ctf_add_type used a different mechanism, a
dedicated hashtab stored in the ctf_link_type_mapping, populated via
ctf_add_type_mapping and queried via the ctf_type_mapping function. To
allow the same functions to be used for variable and symbol population
in both the deduplicating and nondeduplicating linker, the deduplicator
carefully transferred all its input->output mappings into this hashtab
before returning.
This is *expensive*. The number of entries in this hashtab scales as the
number of input types, and unlike the hashing machinery the type mapping
machinery (the only other thing which scales that way) has not been much
optimized.
Now the nondeduplicating linker is gone, we can throw this out, move
the existing type mapping machinery to ctf-create.c and dedicate it to
ctf_add_type alone, and add a new function ctf_dedup_type_mapping which
uses the deduplicator's built-in knowledge of type mappings directly,
without requiring an expensive repopulation phase.
This speeds up a test link of nouveau.ko (a good worst-case candidate
with a lot of types in each of a lot of input files) from 9.11s to 7.15s
in my testing, a speedup of over 20%.
libctf/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (ctf_dict_t) <ctf_link_type_mapping>: No longer used
by the nondeduplicating linker.
(ctf_add_type_mapping): Removed, now static.
(ctf_type_mapping): Likewise.
(ctf_dedup_type_mapping): New.
(ctf_dedup_t) <cd_input_nums>: New.
* ctf-dedup.c (ctf_dedup_init): Populate it.
(ctf_dedup_fini): Free it again. Emphasise that this has to be
the last thing called.
(ctf_dedup): Populate it.
(ctf_dedup_populate_type_mapping): Removed.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): No longer call it. No longer call
ctf_dedup_fini either.
(ctf_dedup_type_mapping): New.
* ctf-link.c (ctf_unnamed_cuname): New.
(ctf_create_per_cu): Arguments must be non-null now.
(ctf_in_member_cb_arg): Removed.
(ctf_link): No longer populate it. No longer discard the
mapping table.
(ctf_link_deduplicating_one_symtypetab): Use
ctf_dedup_type_mapping, not ctf_type_mapping. Use
ctf_unnamed_cuname.
(ctf_link_one_variable): Likewise. Pass in args individually: no
longer a ctf_variable_iter callback.
(empty_link_type_mapping): Removed.
(ctf_link_deduplicating_variables): Use ctf_variable_next, not
ctf_variable_iter. No longer pack arguments to
ctf_link_one_variable into a struct.
(ctf_link_deduplicating_per_cu): Call ctf_dedup_fini once
all link phases are done.
(ctf_link_deduplicating): Likewise.
(ctf_link_intern_extern_string): Improve comment.
(ctf_add_type_mapping): Migrate...
(ctf_type_mapping): ... these functions...
* ctf-create.c (ctf_add_type_mapping): ... here...
(ctf_type_mapping): ... and make static, for the sole use of
ctf_add_type.
2021-03-02 23:10:05 +08:00
|
|
|
if ((dst_type = ctf_dedup_type_mapping (per_cu_out_fp,
|
|
|
|
input, type)) == CTF_ERR)
|
|
|
|
return -1; /* errno is set for us. */
|
libctf: symbol type linking support
This adds facilities to write out the function info and data object
sections, which efficiently map from entries in the symbol table to
types. The write-side code is entirely new: the read-side code was
merely significantly changed and support for indexed tables added
(pointed to by the no-longer-unused cth_objtidxoff and cth_funcidxoff
header fields).
With this in place, you can use ctf_lookup_by_symbol to look up the
types of symbols of function and object type (and, as before, you can
use ctf_lookup_variable to look up types of file-scope variables not
present in the symbol table, as long as you know their name: but
variables that are also data objects are now found in the data object
section instead.)
(Compatible) file format change:
The CTF spec has always said that the function info section looks much
like the CTF_K_FUNCTIONs in the type section: an info word (including an
argument count) followed by a return type and N argument types. This
format is suboptimal: it means function symbols cannot be deduplicated
and it causes a lot of ugly code duplication in libctf. But
conveniently the compiler has never emitted this! Because it has always
emitted a rather different format that libctf has never accepted, we can
be sure that there are no instances of this function info section in the
wild, and can freely change its format without compatibility concerns or
a file format version bump. (And since it has never been emitted in any
code that generated any older file format version, either, we need keep
no code to read the format as specified at all!)
So the function info section is now specified as an array of uint32_t,
exactly like the object data section: each entry is a type ID in the
type section which must be of kind CTF_K_FUNCTION, the prototype of
this function.
This allows function types to be deduplicated and also correctly encodes
the fact that all functions declared in C really are types available to
the program: so they should be stored in the type section like all other
types. (In format v4, we will be able to represent the types of static
functions as well, but that really does require a file format change.)
We introduce a new header flag, CTF_F_NEWFUNCINFO, which is set if the
new function info format is in use. A sufficiently new compiler will
always set this flag. New libctf will always set this flag: old libctf
will refuse to open any CTF dicts that have this flag set. If the flag
is not set on a dict being read in, new libctf will disregard the
function info section. Format v4 will remove this flag (or, rather, the
flag has no meaning there and the bit position may be recycled for some
other purpose).
New API:
Symbol addition:
ctf_add_func_sym: Add a symbol with a given name and type. The
type must be of kind CTF_K_FUNCTION (a function
pointer). Internally this adds a name -> type
mapping to the ctf_funchash in the ctf_dict.
ctf_add_objt_sym: Add a symbol with a given name and type. The type
kind can be anything, including function pointers.
This adds to ctf_objthash.
These both treat symbols as name -> type mappings: the linker associates
symbol names with symbol indexes via the ctf_link_shuffle_syms callback,
which sets up the ctf_dynsyms/ctf_dynsymidx/ctf_dynsymmax fields in the
ctf_dict. Repeated relinks can add more symbols.
Variables that are also exposed as symbols are removed from the variable
section at serialization time.
CTF symbol type sections which have enough pads, defined by
CTF_INDEX_PAD_THRESHOLD (whether because they are in dicts with symbols
where most types are unknown, or in archive where most types are defined
in some child or parent dict, not in this specific dict) are sorted by
name rather than symidx and accompanied by an index which associates
each symbol type entry with a name: the existing ctf_lookup_by_symbol
will map symbol indexes to symbol names and look the names up in the
index automatically. (This is currently ELF-symbol-table-dependent, but
there is almost nothing specific to ELF in here and we can add support
for other symbol table formats easily).
The compiler also uses index sections to communicate the contents of
object file symbol tables without relying on any specific ordering of
symbols: it doesn't need to sort them, and libctf will detect an
unsorted index section via the absence of the new CTF_F_IDXSORTED header
flag, and sort it if needed.
Iteration:
ctf_symbol_next: Iterator which returns the types and names of symbols
one by one, either for function or data symbols.
This does not require any sorting: the ctf_link machinery uses it to
pull in all the compiler-provided symbols cheaply, but it is not
restricted to that use.
(Compatible) changes in API:
ctf_lookup_by_symbol: can now be called for object and function
symbols: never returns ECTF_NOTDATA (which is
now not thrown by anything, but is kept for
compatibility and because it is a plausible
error that we might start throwing again at some
later date).
Internally we also have changes to the ctf-string functionality so that
"external" strings (those where we track a string -> offset mapping, but
only write out an offset) can be consulted via the usual means
(ctf_strptr) before the strtab is written out. This is important
because ctf_link_add_linker_symbol can now be handed symbols named via
strtab offsets, and ctf_link_shuffle_syms must figure out their actual
names by looking in the external symtab we have just been fed by the
ctf_link_add_strtab callback, long before that strtab is written out.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_symbol_next): New.
(ctf_add_objt_sym): Likewise.
(ctf_add_func_sym): Likewise.
* ctf.h: Document new function info section format.
(CTF_F_NEWFUNCINFO): New.
(CTF_F_IDXSORTED): New.
(CTF_F_MAX): Adjust accordingly.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (CTF_INDEX_PAD_THRESHOLD): New.
(_libctf_nonnull_): Likewise.
(ctf_in_flight_dynsym_t): New.
(ctf_dict_t) <ctf_funcidx_names>: Likewise.
<ctf_objtidx_names>: Likewise.
<ctf_nfuncidx>: Likewise.
<ctf_nobjtidx>: Likewise.
<ctf_funcidx_sxlate>: Likewise.
<ctf_objtidx_sxlate>: Likewise.
<ctf_objthash>: Likewise.
<ctf_funchash>: Likewise.
<ctf_dynsyms>: Likewise.
<ctf_dynsymidx>: Likewise.
<ctf_dynsymmax>: Likewise.
<ctf_in_flight_dynsym>: Likewise.
(struct ctf_next) <u.ctn_next>: Likewise.
(ctf_symtab_skippable): New prototype.
(ctf_add_funcobjt_sym): Likewise.
(ctf_dynhash_sort_by_name): Likewise.
(ctf_sym_to_elf64): Rename to...
(ctf_elf32_to_link_sym): ... this, and...
(ctf_elf64_to_link_sym): ... this.
* ctf-open.c (init_symtab): Check for lack of CTF_F_NEWFUNCINFO
flag, and presence of index sections. Refactor out
ctf_symtab_skippable and ctf_elf*_to_link_sym, and use them. Use
ctf_link_sym_t, not Elf64_Sym. Skip initializing objt or func
sxlate sections if corresponding index section is present. Adjust
for new func info section format.
(ctf_bufopen_internal): Add ctf_err_warn to corrupt-file error
handling. Report incorrect-length index sections. Always do an
init_symtab, even if there is no symtab section (there may be index
sections still).
(flip_objts): Adjust comment: func and objt sections are actually
identical in structure now, no need to caveat.
(ctf_dict_close): Free newly-added data structures.
* ctf-create.c (ctf_create): Initialize them.
(ctf_symtab_skippable): New, refactored out of
init_symtab, with st_nameidx_set check added.
(ctf_add_funcobjt_sym): New, add a function or object symbol to the
ctf_objthash or ctf_funchash, by name.
(ctf_add_objt_sym): Call it.
(ctf_add_func_sym): Likewise.
(symtypetab_delete_nonstatic_vars): New, delete vars also present as
data objects.
(CTF_SYMTYPETAB_EMIT_FUNCTION): New flag to symtypetab emitters:
this is a function emission, not a data object emission.
(CTF_SYMTYPETAB_EMIT_PAD): New flag to symtypetab emitters: emit
pads for symbols with no type (only set for unindexed sections).
(CTF_SYMTYPETAB_FORCE_INDEXED): New flag to symtypetab emitters:
always emit indexed.
(symtypetab_density): New, figure out section sizes.
(emit_symtypetab): New, emit a symtypetab.
(emit_symtypetab_index): New, emit a symtypetab index.
(ctf_serialize): Call them, emitting suitably sorted symtypetab
sections and indexes. Set suitable header flags. Copy over new
fields.
* ctf-hash.c (ctf_dynhash_sort_by_name): New, used to impose an
order on symtypetab index sections.
* ctf-link.c (ctf_add_type_mapping): Delete erroneous comment
relating to code that was never committed.
(ctf_link_one_variable): Improve variable name.
(check_sym): New, symtypetab analogue of check_variable.
(ctf_link_deduplicating_one_symtypetab): New.
(ctf_link_deduplicating_syms): Likewise.
(ctf_link_deduplicating): Call them.
(ctf_link_deduplicating_per_cu): Note that we don't call them in
this case (yet).
(ctf_link_add_strtab): Set the error on the fp correctly.
(ctf_link_add_linker_symbol): New (no longer a do-nothing stub), add
a linker symbol to the in-flight list.
(ctf_link_shuffle_syms): New (no longer a do-nothing stub), turn the
in-flight list into a mapping we can use, now its names are
resolvable in the external strtab.
* ctf-string.c (ctf_str_rollback_atom): Don't roll back atoms with
external strtab offsets.
(ctf_str_rollback): Adjust comment.
(ctf_str_write_strtab): Migrate ctf_syn_ext_strtab population from
writeout time...
(ctf_str_add_external): ... to string addition time.
* ctf-lookup.c (ctf_lookup_var_key_t): Rename to...
(ctf_lookup_idx_key_t): ... this, now we use it for syms too.
<clik_names>: New member, a name table.
(ctf_lookup_var): Adjust accordingly.
(ctf_lookup_variable): Likewise.
(ctf_lookup_by_id): Shuffle further up in the file.
(ctf_symidx_sort_arg_cb): New, callback for...
(sort_symidx_by_name): ... this new function to sort a symidx
found to be unsorted (likely originating from the compiler).
(ctf_symidx_sort): New, sort a symidx.
(ctf_lookup_symbol_name): Support dynamic symbols with indexes
provided by the linker. Use ctf_link_sym_t, not Elf64_Sym.
Check the parent if a child lookup fails.
(ctf_lookup_by_symbol): Likewise. Work for function symbols too.
(ctf_symbol_next): New, iterate over symbols with types (without
sorting).
(ctf_lookup_idx_name): New, bsearch for symbol names in indexes.
(ctf_try_lookup_indexed): New, attempt an indexed lookup.
(ctf_func_info): Reimplement in terms of ctf_lookup_by_symbol.
(ctf_func_args): Likewise.
(ctf_get_dict): Move...
* ctf-types.c (ctf_get_dict): ... here.
* ctf-util.c (ctf_sym_to_elf64): Re-express as...
(ctf_elf64_to_link_sym): ... this. Add new st_symidx field, and
st_nameidx_set (always 0, so st_nameidx can be ignored). Look in
the ELF strtab for names.
(ctf_elf32_to_link_sym): Likewise, for Elf32_Sym.
(ctf_next_destroy): Destroy ctf_next_t.u.ctn_next if need be.
* libctf.ver: Add ctf_symbol_next, ctf_add_objt_sym and
ctf_add_func_sym.
2020-11-20 21:34:04 +08:00
|
|
|
|
|
|
|
if (dst_type == 0)
|
|
|
|
{
|
|
|
|
ctf_err_warn (fp, 1, 0,
|
|
|
|
_("type %lx for symbol %s in input file %s "
|
libctf: add a deduplicator-specific type mapping table
When CTF linking is done, the linker has to track the association
between types in the inputs and types in the outputs. The deduplicator
does this via the cd_output_emission_hashes, which maps from hashes of
types (valid in both the input and output) to the IDs of types in the
specific dict in which the cd_emission_hashes is held. However, the
nondeduplicating linker and ctf_add_type used a different mechanism, a
dedicated hashtab stored in the ctf_link_type_mapping, populated via
ctf_add_type_mapping and queried via the ctf_type_mapping function. To
allow the same functions to be used for variable and symbol population
in both the deduplicating and nondeduplicating linker, the deduplicator
carefully transferred all its input->output mappings into this hashtab
before returning.
This is *expensive*. The number of entries in this hashtab scales as the
number of input types, and unlike the hashing machinery the type mapping
machinery (the only other thing which scales that way) has not been much
optimized.
Now the nondeduplicating linker is gone, we can throw this out, move
the existing type mapping machinery to ctf-create.c and dedicate it to
ctf_add_type alone, and add a new function ctf_dedup_type_mapping which
uses the deduplicator's built-in knowledge of type mappings directly,
without requiring an expensive repopulation phase.
This speeds up a test link of nouveau.ko (a good worst-case candidate
with a lot of types in each of a lot of input files) from 9.11s to 7.15s
in my testing, a speedup of over 20%.
libctf/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (ctf_dict_t) <ctf_link_type_mapping>: No longer used
by the nondeduplicating linker.
(ctf_add_type_mapping): Removed, now static.
(ctf_type_mapping): Likewise.
(ctf_dedup_type_mapping): New.
(ctf_dedup_t) <cd_input_nums>: New.
* ctf-dedup.c (ctf_dedup_init): Populate it.
(ctf_dedup_fini): Free it again. Emphasise that this has to be
the last thing called.
(ctf_dedup): Populate it.
(ctf_dedup_populate_type_mapping): Removed.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): No longer call it. No longer call
ctf_dedup_fini either.
(ctf_dedup_type_mapping): New.
* ctf-link.c (ctf_unnamed_cuname): New.
(ctf_create_per_cu): Arguments must be non-null now.
(ctf_in_member_cb_arg): Removed.
(ctf_link): No longer populate it. No longer discard the
mapping table.
(ctf_link_deduplicating_one_symtypetab): Use
ctf_dedup_type_mapping, not ctf_type_mapping. Use
ctf_unnamed_cuname.
(ctf_link_one_variable): Likewise. Pass in args individually: no
longer a ctf_variable_iter callback.
(empty_link_type_mapping): Removed.
(ctf_link_deduplicating_variables): Use ctf_variable_next, not
ctf_variable_iter. No longer pack arguments to
ctf_link_one_variable into a struct.
(ctf_link_deduplicating_per_cu): Call ctf_dedup_fini once
all link phases are done.
(ctf_link_deduplicating): Likewise.
(ctf_link_intern_extern_string): Improve comment.
(ctf_add_type_mapping): Migrate...
(ctf_type_mapping): ... these functions...
* ctf-create.c (ctf_add_type_mapping): ... here...
(ctf_type_mapping): ... and make static, for the sole use of
ctf_add_type.
2021-03-02 23:10:05 +08:00
|
|
|
"not found: skipped"), type, name,
|
|
|
|
ctf_unnamed_cuname (input));
|
libctf: symbol type linking support
This adds facilities to write out the function info and data object
sections, which efficiently map from entries in the symbol table to
types. The write-side code is entirely new: the read-side code was
merely significantly changed and support for indexed tables added
(pointed to by the no-longer-unused cth_objtidxoff and cth_funcidxoff
header fields).
With this in place, you can use ctf_lookup_by_symbol to look up the
types of symbols of function and object type (and, as before, you can
use ctf_lookup_variable to look up types of file-scope variables not
present in the symbol table, as long as you know their name: but
variables that are also data objects are now found in the data object
section instead.)
(Compatible) file format change:
The CTF spec has always said that the function info section looks much
like the CTF_K_FUNCTIONs in the type section: an info word (including an
argument count) followed by a return type and N argument types. This
format is suboptimal: it means function symbols cannot be deduplicated
and it causes a lot of ugly code duplication in libctf. But
conveniently the compiler has never emitted this! Because it has always
emitted a rather different format that libctf has never accepted, we can
be sure that there are no instances of this function info section in the
wild, and can freely change its format without compatibility concerns or
a file format version bump. (And since it has never been emitted in any
code that generated any older file format version, either, we need keep
no code to read the format as specified at all!)
So the function info section is now specified as an array of uint32_t,
exactly like the object data section: each entry is a type ID in the
type section which must be of kind CTF_K_FUNCTION, the prototype of
this function.
This allows function types to be deduplicated and also correctly encodes
the fact that all functions declared in C really are types available to
the program: so they should be stored in the type section like all other
types. (In format v4, we will be able to represent the types of static
functions as well, but that really does require a file format change.)
We introduce a new header flag, CTF_F_NEWFUNCINFO, which is set if the
new function info format is in use. A sufficiently new compiler will
always set this flag. New libctf will always set this flag: old libctf
will refuse to open any CTF dicts that have this flag set. If the flag
is not set on a dict being read in, new libctf will disregard the
function info section. Format v4 will remove this flag (or, rather, the
flag has no meaning there and the bit position may be recycled for some
other purpose).
New API:
Symbol addition:
ctf_add_func_sym: Add a symbol with a given name and type. The
type must be of kind CTF_K_FUNCTION (a function
pointer). Internally this adds a name -> type
mapping to the ctf_funchash in the ctf_dict.
ctf_add_objt_sym: Add a symbol with a given name and type. The type
kind can be anything, including function pointers.
This adds to ctf_objthash.
These both treat symbols as name -> type mappings: the linker associates
symbol names with symbol indexes via the ctf_link_shuffle_syms callback,
which sets up the ctf_dynsyms/ctf_dynsymidx/ctf_dynsymmax fields in the
ctf_dict. Repeated relinks can add more symbols.
Variables that are also exposed as symbols are removed from the variable
section at serialization time.
CTF symbol type sections which have enough pads, defined by
CTF_INDEX_PAD_THRESHOLD (whether because they are in dicts with symbols
where most types are unknown, or in archive where most types are defined
in some child or parent dict, not in this specific dict) are sorted by
name rather than symidx and accompanied by an index which associates
each symbol type entry with a name: the existing ctf_lookup_by_symbol
will map symbol indexes to symbol names and look the names up in the
index automatically. (This is currently ELF-symbol-table-dependent, but
there is almost nothing specific to ELF in here and we can add support
for other symbol table formats easily).
The compiler also uses index sections to communicate the contents of
object file symbol tables without relying on any specific ordering of
symbols: it doesn't need to sort them, and libctf will detect an
unsorted index section via the absence of the new CTF_F_IDXSORTED header
flag, and sort it if needed.
Iteration:
ctf_symbol_next: Iterator which returns the types and names of symbols
one by one, either for function or data symbols.
This does not require any sorting: the ctf_link machinery uses it to
pull in all the compiler-provided symbols cheaply, but it is not
restricted to that use.
(Compatible) changes in API:
ctf_lookup_by_symbol: can now be called for object and function
symbols: never returns ECTF_NOTDATA (which is
now not thrown by anything, but is kept for
compatibility and because it is a plausible
error that we might start throwing again at some
later date).
Internally we also have changes to the ctf-string functionality so that
"external" strings (those where we track a string -> offset mapping, but
only write out an offset) can be consulted via the usual means
(ctf_strptr) before the strtab is written out. This is important
because ctf_link_add_linker_symbol can now be handed symbols named via
strtab offsets, and ctf_link_shuffle_syms must figure out their actual
names by looking in the external symtab we have just been fed by the
ctf_link_add_strtab callback, long before that strtab is written out.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_symbol_next): New.
(ctf_add_objt_sym): Likewise.
(ctf_add_func_sym): Likewise.
* ctf.h: Document new function info section format.
(CTF_F_NEWFUNCINFO): New.
(CTF_F_IDXSORTED): New.
(CTF_F_MAX): Adjust accordingly.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (CTF_INDEX_PAD_THRESHOLD): New.
(_libctf_nonnull_): Likewise.
(ctf_in_flight_dynsym_t): New.
(ctf_dict_t) <ctf_funcidx_names>: Likewise.
<ctf_objtidx_names>: Likewise.
<ctf_nfuncidx>: Likewise.
<ctf_nobjtidx>: Likewise.
<ctf_funcidx_sxlate>: Likewise.
<ctf_objtidx_sxlate>: Likewise.
<ctf_objthash>: Likewise.
<ctf_funchash>: Likewise.
<ctf_dynsyms>: Likewise.
<ctf_dynsymidx>: Likewise.
<ctf_dynsymmax>: Likewise.
<ctf_in_flight_dynsym>: Likewise.
(struct ctf_next) <u.ctn_next>: Likewise.
(ctf_symtab_skippable): New prototype.
(ctf_add_funcobjt_sym): Likewise.
(ctf_dynhash_sort_by_name): Likewise.
(ctf_sym_to_elf64): Rename to...
(ctf_elf32_to_link_sym): ... this, and...
(ctf_elf64_to_link_sym): ... this.
* ctf-open.c (init_symtab): Check for lack of CTF_F_NEWFUNCINFO
flag, and presence of index sections. Refactor out
ctf_symtab_skippable and ctf_elf*_to_link_sym, and use them. Use
ctf_link_sym_t, not Elf64_Sym. Skip initializing objt or func
sxlate sections if corresponding index section is present. Adjust
for new func info section format.
(ctf_bufopen_internal): Add ctf_err_warn to corrupt-file error
handling. Report incorrect-length index sections. Always do an
init_symtab, even if there is no symtab section (there may be index
sections still).
(flip_objts): Adjust comment: func and objt sections are actually
identical in structure now, no need to caveat.
(ctf_dict_close): Free newly-added data structures.
* ctf-create.c (ctf_create): Initialize them.
(ctf_symtab_skippable): New, refactored out of
init_symtab, with st_nameidx_set check added.
(ctf_add_funcobjt_sym): New, add a function or object symbol to the
ctf_objthash or ctf_funchash, by name.
(ctf_add_objt_sym): Call it.
(ctf_add_func_sym): Likewise.
(symtypetab_delete_nonstatic_vars): New, delete vars also present as
data objects.
(CTF_SYMTYPETAB_EMIT_FUNCTION): New flag to symtypetab emitters:
this is a function emission, not a data object emission.
(CTF_SYMTYPETAB_EMIT_PAD): New flag to symtypetab emitters: emit
pads for symbols with no type (only set for unindexed sections).
(CTF_SYMTYPETAB_FORCE_INDEXED): New flag to symtypetab emitters:
always emit indexed.
(symtypetab_density): New, figure out section sizes.
(emit_symtypetab): New, emit a symtypetab.
(emit_symtypetab_index): New, emit a symtypetab index.
(ctf_serialize): Call them, emitting suitably sorted symtypetab
sections and indexes. Set suitable header flags. Copy over new
fields.
* ctf-hash.c (ctf_dynhash_sort_by_name): New, used to impose an
order on symtypetab index sections.
* ctf-link.c (ctf_add_type_mapping): Delete erroneous comment
relating to code that was never committed.
(ctf_link_one_variable): Improve variable name.
(check_sym): New, symtypetab analogue of check_variable.
(ctf_link_deduplicating_one_symtypetab): New.
(ctf_link_deduplicating_syms): Likewise.
(ctf_link_deduplicating): Call them.
(ctf_link_deduplicating_per_cu): Note that we don't call them in
this case (yet).
(ctf_link_add_strtab): Set the error on the fp correctly.
(ctf_link_add_linker_symbol): New (no longer a do-nothing stub), add
a linker symbol to the in-flight list.
(ctf_link_shuffle_syms): New (no longer a do-nothing stub), turn the
in-flight list into a mapping we can use, now its names are
resolvable in the external strtab.
* ctf-string.c (ctf_str_rollback_atom): Don't roll back atoms with
external strtab offsets.
(ctf_str_rollback): Adjust comment.
(ctf_str_write_strtab): Migrate ctf_syn_ext_strtab population from
writeout time...
(ctf_str_add_external): ... to string addition time.
* ctf-lookup.c (ctf_lookup_var_key_t): Rename to...
(ctf_lookup_idx_key_t): ... this, now we use it for syms too.
<clik_names>: New member, a name table.
(ctf_lookup_var): Adjust accordingly.
(ctf_lookup_variable): Likewise.
(ctf_lookup_by_id): Shuffle further up in the file.
(ctf_symidx_sort_arg_cb): New, callback for...
(sort_symidx_by_name): ... this new function to sort a symidx
found to be unsorted (likely originating from the compiler).
(ctf_symidx_sort): New, sort a symidx.
(ctf_lookup_symbol_name): Support dynamic symbols with indexes
provided by the linker. Use ctf_link_sym_t, not Elf64_Sym.
Check the parent if a child lookup fails.
(ctf_lookup_by_symbol): Likewise. Work for function symbols too.
(ctf_symbol_next): New, iterate over symbols with types (without
sorting).
(ctf_lookup_idx_name): New, bsearch for symbol names in indexes.
(ctf_try_lookup_indexed): New, attempt an indexed lookup.
(ctf_func_info): Reimplement in terms of ctf_lookup_by_symbol.
(ctf_func_args): Likewise.
(ctf_get_dict): Move...
* ctf-types.c (ctf_get_dict): ... here.
* ctf-util.c (ctf_sym_to_elf64): Re-express as...
(ctf_elf64_to_link_sym): ... this. Add new st_symidx field, and
st_nameidx_set (always 0, so st_nameidx can be ignored). Look in
the ELF strtab for names.
(ctf_elf32_to_link_sym): Likewise, for Elf32_Sym.
(ctf_next_destroy): Destroy ctf_next_t.u.ctn_next if need be.
* libctf.ver: Add ctf_symbol_next, ctf_add_objt_sym and
ctf_add_func_sym.
2020-11-20 21:34:04 +08:00
|
|
|
continue;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
sym = check_sym (per_cu_out_fp, name, dst_type, functions);
|
|
|
|
|
|
|
|
/* Already present: next symbol. */
|
|
|
|
if (sym == 0)
|
|
|
|
continue;
|
|
|
|
/* Not present: add it. */
|
|
|
|
else if (sym > 0)
|
|
|
|
{
|
|
|
|
if (ctf_add_funcobjt_sym (per_cu_out_fp, functions,
|
|
|
|
name, dst_type) < 0)
|
|
|
|
return -1; /* errno is set for us. */
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
/* Perhaps this should be an assertion failure. */
|
|
|
|
ctf_err_warn (fp, 0, ECTF_DUPLICATE,
|
|
|
|
_("symbol %s in input file %s found conflicting "
|
|
|
|
"even when trying in per-CU dict."), name,
|
libctf: add a deduplicator-specific type mapping table
When CTF linking is done, the linker has to track the association
between types in the inputs and types in the outputs. The deduplicator
does this via the cd_output_emission_hashes, which maps from hashes of
types (valid in both the input and output) to the IDs of types in the
specific dict in which the cd_emission_hashes is held. However, the
nondeduplicating linker and ctf_add_type used a different mechanism, a
dedicated hashtab stored in the ctf_link_type_mapping, populated via
ctf_add_type_mapping and queried via the ctf_type_mapping function. To
allow the same functions to be used for variable and symbol population
in both the deduplicating and nondeduplicating linker, the deduplicator
carefully transferred all its input->output mappings into this hashtab
before returning.
This is *expensive*. The number of entries in this hashtab scales as the
number of input types, and unlike the hashing machinery the type mapping
machinery (the only other thing which scales that way) has not been much
optimized.
Now the nondeduplicating linker is gone, we can throw this out, move
the existing type mapping machinery to ctf-create.c and dedicate it to
ctf_add_type alone, and add a new function ctf_dedup_type_mapping which
uses the deduplicator's built-in knowledge of type mappings directly,
without requiring an expensive repopulation phase.
This speeds up a test link of nouveau.ko (a good worst-case candidate
with a lot of types in each of a lot of input files) from 9.11s to 7.15s
in my testing, a speedup of over 20%.
libctf/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (ctf_dict_t) <ctf_link_type_mapping>: No longer used
by the nondeduplicating linker.
(ctf_add_type_mapping): Removed, now static.
(ctf_type_mapping): Likewise.
(ctf_dedup_type_mapping): New.
(ctf_dedup_t) <cd_input_nums>: New.
* ctf-dedup.c (ctf_dedup_init): Populate it.
(ctf_dedup_fini): Free it again. Emphasise that this has to be
the last thing called.
(ctf_dedup): Populate it.
(ctf_dedup_populate_type_mapping): Removed.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): No longer call it. No longer call
ctf_dedup_fini either.
(ctf_dedup_type_mapping): New.
* ctf-link.c (ctf_unnamed_cuname): New.
(ctf_create_per_cu): Arguments must be non-null now.
(ctf_in_member_cb_arg): Removed.
(ctf_link): No longer populate it. No longer discard the
mapping table.
(ctf_link_deduplicating_one_symtypetab): Use
ctf_dedup_type_mapping, not ctf_type_mapping. Use
ctf_unnamed_cuname.
(ctf_link_one_variable): Likewise. Pass in args individually: no
longer a ctf_variable_iter callback.
(empty_link_type_mapping): Removed.
(ctf_link_deduplicating_variables): Use ctf_variable_next, not
ctf_variable_iter. No longer pack arguments to
ctf_link_one_variable into a struct.
(ctf_link_deduplicating_per_cu): Call ctf_dedup_fini once
all link phases are done.
(ctf_link_deduplicating): Likewise.
(ctf_link_intern_extern_string): Improve comment.
(ctf_add_type_mapping): Migrate...
(ctf_type_mapping): ... these functions...
* ctf-create.c (ctf_add_type_mapping): ... here...
(ctf_type_mapping): ... and make static, for the sole use of
ctf_add_type.
2021-03-02 23:10:05 +08:00
|
|
|
ctf_unnamed_cuname (input));
|
libctf: symbol type linking support
This adds facilities to write out the function info and data object
sections, which efficiently map from entries in the symbol table to
types. The write-side code is entirely new: the read-side code was
merely significantly changed and support for indexed tables added
(pointed to by the no-longer-unused cth_objtidxoff and cth_funcidxoff
header fields).
With this in place, you can use ctf_lookup_by_symbol to look up the
types of symbols of function and object type (and, as before, you can
use ctf_lookup_variable to look up types of file-scope variables not
present in the symbol table, as long as you know their name: but
variables that are also data objects are now found in the data object
section instead.)
(Compatible) file format change:
The CTF spec has always said that the function info section looks much
like the CTF_K_FUNCTIONs in the type section: an info word (including an
argument count) followed by a return type and N argument types. This
format is suboptimal: it means function symbols cannot be deduplicated
and it causes a lot of ugly code duplication in libctf. But
conveniently the compiler has never emitted this! Because it has always
emitted a rather different format that libctf has never accepted, we can
be sure that there are no instances of this function info section in the
wild, and can freely change its format without compatibility concerns or
a file format version bump. (And since it has never been emitted in any
code that generated any older file format version, either, we need keep
no code to read the format as specified at all!)
So the function info section is now specified as an array of uint32_t,
exactly like the object data section: each entry is a type ID in the
type section which must be of kind CTF_K_FUNCTION, the prototype of
this function.
This allows function types to be deduplicated and also correctly encodes
the fact that all functions declared in C really are types available to
the program: so they should be stored in the type section like all other
types. (In format v4, we will be able to represent the types of static
functions as well, but that really does require a file format change.)
We introduce a new header flag, CTF_F_NEWFUNCINFO, which is set if the
new function info format is in use. A sufficiently new compiler will
always set this flag. New libctf will always set this flag: old libctf
will refuse to open any CTF dicts that have this flag set. If the flag
is not set on a dict being read in, new libctf will disregard the
function info section. Format v4 will remove this flag (or, rather, the
flag has no meaning there and the bit position may be recycled for some
other purpose).
New API:
Symbol addition:
ctf_add_func_sym: Add a symbol with a given name and type. The
type must be of kind CTF_K_FUNCTION (a function
pointer). Internally this adds a name -> type
mapping to the ctf_funchash in the ctf_dict.
ctf_add_objt_sym: Add a symbol with a given name and type. The type
kind can be anything, including function pointers.
This adds to ctf_objthash.
These both treat symbols as name -> type mappings: the linker associates
symbol names with symbol indexes via the ctf_link_shuffle_syms callback,
which sets up the ctf_dynsyms/ctf_dynsymidx/ctf_dynsymmax fields in the
ctf_dict. Repeated relinks can add more symbols.
Variables that are also exposed as symbols are removed from the variable
section at serialization time.
CTF symbol type sections which have enough pads, defined by
CTF_INDEX_PAD_THRESHOLD (whether because they are in dicts with symbols
where most types are unknown, or in archive where most types are defined
in some child or parent dict, not in this specific dict) are sorted by
name rather than symidx and accompanied by an index which associates
each symbol type entry with a name: the existing ctf_lookup_by_symbol
will map symbol indexes to symbol names and look the names up in the
index automatically. (This is currently ELF-symbol-table-dependent, but
there is almost nothing specific to ELF in here and we can add support
for other symbol table formats easily).
The compiler also uses index sections to communicate the contents of
object file symbol tables without relying on any specific ordering of
symbols: it doesn't need to sort them, and libctf will detect an
unsorted index section via the absence of the new CTF_F_IDXSORTED header
flag, and sort it if needed.
Iteration:
ctf_symbol_next: Iterator which returns the types and names of symbols
one by one, either for function or data symbols.
This does not require any sorting: the ctf_link machinery uses it to
pull in all the compiler-provided symbols cheaply, but it is not
restricted to that use.
(Compatible) changes in API:
ctf_lookup_by_symbol: can now be called for object and function
symbols: never returns ECTF_NOTDATA (which is
now not thrown by anything, but is kept for
compatibility and because it is a plausible
error that we might start throwing again at some
later date).
Internally we also have changes to the ctf-string functionality so that
"external" strings (those where we track a string -> offset mapping, but
only write out an offset) can be consulted via the usual means
(ctf_strptr) before the strtab is written out. This is important
because ctf_link_add_linker_symbol can now be handed symbols named via
strtab offsets, and ctf_link_shuffle_syms must figure out their actual
names by looking in the external symtab we have just been fed by the
ctf_link_add_strtab callback, long before that strtab is written out.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_symbol_next): New.
(ctf_add_objt_sym): Likewise.
(ctf_add_func_sym): Likewise.
* ctf.h: Document new function info section format.
(CTF_F_NEWFUNCINFO): New.
(CTF_F_IDXSORTED): New.
(CTF_F_MAX): Adjust accordingly.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (CTF_INDEX_PAD_THRESHOLD): New.
(_libctf_nonnull_): Likewise.
(ctf_in_flight_dynsym_t): New.
(ctf_dict_t) <ctf_funcidx_names>: Likewise.
<ctf_objtidx_names>: Likewise.
<ctf_nfuncidx>: Likewise.
<ctf_nobjtidx>: Likewise.
<ctf_funcidx_sxlate>: Likewise.
<ctf_objtidx_sxlate>: Likewise.
<ctf_objthash>: Likewise.
<ctf_funchash>: Likewise.
<ctf_dynsyms>: Likewise.
<ctf_dynsymidx>: Likewise.
<ctf_dynsymmax>: Likewise.
<ctf_in_flight_dynsym>: Likewise.
(struct ctf_next) <u.ctn_next>: Likewise.
(ctf_symtab_skippable): New prototype.
(ctf_add_funcobjt_sym): Likewise.
(ctf_dynhash_sort_by_name): Likewise.
(ctf_sym_to_elf64): Rename to...
(ctf_elf32_to_link_sym): ... this, and...
(ctf_elf64_to_link_sym): ... this.
* ctf-open.c (init_symtab): Check for lack of CTF_F_NEWFUNCINFO
flag, and presence of index sections. Refactor out
ctf_symtab_skippable and ctf_elf*_to_link_sym, and use them. Use
ctf_link_sym_t, not Elf64_Sym. Skip initializing objt or func
sxlate sections if corresponding index section is present. Adjust
for new func info section format.
(ctf_bufopen_internal): Add ctf_err_warn to corrupt-file error
handling. Report incorrect-length index sections. Always do an
init_symtab, even if there is no symtab section (there may be index
sections still).
(flip_objts): Adjust comment: func and objt sections are actually
identical in structure now, no need to caveat.
(ctf_dict_close): Free newly-added data structures.
* ctf-create.c (ctf_create): Initialize them.
(ctf_symtab_skippable): New, refactored out of
init_symtab, with st_nameidx_set check added.
(ctf_add_funcobjt_sym): New, add a function or object symbol to the
ctf_objthash or ctf_funchash, by name.
(ctf_add_objt_sym): Call it.
(ctf_add_func_sym): Likewise.
(symtypetab_delete_nonstatic_vars): New, delete vars also present as
data objects.
(CTF_SYMTYPETAB_EMIT_FUNCTION): New flag to symtypetab emitters:
this is a function emission, not a data object emission.
(CTF_SYMTYPETAB_EMIT_PAD): New flag to symtypetab emitters: emit
pads for symbols with no type (only set for unindexed sections).
(CTF_SYMTYPETAB_FORCE_INDEXED): New flag to symtypetab emitters:
always emit indexed.
(symtypetab_density): New, figure out section sizes.
(emit_symtypetab): New, emit a symtypetab.
(emit_symtypetab_index): New, emit a symtypetab index.
(ctf_serialize): Call them, emitting suitably sorted symtypetab
sections and indexes. Set suitable header flags. Copy over new
fields.
* ctf-hash.c (ctf_dynhash_sort_by_name): New, used to impose an
order on symtypetab index sections.
* ctf-link.c (ctf_add_type_mapping): Delete erroneous comment
relating to code that was never committed.
(ctf_link_one_variable): Improve variable name.
(check_sym): New, symtypetab analogue of check_variable.
(ctf_link_deduplicating_one_symtypetab): New.
(ctf_link_deduplicating_syms): Likewise.
(ctf_link_deduplicating): Call them.
(ctf_link_deduplicating_per_cu): Note that we don't call them in
this case (yet).
(ctf_link_add_strtab): Set the error on the fp correctly.
(ctf_link_add_linker_symbol): New (no longer a do-nothing stub), add
a linker symbol to the in-flight list.
(ctf_link_shuffle_syms): New (no longer a do-nothing stub), turn the
in-flight list into a mapping we can use, now its names are
resolvable in the external strtab.
* ctf-string.c (ctf_str_rollback_atom): Don't roll back atoms with
external strtab offsets.
(ctf_str_rollback): Adjust comment.
(ctf_str_write_strtab): Migrate ctf_syn_ext_strtab population from
writeout time...
(ctf_str_add_external): ... to string addition time.
* ctf-lookup.c (ctf_lookup_var_key_t): Rename to...
(ctf_lookup_idx_key_t): ... this, now we use it for syms too.
<clik_names>: New member, a name table.
(ctf_lookup_var): Adjust accordingly.
(ctf_lookup_variable): Likewise.
(ctf_lookup_by_id): Shuffle further up in the file.
(ctf_symidx_sort_arg_cb): New, callback for...
(sort_symidx_by_name): ... this new function to sort a symidx
found to be unsorted (likely originating from the compiler).
(ctf_symidx_sort): New, sort a symidx.
(ctf_lookup_symbol_name): Support dynamic symbols with indexes
provided by the linker. Use ctf_link_sym_t, not Elf64_Sym.
Check the parent if a child lookup fails.
(ctf_lookup_by_symbol): Likewise. Work for function symbols too.
(ctf_symbol_next): New, iterate over symbols with types (without
sorting).
(ctf_lookup_idx_name): New, bsearch for symbol names in indexes.
(ctf_try_lookup_indexed): New, attempt an indexed lookup.
(ctf_func_info): Reimplement in terms of ctf_lookup_by_symbol.
(ctf_func_args): Likewise.
(ctf_get_dict): Move...
* ctf-types.c (ctf_get_dict): ... here.
* ctf-util.c (ctf_sym_to_elf64): Re-express as...
(ctf_elf64_to_link_sym): ... this. Add new st_symidx field, and
st_nameidx_set (always 0, so st_nameidx can be ignored). Look in
the ELF strtab for names.
(ctf_elf32_to_link_sym): Likewise, for Elf32_Sym.
(ctf_next_destroy): Destroy ctf_next_t.u.ctn_next if need be.
* libctf.ver: Add ctf_symbol_next, ctf_add_objt_sym and
ctf_add_func_sym.
2020-11-20 21:34:04 +08:00
|
|
|
return (ctf_set_errno (fp, ECTF_DUPLICATE));
|
|
|
|
}
|
|
|
|
}
|
|
|
|
if (ctf_errno (input) != ECTF_NEXT_END)
|
|
|
|
{
|
|
|
|
ctf_set_errno (fp, ctf_errno (input));
|
|
|
|
ctf_err_warn (fp, 0, ctf_errno (input),
|
|
|
|
functions ? _("iterating over function symbols") :
|
|
|
|
_("iterating over data symbols"));
|
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Do a deduplicating link of the function info and data objects
|
|
|
|
in the inputs. */
|
|
|
|
static int
|
|
|
|
ctf_link_deduplicating_syms (ctf_dict_t *fp, ctf_dict_t **inputs,
|
|
|
|
size_t ninputs, int cu_mapped)
|
|
|
|
{
|
|
|
|
size_t i;
|
|
|
|
|
|
|
|
for (i = 0; i < ninputs; i++)
|
|
|
|
{
|
|
|
|
if (ctf_link_deduplicating_one_symtypetab (fp, inputs[i],
|
|
|
|
cu_mapped, 0) < 0)
|
|
|
|
return -1; /* errno is set for us. */
|
|
|
|
|
|
|
|
if (ctf_link_deduplicating_one_symtypetab (fp, inputs[i],
|
|
|
|
cu_mapped, 1) < 0)
|
|
|
|
return -1; /* errno is set for us. */
|
|
|
|
}
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
libctf, link: tie in the deduplicating linker
This fairly intricate commit connects up the CTF linker machinery (which
operates in terms of ctf_archive_t's on ctf_link_inputs ->
ctf_link_outputs) to the deduplicator (which operates in terms of arrays
of ctf_file_t's, all the archives exploded).
The nondeduplicating linker is retained, but is not called unless the
CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the
environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have
confidence in the much-more-complex deduplicating linker, I hope the
nondeduplicating linker can be removed.
In brief, what this does is traverses each input archive in
ctf_link_inputs, opening every member (if not already open) and tying
child dicts to their parents, shoving them into an array and
constructing a corresponding parents array that tells the deduplicator
which dict is the parent of which child. We then call ctf_dedup and
ctf_dedup_emit with that array of inputs, taking the outputs that result
and putting them into ctf_link_outputs where the rest of the CTF linker
expects to find them, then linking in the variables just as is done by
the nondeduplicating linker.
It also implements much of the CU-mapping side of things. The problem
CU-mapping introduces is that if you map many input CUs into one output,
this is saying that you want many translation units to produce at most
one child dict if conflicting types are found in any of them. This
means you can suddenly have multiple distinct types with the same name
in the same dict, which libctf cannot really represent because it's not
something you can do with C translation units.
The deduplicator machinery already committed does as best it can with
these, hiding types with conflicting names rather than making child
dicts out of them: but we still need to call it. This is done similarly
to the main link, taking the inputs (one CU output at a time),
deduplicating them, taking the output and making it an input to the
final link. Two (significant) optimizations are done: we share atoms
tables between all these links and the final link (so e.g. all type hash
values are shared, all decorated type names, etc); and any CU-mapped
links with only one input (and no child dicts) doesn't need to do
anything other than renaming the CU: the CU-mapped link phase can be
skipped for it. Put together, large CU-mapped links can save 50% of
their memory usage and about as much time (and the memory usage for
CU-mapped links is significant, because all those output CUs have to
have all their types stored in memory all at once).
include/
* ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the
deduplicator.
libctf/
* ctf-impl.h (ctf_list_splice): New.
* ctf-util.h (ctf_list_splice): Likewise.
* ctf-link.c (link_sort_inputs_cb_arg_t): Likewise.
(ctf_link_sort_inputs): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_variables): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Call it.
2020-06-06 05:57:06 +08:00
|
|
|
/* Do the per-CU part of a deduplicating link. */
|
|
|
|
static int
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
ctf_link_deduplicating_per_cu (ctf_dict_t *fp)
|
libctf, link: tie in the deduplicating linker
This fairly intricate commit connects up the CTF linker machinery (which
operates in terms of ctf_archive_t's on ctf_link_inputs ->
ctf_link_outputs) to the deduplicator (which operates in terms of arrays
of ctf_file_t's, all the archives exploded).
The nondeduplicating linker is retained, but is not called unless the
CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the
environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have
confidence in the much-more-complex deduplicating linker, I hope the
nondeduplicating linker can be removed.
In brief, what this does is traverses each input archive in
ctf_link_inputs, opening every member (if not already open) and tying
child dicts to their parents, shoving them into an array and
constructing a corresponding parents array that tells the deduplicator
which dict is the parent of which child. We then call ctf_dedup and
ctf_dedup_emit with that array of inputs, taking the outputs that result
and putting them into ctf_link_outputs where the rest of the CTF linker
expects to find them, then linking in the variables just as is done by
the nondeduplicating linker.
It also implements much of the CU-mapping side of things. The problem
CU-mapping introduces is that if you map many input CUs into one output,
this is saying that you want many translation units to produce at most
one child dict if conflicting types are found in any of them. This
means you can suddenly have multiple distinct types with the same name
in the same dict, which libctf cannot really represent because it's not
something you can do with C translation units.
The deduplicator machinery already committed does as best it can with
these, hiding types with conflicting names rather than making child
dicts out of them: but we still need to call it. This is done similarly
to the main link, taking the inputs (one CU output at a time),
deduplicating them, taking the output and making it an input to the
final link. Two (significant) optimizations are done: we share atoms
tables between all these links and the final link (so e.g. all type hash
values are shared, all decorated type names, etc); and any CU-mapped
links with only one input (and no child dicts) doesn't need to do
anything other than renaming the CU: the CU-mapped link phase can be
skipped for it. Put together, large CU-mapped links can save 50% of
their memory usage and about as much time (and the memory usage for
CU-mapped links is significant, because all those output CUs have to
have all their types stored in memory all at once).
include/
* ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the
deduplicator.
libctf/
* ctf-impl.h (ctf_list_splice): New.
* ctf-util.h (ctf_list_splice): Likewise.
* ctf-link.c (link_sort_inputs_cb_arg_t): Likewise.
(ctf_link_sort_inputs): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_variables): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Call it.
2020-06-06 05:57:06 +08:00
|
|
|
{
|
|
|
|
ctf_next_t *i = NULL;
|
|
|
|
int err;
|
|
|
|
void *out_cu;
|
|
|
|
void *in_cus;
|
|
|
|
|
|
|
|
/* Links with a per-CU mapping in force get a first pass of deduplication,
|
|
|
|
dedupping the inputs for a given CU mapping into the output for that
|
|
|
|
mapping. The outputs from this process get fed back into the final pass
|
|
|
|
that is carried out even for non-CU links. */
|
|
|
|
|
|
|
|
while ((err = ctf_dynhash_next (fp->ctf_link_out_cu_mapping, &i, &out_cu,
|
|
|
|
&in_cus)) == 0)
|
|
|
|
{
|
|
|
|
const char *out_name = (const char *) out_cu;
|
|
|
|
ctf_dynhash_t *in = (ctf_dynhash_t *) in_cus;
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
ctf_dict_t *out = NULL;
|
|
|
|
ctf_dict_t **inputs;
|
|
|
|
ctf_dict_t **outputs;
|
libctf, link: tie in the deduplicating linker
This fairly intricate commit connects up the CTF linker machinery (which
operates in terms of ctf_archive_t's on ctf_link_inputs ->
ctf_link_outputs) to the deduplicator (which operates in terms of arrays
of ctf_file_t's, all the archives exploded).
The nondeduplicating linker is retained, but is not called unless the
CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the
environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have
confidence in the much-more-complex deduplicating linker, I hope the
nondeduplicating linker can be removed.
In brief, what this does is traverses each input archive in
ctf_link_inputs, opening every member (if not already open) and tying
child dicts to their parents, shoving them into an array and
constructing a corresponding parents array that tells the deduplicator
which dict is the parent of which child. We then call ctf_dedup and
ctf_dedup_emit with that array of inputs, taking the outputs that result
and putting them into ctf_link_outputs where the rest of the CTF linker
expects to find them, then linking in the variables just as is done by
the nondeduplicating linker.
It also implements much of the CU-mapping side of things. The problem
CU-mapping introduces is that if you map many input CUs into one output,
this is saying that you want many translation units to produce at most
one child dict if conflicting types are found in any of them. This
means you can suddenly have multiple distinct types with the same name
in the same dict, which libctf cannot really represent because it's not
something you can do with C translation units.
The deduplicator machinery already committed does as best it can with
these, hiding types with conflicting names rather than making child
dicts out of them: but we still need to call it. This is done similarly
to the main link, taking the inputs (one CU output at a time),
deduplicating them, taking the output and making it an input to the
final link. Two (significant) optimizations are done: we share atoms
tables between all these links and the final link (so e.g. all type hash
values are shared, all decorated type names, etc); and any CU-mapped
links with only one input (and no child dicts) doesn't need to do
anything other than renaming the CU: the CU-mapped link phase can be
skipped for it. Put together, large CU-mapped links can save 50% of
their memory usage and about as much time (and the memory usage for
CU-mapped links is significant, because all those output CUs have to
have all their types stored in memory all at once).
include/
* ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the
deduplicator.
libctf/
* ctf-impl.h (ctf_list_splice): New.
* ctf-util.h (ctf_list_splice): Likewise.
* ctf-link.c (link_sort_inputs_cb_arg_t): Likewise.
(ctf_link_sort_inputs): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_variables): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Call it.
2020-06-06 05:57:06 +08:00
|
|
|
ctf_archive_t *in_arc;
|
|
|
|
ssize_t ninputs;
|
|
|
|
ctf_link_input_t *only_input;
|
|
|
|
uint32_t noutputs;
|
|
|
|
uint32_t *parents;
|
|
|
|
|
|
|
|
if ((ninputs = ctf_link_deduplicating_count_inputs (fp, in,
|
|
|
|
&only_input)) == -1)
|
|
|
|
goto err_open_inputs;
|
|
|
|
|
|
|
|
/* CU mapping with no inputs? Skip. */
|
|
|
|
if (ninputs == 0)
|
|
|
|
continue;
|
|
|
|
|
|
|
|
if (labs ((long int) ninputs) > 0xfffffffe)
|
|
|
|
{
|
libctf, binutils, include, ld: gettextize and improve error handling
This commit follows on from the earlier commit "libctf, ld, binutils:
add textual error/warning reporting for libctf" and converts every error
in libctf that was reported using ctf_dprintf to use ctf_err_warn
instead, gettextizing them in the process, using N_() where necessary to
avoid doing gettext calls unless an error message is actually generated,
and rephrasing some error messages for ease of translation.
This requires a slight change in the ctf_errwarning_next API: this API
is public but has not been in a release yet, so can still change freely.
The problem is that many errors are emitted at open time (whether
opening of a CTF dict, or opening of a CTF archive): the former of these
throws away its incompletely-initialized ctf_file_t rather than return
it, and the latter has no ctf_file_t at all. So errors and warnings
emitted at open time cannot be stored in the ctf_file_t, and have to go
elsewhere.
We put them in a static local in ctf-subr.c (which is not very
thread-safe: a later commit will improve things here): ctf_err_warn with
a NULL fp adds to this list, and the public interface
ctf_errwarning_next with a NULL fp retrieves from it.
We need a slight exception from the usual iterator rules in this case:
with a NULL fp, there is nowhere to store the ECTF_NEXT_END "error"
which signifies the end of iteration, so we add a new err parameter to
ctf_errwarning_next which is used to report such iteration-related
errors. (If an fp is provided -- i.e., if not reporting open errors --
this is optional, but even if it's optional it's still an API change.
This is actually useful from a usability POV as well, since
ctf_errwarning_next is usually called when there's been an error, so
overwriting the error code with ECTF_NEXT_END is not very helpful!
So, unusually, ctf_errwarning_next now uses the passed fp for its
error code *only* if no errp pointer is passed in, and leaves it
untouched otherwise.)
ld, objdump and readelf are adapted to call ctf_errwarning_next with a
NULL fp to report open errors where appropriate.
The ctf_err_warn API also has to change, gaining a new error-number
parameter which is used to add the error message corresponding to that
error number into the debug stream when LIBCTF_DEBUG is enabled:
changing this API is easy at this point since we are already touching
all existing calls to gettextize them. We need this because the debug
stream should contain the errno's message, but the error reported in the
error/warning stream should *not*, because the caller will probably
report it themselves at failure time regardless, and reporting it in
every error message that leads up to it leads to a ridiculous chattering
on failure, which is likely to end up as ridiculous chattering on stderr
(trimmed a bit):
CTF error: `ld/testsuite/ld-ctf/A.c (0): lookup failure for type 3: flags 1: The parent CTF dictionary is unavailable'
CTF error: `ld/testsuite/ld-ctf/A.c (0): struct/union member type hashing error during type hashing for type 80000001, kind 6: The parent CTF dictionary is unavailable'
CTF error: `deduplicating link variable emission failed for ld/testsuite/ld-ctf/A.c: The parent CTF dictionary is unavailable'
ld/.libs/lt-ld-new: warning: CTF linking failed; output will have no CTF section: `The parent CTF dictionary is unavailable'
We only need to be told that the parent CTF dictionary is unavailable
*once*, not over and over again!
errmsgs are still emitted on warning generation, because warnings do not
usually lead to a failure propagated up to the caller and reported
there.
Debug-stream messages are not translated. If translation is turned on,
there will be a mixture of English and translated messages in the debug
stream, but rather that than burden the translators with debug-only
output.
binutils/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_archive_member): Move error-
reporting...
(dump_ctf_errs): ... into this separate function.
(dump_ctf): Call it on open errors.
* readelf.c (dump_ctf_archive_member): Move error-
reporting...
(dump_ctf_errs): ... into this separate function. Support
calls with NULL fp. Adjust for new err parameter to
ctf_errwarning_next.
(dump_section_as_ctf): Call it on open errors.
include/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_errwarning_next): New err parameter.
ld/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (lang_ctf_errs_warnings): Support calls with NULL fp.
Adjust for new err parameter to ctf_errwarning_next. Only
check for assertion failures when fp is non-NULL.
(ldlang_open_ctf): Call it on open errors.
* testsuite/ld-ctf/ctf.exp: Always use the C locale to avoid
breaking the diags tests.
libctf/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-subr.c (open_errors): New list.
(ctf_err_warn): Calls with NULL fp append to open_errors. Add err
parameter, and use it to decorate the debug stream with errmsgs.
(ctf_err_warn_to_open): Splice errors from a CTF dict into the
open_errors.
(ctf_errwarning_next): Calls with NULL fp report from open_errors.
New err param to report iteration errors (including end-of-iteration)
when fp is NULL.
(ctf_assert_fail_internal): Adjust ctf_err_warn call for new err
parameter: gettextize.
* ctf-impl.h (ctfo_get_vbytes): Add ctf_file_t parameter.
(LCTF_VBYTES): Adjust.
(ctf_err_warn_to_open): New.
(ctf_err_warn): Adjust.
(ctf_bundle): Used in only one place: move...
* ctf-create.c: ... here.
(enumcmp): Use ctf_err_warn, not ctf_dprintf, passing the err number
down as needed. Don't emit the errmsg. Gettextize.
(membcmp): Likewise.
(ctf_add_type_internal): Likewise.
(ctf_write_mem): Likewise.
(ctf_compress_write): Likewise. Report errors writing the header or
body.
(ctf_write): Likewise.
* ctf-archive.c (ctf_arc_write_fd): Use ctf_err_warn, not
ctf_dprintf, and gettextize, as above.
(ctf_arc_write): Likewise.
(ctf_arc_bufopen): Likewise.
(ctf_arc_open_internal): Likewise.
* ctf-labels.c (ctf_label_iter): Likewise.
* ctf-open-bfd.c (ctf_bfdclose): Likewise.
(ctf_bfdopen): Likewise.
(ctf_bfdopen_ctfsect): Likewise.
(ctf_fdopen): Likewise.
* ctf-string.c (ctf_str_write_strtab): Likewise.
* ctf-types.c (ctf_type_resolve): Likewise.
* ctf-open.c (get_vbytes_common): Likewise. Pass down the ctf dict.
(get_vbytes_v1): Pass down the ctf dict.
(get_vbytes_v2): Likewise.
(flip_ctf): Likewise.
(flip_types): Likewise. Use ctf_err_warn, not ctf_dprintf, and
gettextize, as above.
(upgrade_types_v1): Adjust calls.
(init_types): Use ctf_err_warn, not ctf_dprintf, as above.
(ctf_bufopen_internal): Likewise. Adjust calls. Transplant errors
emitted into individual dicts into the open errors if this turns
out to be a failed open in the end.
* ctf-dump.c (ctf_dump_format_type): Adjust ctf_err_warn for new err
argument. Gettextize. Don't emit the errmsg.
(ctf_dump_funcs): Likewise. Collapse err label into its only case.
(ctf_dump_type): Likewise.
* ctf-link.c (ctf_create_per_cu): Adjust ctf_err_warn for new err
argument. Gettextize. Don't emit the errmsg.
(ctf_link_one_type): Likewise.
(ctf_link_lazy_open): Likewise.
(ctf_link_one_input_archive): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Likewise.
(ctf_link_deduplicating_per_cu): Likewise. Add some missed
ctf_set_errnos to obscure error cases.
* ctf-dedup.c (ctf_dedup_rhash_type): Adjust ctf_err_warn for new
err argument. Gettextize. Don't emit the errmsg.
(ctf_dedup_populate_mappings): Likewise.
(ctf_dedup_detect_name_ambiguity): Likewise.
(ctf_dedup_init): Likewise.
(ctf_dedup_multiple_input_dicts): Likewise.
(ctf_dedup_conflictify_unshared): Likewise.
(ctf_dedup): Likewise.
(ctf_dedup_rwalk_one_output_mapping): Likewise.
(ctf_dedup_id_to_target): Likewise.
(ctf_dedup_emit_type): Likewise.
(ctf_dedup_emit_struct_members): Likewise.
(ctf_dedup_populate_type_mapping): Likewise.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): Likewise.
(ctf_dedup_hash_type): Likewise. Fix a bit of messed-up error
status setting.
(ctf_dedup_rwalk_one_output_mapping): Likewise. Don't hide
unknown-type-kind messages (which signify file corruption).
2020-07-27 23:45:15 +08:00
|
|
|
ctf_err_warn (fp, 0, EFBIG, _("too many inputs in deduplicating "
|
|
|
|
"link: %li"), (long int) ninputs);
|
|
|
|
ctf_set_errno (fp, EFBIG);
|
libctf, link: tie in the deduplicating linker
This fairly intricate commit connects up the CTF linker machinery (which
operates in terms of ctf_archive_t's on ctf_link_inputs ->
ctf_link_outputs) to the deduplicator (which operates in terms of arrays
of ctf_file_t's, all the archives exploded).
The nondeduplicating linker is retained, but is not called unless the
CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the
environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have
confidence in the much-more-complex deduplicating linker, I hope the
nondeduplicating linker can be removed.
In brief, what this does is traverses each input archive in
ctf_link_inputs, opening every member (if not already open) and tying
child dicts to their parents, shoving them into an array and
constructing a corresponding parents array that tells the deduplicator
which dict is the parent of which child. We then call ctf_dedup and
ctf_dedup_emit with that array of inputs, taking the outputs that result
and putting them into ctf_link_outputs where the rest of the CTF linker
expects to find them, then linking in the variables just as is done by
the nondeduplicating linker.
It also implements much of the CU-mapping side of things. The problem
CU-mapping introduces is that if you map many input CUs into one output,
this is saying that you want many translation units to produce at most
one child dict if conflicting types are found in any of them. This
means you can suddenly have multiple distinct types with the same name
in the same dict, which libctf cannot really represent because it's not
something you can do with C translation units.
The deduplicator machinery already committed does as best it can with
these, hiding types with conflicting names rather than making child
dicts out of them: but we still need to call it. This is done similarly
to the main link, taking the inputs (one CU output at a time),
deduplicating them, taking the output and making it an input to the
final link. Two (significant) optimizations are done: we share atoms
tables between all these links and the final link (so e.g. all type hash
values are shared, all decorated type names, etc); and any CU-mapped
links with only one input (and no child dicts) doesn't need to do
anything other than renaming the CU: the CU-mapped link phase can be
skipped for it. Put together, large CU-mapped links can save 50% of
their memory usage and about as much time (and the memory usage for
CU-mapped links is significant, because all those output CUs have to
have all their types stored in memory all at once).
include/
* ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the
deduplicator.
libctf/
* ctf-impl.h (ctf_list_splice): New.
* ctf-util.h (ctf_list_splice): Likewise.
* ctf-link.c (link_sort_inputs_cb_arg_t): Likewise.
(ctf_link_sort_inputs): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_variables): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Call it.
2020-06-06 05:57:06 +08:00
|
|
|
goto err_open_inputs;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Short-circuit: a cu-mapped link with only one input archive with
|
|
|
|
unconflicting contents is a do-nothing, and we can just leave the input
|
|
|
|
in place: we do have to change the cuname, though, so we unwrap it,
|
|
|
|
change the cuname, then stuff it back in the linker input again, via
|
|
|
|
the clin_fp short-circuit member. ctf_link_deduplicating_open_inputs
|
|
|
|
will spot this member and jam it straight into the next link phase,
|
|
|
|
ignoring the corresponding archive. */
|
|
|
|
if (only_input && ninputs == 1)
|
|
|
|
{
|
|
|
|
ctf_next_t *ai = NULL;
|
|
|
|
int err;
|
|
|
|
|
|
|
|
/* We can abuse an archive iterator to get the only member cheaply, no
|
|
|
|
matter what its name. */
|
|
|
|
only_input->clin_fp = ctf_archive_next (only_input->clin_arc,
|
|
|
|
&ai, NULL, 0, &err);
|
|
|
|
if (!only_input->clin_fp)
|
|
|
|
{
|
libctf, binutils, include, ld: gettextize and improve error handling
This commit follows on from the earlier commit "libctf, ld, binutils:
add textual error/warning reporting for libctf" and converts every error
in libctf that was reported using ctf_dprintf to use ctf_err_warn
instead, gettextizing them in the process, using N_() where necessary to
avoid doing gettext calls unless an error message is actually generated,
and rephrasing some error messages for ease of translation.
This requires a slight change in the ctf_errwarning_next API: this API
is public but has not been in a release yet, so can still change freely.
The problem is that many errors are emitted at open time (whether
opening of a CTF dict, or opening of a CTF archive): the former of these
throws away its incompletely-initialized ctf_file_t rather than return
it, and the latter has no ctf_file_t at all. So errors and warnings
emitted at open time cannot be stored in the ctf_file_t, and have to go
elsewhere.
We put them in a static local in ctf-subr.c (which is not very
thread-safe: a later commit will improve things here): ctf_err_warn with
a NULL fp adds to this list, and the public interface
ctf_errwarning_next with a NULL fp retrieves from it.
We need a slight exception from the usual iterator rules in this case:
with a NULL fp, there is nowhere to store the ECTF_NEXT_END "error"
which signifies the end of iteration, so we add a new err parameter to
ctf_errwarning_next which is used to report such iteration-related
errors. (If an fp is provided -- i.e., if not reporting open errors --
this is optional, but even if it's optional it's still an API change.
This is actually useful from a usability POV as well, since
ctf_errwarning_next is usually called when there's been an error, so
overwriting the error code with ECTF_NEXT_END is not very helpful!
So, unusually, ctf_errwarning_next now uses the passed fp for its
error code *only* if no errp pointer is passed in, and leaves it
untouched otherwise.)
ld, objdump and readelf are adapted to call ctf_errwarning_next with a
NULL fp to report open errors where appropriate.
The ctf_err_warn API also has to change, gaining a new error-number
parameter which is used to add the error message corresponding to that
error number into the debug stream when LIBCTF_DEBUG is enabled:
changing this API is easy at this point since we are already touching
all existing calls to gettextize them. We need this because the debug
stream should contain the errno's message, but the error reported in the
error/warning stream should *not*, because the caller will probably
report it themselves at failure time regardless, and reporting it in
every error message that leads up to it leads to a ridiculous chattering
on failure, which is likely to end up as ridiculous chattering on stderr
(trimmed a bit):
CTF error: `ld/testsuite/ld-ctf/A.c (0): lookup failure for type 3: flags 1: The parent CTF dictionary is unavailable'
CTF error: `ld/testsuite/ld-ctf/A.c (0): struct/union member type hashing error during type hashing for type 80000001, kind 6: The parent CTF dictionary is unavailable'
CTF error: `deduplicating link variable emission failed for ld/testsuite/ld-ctf/A.c: The parent CTF dictionary is unavailable'
ld/.libs/lt-ld-new: warning: CTF linking failed; output will have no CTF section: `The parent CTF dictionary is unavailable'
We only need to be told that the parent CTF dictionary is unavailable
*once*, not over and over again!
errmsgs are still emitted on warning generation, because warnings do not
usually lead to a failure propagated up to the caller and reported
there.
Debug-stream messages are not translated. If translation is turned on,
there will be a mixture of English and translated messages in the debug
stream, but rather that than burden the translators with debug-only
output.
binutils/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_archive_member): Move error-
reporting...
(dump_ctf_errs): ... into this separate function.
(dump_ctf): Call it on open errors.
* readelf.c (dump_ctf_archive_member): Move error-
reporting...
(dump_ctf_errs): ... into this separate function. Support
calls with NULL fp. Adjust for new err parameter to
ctf_errwarning_next.
(dump_section_as_ctf): Call it on open errors.
include/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_errwarning_next): New err parameter.
ld/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (lang_ctf_errs_warnings): Support calls with NULL fp.
Adjust for new err parameter to ctf_errwarning_next. Only
check for assertion failures when fp is non-NULL.
(ldlang_open_ctf): Call it on open errors.
* testsuite/ld-ctf/ctf.exp: Always use the C locale to avoid
breaking the diags tests.
libctf/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-subr.c (open_errors): New list.
(ctf_err_warn): Calls with NULL fp append to open_errors. Add err
parameter, and use it to decorate the debug stream with errmsgs.
(ctf_err_warn_to_open): Splice errors from a CTF dict into the
open_errors.
(ctf_errwarning_next): Calls with NULL fp report from open_errors.
New err param to report iteration errors (including end-of-iteration)
when fp is NULL.
(ctf_assert_fail_internal): Adjust ctf_err_warn call for new err
parameter: gettextize.
* ctf-impl.h (ctfo_get_vbytes): Add ctf_file_t parameter.
(LCTF_VBYTES): Adjust.
(ctf_err_warn_to_open): New.
(ctf_err_warn): Adjust.
(ctf_bundle): Used in only one place: move...
* ctf-create.c: ... here.
(enumcmp): Use ctf_err_warn, not ctf_dprintf, passing the err number
down as needed. Don't emit the errmsg. Gettextize.
(membcmp): Likewise.
(ctf_add_type_internal): Likewise.
(ctf_write_mem): Likewise.
(ctf_compress_write): Likewise. Report errors writing the header or
body.
(ctf_write): Likewise.
* ctf-archive.c (ctf_arc_write_fd): Use ctf_err_warn, not
ctf_dprintf, and gettextize, as above.
(ctf_arc_write): Likewise.
(ctf_arc_bufopen): Likewise.
(ctf_arc_open_internal): Likewise.
* ctf-labels.c (ctf_label_iter): Likewise.
* ctf-open-bfd.c (ctf_bfdclose): Likewise.
(ctf_bfdopen): Likewise.
(ctf_bfdopen_ctfsect): Likewise.
(ctf_fdopen): Likewise.
* ctf-string.c (ctf_str_write_strtab): Likewise.
* ctf-types.c (ctf_type_resolve): Likewise.
* ctf-open.c (get_vbytes_common): Likewise. Pass down the ctf dict.
(get_vbytes_v1): Pass down the ctf dict.
(get_vbytes_v2): Likewise.
(flip_ctf): Likewise.
(flip_types): Likewise. Use ctf_err_warn, not ctf_dprintf, and
gettextize, as above.
(upgrade_types_v1): Adjust calls.
(init_types): Use ctf_err_warn, not ctf_dprintf, as above.
(ctf_bufopen_internal): Likewise. Adjust calls. Transplant errors
emitted into individual dicts into the open errors if this turns
out to be a failed open in the end.
* ctf-dump.c (ctf_dump_format_type): Adjust ctf_err_warn for new err
argument. Gettextize. Don't emit the errmsg.
(ctf_dump_funcs): Likewise. Collapse err label into its only case.
(ctf_dump_type): Likewise.
* ctf-link.c (ctf_create_per_cu): Adjust ctf_err_warn for new err
argument. Gettextize. Don't emit the errmsg.
(ctf_link_one_type): Likewise.
(ctf_link_lazy_open): Likewise.
(ctf_link_one_input_archive): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Likewise.
(ctf_link_deduplicating_per_cu): Likewise. Add some missed
ctf_set_errnos to obscure error cases.
* ctf-dedup.c (ctf_dedup_rhash_type): Adjust ctf_err_warn for new
err argument. Gettextize. Don't emit the errmsg.
(ctf_dedup_populate_mappings): Likewise.
(ctf_dedup_detect_name_ambiguity): Likewise.
(ctf_dedup_init): Likewise.
(ctf_dedup_multiple_input_dicts): Likewise.
(ctf_dedup_conflictify_unshared): Likewise.
(ctf_dedup): Likewise.
(ctf_dedup_rwalk_one_output_mapping): Likewise.
(ctf_dedup_id_to_target): Likewise.
(ctf_dedup_emit_type): Likewise.
(ctf_dedup_emit_struct_members): Likewise.
(ctf_dedup_populate_type_mapping): Likewise.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): Likewise.
(ctf_dedup_hash_type): Likewise. Fix a bit of messed-up error
status setting.
(ctf_dedup_rwalk_one_output_mapping): Likewise. Don't hide
unknown-type-kind messages (which signify file corruption).
2020-07-27 23:45:15 +08:00
|
|
|
ctf_err_warn (fp, 0, err, _("cannot open archive %s in "
|
|
|
|
"CU-mapped CTF link"),
|
|
|
|
only_input->clin_filename);
|
libctf, link: tie in the deduplicating linker
This fairly intricate commit connects up the CTF linker machinery (which
operates in terms of ctf_archive_t's on ctf_link_inputs ->
ctf_link_outputs) to the deduplicator (which operates in terms of arrays
of ctf_file_t's, all the archives exploded).
The nondeduplicating linker is retained, but is not called unless the
CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the
environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have
confidence in the much-more-complex deduplicating linker, I hope the
nondeduplicating linker can be removed.
In brief, what this does is traverses each input archive in
ctf_link_inputs, opening every member (if not already open) and tying
child dicts to their parents, shoving them into an array and
constructing a corresponding parents array that tells the deduplicator
which dict is the parent of which child. We then call ctf_dedup and
ctf_dedup_emit with that array of inputs, taking the outputs that result
and putting them into ctf_link_outputs where the rest of the CTF linker
expects to find them, then linking in the variables just as is done by
the nondeduplicating linker.
It also implements much of the CU-mapping side of things. The problem
CU-mapping introduces is that if you map many input CUs into one output,
this is saying that you want many translation units to produce at most
one child dict if conflicting types are found in any of them. This
means you can suddenly have multiple distinct types with the same name
in the same dict, which libctf cannot really represent because it's not
something you can do with C translation units.
The deduplicator machinery already committed does as best it can with
these, hiding types with conflicting names rather than making child
dicts out of them: but we still need to call it. This is done similarly
to the main link, taking the inputs (one CU output at a time),
deduplicating them, taking the output and making it an input to the
final link. Two (significant) optimizations are done: we share atoms
tables between all these links and the final link (so e.g. all type hash
values are shared, all decorated type names, etc); and any CU-mapped
links with only one input (and no child dicts) doesn't need to do
anything other than renaming the CU: the CU-mapped link phase can be
skipped for it. Put together, large CU-mapped links can save 50% of
their memory usage and about as much time (and the memory usage for
CU-mapped links is significant, because all those output CUs have to
have all their types stored in memory all at once).
include/
* ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the
deduplicator.
libctf/
* ctf-impl.h (ctf_list_splice): New.
* ctf-util.h (ctf_list_splice): Likewise.
* ctf-link.c (link_sort_inputs_cb_arg_t): Likewise.
(ctf_link_sort_inputs): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_variables): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Call it.
2020-06-06 05:57:06 +08:00
|
|
|
ctf_set_errno (fp, err);
|
|
|
|
goto err_open_inputs;
|
|
|
|
}
|
|
|
|
ctf_next_destroy (ai);
|
|
|
|
|
|
|
|
if (strcmp (only_input->clin_filename, out_name) != 0)
|
|
|
|
{
|
|
|
|
/* Renaming. We need to add a new input, then null out the
|
|
|
|
clin_arc and clin_fp of the old one to stop it being
|
|
|
|
auto-closed on removal. The new input needs its cuname changed
|
|
|
|
to out_name, which is doable only because the cuname is a
|
|
|
|
dynamic property which can be changed even in readonly
|
|
|
|
dicts. */
|
|
|
|
|
|
|
|
ctf_cuname_set (only_input->clin_fp, out_name);
|
|
|
|
if (ctf_link_add_ctf_internal (fp, only_input->clin_arc,
|
|
|
|
only_input->clin_fp,
|
|
|
|
out_name) < 0)
|
|
|
|
{
|
libctf, binutils, include, ld: gettextize and improve error handling
This commit follows on from the earlier commit "libctf, ld, binutils:
add textual error/warning reporting for libctf" and converts every error
in libctf that was reported using ctf_dprintf to use ctf_err_warn
instead, gettextizing them in the process, using N_() where necessary to
avoid doing gettext calls unless an error message is actually generated,
and rephrasing some error messages for ease of translation.
This requires a slight change in the ctf_errwarning_next API: this API
is public but has not been in a release yet, so can still change freely.
The problem is that many errors are emitted at open time (whether
opening of a CTF dict, or opening of a CTF archive): the former of these
throws away its incompletely-initialized ctf_file_t rather than return
it, and the latter has no ctf_file_t at all. So errors and warnings
emitted at open time cannot be stored in the ctf_file_t, and have to go
elsewhere.
We put them in a static local in ctf-subr.c (which is not very
thread-safe: a later commit will improve things here): ctf_err_warn with
a NULL fp adds to this list, and the public interface
ctf_errwarning_next with a NULL fp retrieves from it.
We need a slight exception from the usual iterator rules in this case:
with a NULL fp, there is nowhere to store the ECTF_NEXT_END "error"
which signifies the end of iteration, so we add a new err parameter to
ctf_errwarning_next which is used to report such iteration-related
errors. (If an fp is provided -- i.e., if not reporting open errors --
this is optional, but even if it's optional it's still an API change.
This is actually useful from a usability POV as well, since
ctf_errwarning_next is usually called when there's been an error, so
overwriting the error code with ECTF_NEXT_END is not very helpful!
So, unusually, ctf_errwarning_next now uses the passed fp for its
error code *only* if no errp pointer is passed in, and leaves it
untouched otherwise.)
ld, objdump and readelf are adapted to call ctf_errwarning_next with a
NULL fp to report open errors where appropriate.
The ctf_err_warn API also has to change, gaining a new error-number
parameter which is used to add the error message corresponding to that
error number into the debug stream when LIBCTF_DEBUG is enabled:
changing this API is easy at this point since we are already touching
all existing calls to gettextize them. We need this because the debug
stream should contain the errno's message, but the error reported in the
error/warning stream should *not*, because the caller will probably
report it themselves at failure time regardless, and reporting it in
every error message that leads up to it leads to a ridiculous chattering
on failure, which is likely to end up as ridiculous chattering on stderr
(trimmed a bit):
CTF error: `ld/testsuite/ld-ctf/A.c (0): lookup failure for type 3: flags 1: The parent CTF dictionary is unavailable'
CTF error: `ld/testsuite/ld-ctf/A.c (0): struct/union member type hashing error during type hashing for type 80000001, kind 6: The parent CTF dictionary is unavailable'
CTF error: `deduplicating link variable emission failed for ld/testsuite/ld-ctf/A.c: The parent CTF dictionary is unavailable'
ld/.libs/lt-ld-new: warning: CTF linking failed; output will have no CTF section: `The parent CTF dictionary is unavailable'
We only need to be told that the parent CTF dictionary is unavailable
*once*, not over and over again!
errmsgs are still emitted on warning generation, because warnings do not
usually lead to a failure propagated up to the caller and reported
there.
Debug-stream messages are not translated. If translation is turned on,
there will be a mixture of English and translated messages in the debug
stream, but rather that than burden the translators with debug-only
output.
binutils/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_archive_member): Move error-
reporting...
(dump_ctf_errs): ... into this separate function.
(dump_ctf): Call it on open errors.
* readelf.c (dump_ctf_archive_member): Move error-
reporting...
(dump_ctf_errs): ... into this separate function. Support
calls with NULL fp. Adjust for new err parameter to
ctf_errwarning_next.
(dump_section_as_ctf): Call it on open errors.
include/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_errwarning_next): New err parameter.
ld/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (lang_ctf_errs_warnings): Support calls with NULL fp.
Adjust for new err parameter to ctf_errwarning_next. Only
check for assertion failures when fp is non-NULL.
(ldlang_open_ctf): Call it on open errors.
* testsuite/ld-ctf/ctf.exp: Always use the C locale to avoid
breaking the diags tests.
libctf/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-subr.c (open_errors): New list.
(ctf_err_warn): Calls with NULL fp append to open_errors. Add err
parameter, and use it to decorate the debug stream with errmsgs.
(ctf_err_warn_to_open): Splice errors from a CTF dict into the
open_errors.
(ctf_errwarning_next): Calls with NULL fp report from open_errors.
New err param to report iteration errors (including end-of-iteration)
when fp is NULL.
(ctf_assert_fail_internal): Adjust ctf_err_warn call for new err
parameter: gettextize.
* ctf-impl.h (ctfo_get_vbytes): Add ctf_file_t parameter.
(LCTF_VBYTES): Adjust.
(ctf_err_warn_to_open): New.
(ctf_err_warn): Adjust.
(ctf_bundle): Used in only one place: move...
* ctf-create.c: ... here.
(enumcmp): Use ctf_err_warn, not ctf_dprintf, passing the err number
down as needed. Don't emit the errmsg. Gettextize.
(membcmp): Likewise.
(ctf_add_type_internal): Likewise.
(ctf_write_mem): Likewise.
(ctf_compress_write): Likewise. Report errors writing the header or
body.
(ctf_write): Likewise.
* ctf-archive.c (ctf_arc_write_fd): Use ctf_err_warn, not
ctf_dprintf, and gettextize, as above.
(ctf_arc_write): Likewise.
(ctf_arc_bufopen): Likewise.
(ctf_arc_open_internal): Likewise.
* ctf-labels.c (ctf_label_iter): Likewise.
* ctf-open-bfd.c (ctf_bfdclose): Likewise.
(ctf_bfdopen): Likewise.
(ctf_bfdopen_ctfsect): Likewise.
(ctf_fdopen): Likewise.
* ctf-string.c (ctf_str_write_strtab): Likewise.
* ctf-types.c (ctf_type_resolve): Likewise.
* ctf-open.c (get_vbytes_common): Likewise. Pass down the ctf dict.
(get_vbytes_v1): Pass down the ctf dict.
(get_vbytes_v2): Likewise.
(flip_ctf): Likewise.
(flip_types): Likewise. Use ctf_err_warn, not ctf_dprintf, and
gettextize, as above.
(upgrade_types_v1): Adjust calls.
(init_types): Use ctf_err_warn, not ctf_dprintf, as above.
(ctf_bufopen_internal): Likewise. Adjust calls. Transplant errors
emitted into individual dicts into the open errors if this turns
out to be a failed open in the end.
* ctf-dump.c (ctf_dump_format_type): Adjust ctf_err_warn for new err
argument. Gettextize. Don't emit the errmsg.
(ctf_dump_funcs): Likewise. Collapse err label into its only case.
(ctf_dump_type): Likewise.
* ctf-link.c (ctf_create_per_cu): Adjust ctf_err_warn for new err
argument. Gettextize. Don't emit the errmsg.
(ctf_link_one_type): Likewise.
(ctf_link_lazy_open): Likewise.
(ctf_link_one_input_archive): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Likewise.
(ctf_link_deduplicating_per_cu): Likewise. Add some missed
ctf_set_errnos to obscure error cases.
* ctf-dedup.c (ctf_dedup_rhash_type): Adjust ctf_err_warn for new
err argument. Gettextize. Don't emit the errmsg.
(ctf_dedup_populate_mappings): Likewise.
(ctf_dedup_detect_name_ambiguity): Likewise.
(ctf_dedup_init): Likewise.
(ctf_dedup_multiple_input_dicts): Likewise.
(ctf_dedup_conflictify_unshared): Likewise.
(ctf_dedup): Likewise.
(ctf_dedup_rwalk_one_output_mapping): Likewise.
(ctf_dedup_id_to_target): Likewise.
(ctf_dedup_emit_type): Likewise.
(ctf_dedup_emit_struct_members): Likewise.
(ctf_dedup_populate_type_mapping): Likewise.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): Likewise.
(ctf_dedup_hash_type): Likewise. Fix a bit of messed-up error
status setting.
(ctf_dedup_rwalk_one_output_mapping): Likewise. Don't hide
unknown-type-kind messages (which signify file corruption).
2020-07-27 23:45:15 +08:00
|
|
|
ctf_err_warn (fp, 0, 0, _("cannot add intermediate files "
|
|
|
|
"to link"));
|
libctf, link: tie in the deduplicating linker
This fairly intricate commit connects up the CTF linker machinery (which
operates in terms of ctf_archive_t's on ctf_link_inputs ->
ctf_link_outputs) to the deduplicator (which operates in terms of arrays
of ctf_file_t's, all the archives exploded).
The nondeduplicating linker is retained, but is not called unless the
CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the
environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have
confidence in the much-more-complex deduplicating linker, I hope the
nondeduplicating linker can be removed.
In brief, what this does is traverses each input archive in
ctf_link_inputs, opening every member (if not already open) and tying
child dicts to their parents, shoving them into an array and
constructing a corresponding parents array that tells the deduplicator
which dict is the parent of which child. We then call ctf_dedup and
ctf_dedup_emit with that array of inputs, taking the outputs that result
and putting them into ctf_link_outputs where the rest of the CTF linker
expects to find them, then linking in the variables just as is done by
the nondeduplicating linker.
It also implements much of the CU-mapping side of things. The problem
CU-mapping introduces is that if you map many input CUs into one output,
this is saying that you want many translation units to produce at most
one child dict if conflicting types are found in any of them. This
means you can suddenly have multiple distinct types with the same name
in the same dict, which libctf cannot really represent because it's not
something you can do with C translation units.
The deduplicator machinery already committed does as best it can with
these, hiding types with conflicting names rather than making child
dicts out of them: but we still need to call it. This is done similarly
to the main link, taking the inputs (one CU output at a time),
deduplicating them, taking the output and making it an input to the
final link. Two (significant) optimizations are done: we share atoms
tables between all these links and the final link (so e.g. all type hash
values are shared, all decorated type names, etc); and any CU-mapped
links with only one input (and no child dicts) doesn't need to do
anything other than renaming the CU: the CU-mapped link phase can be
skipped for it. Put together, large CU-mapped links can save 50% of
their memory usage and about as much time (and the memory usage for
CU-mapped links is significant, because all those output CUs have to
have all their types stored in memory all at once).
include/
* ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the
deduplicator.
libctf/
* ctf-impl.h (ctf_list_splice): New.
* ctf-util.h (ctf_list_splice): Likewise.
* ctf-link.c (link_sort_inputs_cb_arg_t): Likewise.
(ctf_link_sort_inputs): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_variables): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Call it.
2020-06-06 05:57:06 +08:00
|
|
|
goto err_open_inputs;
|
|
|
|
}
|
|
|
|
only_input->clin_arc = NULL;
|
|
|
|
only_input->clin_fp = NULL;
|
|
|
|
ctf_dynhash_remove (fp->ctf_link_inputs,
|
|
|
|
only_input->clin_filename);
|
|
|
|
}
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* This is a real CU many-to-one mapping: we must dedup the inputs into
|
|
|
|
a new output to be used in the final link phase. */
|
|
|
|
|
|
|
|
if ((inputs = ctf_link_deduplicating_open_inputs (fp, in, ninputs,
|
|
|
|
&parents)) == NULL)
|
|
|
|
{
|
|
|
|
ctf_next_destroy (i);
|
|
|
|
goto err_inputs;
|
|
|
|
}
|
|
|
|
|
|
|
|
if ((out = ctf_create (&err)) == NULL)
|
|
|
|
{
|
libctf, binutils, include, ld: gettextize and improve error handling
This commit follows on from the earlier commit "libctf, ld, binutils:
add textual error/warning reporting for libctf" and converts every error
in libctf that was reported using ctf_dprintf to use ctf_err_warn
instead, gettextizing them in the process, using N_() where necessary to
avoid doing gettext calls unless an error message is actually generated,
and rephrasing some error messages for ease of translation.
This requires a slight change in the ctf_errwarning_next API: this API
is public but has not been in a release yet, so can still change freely.
The problem is that many errors are emitted at open time (whether
opening of a CTF dict, or opening of a CTF archive): the former of these
throws away its incompletely-initialized ctf_file_t rather than return
it, and the latter has no ctf_file_t at all. So errors and warnings
emitted at open time cannot be stored in the ctf_file_t, and have to go
elsewhere.
We put them in a static local in ctf-subr.c (which is not very
thread-safe: a later commit will improve things here): ctf_err_warn with
a NULL fp adds to this list, and the public interface
ctf_errwarning_next with a NULL fp retrieves from it.
We need a slight exception from the usual iterator rules in this case:
with a NULL fp, there is nowhere to store the ECTF_NEXT_END "error"
which signifies the end of iteration, so we add a new err parameter to
ctf_errwarning_next which is used to report such iteration-related
errors. (If an fp is provided -- i.e., if not reporting open errors --
this is optional, but even if it's optional it's still an API change.
This is actually useful from a usability POV as well, since
ctf_errwarning_next is usually called when there's been an error, so
overwriting the error code with ECTF_NEXT_END is not very helpful!
So, unusually, ctf_errwarning_next now uses the passed fp for its
error code *only* if no errp pointer is passed in, and leaves it
untouched otherwise.)
ld, objdump and readelf are adapted to call ctf_errwarning_next with a
NULL fp to report open errors where appropriate.
The ctf_err_warn API also has to change, gaining a new error-number
parameter which is used to add the error message corresponding to that
error number into the debug stream when LIBCTF_DEBUG is enabled:
changing this API is easy at this point since we are already touching
all existing calls to gettextize them. We need this because the debug
stream should contain the errno's message, but the error reported in the
error/warning stream should *not*, because the caller will probably
report it themselves at failure time regardless, and reporting it in
every error message that leads up to it leads to a ridiculous chattering
on failure, which is likely to end up as ridiculous chattering on stderr
(trimmed a bit):
CTF error: `ld/testsuite/ld-ctf/A.c (0): lookup failure for type 3: flags 1: The parent CTF dictionary is unavailable'
CTF error: `ld/testsuite/ld-ctf/A.c (0): struct/union member type hashing error during type hashing for type 80000001, kind 6: The parent CTF dictionary is unavailable'
CTF error: `deduplicating link variable emission failed for ld/testsuite/ld-ctf/A.c: The parent CTF dictionary is unavailable'
ld/.libs/lt-ld-new: warning: CTF linking failed; output will have no CTF section: `The parent CTF dictionary is unavailable'
We only need to be told that the parent CTF dictionary is unavailable
*once*, not over and over again!
errmsgs are still emitted on warning generation, because warnings do not
usually lead to a failure propagated up to the caller and reported
there.
Debug-stream messages are not translated. If translation is turned on,
there will be a mixture of English and translated messages in the debug
stream, but rather that than burden the translators with debug-only
output.
binutils/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_archive_member): Move error-
reporting...
(dump_ctf_errs): ... into this separate function.
(dump_ctf): Call it on open errors.
* readelf.c (dump_ctf_archive_member): Move error-
reporting...
(dump_ctf_errs): ... into this separate function. Support
calls with NULL fp. Adjust for new err parameter to
ctf_errwarning_next.
(dump_section_as_ctf): Call it on open errors.
include/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_errwarning_next): New err parameter.
ld/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (lang_ctf_errs_warnings): Support calls with NULL fp.
Adjust for new err parameter to ctf_errwarning_next. Only
check for assertion failures when fp is non-NULL.
(ldlang_open_ctf): Call it on open errors.
* testsuite/ld-ctf/ctf.exp: Always use the C locale to avoid
breaking the diags tests.
libctf/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-subr.c (open_errors): New list.
(ctf_err_warn): Calls with NULL fp append to open_errors. Add err
parameter, and use it to decorate the debug stream with errmsgs.
(ctf_err_warn_to_open): Splice errors from a CTF dict into the
open_errors.
(ctf_errwarning_next): Calls with NULL fp report from open_errors.
New err param to report iteration errors (including end-of-iteration)
when fp is NULL.
(ctf_assert_fail_internal): Adjust ctf_err_warn call for new err
parameter: gettextize.
* ctf-impl.h (ctfo_get_vbytes): Add ctf_file_t parameter.
(LCTF_VBYTES): Adjust.
(ctf_err_warn_to_open): New.
(ctf_err_warn): Adjust.
(ctf_bundle): Used in only one place: move...
* ctf-create.c: ... here.
(enumcmp): Use ctf_err_warn, not ctf_dprintf, passing the err number
down as needed. Don't emit the errmsg. Gettextize.
(membcmp): Likewise.
(ctf_add_type_internal): Likewise.
(ctf_write_mem): Likewise.
(ctf_compress_write): Likewise. Report errors writing the header or
body.
(ctf_write): Likewise.
* ctf-archive.c (ctf_arc_write_fd): Use ctf_err_warn, not
ctf_dprintf, and gettextize, as above.
(ctf_arc_write): Likewise.
(ctf_arc_bufopen): Likewise.
(ctf_arc_open_internal): Likewise.
* ctf-labels.c (ctf_label_iter): Likewise.
* ctf-open-bfd.c (ctf_bfdclose): Likewise.
(ctf_bfdopen): Likewise.
(ctf_bfdopen_ctfsect): Likewise.
(ctf_fdopen): Likewise.
* ctf-string.c (ctf_str_write_strtab): Likewise.
* ctf-types.c (ctf_type_resolve): Likewise.
* ctf-open.c (get_vbytes_common): Likewise. Pass down the ctf dict.
(get_vbytes_v1): Pass down the ctf dict.
(get_vbytes_v2): Likewise.
(flip_ctf): Likewise.
(flip_types): Likewise. Use ctf_err_warn, not ctf_dprintf, and
gettextize, as above.
(upgrade_types_v1): Adjust calls.
(init_types): Use ctf_err_warn, not ctf_dprintf, as above.
(ctf_bufopen_internal): Likewise. Adjust calls. Transplant errors
emitted into individual dicts into the open errors if this turns
out to be a failed open in the end.
* ctf-dump.c (ctf_dump_format_type): Adjust ctf_err_warn for new err
argument. Gettextize. Don't emit the errmsg.
(ctf_dump_funcs): Likewise. Collapse err label into its only case.
(ctf_dump_type): Likewise.
* ctf-link.c (ctf_create_per_cu): Adjust ctf_err_warn for new err
argument. Gettextize. Don't emit the errmsg.
(ctf_link_one_type): Likewise.
(ctf_link_lazy_open): Likewise.
(ctf_link_one_input_archive): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Likewise.
(ctf_link_deduplicating_per_cu): Likewise. Add some missed
ctf_set_errnos to obscure error cases.
* ctf-dedup.c (ctf_dedup_rhash_type): Adjust ctf_err_warn for new
err argument. Gettextize. Don't emit the errmsg.
(ctf_dedup_populate_mappings): Likewise.
(ctf_dedup_detect_name_ambiguity): Likewise.
(ctf_dedup_init): Likewise.
(ctf_dedup_multiple_input_dicts): Likewise.
(ctf_dedup_conflictify_unshared): Likewise.
(ctf_dedup): Likewise.
(ctf_dedup_rwalk_one_output_mapping): Likewise.
(ctf_dedup_id_to_target): Likewise.
(ctf_dedup_emit_type): Likewise.
(ctf_dedup_emit_struct_members): Likewise.
(ctf_dedup_populate_type_mapping): Likewise.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): Likewise.
(ctf_dedup_hash_type): Likewise. Fix a bit of messed-up error
status setting.
(ctf_dedup_rwalk_one_output_mapping): Likewise. Don't hide
unknown-type-kind messages (which signify file corruption).
2020-07-27 23:45:15 +08:00
|
|
|
ctf_err_warn (fp, 0, err, _("cannot create per-CU CTF archive "
|
|
|
|
"for %s"),
|
|
|
|
out_name);
|
libctf, link: tie in the deduplicating linker
This fairly intricate commit connects up the CTF linker machinery (which
operates in terms of ctf_archive_t's on ctf_link_inputs ->
ctf_link_outputs) to the deduplicator (which operates in terms of arrays
of ctf_file_t's, all the archives exploded).
The nondeduplicating linker is retained, but is not called unless the
CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the
environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have
confidence in the much-more-complex deduplicating linker, I hope the
nondeduplicating linker can be removed.
In brief, what this does is traverses each input archive in
ctf_link_inputs, opening every member (if not already open) and tying
child dicts to their parents, shoving them into an array and
constructing a corresponding parents array that tells the deduplicator
which dict is the parent of which child. We then call ctf_dedup and
ctf_dedup_emit with that array of inputs, taking the outputs that result
and putting them into ctf_link_outputs where the rest of the CTF linker
expects to find them, then linking in the variables just as is done by
the nondeduplicating linker.
It also implements much of the CU-mapping side of things. The problem
CU-mapping introduces is that if you map many input CUs into one output,
this is saying that you want many translation units to produce at most
one child dict if conflicting types are found in any of them. This
means you can suddenly have multiple distinct types with the same name
in the same dict, which libctf cannot really represent because it's not
something you can do with C translation units.
The deduplicator machinery already committed does as best it can with
these, hiding types with conflicting names rather than making child
dicts out of them: but we still need to call it. This is done similarly
to the main link, taking the inputs (one CU output at a time),
deduplicating them, taking the output and making it an input to the
final link. Two (significant) optimizations are done: we share atoms
tables between all these links and the final link (so e.g. all type hash
values are shared, all decorated type names, etc); and any CU-mapped
links with only one input (and no child dicts) doesn't need to do
anything other than renaming the CU: the CU-mapped link phase can be
skipped for it. Put together, large CU-mapped links can save 50% of
their memory usage and about as much time (and the memory usage for
CU-mapped links is significant, because all those output CUs have to
have all their types stored in memory all at once).
include/
* ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the
deduplicator.
libctf/
* ctf-impl.h (ctf_list_splice): New.
* ctf-util.h (ctf_list_splice): Likewise.
* ctf-link.c (link_sort_inputs_cb_arg_t): Likewise.
(ctf_link_sort_inputs): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_variables): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Call it.
2020-06-06 05:57:06 +08:00
|
|
|
ctf_set_errno (fp, err);
|
|
|
|
goto err_inputs;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Share the atoms table to reduce memory usage. */
|
|
|
|
out->ctf_dedup_atoms = fp->ctf_dedup_atoms_alloc;
|
|
|
|
|
|
|
|
/* No ctf_imports at this stage: this per-CU dictionary has no parents.
|
|
|
|
Parent/child deduplication happens in the link's final pass. However,
|
|
|
|
the cuname *is* important, as it is propagated into the final
|
|
|
|
dictionary. */
|
|
|
|
ctf_cuname_set (out, out_name);
|
|
|
|
|
|
|
|
if (ctf_dedup (out, inputs, ninputs, parents, 1) < 0)
|
|
|
|
{
|
libctf, binutils, include, ld: gettextize and improve error handling
This commit follows on from the earlier commit "libctf, ld, binutils:
add textual error/warning reporting for libctf" and converts every error
in libctf that was reported using ctf_dprintf to use ctf_err_warn
instead, gettextizing them in the process, using N_() where necessary to
avoid doing gettext calls unless an error message is actually generated,
and rephrasing some error messages for ease of translation.
This requires a slight change in the ctf_errwarning_next API: this API
is public but has not been in a release yet, so can still change freely.
The problem is that many errors are emitted at open time (whether
opening of a CTF dict, or opening of a CTF archive): the former of these
throws away its incompletely-initialized ctf_file_t rather than return
it, and the latter has no ctf_file_t at all. So errors and warnings
emitted at open time cannot be stored in the ctf_file_t, and have to go
elsewhere.
We put them in a static local in ctf-subr.c (which is not very
thread-safe: a later commit will improve things here): ctf_err_warn with
a NULL fp adds to this list, and the public interface
ctf_errwarning_next with a NULL fp retrieves from it.
We need a slight exception from the usual iterator rules in this case:
with a NULL fp, there is nowhere to store the ECTF_NEXT_END "error"
which signifies the end of iteration, so we add a new err parameter to
ctf_errwarning_next which is used to report such iteration-related
errors. (If an fp is provided -- i.e., if not reporting open errors --
this is optional, but even if it's optional it's still an API change.
This is actually useful from a usability POV as well, since
ctf_errwarning_next is usually called when there's been an error, so
overwriting the error code with ECTF_NEXT_END is not very helpful!
So, unusually, ctf_errwarning_next now uses the passed fp for its
error code *only* if no errp pointer is passed in, and leaves it
untouched otherwise.)
ld, objdump and readelf are adapted to call ctf_errwarning_next with a
NULL fp to report open errors where appropriate.
The ctf_err_warn API also has to change, gaining a new error-number
parameter which is used to add the error message corresponding to that
error number into the debug stream when LIBCTF_DEBUG is enabled:
changing this API is easy at this point since we are already touching
all existing calls to gettextize them. We need this because the debug
stream should contain the errno's message, but the error reported in the
error/warning stream should *not*, because the caller will probably
report it themselves at failure time regardless, and reporting it in
every error message that leads up to it leads to a ridiculous chattering
on failure, which is likely to end up as ridiculous chattering on stderr
(trimmed a bit):
CTF error: `ld/testsuite/ld-ctf/A.c (0): lookup failure for type 3: flags 1: The parent CTF dictionary is unavailable'
CTF error: `ld/testsuite/ld-ctf/A.c (0): struct/union member type hashing error during type hashing for type 80000001, kind 6: The parent CTF dictionary is unavailable'
CTF error: `deduplicating link variable emission failed for ld/testsuite/ld-ctf/A.c: The parent CTF dictionary is unavailable'
ld/.libs/lt-ld-new: warning: CTF linking failed; output will have no CTF section: `The parent CTF dictionary is unavailable'
We only need to be told that the parent CTF dictionary is unavailable
*once*, not over and over again!
errmsgs are still emitted on warning generation, because warnings do not
usually lead to a failure propagated up to the caller and reported
there.
Debug-stream messages are not translated. If translation is turned on,
there will be a mixture of English and translated messages in the debug
stream, but rather that than burden the translators with debug-only
output.
binutils/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_archive_member): Move error-
reporting...
(dump_ctf_errs): ... into this separate function.
(dump_ctf): Call it on open errors.
* readelf.c (dump_ctf_archive_member): Move error-
reporting...
(dump_ctf_errs): ... into this separate function. Support
calls with NULL fp. Adjust for new err parameter to
ctf_errwarning_next.
(dump_section_as_ctf): Call it on open errors.
include/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_errwarning_next): New err parameter.
ld/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (lang_ctf_errs_warnings): Support calls with NULL fp.
Adjust for new err parameter to ctf_errwarning_next. Only
check for assertion failures when fp is non-NULL.
(ldlang_open_ctf): Call it on open errors.
* testsuite/ld-ctf/ctf.exp: Always use the C locale to avoid
breaking the diags tests.
libctf/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-subr.c (open_errors): New list.
(ctf_err_warn): Calls with NULL fp append to open_errors. Add err
parameter, and use it to decorate the debug stream with errmsgs.
(ctf_err_warn_to_open): Splice errors from a CTF dict into the
open_errors.
(ctf_errwarning_next): Calls with NULL fp report from open_errors.
New err param to report iteration errors (including end-of-iteration)
when fp is NULL.
(ctf_assert_fail_internal): Adjust ctf_err_warn call for new err
parameter: gettextize.
* ctf-impl.h (ctfo_get_vbytes): Add ctf_file_t parameter.
(LCTF_VBYTES): Adjust.
(ctf_err_warn_to_open): New.
(ctf_err_warn): Adjust.
(ctf_bundle): Used in only one place: move...
* ctf-create.c: ... here.
(enumcmp): Use ctf_err_warn, not ctf_dprintf, passing the err number
down as needed. Don't emit the errmsg. Gettextize.
(membcmp): Likewise.
(ctf_add_type_internal): Likewise.
(ctf_write_mem): Likewise.
(ctf_compress_write): Likewise. Report errors writing the header or
body.
(ctf_write): Likewise.
* ctf-archive.c (ctf_arc_write_fd): Use ctf_err_warn, not
ctf_dprintf, and gettextize, as above.
(ctf_arc_write): Likewise.
(ctf_arc_bufopen): Likewise.
(ctf_arc_open_internal): Likewise.
* ctf-labels.c (ctf_label_iter): Likewise.
* ctf-open-bfd.c (ctf_bfdclose): Likewise.
(ctf_bfdopen): Likewise.
(ctf_bfdopen_ctfsect): Likewise.
(ctf_fdopen): Likewise.
* ctf-string.c (ctf_str_write_strtab): Likewise.
* ctf-types.c (ctf_type_resolve): Likewise.
* ctf-open.c (get_vbytes_common): Likewise. Pass down the ctf dict.
(get_vbytes_v1): Pass down the ctf dict.
(get_vbytes_v2): Likewise.
(flip_ctf): Likewise.
(flip_types): Likewise. Use ctf_err_warn, not ctf_dprintf, and
gettextize, as above.
(upgrade_types_v1): Adjust calls.
(init_types): Use ctf_err_warn, not ctf_dprintf, as above.
(ctf_bufopen_internal): Likewise. Adjust calls. Transplant errors
emitted into individual dicts into the open errors if this turns
out to be a failed open in the end.
* ctf-dump.c (ctf_dump_format_type): Adjust ctf_err_warn for new err
argument. Gettextize. Don't emit the errmsg.
(ctf_dump_funcs): Likewise. Collapse err label into its only case.
(ctf_dump_type): Likewise.
* ctf-link.c (ctf_create_per_cu): Adjust ctf_err_warn for new err
argument. Gettextize. Don't emit the errmsg.
(ctf_link_one_type): Likewise.
(ctf_link_lazy_open): Likewise.
(ctf_link_one_input_archive): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Likewise.
(ctf_link_deduplicating_per_cu): Likewise. Add some missed
ctf_set_errnos to obscure error cases.
* ctf-dedup.c (ctf_dedup_rhash_type): Adjust ctf_err_warn for new
err argument. Gettextize. Don't emit the errmsg.
(ctf_dedup_populate_mappings): Likewise.
(ctf_dedup_detect_name_ambiguity): Likewise.
(ctf_dedup_init): Likewise.
(ctf_dedup_multiple_input_dicts): Likewise.
(ctf_dedup_conflictify_unshared): Likewise.
(ctf_dedup): Likewise.
(ctf_dedup_rwalk_one_output_mapping): Likewise.
(ctf_dedup_id_to_target): Likewise.
(ctf_dedup_emit_type): Likewise.
(ctf_dedup_emit_struct_members): Likewise.
(ctf_dedup_populate_type_mapping): Likewise.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): Likewise.
(ctf_dedup_hash_type): Likewise. Fix a bit of messed-up error
status setting.
(ctf_dedup_rwalk_one_output_mapping): Likewise. Don't hide
unknown-type-kind messages (which signify file corruption).
2020-07-27 23:45:15 +08:00
|
|
|
ctf_set_errno (fp, ctf_errno (out));
|
|
|
|
ctf_err_warn (fp, 0, 0, _("CU-mapped deduplication failed for %s"),
|
|
|
|
out_name);
|
libctf, link: tie in the deduplicating linker
This fairly intricate commit connects up the CTF linker machinery (which
operates in terms of ctf_archive_t's on ctf_link_inputs ->
ctf_link_outputs) to the deduplicator (which operates in terms of arrays
of ctf_file_t's, all the archives exploded).
The nondeduplicating linker is retained, but is not called unless the
CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the
environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have
confidence in the much-more-complex deduplicating linker, I hope the
nondeduplicating linker can be removed.
In brief, what this does is traverses each input archive in
ctf_link_inputs, opening every member (if not already open) and tying
child dicts to their parents, shoving them into an array and
constructing a corresponding parents array that tells the deduplicator
which dict is the parent of which child. We then call ctf_dedup and
ctf_dedup_emit with that array of inputs, taking the outputs that result
and putting them into ctf_link_outputs where the rest of the CTF linker
expects to find them, then linking in the variables just as is done by
the nondeduplicating linker.
It also implements much of the CU-mapping side of things. The problem
CU-mapping introduces is that if you map many input CUs into one output,
this is saying that you want many translation units to produce at most
one child dict if conflicting types are found in any of them. This
means you can suddenly have multiple distinct types with the same name
in the same dict, which libctf cannot really represent because it's not
something you can do with C translation units.
The deduplicator machinery already committed does as best it can with
these, hiding types with conflicting names rather than making child
dicts out of them: but we still need to call it. This is done similarly
to the main link, taking the inputs (one CU output at a time),
deduplicating them, taking the output and making it an input to the
final link. Two (significant) optimizations are done: we share atoms
tables between all these links and the final link (so e.g. all type hash
values are shared, all decorated type names, etc); and any CU-mapped
links with only one input (and no child dicts) doesn't need to do
anything other than renaming the CU: the CU-mapped link phase can be
skipped for it. Put together, large CU-mapped links can save 50% of
their memory usage and about as much time (and the memory usage for
CU-mapped links is significant, because all those output CUs have to
have all their types stored in memory all at once).
include/
* ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the
deduplicator.
libctf/
* ctf-impl.h (ctf_list_splice): New.
* ctf-util.h (ctf_list_splice): Likewise.
* ctf-link.c (link_sort_inputs_cb_arg_t): Likewise.
(ctf_link_sort_inputs): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_variables): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Call it.
2020-06-06 05:57:06 +08:00
|
|
|
goto err_inputs;
|
|
|
|
}
|
|
|
|
|
|
|
|
if ((outputs = ctf_dedup_emit (out, inputs, ninputs, parents,
|
|
|
|
&noutputs, 1)) == NULL)
|
|
|
|
{
|
libctf, binutils, include, ld: gettextize and improve error handling
This commit follows on from the earlier commit "libctf, ld, binutils:
add textual error/warning reporting for libctf" and converts every error
in libctf that was reported using ctf_dprintf to use ctf_err_warn
instead, gettextizing them in the process, using N_() where necessary to
avoid doing gettext calls unless an error message is actually generated,
and rephrasing some error messages for ease of translation.
This requires a slight change in the ctf_errwarning_next API: this API
is public but has not been in a release yet, so can still change freely.
The problem is that many errors are emitted at open time (whether
opening of a CTF dict, or opening of a CTF archive): the former of these
throws away its incompletely-initialized ctf_file_t rather than return
it, and the latter has no ctf_file_t at all. So errors and warnings
emitted at open time cannot be stored in the ctf_file_t, and have to go
elsewhere.
We put them in a static local in ctf-subr.c (which is not very
thread-safe: a later commit will improve things here): ctf_err_warn with
a NULL fp adds to this list, and the public interface
ctf_errwarning_next with a NULL fp retrieves from it.
We need a slight exception from the usual iterator rules in this case:
with a NULL fp, there is nowhere to store the ECTF_NEXT_END "error"
which signifies the end of iteration, so we add a new err parameter to
ctf_errwarning_next which is used to report such iteration-related
errors. (If an fp is provided -- i.e., if not reporting open errors --
this is optional, but even if it's optional it's still an API change.
This is actually useful from a usability POV as well, since
ctf_errwarning_next is usually called when there's been an error, so
overwriting the error code with ECTF_NEXT_END is not very helpful!
So, unusually, ctf_errwarning_next now uses the passed fp for its
error code *only* if no errp pointer is passed in, and leaves it
untouched otherwise.)
ld, objdump and readelf are adapted to call ctf_errwarning_next with a
NULL fp to report open errors where appropriate.
The ctf_err_warn API also has to change, gaining a new error-number
parameter which is used to add the error message corresponding to that
error number into the debug stream when LIBCTF_DEBUG is enabled:
changing this API is easy at this point since we are already touching
all existing calls to gettextize them. We need this because the debug
stream should contain the errno's message, but the error reported in the
error/warning stream should *not*, because the caller will probably
report it themselves at failure time regardless, and reporting it in
every error message that leads up to it leads to a ridiculous chattering
on failure, which is likely to end up as ridiculous chattering on stderr
(trimmed a bit):
CTF error: `ld/testsuite/ld-ctf/A.c (0): lookup failure for type 3: flags 1: The parent CTF dictionary is unavailable'
CTF error: `ld/testsuite/ld-ctf/A.c (0): struct/union member type hashing error during type hashing for type 80000001, kind 6: The parent CTF dictionary is unavailable'
CTF error: `deduplicating link variable emission failed for ld/testsuite/ld-ctf/A.c: The parent CTF dictionary is unavailable'
ld/.libs/lt-ld-new: warning: CTF linking failed; output will have no CTF section: `The parent CTF dictionary is unavailable'
We only need to be told that the parent CTF dictionary is unavailable
*once*, not over and over again!
errmsgs are still emitted on warning generation, because warnings do not
usually lead to a failure propagated up to the caller and reported
there.
Debug-stream messages are not translated. If translation is turned on,
there will be a mixture of English and translated messages in the debug
stream, but rather that than burden the translators with debug-only
output.
binutils/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_archive_member): Move error-
reporting...
(dump_ctf_errs): ... into this separate function.
(dump_ctf): Call it on open errors.
* readelf.c (dump_ctf_archive_member): Move error-
reporting...
(dump_ctf_errs): ... into this separate function. Support
calls with NULL fp. Adjust for new err parameter to
ctf_errwarning_next.
(dump_section_as_ctf): Call it on open errors.
include/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_errwarning_next): New err parameter.
ld/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (lang_ctf_errs_warnings): Support calls with NULL fp.
Adjust for new err parameter to ctf_errwarning_next. Only
check for assertion failures when fp is non-NULL.
(ldlang_open_ctf): Call it on open errors.
* testsuite/ld-ctf/ctf.exp: Always use the C locale to avoid
breaking the diags tests.
libctf/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-subr.c (open_errors): New list.
(ctf_err_warn): Calls with NULL fp append to open_errors. Add err
parameter, and use it to decorate the debug stream with errmsgs.
(ctf_err_warn_to_open): Splice errors from a CTF dict into the
open_errors.
(ctf_errwarning_next): Calls with NULL fp report from open_errors.
New err param to report iteration errors (including end-of-iteration)
when fp is NULL.
(ctf_assert_fail_internal): Adjust ctf_err_warn call for new err
parameter: gettextize.
* ctf-impl.h (ctfo_get_vbytes): Add ctf_file_t parameter.
(LCTF_VBYTES): Adjust.
(ctf_err_warn_to_open): New.
(ctf_err_warn): Adjust.
(ctf_bundle): Used in only one place: move...
* ctf-create.c: ... here.
(enumcmp): Use ctf_err_warn, not ctf_dprintf, passing the err number
down as needed. Don't emit the errmsg. Gettextize.
(membcmp): Likewise.
(ctf_add_type_internal): Likewise.
(ctf_write_mem): Likewise.
(ctf_compress_write): Likewise. Report errors writing the header or
body.
(ctf_write): Likewise.
* ctf-archive.c (ctf_arc_write_fd): Use ctf_err_warn, not
ctf_dprintf, and gettextize, as above.
(ctf_arc_write): Likewise.
(ctf_arc_bufopen): Likewise.
(ctf_arc_open_internal): Likewise.
* ctf-labels.c (ctf_label_iter): Likewise.
* ctf-open-bfd.c (ctf_bfdclose): Likewise.
(ctf_bfdopen): Likewise.
(ctf_bfdopen_ctfsect): Likewise.
(ctf_fdopen): Likewise.
* ctf-string.c (ctf_str_write_strtab): Likewise.
* ctf-types.c (ctf_type_resolve): Likewise.
* ctf-open.c (get_vbytes_common): Likewise. Pass down the ctf dict.
(get_vbytes_v1): Pass down the ctf dict.
(get_vbytes_v2): Likewise.
(flip_ctf): Likewise.
(flip_types): Likewise. Use ctf_err_warn, not ctf_dprintf, and
gettextize, as above.
(upgrade_types_v1): Adjust calls.
(init_types): Use ctf_err_warn, not ctf_dprintf, as above.
(ctf_bufopen_internal): Likewise. Adjust calls. Transplant errors
emitted into individual dicts into the open errors if this turns
out to be a failed open in the end.
* ctf-dump.c (ctf_dump_format_type): Adjust ctf_err_warn for new err
argument. Gettextize. Don't emit the errmsg.
(ctf_dump_funcs): Likewise. Collapse err label into its only case.
(ctf_dump_type): Likewise.
* ctf-link.c (ctf_create_per_cu): Adjust ctf_err_warn for new err
argument. Gettextize. Don't emit the errmsg.
(ctf_link_one_type): Likewise.
(ctf_link_lazy_open): Likewise.
(ctf_link_one_input_archive): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Likewise.
(ctf_link_deduplicating_per_cu): Likewise. Add some missed
ctf_set_errnos to obscure error cases.
* ctf-dedup.c (ctf_dedup_rhash_type): Adjust ctf_err_warn for new
err argument. Gettextize. Don't emit the errmsg.
(ctf_dedup_populate_mappings): Likewise.
(ctf_dedup_detect_name_ambiguity): Likewise.
(ctf_dedup_init): Likewise.
(ctf_dedup_multiple_input_dicts): Likewise.
(ctf_dedup_conflictify_unshared): Likewise.
(ctf_dedup): Likewise.
(ctf_dedup_rwalk_one_output_mapping): Likewise.
(ctf_dedup_id_to_target): Likewise.
(ctf_dedup_emit_type): Likewise.
(ctf_dedup_emit_struct_members): Likewise.
(ctf_dedup_populate_type_mapping): Likewise.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): Likewise.
(ctf_dedup_hash_type): Likewise. Fix a bit of messed-up error
status setting.
(ctf_dedup_rwalk_one_output_mapping): Likewise. Don't hide
unknown-type-kind messages (which signify file corruption).
2020-07-27 23:45:15 +08:00
|
|
|
ctf_set_errno (fp, ctf_errno (out));
|
|
|
|
ctf_err_warn (fp, 0, 0, _("CU-mapped deduplicating link type emission "
|
|
|
|
"failed for %s"), out_name);
|
libctf, link: tie in the deduplicating linker
This fairly intricate commit connects up the CTF linker machinery (which
operates in terms of ctf_archive_t's on ctf_link_inputs ->
ctf_link_outputs) to the deduplicator (which operates in terms of arrays
of ctf_file_t's, all the archives exploded).
The nondeduplicating linker is retained, but is not called unless the
CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the
environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have
confidence in the much-more-complex deduplicating linker, I hope the
nondeduplicating linker can be removed.
In brief, what this does is traverses each input archive in
ctf_link_inputs, opening every member (if not already open) and tying
child dicts to their parents, shoving them into an array and
constructing a corresponding parents array that tells the deduplicator
which dict is the parent of which child. We then call ctf_dedup and
ctf_dedup_emit with that array of inputs, taking the outputs that result
and putting them into ctf_link_outputs where the rest of the CTF linker
expects to find them, then linking in the variables just as is done by
the nondeduplicating linker.
It also implements much of the CU-mapping side of things. The problem
CU-mapping introduces is that if you map many input CUs into one output,
this is saying that you want many translation units to produce at most
one child dict if conflicting types are found in any of them. This
means you can suddenly have multiple distinct types with the same name
in the same dict, which libctf cannot really represent because it's not
something you can do with C translation units.
The deduplicator machinery already committed does as best it can with
these, hiding types with conflicting names rather than making child
dicts out of them: but we still need to call it. This is done similarly
to the main link, taking the inputs (one CU output at a time),
deduplicating them, taking the output and making it an input to the
final link. Two (significant) optimizations are done: we share atoms
tables between all these links and the final link (so e.g. all type hash
values are shared, all decorated type names, etc); and any CU-mapped
links with only one input (and no child dicts) doesn't need to do
anything other than renaming the CU: the CU-mapped link phase can be
skipped for it. Put together, large CU-mapped links can save 50% of
their memory usage and about as much time (and the memory usage for
CU-mapped links is significant, because all those output CUs have to
have all their types stored in memory all at once).
include/
* ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the
deduplicator.
libctf/
* ctf-impl.h (ctf_list_splice): New.
* ctf-util.h (ctf_list_splice): Likewise.
* ctf-link.c (link_sort_inputs_cb_arg_t): Likewise.
(ctf_link_sort_inputs): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_variables): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Call it.
2020-06-06 05:57:06 +08:00
|
|
|
goto err_inputs;
|
|
|
|
}
|
|
|
|
if (!ctf_assert (fp, noutputs == 1))
|
libctf: minor error-handling fixes
A transient bug in the preceding change (fixed before commit) exposed a
new failure, of ld/testsuite/ld-ctf/diag-parname.d. This attempts to
ensure that if we link a dict with child type IDs but no attached
parent, we get a suitable ECTF_NOPARENT error. This was happening
before this commit, but only by chance, because ctf_variable_iter and
ctf_variable_next check to see if the dict they're passed is a child
dict without an associated parent. We forgot error-checking on the
ctf_variable_next call, and as a result this was concealed -- and
looking for the problem exposed a new bug.
If any of the lookups beneath ctf_dedup_hash_type fail, the CTF link
does *not* fail, but acts quite bizarrely, skipping the type but
emitting an error to the CTF error/warning log -- so the linker will
report an error, emit a partial CTF dict missing some types, and exit
with exitcode 0 as if nothing went wrong. Since ctf_dedup_hash_type is
never expected to fail in normal operation, this is surely wrong:
failures at emission time do not emit partial CTF dicts, so failures
at hashing time should not either.
So propagate the error back up.
Also fix a couple of smaller bugs where we fail to properly free things
and/or propagate error codes on various rare link-time errors and
out-of-memory conditions.
libctf/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-dedup.c (ctf_dedup): Pass on errors from ctf_dedup_hash_type.
Call ctf_dedup_fini properly on other errors.
(ctf_dedup_emit_type): Set the errno on dynhash insertion failure.
* ctf-link.c (ctf_link_deduplicating_per_cu): Close outputs beyond
output 0 when asserting because >1 output is found.
(ctf_link_deduplicating): Likewise, when asserting because the
shared output is not the same as the passed-in fp.
2021-03-02 23:10:05 +08:00
|
|
|
{
|
|
|
|
size_t j;
|
|
|
|
for (j = 1; j < noutputs; j++)
|
|
|
|
ctf_dict_close (outputs[j]);
|
|
|
|
goto err_inputs_outputs;
|
|
|
|
}
|
libctf, link: tie in the deduplicating linker
This fairly intricate commit connects up the CTF linker machinery (which
operates in terms of ctf_archive_t's on ctf_link_inputs ->
ctf_link_outputs) to the deduplicator (which operates in terms of arrays
of ctf_file_t's, all the archives exploded).
The nondeduplicating linker is retained, but is not called unless the
CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the
environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have
confidence in the much-more-complex deduplicating linker, I hope the
nondeduplicating linker can be removed.
In brief, what this does is traverses each input archive in
ctf_link_inputs, opening every member (if not already open) and tying
child dicts to their parents, shoving them into an array and
constructing a corresponding parents array that tells the deduplicator
which dict is the parent of which child. We then call ctf_dedup and
ctf_dedup_emit with that array of inputs, taking the outputs that result
and putting them into ctf_link_outputs where the rest of the CTF linker
expects to find them, then linking in the variables just as is done by
the nondeduplicating linker.
It also implements much of the CU-mapping side of things. The problem
CU-mapping introduces is that if you map many input CUs into one output,
this is saying that you want many translation units to produce at most
one child dict if conflicting types are found in any of them. This
means you can suddenly have multiple distinct types with the same name
in the same dict, which libctf cannot really represent because it's not
something you can do with C translation units.
The deduplicator machinery already committed does as best it can with
these, hiding types with conflicting names rather than making child
dicts out of them: but we still need to call it. This is done similarly
to the main link, taking the inputs (one CU output at a time),
deduplicating them, taking the output and making it an input to the
final link. Two (significant) optimizations are done: we share atoms
tables between all these links and the final link (so e.g. all type hash
values are shared, all decorated type names, etc); and any CU-mapped
links with only one input (and no child dicts) doesn't need to do
anything other than renaming the CU: the CU-mapped link phase can be
skipped for it. Put together, large CU-mapped links can save 50% of
their memory usage and about as much time (and the memory usage for
CU-mapped links is significant, because all those output CUs have to
have all their types stored in memory all at once).
include/
* ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the
deduplicator.
libctf/
* ctf-impl.h (ctf_list_splice): New.
* ctf-util.h (ctf_list_splice): Likewise.
* ctf-link.c (link_sort_inputs_cb_arg_t): Likewise.
(ctf_link_sort_inputs): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_variables): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Call it.
2020-06-06 05:57:06 +08:00
|
|
|
|
|
|
|
if (!(fp->ctf_link_flags & CTF_LINK_OMIT_VARIABLES_SECTION)
|
|
|
|
&& ctf_link_deduplicating_variables (out, inputs, ninputs, 1) < 0)
|
|
|
|
{
|
libctf, binutils, include, ld: gettextize and improve error handling
This commit follows on from the earlier commit "libctf, ld, binutils:
add textual error/warning reporting for libctf" and converts every error
in libctf that was reported using ctf_dprintf to use ctf_err_warn
instead, gettextizing them in the process, using N_() where necessary to
avoid doing gettext calls unless an error message is actually generated,
and rephrasing some error messages for ease of translation.
This requires a slight change in the ctf_errwarning_next API: this API
is public but has not been in a release yet, so can still change freely.
The problem is that many errors are emitted at open time (whether
opening of a CTF dict, or opening of a CTF archive): the former of these
throws away its incompletely-initialized ctf_file_t rather than return
it, and the latter has no ctf_file_t at all. So errors and warnings
emitted at open time cannot be stored in the ctf_file_t, and have to go
elsewhere.
We put them in a static local in ctf-subr.c (which is not very
thread-safe: a later commit will improve things here): ctf_err_warn with
a NULL fp adds to this list, and the public interface
ctf_errwarning_next with a NULL fp retrieves from it.
We need a slight exception from the usual iterator rules in this case:
with a NULL fp, there is nowhere to store the ECTF_NEXT_END "error"
which signifies the end of iteration, so we add a new err parameter to
ctf_errwarning_next which is used to report such iteration-related
errors. (If an fp is provided -- i.e., if not reporting open errors --
this is optional, but even if it's optional it's still an API change.
This is actually useful from a usability POV as well, since
ctf_errwarning_next is usually called when there's been an error, so
overwriting the error code with ECTF_NEXT_END is not very helpful!
So, unusually, ctf_errwarning_next now uses the passed fp for its
error code *only* if no errp pointer is passed in, and leaves it
untouched otherwise.)
ld, objdump and readelf are adapted to call ctf_errwarning_next with a
NULL fp to report open errors where appropriate.
The ctf_err_warn API also has to change, gaining a new error-number
parameter which is used to add the error message corresponding to that
error number into the debug stream when LIBCTF_DEBUG is enabled:
changing this API is easy at this point since we are already touching
all existing calls to gettextize them. We need this because the debug
stream should contain the errno's message, but the error reported in the
error/warning stream should *not*, because the caller will probably
report it themselves at failure time regardless, and reporting it in
every error message that leads up to it leads to a ridiculous chattering
on failure, which is likely to end up as ridiculous chattering on stderr
(trimmed a bit):
CTF error: `ld/testsuite/ld-ctf/A.c (0): lookup failure for type 3: flags 1: The parent CTF dictionary is unavailable'
CTF error: `ld/testsuite/ld-ctf/A.c (0): struct/union member type hashing error during type hashing for type 80000001, kind 6: The parent CTF dictionary is unavailable'
CTF error: `deduplicating link variable emission failed for ld/testsuite/ld-ctf/A.c: The parent CTF dictionary is unavailable'
ld/.libs/lt-ld-new: warning: CTF linking failed; output will have no CTF section: `The parent CTF dictionary is unavailable'
We only need to be told that the parent CTF dictionary is unavailable
*once*, not over and over again!
errmsgs are still emitted on warning generation, because warnings do not
usually lead to a failure propagated up to the caller and reported
there.
Debug-stream messages are not translated. If translation is turned on,
there will be a mixture of English and translated messages in the debug
stream, but rather that than burden the translators with debug-only
output.
binutils/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_archive_member): Move error-
reporting...
(dump_ctf_errs): ... into this separate function.
(dump_ctf): Call it on open errors.
* readelf.c (dump_ctf_archive_member): Move error-
reporting...
(dump_ctf_errs): ... into this separate function. Support
calls with NULL fp. Adjust for new err parameter to
ctf_errwarning_next.
(dump_section_as_ctf): Call it on open errors.
include/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_errwarning_next): New err parameter.
ld/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (lang_ctf_errs_warnings): Support calls with NULL fp.
Adjust for new err parameter to ctf_errwarning_next. Only
check for assertion failures when fp is non-NULL.
(ldlang_open_ctf): Call it on open errors.
* testsuite/ld-ctf/ctf.exp: Always use the C locale to avoid
breaking the diags tests.
libctf/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-subr.c (open_errors): New list.
(ctf_err_warn): Calls with NULL fp append to open_errors. Add err
parameter, and use it to decorate the debug stream with errmsgs.
(ctf_err_warn_to_open): Splice errors from a CTF dict into the
open_errors.
(ctf_errwarning_next): Calls with NULL fp report from open_errors.
New err param to report iteration errors (including end-of-iteration)
when fp is NULL.
(ctf_assert_fail_internal): Adjust ctf_err_warn call for new err
parameter: gettextize.
* ctf-impl.h (ctfo_get_vbytes): Add ctf_file_t parameter.
(LCTF_VBYTES): Adjust.
(ctf_err_warn_to_open): New.
(ctf_err_warn): Adjust.
(ctf_bundle): Used in only one place: move...
* ctf-create.c: ... here.
(enumcmp): Use ctf_err_warn, not ctf_dprintf, passing the err number
down as needed. Don't emit the errmsg. Gettextize.
(membcmp): Likewise.
(ctf_add_type_internal): Likewise.
(ctf_write_mem): Likewise.
(ctf_compress_write): Likewise. Report errors writing the header or
body.
(ctf_write): Likewise.
* ctf-archive.c (ctf_arc_write_fd): Use ctf_err_warn, not
ctf_dprintf, and gettextize, as above.
(ctf_arc_write): Likewise.
(ctf_arc_bufopen): Likewise.
(ctf_arc_open_internal): Likewise.
* ctf-labels.c (ctf_label_iter): Likewise.
* ctf-open-bfd.c (ctf_bfdclose): Likewise.
(ctf_bfdopen): Likewise.
(ctf_bfdopen_ctfsect): Likewise.
(ctf_fdopen): Likewise.
* ctf-string.c (ctf_str_write_strtab): Likewise.
* ctf-types.c (ctf_type_resolve): Likewise.
* ctf-open.c (get_vbytes_common): Likewise. Pass down the ctf dict.
(get_vbytes_v1): Pass down the ctf dict.
(get_vbytes_v2): Likewise.
(flip_ctf): Likewise.
(flip_types): Likewise. Use ctf_err_warn, not ctf_dprintf, and
gettextize, as above.
(upgrade_types_v1): Adjust calls.
(init_types): Use ctf_err_warn, not ctf_dprintf, as above.
(ctf_bufopen_internal): Likewise. Adjust calls. Transplant errors
emitted into individual dicts into the open errors if this turns
out to be a failed open in the end.
* ctf-dump.c (ctf_dump_format_type): Adjust ctf_err_warn for new err
argument. Gettextize. Don't emit the errmsg.
(ctf_dump_funcs): Likewise. Collapse err label into its only case.
(ctf_dump_type): Likewise.
* ctf-link.c (ctf_create_per_cu): Adjust ctf_err_warn for new err
argument. Gettextize. Don't emit the errmsg.
(ctf_link_one_type): Likewise.
(ctf_link_lazy_open): Likewise.
(ctf_link_one_input_archive): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Likewise.
(ctf_link_deduplicating_per_cu): Likewise. Add some missed
ctf_set_errnos to obscure error cases.
* ctf-dedup.c (ctf_dedup_rhash_type): Adjust ctf_err_warn for new
err argument. Gettextize. Don't emit the errmsg.
(ctf_dedup_populate_mappings): Likewise.
(ctf_dedup_detect_name_ambiguity): Likewise.
(ctf_dedup_init): Likewise.
(ctf_dedup_multiple_input_dicts): Likewise.
(ctf_dedup_conflictify_unshared): Likewise.
(ctf_dedup): Likewise.
(ctf_dedup_rwalk_one_output_mapping): Likewise.
(ctf_dedup_id_to_target): Likewise.
(ctf_dedup_emit_type): Likewise.
(ctf_dedup_emit_struct_members): Likewise.
(ctf_dedup_populate_type_mapping): Likewise.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): Likewise.
(ctf_dedup_hash_type): Likewise. Fix a bit of messed-up error
status setting.
(ctf_dedup_rwalk_one_output_mapping): Likewise. Don't hide
unknown-type-kind messages (which signify file corruption).
2020-07-27 23:45:15 +08:00
|
|
|
ctf_set_errno (fp, ctf_errno (out));
|
|
|
|
ctf_err_warn (fp, 0, 0, _("CU-mapped deduplicating link variable "
|
|
|
|
"emission failed for %s"), out_name);
|
libctf, link: tie in the deduplicating linker
This fairly intricate commit connects up the CTF linker machinery (which
operates in terms of ctf_archive_t's on ctf_link_inputs ->
ctf_link_outputs) to the deduplicator (which operates in terms of arrays
of ctf_file_t's, all the archives exploded).
The nondeduplicating linker is retained, but is not called unless the
CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the
environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have
confidence in the much-more-complex deduplicating linker, I hope the
nondeduplicating linker can be removed.
In brief, what this does is traverses each input archive in
ctf_link_inputs, opening every member (if not already open) and tying
child dicts to their parents, shoving them into an array and
constructing a corresponding parents array that tells the deduplicator
which dict is the parent of which child. We then call ctf_dedup and
ctf_dedup_emit with that array of inputs, taking the outputs that result
and putting them into ctf_link_outputs where the rest of the CTF linker
expects to find them, then linking in the variables just as is done by
the nondeduplicating linker.
It also implements much of the CU-mapping side of things. The problem
CU-mapping introduces is that if you map many input CUs into one output,
this is saying that you want many translation units to produce at most
one child dict if conflicting types are found in any of them. This
means you can suddenly have multiple distinct types with the same name
in the same dict, which libctf cannot really represent because it's not
something you can do with C translation units.
The deduplicator machinery already committed does as best it can with
these, hiding types with conflicting names rather than making child
dicts out of them: but we still need to call it. This is done similarly
to the main link, taking the inputs (one CU output at a time),
deduplicating them, taking the output and making it an input to the
final link. Two (significant) optimizations are done: we share atoms
tables between all these links and the final link (so e.g. all type hash
values are shared, all decorated type names, etc); and any CU-mapped
links with only one input (and no child dicts) doesn't need to do
anything other than renaming the CU: the CU-mapped link phase can be
skipped for it. Put together, large CU-mapped links can save 50% of
their memory usage and about as much time (and the memory usage for
CU-mapped links is significant, because all those output CUs have to
have all their types stored in memory all at once).
include/
* ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the
deduplicator.
libctf/
* ctf-impl.h (ctf_list_splice): New.
* ctf-util.h (ctf_list_splice): Likewise.
* ctf-link.c (link_sort_inputs_cb_arg_t): Likewise.
(ctf_link_sort_inputs): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_variables): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Call it.
2020-06-06 05:57:06 +08:00
|
|
|
goto err_inputs_outputs;
|
|
|
|
}
|
|
|
|
|
libctf: add a deduplicator-specific type mapping table
When CTF linking is done, the linker has to track the association
between types in the inputs and types in the outputs. The deduplicator
does this via the cd_output_emission_hashes, which maps from hashes of
types (valid in both the input and output) to the IDs of types in the
specific dict in which the cd_emission_hashes is held. However, the
nondeduplicating linker and ctf_add_type used a different mechanism, a
dedicated hashtab stored in the ctf_link_type_mapping, populated via
ctf_add_type_mapping and queried via the ctf_type_mapping function. To
allow the same functions to be used for variable and symbol population
in both the deduplicating and nondeduplicating linker, the deduplicator
carefully transferred all its input->output mappings into this hashtab
before returning.
This is *expensive*. The number of entries in this hashtab scales as the
number of input types, and unlike the hashing machinery the type mapping
machinery (the only other thing which scales that way) has not been much
optimized.
Now the nondeduplicating linker is gone, we can throw this out, move
the existing type mapping machinery to ctf-create.c and dedicate it to
ctf_add_type alone, and add a new function ctf_dedup_type_mapping which
uses the deduplicator's built-in knowledge of type mappings directly,
without requiring an expensive repopulation phase.
This speeds up a test link of nouveau.ko (a good worst-case candidate
with a lot of types in each of a lot of input files) from 9.11s to 7.15s
in my testing, a speedup of over 20%.
libctf/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (ctf_dict_t) <ctf_link_type_mapping>: No longer used
by the nondeduplicating linker.
(ctf_add_type_mapping): Removed, now static.
(ctf_type_mapping): Likewise.
(ctf_dedup_type_mapping): New.
(ctf_dedup_t) <cd_input_nums>: New.
* ctf-dedup.c (ctf_dedup_init): Populate it.
(ctf_dedup_fini): Free it again. Emphasise that this has to be
the last thing called.
(ctf_dedup): Populate it.
(ctf_dedup_populate_type_mapping): Removed.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): No longer call it. No longer call
ctf_dedup_fini either.
(ctf_dedup_type_mapping): New.
* ctf-link.c (ctf_unnamed_cuname): New.
(ctf_create_per_cu): Arguments must be non-null now.
(ctf_in_member_cb_arg): Removed.
(ctf_link): No longer populate it. No longer discard the
mapping table.
(ctf_link_deduplicating_one_symtypetab): Use
ctf_dedup_type_mapping, not ctf_type_mapping. Use
ctf_unnamed_cuname.
(ctf_link_one_variable): Likewise. Pass in args individually: no
longer a ctf_variable_iter callback.
(empty_link_type_mapping): Removed.
(ctf_link_deduplicating_variables): Use ctf_variable_next, not
ctf_variable_iter. No longer pack arguments to
ctf_link_one_variable into a struct.
(ctf_link_deduplicating_per_cu): Call ctf_dedup_fini once
all link phases are done.
(ctf_link_deduplicating): Likewise.
(ctf_link_intern_extern_string): Improve comment.
(ctf_add_type_mapping): Migrate...
(ctf_type_mapping): ... these functions...
* ctf-create.c (ctf_add_type_mapping): ... here...
(ctf_type_mapping): ... and make static, for the sole use of
ctf_add_type.
2021-03-02 23:10:05 +08:00
|
|
|
ctf_dedup_fini (out, outputs, noutputs);
|
|
|
|
|
libctf: symbol type linking support
This adds facilities to write out the function info and data object
sections, which efficiently map from entries in the symbol table to
types. The write-side code is entirely new: the read-side code was
merely significantly changed and support for indexed tables added
(pointed to by the no-longer-unused cth_objtidxoff and cth_funcidxoff
header fields).
With this in place, you can use ctf_lookup_by_symbol to look up the
types of symbols of function and object type (and, as before, you can
use ctf_lookup_variable to look up types of file-scope variables not
present in the symbol table, as long as you know their name: but
variables that are also data objects are now found in the data object
section instead.)
(Compatible) file format change:
The CTF spec has always said that the function info section looks much
like the CTF_K_FUNCTIONs in the type section: an info word (including an
argument count) followed by a return type and N argument types. This
format is suboptimal: it means function symbols cannot be deduplicated
and it causes a lot of ugly code duplication in libctf. But
conveniently the compiler has never emitted this! Because it has always
emitted a rather different format that libctf has never accepted, we can
be sure that there are no instances of this function info section in the
wild, and can freely change its format without compatibility concerns or
a file format version bump. (And since it has never been emitted in any
code that generated any older file format version, either, we need keep
no code to read the format as specified at all!)
So the function info section is now specified as an array of uint32_t,
exactly like the object data section: each entry is a type ID in the
type section which must be of kind CTF_K_FUNCTION, the prototype of
this function.
This allows function types to be deduplicated and also correctly encodes
the fact that all functions declared in C really are types available to
the program: so they should be stored in the type section like all other
types. (In format v4, we will be able to represent the types of static
functions as well, but that really does require a file format change.)
We introduce a new header flag, CTF_F_NEWFUNCINFO, which is set if the
new function info format is in use. A sufficiently new compiler will
always set this flag. New libctf will always set this flag: old libctf
will refuse to open any CTF dicts that have this flag set. If the flag
is not set on a dict being read in, new libctf will disregard the
function info section. Format v4 will remove this flag (or, rather, the
flag has no meaning there and the bit position may be recycled for some
other purpose).
New API:
Symbol addition:
ctf_add_func_sym: Add a symbol with a given name and type. The
type must be of kind CTF_K_FUNCTION (a function
pointer). Internally this adds a name -> type
mapping to the ctf_funchash in the ctf_dict.
ctf_add_objt_sym: Add a symbol with a given name and type. The type
kind can be anything, including function pointers.
This adds to ctf_objthash.
These both treat symbols as name -> type mappings: the linker associates
symbol names with symbol indexes via the ctf_link_shuffle_syms callback,
which sets up the ctf_dynsyms/ctf_dynsymidx/ctf_dynsymmax fields in the
ctf_dict. Repeated relinks can add more symbols.
Variables that are also exposed as symbols are removed from the variable
section at serialization time.
CTF symbol type sections which have enough pads, defined by
CTF_INDEX_PAD_THRESHOLD (whether because they are in dicts with symbols
where most types are unknown, or in archive where most types are defined
in some child or parent dict, not in this specific dict) are sorted by
name rather than symidx and accompanied by an index which associates
each symbol type entry with a name: the existing ctf_lookup_by_symbol
will map symbol indexes to symbol names and look the names up in the
index automatically. (This is currently ELF-symbol-table-dependent, but
there is almost nothing specific to ELF in here and we can add support
for other symbol table formats easily).
The compiler also uses index sections to communicate the contents of
object file symbol tables without relying on any specific ordering of
symbols: it doesn't need to sort them, and libctf will detect an
unsorted index section via the absence of the new CTF_F_IDXSORTED header
flag, and sort it if needed.
Iteration:
ctf_symbol_next: Iterator which returns the types and names of symbols
one by one, either for function or data symbols.
This does not require any sorting: the ctf_link machinery uses it to
pull in all the compiler-provided symbols cheaply, but it is not
restricted to that use.
(Compatible) changes in API:
ctf_lookup_by_symbol: can now be called for object and function
symbols: never returns ECTF_NOTDATA (which is
now not thrown by anything, but is kept for
compatibility and because it is a plausible
error that we might start throwing again at some
later date).
Internally we also have changes to the ctf-string functionality so that
"external" strings (those where we track a string -> offset mapping, but
only write out an offset) can be consulted via the usual means
(ctf_strptr) before the strtab is written out. This is important
because ctf_link_add_linker_symbol can now be handed symbols named via
strtab offsets, and ctf_link_shuffle_syms must figure out their actual
names by looking in the external symtab we have just been fed by the
ctf_link_add_strtab callback, long before that strtab is written out.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_symbol_next): New.
(ctf_add_objt_sym): Likewise.
(ctf_add_func_sym): Likewise.
* ctf.h: Document new function info section format.
(CTF_F_NEWFUNCINFO): New.
(CTF_F_IDXSORTED): New.
(CTF_F_MAX): Adjust accordingly.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (CTF_INDEX_PAD_THRESHOLD): New.
(_libctf_nonnull_): Likewise.
(ctf_in_flight_dynsym_t): New.
(ctf_dict_t) <ctf_funcidx_names>: Likewise.
<ctf_objtidx_names>: Likewise.
<ctf_nfuncidx>: Likewise.
<ctf_nobjtidx>: Likewise.
<ctf_funcidx_sxlate>: Likewise.
<ctf_objtidx_sxlate>: Likewise.
<ctf_objthash>: Likewise.
<ctf_funchash>: Likewise.
<ctf_dynsyms>: Likewise.
<ctf_dynsymidx>: Likewise.
<ctf_dynsymmax>: Likewise.
<ctf_in_flight_dynsym>: Likewise.
(struct ctf_next) <u.ctn_next>: Likewise.
(ctf_symtab_skippable): New prototype.
(ctf_add_funcobjt_sym): Likewise.
(ctf_dynhash_sort_by_name): Likewise.
(ctf_sym_to_elf64): Rename to...
(ctf_elf32_to_link_sym): ... this, and...
(ctf_elf64_to_link_sym): ... this.
* ctf-open.c (init_symtab): Check for lack of CTF_F_NEWFUNCINFO
flag, and presence of index sections. Refactor out
ctf_symtab_skippable and ctf_elf*_to_link_sym, and use them. Use
ctf_link_sym_t, not Elf64_Sym. Skip initializing objt or func
sxlate sections if corresponding index section is present. Adjust
for new func info section format.
(ctf_bufopen_internal): Add ctf_err_warn to corrupt-file error
handling. Report incorrect-length index sections. Always do an
init_symtab, even if there is no symtab section (there may be index
sections still).
(flip_objts): Adjust comment: func and objt sections are actually
identical in structure now, no need to caveat.
(ctf_dict_close): Free newly-added data structures.
* ctf-create.c (ctf_create): Initialize them.
(ctf_symtab_skippable): New, refactored out of
init_symtab, with st_nameidx_set check added.
(ctf_add_funcobjt_sym): New, add a function or object symbol to the
ctf_objthash or ctf_funchash, by name.
(ctf_add_objt_sym): Call it.
(ctf_add_func_sym): Likewise.
(symtypetab_delete_nonstatic_vars): New, delete vars also present as
data objects.
(CTF_SYMTYPETAB_EMIT_FUNCTION): New flag to symtypetab emitters:
this is a function emission, not a data object emission.
(CTF_SYMTYPETAB_EMIT_PAD): New flag to symtypetab emitters: emit
pads for symbols with no type (only set for unindexed sections).
(CTF_SYMTYPETAB_FORCE_INDEXED): New flag to symtypetab emitters:
always emit indexed.
(symtypetab_density): New, figure out section sizes.
(emit_symtypetab): New, emit a symtypetab.
(emit_symtypetab_index): New, emit a symtypetab index.
(ctf_serialize): Call them, emitting suitably sorted symtypetab
sections and indexes. Set suitable header flags. Copy over new
fields.
* ctf-hash.c (ctf_dynhash_sort_by_name): New, used to impose an
order on symtypetab index sections.
* ctf-link.c (ctf_add_type_mapping): Delete erroneous comment
relating to code that was never committed.
(ctf_link_one_variable): Improve variable name.
(check_sym): New, symtypetab analogue of check_variable.
(ctf_link_deduplicating_one_symtypetab): New.
(ctf_link_deduplicating_syms): Likewise.
(ctf_link_deduplicating): Call them.
(ctf_link_deduplicating_per_cu): Note that we don't call them in
this case (yet).
(ctf_link_add_strtab): Set the error on the fp correctly.
(ctf_link_add_linker_symbol): New (no longer a do-nothing stub), add
a linker symbol to the in-flight list.
(ctf_link_shuffle_syms): New (no longer a do-nothing stub), turn the
in-flight list into a mapping we can use, now its names are
resolvable in the external strtab.
* ctf-string.c (ctf_str_rollback_atom): Don't roll back atoms with
external strtab offsets.
(ctf_str_rollback): Adjust comment.
(ctf_str_write_strtab): Migrate ctf_syn_ext_strtab population from
writeout time...
(ctf_str_add_external): ... to string addition time.
* ctf-lookup.c (ctf_lookup_var_key_t): Rename to...
(ctf_lookup_idx_key_t): ... this, now we use it for syms too.
<clik_names>: New member, a name table.
(ctf_lookup_var): Adjust accordingly.
(ctf_lookup_variable): Likewise.
(ctf_lookup_by_id): Shuffle further up in the file.
(ctf_symidx_sort_arg_cb): New, callback for...
(sort_symidx_by_name): ... this new function to sort a symidx
found to be unsorted (likely originating from the compiler).
(ctf_symidx_sort): New, sort a symidx.
(ctf_lookup_symbol_name): Support dynamic symbols with indexes
provided by the linker. Use ctf_link_sym_t, not Elf64_Sym.
Check the parent if a child lookup fails.
(ctf_lookup_by_symbol): Likewise. Work for function symbols too.
(ctf_symbol_next): New, iterate over symbols with types (without
sorting).
(ctf_lookup_idx_name): New, bsearch for symbol names in indexes.
(ctf_try_lookup_indexed): New, attempt an indexed lookup.
(ctf_func_info): Reimplement in terms of ctf_lookup_by_symbol.
(ctf_func_args): Likewise.
(ctf_get_dict): Move...
* ctf-types.c (ctf_get_dict): ... here.
* ctf-util.c (ctf_sym_to_elf64): Re-express as...
(ctf_elf64_to_link_sym): ... this. Add new st_symidx field, and
st_nameidx_set (always 0, so st_nameidx can be ignored). Look in
the ELF strtab for names.
(ctf_elf32_to_link_sym): Likewise, for Elf32_Sym.
(ctf_next_destroy): Destroy ctf_next_t.u.ctn_next if need be.
* libctf.ver: Add ctf_symbol_next, ctf_add_objt_sym and
ctf_add_func_sym.
2020-11-20 21:34:04 +08:00
|
|
|
/* For now, we omit symbol section linking for CU-mapped links, until it
|
|
|
|
is clear how to unify the symbol table across such links. (Perhaps we
|
|
|
|
should emit an unconditionally indexed symtab, like the compiler
|
|
|
|
does.) */
|
|
|
|
|
libctf, link: tie in the deduplicating linker
This fairly intricate commit connects up the CTF linker machinery (which
operates in terms of ctf_archive_t's on ctf_link_inputs ->
ctf_link_outputs) to the deduplicator (which operates in terms of arrays
of ctf_file_t's, all the archives exploded).
The nondeduplicating linker is retained, but is not called unless the
CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the
environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have
confidence in the much-more-complex deduplicating linker, I hope the
nondeduplicating linker can be removed.
In brief, what this does is traverses each input archive in
ctf_link_inputs, opening every member (if not already open) and tying
child dicts to their parents, shoving them into an array and
constructing a corresponding parents array that tells the deduplicator
which dict is the parent of which child. We then call ctf_dedup and
ctf_dedup_emit with that array of inputs, taking the outputs that result
and putting them into ctf_link_outputs where the rest of the CTF linker
expects to find them, then linking in the variables just as is done by
the nondeduplicating linker.
It also implements much of the CU-mapping side of things. The problem
CU-mapping introduces is that if you map many input CUs into one output,
this is saying that you want many translation units to produce at most
one child dict if conflicting types are found in any of them. This
means you can suddenly have multiple distinct types with the same name
in the same dict, which libctf cannot really represent because it's not
something you can do with C translation units.
The deduplicator machinery already committed does as best it can with
these, hiding types with conflicting names rather than making child
dicts out of them: but we still need to call it. This is done similarly
to the main link, taking the inputs (one CU output at a time),
deduplicating them, taking the output and making it an input to the
final link. Two (significant) optimizations are done: we share atoms
tables between all these links and the final link (so e.g. all type hash
values are shared, all decorated type names, etc); and any CU-mapped
links with only one input (and no child dicts) doesn't need to do
anything other than renaming the CU: the CU-mapped link phase can be
skipped for it. Put together, large CU-mapped links can save 50% of
their memory usage and about as much time (and the memory usage for
CU-mapped links is significant, because all those output CUs have to
have all their types stored in memory all at once).
include/
* ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the
deduplicator.
libctf/
* ctf-impl.h (ctf_list_splice): New.
* ctf-util.h (ctf_list_splice): Likewise.
* ctf-link.c (link_sort_inputs_cb_arg_t): Likewise.
(ctf_link_sort_inputs): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_variables): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Call it.
2020-06-06 05:57:06 +08:00
|
|
|
if (ctf_link_deduplicating_close_inputs (fp, in, inputs, ninputs) < 0)
|
|
|
|
{
|
|
|
|
free (inputs);
|
|
|
|
free (parents);
|
|
|
|
goto err_outputs;
|
|
|
|
}
|
|
|
|
free (inputs);
|
|
|
|
free (parents);
|
|
|
|
|
|
|
|
/* Splice any errors or warnings created during this link back into the
|
|
|
|
dict that the caller knows about. */
|
|
|
|
ctf_list_splice (&fp->ctf_errs_warnings, &outputs[0]->ctf_errs_warnings);
|
|
|
|
|
|
|
|
/* This output now becomes an input to the next link phase, with a name
|
|
|
|
equal to the CU name. We have to wrap it in an archive wrapper
|
|
|
|
first. */
|
|
|
|
|
|
|
|
if ((in_arc = ctf_new_archive_internal (0, 0, NULL, outputs[0], NULL,
|
|
|
|
NULL, &err)) == NULL)
|
|
|
|
{
|
|
|
|
ctf_set_errno (fp, err);
|
|
|
|
goto err_outputs;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (ctf_link_add_ctf_internal (fp, in_arc, NULL,
|
|
|
|
ctf_cuname (outputs[0])) < 0)
|
|
|
|
{
|
libctf, binutils, include, ld: gettextize and improve error handling
This commit follows on from the earlier commit "libctf, ld, binutils:
add textual error/warning reporting for libctf" and converts every error
in libctf that was reported using ctf_dprintf to use ctf_err_warn
instead, gettextizing them in the process, using N_() where necessary to
avoid doing gettext calls unless an error message is actually generated,
and rephrasing some error messages for ease of translation.
This requires a slight change in the ctf_errwarning_next API: this API
is public but has not been in a release yet, so can still change freely.
The problem is that many errors are emitted at open time (whether
opening of a CTF dict, or opening of a CTF archive): the former of these
throws away its incompletely-initialized ctf_file_t rather than return
it, and the latter has no ctf_file_t at all. So errors and warnings
emitted at open time cannot be stored in the ctf_file_t, and have to go
elsewhere.
We put them in a static local in ctf-subr.c (which is not very
thread-safe: a later commit will improve things here): ctf_err_warn with
a NULL fp adds to this list, and the public interface
ctf_errwarning_next with a NULL fp retrieves from it.
We need a slight exception from the usual iterator rules in this case:
with a NULL fp, there is nowhere to store the ECTF_NEXT_END "error"
which signifies the end of iteration, so we add a new err parameter to
ctf_errwarning_next which is used to report such iteration-related
errors. (If an fp is provided -- i.e., if not reporting open errors --
this is optional, but even if it's optional it's still an API change.
This is actually useful from a usability POV as well, since
ctf_errwarning_next is usually called when there's been an error, so
overwriting the error code with ECTF_NEXT_END is not very helpful!
So, unusually, ctf_errwarning_next now uses the passed fp for its
error code *only* if no errp pointer is passed in, and leaves it
untouched otherwise.)
ld, objdump and readelf are adapted to call ctf_errwarning_next with a
NULL fp to report open errors where appropriate.
The ctf_err_warn API also has to change, gaining a new error-number
parameter which is used to add the error message corresponding to that
error number into the debug stream when LIBCTF_DEBUG is enabled:
changing this API is easy at this point since we are already touching
all existing calls to gettextize them. We need this because the debug
stream should contain the errno's message, but the error reported in the
error/warning stream should *not*, because the caller will probably
report it themselves at failure time regardless, and reporting it in
every error message that leads up to it leads to a ridiculous chattering
on failure, which is likely to end up as ridiculous chattering on stderr
(trimmed a bit):
CTF error: `ld/testsuite/ld-ctf/A.c (0): lookup failure for type 3: flags 1: The parent CTF dictionary is unavailable'
CTF error: `ld/testsuite/ld-ctf/A.c (0): struct/union member type hashing error during type hashing for type 80000001, kind 6: The parent CTF dictionary is unavailable'
CTF error: `deduplicating link variable emission failed for ld/testsuite/ld-ctf/A.c: The parent CTF dictionary is unavailable'
ld/.libs/lt-ld-new: warning: CTF linking failed; output will have no CTF section: `The parent CTF dictionary is unavailable'
We only need to be told that the parent CTF dictionary is unavailable
*once*, not over and over again!
errmsgs are still emitted on warning generation, because warnings do not
usually lead to a failure propagated up to the caller and reported
there.
Debug-stream messages are not translated. If translation is turned on,
there will be a mixture of English and translated messages in the debug
stream, but rather that than burden the translators with debug-only
output.
binutils/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_archive_member): Move error-
reporting...
(dump_ctf_errs): ... into this separate function.
(dump_ctf): Call it on open errors.
* readelf.c (dump_ctf_archive_member): Move error-
reporting...
(dump_ctf_errs): ... into this separate function. Support
calls with NULL fp. Adjust for new err parameter to
ctf_errwarning_next.
(dump_section_as_ctf): Call it on open errors.
include/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_errwarning_next): New err parameter.
ld/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (lang_ctf_errs_warnings): Support calls with NULL fp.
Adjust for new err parameter to ctf_errwarning_next. Only
check for assertion failures when fp is non-NULL.
(ldlang_open_ctf): Call it on open errors.
* testsuite/ld-ctf/ctf.exp: Always use the C locale to avoid
breaking the diags tests.
libctf/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-subr.c (open_errors): New list.
(ctf_err_warn): Calls with NULL fp append to open_errors. Add err
parameter, and use it to decorate the debug stream with errmsgs.
(ctf_err_warn_to_open): Splice errors from a CTF dict into the
open_errors.
(ctf_errwarning_next): Calls with NULL fp report from open_errors.
New err param to report iteration errors (including end-of-iteration)
when fp is NULL.
(ctf_assert_fail_internal): Adjust ctf_err_warn call for new err
parameter: gettextize.
* ctf-impl.h (ctfo_get_vbytes): Add ctf_file_t parameter.
(LCTF_VBYTES): Adjust.
(ctf_err_warn_to_open): New.
(ctf_err_warn): Adjust.
(ctf_bundle): Used in only one place: move...
* ctf-create.c: ... here.
(enumcmp): Use ctf_err_warn, not ctf_dprintf, passing the err number
down as needed. Don't emit the errmsg. Gettextize.
(membcmp): Likewise.
(ctf_add_type_internal): Likewise.
(ctf_write_mem): Likewise.
(ctf_compress_write): Likewise. Report errors writing the header or
body.
(ctf_write): Likewise.
* ctf-archive.c (ctf_arc_write_fd): Use ctf_err_warn, not
ctf_dprintf, and gettextize, as above.
(ctf_arc_write): Likewise.
(ctf_arc_bufopen): Likewise.
(ctf_arc_open_internal): Likewise.
* ctf-labels.c (ctf_label_iter): Likewise.
* ctf-open-bfd.c (ctf_bfdclose): Likewise.
(ctf_bfdopen): Likewise.
(ctf_bfdopen_ctfsect): Likewise.
(ctf_fdopen): Likewise.
* ctf-string.c (ctf_str_write_strtab): Likewise.
* ctf-types.c (ctf_type_resolve): Likewise.
* ctf-open.c (get_vbytes_common): Likewise. Pass down the ctf dict.
(get_vbytes_v1): Pass down the ctf dict.
(get_vbytes_v2): Likewise.
(flip_ctf): Likewise.
(flip_types): Likewise. Use ctf_err_warn, not ctf_dprintf, and
gettextize, as above.
(upgrade_types_v1): Adjust calls.
(init_types): Use ctf_err_warn, not ctf_dprintf, as above.
(ctf_bufopen_internal): Likewise. Adjust calls. Transplant errors
emitted into individual dicts into the open errors if this turns
out to be a failed open in the end.
* ctf-dump.c (ctf_dump_format_type): Adjust ctf_err_warn for new err
argument. Gettextize. Don't emit the errmsg.
(ctf_dump_funcs): Likewise. Collapse err label into its only case.
(ctf_dump_type): Likewise.
* ctf-link.c (ctf_create_per_cu): Adjust ctf_err_warn for new err
argument. Gettextize. Don't emit the errmsg.
(ctf_link_one_type): Likewise.
(ctf_link_lazy_open): Likewise.
(ctf_link_one_input_archive): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Likewise.
(ctf_link_deduplicating_per_cu): Likewise. Add some missed
ctf_set_errnos to obscure error cases.
* ctf-dedup.c (ctf_dedup_rhash_type): Adjust ctf_err_warn for new
err argument. Gettextize. Don't emit the errmsg.
(ctf_dedup_populate_mappings): Likewise.
(ctf_dedup_detect_name_ambiguity): Likewise.
(ctf_dedup_init): Likewise.
(ctf_dedup_multiple_input_dicts): Likewise.
(ctf_dedup_conflictify_unshared): Likewise.
(ctf_dedup): Likewise.
(ctf_dedup_rwalk_one_output_mapping): Likewise.
(ctf_dedup_id_to_target): Likewise.
(ctf_dedup_emit_type): Likewise.
(ctf_dedup_emit_struct_members): Likewise.
(ctf_dedup_populate_type_mapping): Likewise.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): Likewise.
(ctf_dedup_hash_type): Likewise. Fix a bit of messed-up error
status setting.
(ctf_dedup_rwalk_one_output_mapping): Likewise. Don't hide
unknown-type-kind messages (which signify file corruption).
2020-07-27 23:45:15 +08:00
|
|
|
ctf_err_warn (fp, 0, 0, _("cannot add intermediate files to link"));
|
libctf, link: tie in the deduplicating linker
This fairly intricate commit connects up the CTF linker machinery (which
operates in terms of ctf_archive_t's on ctf_link_inputs ->
ctf_link_outputs) to the deduplicator (which operates in terms of arrays
of ctf_file_t's, all the archives exploded).
The nondeduplicating linker is retained, but is not called unless the
CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the
environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have
confidence in the much-more-complex deduplicating linker, I hope the
nondeduplicating linker can be removed.
In brief, what this does is traverses each input archive in
ctf_link_inputs, opening every member (if not already open) and tying
child dicts to their parents, shoving them into an array and
constructing a corresponding parents array that tells the deduplicator
which dict is the parent of which child. We then call ctf_dedup and
ctf_dedup_emit with that array of inputs, taking the outputs that result
and putting them into ctf_link_outputs where the rest of the CTF linker
expects to find them, then linking in the variables just as is done by
the nondeduplicating linker.
It also implements much of the CU-mapping side of things. The problem
CU-mapping introduces is that if you map many input CUs into one output,
this is saying that you want many translation units to produce at most
one child dict if conflicting types are found in any of them. This
means you can suddenly have multiple distinct types with the same name
in the same dict, which libctf cannot really represent because it's not
something you can do with C translation units.
The deduplicator machinery already committed does as best it can with
these, hiding types with conflicting names rather than making child
dicts out of them: but we still need to call it. This is done similarly
to the main link, taking the inputs (one CU output at a time),
deduplicating them, taking the output and making it an input to the
final link. Two (significant) optimizations are done: we share atoms
tables between all these links and the final link (so e.g. all type hash
values are shared, all decorated type names, etc); and any CU-mapped
links with only one input (and no child dicts) doesn't need to do
anything other than renaming the CU: the CU-mapped link phase can be
skipped for it. Put together, large CU-mapped links can save 50% of
their memory usage and about as much time (and the memory usage for
CU-mapped links is significant, because all those output CUs have to
have all their types stored in memory all at once).
include/
* ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the
deduplicator.
libctf/
* ctf-impl.h (ctf_list_splice): New.
* ctf-util.h (ctf_list_splice): Likewise.
* ctf-link.c (link_sort_inputs_cb_arg_t): Likewise.
(ctf_link_sort_inputs): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_variables): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Call it.
2020-06-06 05:57:06 +08:00
|
|
|
goto err_outputs;
|
|
|
|
}
|
|
|
|
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
ctf_dict_close (out);
|
libctf, link: tie in the deduplicating linker
This fairly intricate commit connects up the CTF linker machinery (which
operates in terms of ctf_archive_t's on ctf_link_inputs ->
ctf_link_outputs) to the deduplicator (which operates in terms of arrays
of ctf_file_t's, all the archives exploded).
The nondeduplicating linker is retained, but is not called unless the
CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the
environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have
confidence in the much-more-complex deduplicating linker, I hope the
nondeduplicating linker can be removed.
In brief, what this does is traverses each input archive in
ctf_link_inputs, opening every member (if not already open) and tying
child dicts to their parents, shoving them into an array and
constructing a corresponding parents array that tells the deduplicator
which dict is the parent of which child. We then call ctf_dedup and
ctf_dedup_emit with that array of inputs, taking the outputs that result
and putting them into ctf_link_outputs where the rest of the CTF linker
expects to find them, then linking in the variables just as is done by
the nondeduplicating linker.
It also implements much of the CU-mapping side of things. The problem
CU-mapping introduces is that if you map many input CUs into one output,
this is saying that you want many translation units to produce at most
one child dict if conflicting types are found in any of them. This
means you can suddenly have multiple distinct types with the same name
in the same dict, which libctf cannot really represent because it's not
something you can do with C translation units.
The deduplicator machinery already committed does as best it can with
these, hiding types with conflicting names rather than making child
dicts out of them: but we still need to call it. This is done similarly
to the main link, taking the inputs (one CU output at a time),
deduplicating them, taking the output and making it an input to the
final link. Two (significant) optimizations are done: we share atoms
tables between all these links and the final link (so e.g. all type hash
values are shared, all decorated type names, etc); and any CU-mapped
links with only one input (and no child dicts) doesn't need to do
anything other than renaming the CU: the CU-mapped link phase can be
skipped for it. Put together, large CU-mapped links can save 50% of
their memory usage and about as much time (and the memory usage for
CU-mapped links is significant, because all those output CUs have to
have all their types stored in memory all at once).
include/
* ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the
deduplicator.
libctf/
* ctf-impl.h (ctf_list_splice): New.
* ctf-util.h (ctf_list_splice): Likewise.
* ctf-link.c (link_sort_inputs_cb_arg_t): Likewise.
(ctf_link_sort_inputs): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_variables): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Call it.
2020-06-06 05:57:06 +08:00
|
|
|
free (outputs);
|
|
|
|
continue;
|
|
|
|
|
|
|
|
err_inputs_outputs:
|
|
|
|
ctf_list_splice (&fp->ctf_errs_warnings, &outputs[0]->ctf_errs_warnings);
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
ctf_dict_close (outputs[0]);
|
libctf, link: tie in the deduplicating linker
This fairly intricate commit connects up the CTF linker machinery (which
operates in terms of ctf_archive_t's on ctf_link_inputs ->
ctf_link_outputs) to the deduplicator (which operates in terms of arrays
of ctf_file_t's, all the archives exploded).
The nondeduplicating linker is retained, but is not called unless the
CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the
environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have
confidence in the much-more-complex deduplicating linker, I hope the
nondeduplicating linker can be removed.
In brief, what this does is traverses each input archive in
ctf_link_inputs, opening every member (if not already open) and tying
child dicts to their parents, shoving them into an array and
constructing a corresponding parents array that tells the deduplicator
which dict is the parent of which child. We then call ctf_dedup and
ctf_dedup_emit with that array of inputs, taking the outputs that result
and putting them into ctf_link_outputs where the rest of the CTF linker
expects to find them, then linking in the variables just as is done by
the nondeduplicating linker.
It also implements much of the CU-mapping side of things. The problem
CU-mapping introduces is that if you map many input CUs into one output,
this is saying that you want many translation units to produce at most
one child dict if conflicting types are found in any of them. This
means you can suddenly have multiple distinct types with the same name
in the same dict, which libctf cannot really represent because it's not
something you can do with C translation units.
The deduplicator machinery already committed does as best it can with
these, hiding types with conflicting names rather than making child
dicts out of them: but we still need to call it. This is done similarly
to the main link, taking the inputs (one CU output at a time),
deduplicating them, taking the output and making it an input to the
final link. Two (significant) optimizations are done: we share atoms
tables between all these links and the final link (so e.g. all type hash
values are shared, all decorated type names, etc); and any CU-mapped
links with only one input (and no child dicts) doesn't need to do
anything other than renaming the CU: the CU-mapped link phase can be
skipped for it. Put together, large CU-mapped links can save 50% of
their memory usage and about as much time (and the memory usage for
CU-mapped links is significant, because all those output CUs have to
have all their types stored in memory all at once).
include/
* ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the
deduplicator.
libctf/
* ctf-impl.h (ctf_list_splice): New.
* ctf-util.h (ctf_list_splice): Likewise.
* ctf-link.c (link_sort_inputs_cb_arg_t): Likewise.
(ctf_link_sort_inputs): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_variables): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Call it.
2020-06-06 05:57:06 +08:00
|
|
|
free (outputs);
|
|
|
|
err_inputs:
|
|
|
|
ctf_link_deduplicating_close_inputs (fp, in, inputs, ninputs);
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
ctf_dict_close (out);
|
libctf, link: tie in the deduplicating linker
This fairly intricate commit connects up the CTF linker machinery (which
operates in terms of ctf_archive_t's on ctf_link_inputs ->
ctf_link_outputs) to the deduplicator (which operates in terms of arrays
of ctf_file_t's, all the archives exploded).
The nondeduplicating linker is retained, but is not called unless the
CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the
environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have
confidence in the much-more-complex deduplicating linker, I hope the
nondeduplicating linker can be removed.
In brief, what this does is traverses each input archive in
ctf_link_inputs, opening every member (if not already open) and tying
child dicts to their parents, shoving them into an array and
constructing a corresponding parents array that tells the deduplicator
which dict is the parent of which child. We then call ctf_dedup and
ctf_dedup_emit with that array of inputs, taking the outputs that result
and putting them into ctf_link_outputs where the rest of the CTF linker
expects to find them, then linking in the variables just as is done by
the nondeduplicating linker.
It also implements much of the CU-mapping side of things. The problem
CU-mapping introduces is that if you map many input CUs into one output,
this is saying that you want many translation units to produce at most
one child dict if conflicting types are found in any of them. This
means you can suddenly have multiple distinct types with the same name
in the same dict, which libctf cannot really represent because it's not
something you can do with C translation units.
The deduplicator machinery already committed does as best it can with
these, hiding types with conflicting names rather than making child
dicts out of them: but we still need to call it. This is done similarly
to the main link, taking the inputs (one CU output at a time),
deduplicating them, taking the output and making it an input to the
final link. Two (significant) optimizations are done: we share atoms
tables between all these links and the final link (so e.g. all type hash
values are shared, all decorated type names, etc); and any CU-mapped
links with only one input (and no child dicts) doesn't need to do
anything other than renaming the CU: the CU-mapped link phase can be
skipped for it. Put together, large CU-mapped links can save 50% of
their memory usage and about as much time (and the memory usage for
CU-mapped links is significant, because all those output CUs have to
have all their types stored in memory all at once).
include/
* ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the
deduplicator.
libctf/
* ctf-impl.h (ctf_list_splice): New.
* ctf-util.h (ctf_list_splice): Likewise.
* ctf-link.c (link_sort_inputs_cb_arg_t): Likewise.
(ctf_link_sort_inputs): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_variables): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Call it.
2020-06-06 05:57:06 +08:00
|
|
|
free (inputs);
|
|
|
|
free (parents);
|
|
|
|
err_open_inputs:
|
|
|
|
ctf_next_destroy (i);
|
|
|
|
return -1;
|
|
|
|
|
|
|
|
err_outputs:
|
|
|
|
ctf_list_splice (&fp->ctf_errs_warnings, &outputs[0]->ctf_errs_warnings);
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
ctf_dict_close (outputs[0]);
|
libctf, link: tie in the deduplicating linker
This fairly intricate commit connects up the CTF linker machinery (which
operates in terms of ctf_archive_t's on ctf_link_inputs ->
ctf_link_outputs) to the deduplicator (which operates in terms of arrays
of ctf_file_t's, all the archives exploded).
The nondeduplicating linker is retained, but is not called unless the
CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the
environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have
confidence in the much-more-complex deduplicating linker, I hope the
nondeduplicating linker can be removed.
In brief, what this does is traverses each input archive in
ctf_link_inputs, opening every member (if not already open) and tying
child dicts to their parents, shoving them into an array and
constructing a corresponding parents array that tells the deduplicator
which dict is the parent of which child. We then call ctf_dedup and
ctf_dedup_emit with that array of inputs, taking the outputs that result
and putting them into ctf_link_outputs where the rest of the CTF linker
expects to find them, then linking in the variables just as is done by
the nondeduplicating linker.
It also implements much of the CU-mapping side of things. The problem
CU-mapping introduces is that if you map many input CUs into one output,
this is saying that you want many translation units to produce at most
one child dict if conflicting types are found in any of them. This
means you can suddenly have multiple distinct types with the same name
in the same dict, which libctf cannot really represent because it's not
something you can do with C translation units.
The deduplicator machinery already committed does as best it can with
these, hiding types with conflicting names rather than making child
dicts out of them: but we still need to call it. This is done similarly
to the main link, taking the inputs (one CU output at a time),
deduplicating them, taking the output and making it an input to the
final link. Two (significant) optimizations are done: we share atoms
tables between all these links and the final link (so e.g. all type hash
values are shared, all decorated type names, etc); and any CU-mapped
links with only one input (and no child dicts) doesn't need to do
anything other than renaming the CU: the CU-mapped link phase can be
skipped for it. Put together, large CU-mapped links can save 50% of
their memory usage and about as much time (and the memory usage for
CU-mapped links is significant, because all those output CUs have to
have all their types stored in memory all at once).
include/
* ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the
deduplicator.
libctf/
* ctf-impl.h (ctf_list_splice): New.
* ctf-util.h (ctf_list_splice): Likewise.
* ctf-link.c (link_sort_inputs_cb_arg_t): Likewise.
(ctf_link_sort_inputs): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_variables): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Call it.
2020-06-06 05:57:06 +08:00
|
|
|
free (outputs);
|
|
|
|
ctf_next_destroy (i);
|
|
|
|
return -1; /* Errno is set for us. */
|
|
|
|
}
|
|
|
|
if (err != ECTF_NEXT_END)
|
|
|
|
{
|
libctf, binutils, include, ld: gettextize and improve error handling
This commit follows on from the earlier commit "libctf, ld, binutils:
add textual error/warning reporting for libctf" and converts every error
in libctf that was reported using ctf_dprintf to use ctf_err_warn
instead, gettextizing them in the process, using N_() where necessary to
avoid doing gettext calls unless an error message is actually generated,
and rephrasing some error messages for ease of translation.
This requires a slight change in the ctf_errwarning_next API: this API
is public but has not been in a release yet, so can still change freely.
The problem is that many errors are emitted at open time (whether
opening of a CTF dict, or opening of a CTF archive): the former of these
throws away its incompletely-initialized ctf_file_t rather than return
it, and the latter has no ctf_file_t at all. So errors and warnings
emitted at open time cannot be stored in the ctf_file_t, and have to go
elsewhere.
We put them in a static local in ctf-subr.c (which is not very
thread-safe: a later commit will improve things here): ctf_err_warn with
a NULL fp adds to this list, and the public interface
ctf_errwarning_next with a NULL fp retrieves from it.
We need a slight exception from the usual iterator rules in this case:
with a NULL fp, there is nowhere to store the ECTF_NEXT_END "error"
which signifies the end of iteration, so we add a new err parameter to
ctf_errwarning_next which is used to report such iteration-related
errors. (If an fp is provided -- i.e., if not reporting open errors --
this is optional, but even if it's optional it's still an API change.
This is actually useful from a usability POV as well, since
ctf_errwarning_next is usually called when there's been an error, so
overwriting the error code with ECTF_NEXT_END is not very helpful!
So, unusually, ctf_errwarning_next now uses the passed fp for its
error code *only* if no errp pointer is passed in, and leaves it
untouched otherwise.)
ld, objdump and readelf are adapted to call ctf_errwarning_next with a
NULL fp to report open errors where appropriate.
The ctf_err_warn API also has to change, gaining a new error-number
parameter which is used to add the error message corresponding to that
error number into the debug stream when LIBCTF_DEBUG is enabled:
changing this API is easy at this point since we are already touching
all existing calls to gettextize them. We need this because the debug
stream should contain the errno's message, but the error reported in the
error/warning stream should *not*, because the caller will probably
report it themselves at failure time regardless, and reporting it in
every error message that leads up to it leads to a ridiculous chattering
on failure, which is likely to end up as ridiculous chattering on stderr
(trimmed a bit):
CTF error: `ld/testsuite/ld-ctf/A.c (0): lookup failure for type 3: flags 1: The parent CTF dictionary is unavailable'
CTF error: `ld/testsuite/ld-ctf/A.c (0): struct/union member type hashing error during type hashing for type 80000001, kind 6: The parent CTF dictionary is unavailable'
CTF error: `deduplicating link variable emission failed for ld/testsuite/ld-ctf/A.c: The parent CTF dictionary is unavailable'
ld/.libs/lt-ld-new: warning: CTF linking failed; output will have no CTF section: `The parent CTF dictionary is unavailable'
We only need to be told that the parent CTF dictionary is unavailable
*once*, not over and over again!
errmsgs are still emitted on warning generation, because warnings do not
usually lead to a failure propagated up to the caller and reported
there.
Debug-stream messages are not translated. If translation is turned on,
there will be a mixture of English and translated messages in the debug
stream, but rather that than burden the translators with debug-only
output.
binutils/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_archive_member): Move error-
reporting...
(dump_ctf_errs): ... into this separate function.
(dump_ctf): Call it on open errors.
* readelf.c (dump_ctf_archive_member): Move error-
reporting...
(dump_ctf_errs): ... into this separate function. Support
calls with NULL fp. Adjust for new err parameter to
ctf_errwarning_next.
(dump_section_as_ctf): Call it on open errors.
include/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_errwarning_next): New err parameter.
ld/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (lang_ctf_errs_warnings): Support calls with NULL fp.
Adjust for new err parameter to ctf_errwarning_next. Only
check for assertion failures when fp is non-NULL.
(ldlang_open_ctf): Call it on open errors.
* testsuite/ld-ctf/ctf.exp: Always use the C locale to avoid
breaking the diags tests.
libctf/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-subr.c (open_errors): New list.
(ctf_err_warn): Calls with NULL fp append to open_errors. Add err
parameter, and use it to decorate the debug stream with errmsgs.
(ctf_err_warn_to_open): Splice errors from a CTF dict into the
open_errors.
(ctf_errwarning_next): Calls with NULL fp report from open_errors.
New err param to report iteration errors (including end-of-iteration)
when fp is NULL.
(ctf_assert_fail_internal): Adjust ctf_err_warn call for new err
parameter: gettextize.
* ctf-impl.h (ctfo_get_vbytes): Add ctf_file_t parameter.
(LCTF_VBYTES): Adjust.
(ctf_err_warn_to_open): New.
(ctf_err_warn): Adjust.
(ctf_bundle): Used in only one place: move...
* ctf-create.c: ... here.
(enumcmp): Use ctf_err_warn, not ctf_dprintf, passing the err number
down as needed. Don't emit the errmsg. Gettextize.
(membcmp): Likewise.
(ctf_add_type_internal): Likewise.
(ctf_write_mem): Likewise.
(ctf_compress_write): Likewise. Report errors writing the header or
body.
(ctf_write): Likewise.
* ctf-archive.c (ctf_arc_write_fd): Use ctf_err_warn, not
ctf_dprintf, and gettextize, as above.
(ctf_arc_write): Likewise.
(ctf_arc_bufopen): Likewise.
(ctf_arc_open_internal): Likewise.
* ctf-labels.c (ctf_label_iter): Likewise.
* ctf-open-bfd.c (ctf_bfdclose): Likewise.
(ctf_bfdopen): Likewise.
(ctf_bfdopen_ctfsect): Likewise.
(ctf_fdopen): Likewise.
* ctf-string.c (ctf_str_write_strtab): Likewise.
* ctf-types.c (ctf_type_resolve): Likewise.
* ctf-open.c (get_vbytes_common): Likewise. Pass down the ctf dict.
(get_vbytes_v1): Pass down the ctf dict.
(get_vbytes_v2): Likewise.
(flip_ctf): Likewise.
(flip_types): Likewise. Use ctf_err_warn, not ctf_dprintf, and
gettextize, as above.
(upgrade_types_v1): Adjust calls.
(init_types): Use ctf_err_warn, not ctf_dprintf, as above.
(ctf_bufopen_internal): Likewise. Adjust calls. Transplant errors
emitted into individual dicts into the open errors if this turns
out to be a failed open in the end.
* ctf-dump.c (ctf_dump_format_type): Adjust ctf_err_warn for new err
argument. Gettextize. Don't emit the errmsg.
(ctf_dump_funcs): Likewise. Collapse err label into its only case.
(ctf_dump_type): Likewise.
* ctf-link.c (ctf_create_per_cu): Adjust ctf_err_warn for new err
argument. Gettextize. Don't emit the errmsg.
(ctf_link_one_type): Likewise.
(ctf_link_lazy_open): Likewise.
(ctf_link_one_input_archive): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Likewise.
(ctf_link_deduplicating_per_cu): Likewise. Add some missed
ctf_set_errnos to obscure error cases.
* ctf-dedup.c (ctf_dedup_rhash_type): Adjust ctf_err_warn for new
err argument. Gettextize. Don't emit the errmsg.
(ctf_dedup_populate_mappings): Likewise.
(ctf_dedup_detect_name_ambiguity): Likewise.
(ctf_dedup_init): Likewise.
(ctf_dedup_multiple_input_dicts): Likewise.
(ctf_dedup_conflictify_unshared): Likewise.
(ctf_dedup): Likewise.
(ctf_dedup_rwalk_one_output_mapping): Likewise.
(ctf_dedup_id_to_target): Likewise.
(ctf_dedup_emit_type): Likewise.
(ctf_dedup_emit_struct_members): Likewise.
(ctf_dedup_populate_type_mapping): Likewise.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): Likewise.
(ctf_dedup_hash_type): Likewise. Fix a bit of messed-up error
status setting.
(ctf_dedup_rwalk_one_output_mapping): Likewise. Don't hide
unknown-type-kind messages (which signify file corruption).
2020-07-27 23:45:15 +08:00
|
|
|
ctf_err_warn (fp, 0, err, _("iteration error in CU-mapped deduplicating "
|
|
|
|
"link"));
|
libctf, link: tie in the deduplicating linker
This fairly intricate commit connects up the CTF linker machinery (which
operates in terms of ctf_archive_t's on ctf_link_inputs ->
ctf_link_outputs) to the deduplicator (which operates in terms of arrays
of ctf_file_t's, all the archives exploded).
The nondeduplicating linker is retained, but is not called unless the
CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the
environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have
confidence in the much-more-complex deduplicating linker, I hope the
nondeduplicating linker can be removed.
In brief, what this does is traverses each input archive in
ctf_link_inputs, opening every member (if not already open) and tying
child dicts to their parents, shoving them into an array and
constructing a corresponding parents array that tells the deduplicator
which dict is the parent of which child. We then call ctf_dedup and
ctf_dedup_emit with that array of inputs, taking the outputs that result
and putting them into ctf_link_outputs where the rest of the CTF linker
expects to find them, then linking in the variables just as is done by
the nondeduplicating linker.
It also implements much of the CU-mapping side of things. The problem
CU-mapping introduces is that if you map many input CUs into one output,
this is saying that you want many translation units to produce at most
one child dict if conflicting types are found in any of them. This
means you can suddenly have multiple distinct types with the same name
in the same dict, which libctf cannot really represent because it's not
something you can do with C translation units.
The deduplicator machinery already committed does as best it can with
these, hiding types with conflicting names rather than making child
dicts out of them: but we still need to call it. This is done similarly
to the main link, taking the inputs (one CU output at a time),
deduplicating them, taking the output and making it an input to the
final link. Two (significant) optimizations are done: we share atoms
tables between all these links and the final link (so e.g. all type hash
values are shared, all decorated type names, etc); and any CU-mapped
links with only one input (and no child dicts) doesn't need to do
anything other than renaming the CU: the CU-mapped link phase can be
skipped for it. Put together, large CU-mapped links can save 50% of
their memory usage and about as much time (and the memory usage for
CU-mapped links is significant, because all those output CUs have to
have all their types stored in memory all at once).
include/
* ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the
deduplicator.
libctf/
* ctf-impl.h (ctf_list_splice): New.
* ctf-util.h (ctf_list_splice): Likewise.
* ctf-link.c (link_sort_inputs_cb_arg_t): Likewise.
(ctf_link_sort_inputs): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_variables): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Call it.
2020-06-06 05:57:06 +08:00
|
|
|
return ctf_set_errno (fp, err);
|
|
|
|
}
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
libctf: fix linking together multiple objects derived from the same source
Right now, if you compile the same .c input repeatedly with CTF enabled
and different compilation flags, then arrange to link all of these
together, then things misbehave in various ways. libctf may conflate
either inputs (if the .o files have the same name, say if they are
stored in different .a archives), or per-CU outputs when conflicting
types are found: the latter can lead to entirely spurious errors when
it tries to produce multiple per-CU outputs with the same name
(discarding all but the last, but then looking for types in the earlier
ones which have just been thrown away).
Fixing this is multi-pronged. Both inputs and outputs need to be
differentiated in the hashtables libctf keeps them in: inputs with the
same cuname and filename need to be considered distinct as long as they
have different associated CTF dicts, and per-CU outputs need to be
considered distinct as long as they have different associated input
dicts. Right now there is nothing tying the two together other than the
CU name: fix this by introducing a new field in the ctf_dict_t named
ctf_link_in_out, which (for input dicts) points to the associated per-CU
output dict (if any), and for output dicts points to the associated
input dict. At creation time the name used is completely arbitrary:
it's only important that it be distinct if CTF dicts are distinct. So,
when a clash is found, adjust the CU name by sticking the number of
elements in the input on the end. At output time, the CU name will
appear in the linked object, so it matters a little more that it look
slightly less ugly: in conflicting cases, append an incrementing
integer, starting at 0.
This naming scheme is not very helpful, but it's hard to see what else
we can do. The input .o name may be the same. The input .a name is not
even visible to ctf_link, and even *that* might be the same, because
.a's can contain many members with the same name, all of which
participate in the link. All we really know is that the two have
distinct dictionaries with distinct types in them, and at least this way
they are all represented, any any symbols, variables etc referring to
those types are accurately stored.
(As a side-effect this also fixes a use-after-free and double-free when
errors are found during variable or symbol emission.)
Use the opportunity to prevent a couple of sources of problems, to wit
changing the active CU mappings when a link has already been done
(no effect on ld, which doesn't use CU mappings at all), and causing
multiple consecutive ctf_link's to have the same net effect as just
doing the last one (no effect on ld, which only ever does one
ctf_link) rather than having the links be a sort of half-incremental
not-really-intended mess.
libctf/ChangeLog:
PR libctf/29242
* ctf-impl.h (struct ctf_dict) [ctf_link_in_out]: New.
* ctf-dedup.c (ctf_dedup_emit_type): Set it.
* ctf-link.c (ctf_link_add_ctf_internal): Set the input
CU name uniquely when clashes are found.
(ctf_link_add): Document what repeated additions do.
(ctf_new_per_cu_name): New, come up with a consistent
name for a new per-CU dict.
(ctf_link_deduplicating): Use it.
(ctf_create_per_cu): Use it, and ctf_link_in_out, and set
ctf_link_in_out properly. Don't overwrite per-CU dicts with
per-CU dicts relating to different inputs.
(ctf_link_add_cu_mapping): Prevent per-CU mappings being set up
if we already have per-CU outputs.
(ctf_link_one_variable): Adjust ctf_link_per_cu call.
(ctf_link_deduplicating_one_symtypetab): Likewise.
(ctf_link_empty_outputs): New, delete all the ctf_link_outputs
and blank out ctf_link_in_out on the corresponding inputs.
(ctf_link): Clarify the effect of multiple ctf_link calls.
Empty ctf_link_outputs if it already exists rather than
having the old output leak into the new link. Fix a variable
name.
* testsuite/config/default.exp (AR): Add.
(OBJDUMP): Likewise.
* testsuite/libctf-regression/libctf-repeat-cu.exp: New test.
* testsuite/libctf-regression/libctf-repeat-cu*: Main program,
library, and expected results for the test.
2022-06-11 00:05:50 +08:00
|
|
|
/* Empty all the ctf_link_outputs. */
|
|
|
|
static int
|
|
|
|
ctf_link_empty_outputs (ctf_dict_t *fp)
|
|
|
|
{
|
|
|
|
ctf_next_t *i = NULL;
|
|
|
|
void *v;
|
|
|
|
int err;
|
|
|
|
|
|
|
|
ctf_dynhash_empty (fp->ctf_link_outputs);
|
|
|
|
|
|
|
|
while ((err = ctf_dynhash_next (fp->ctf_link_inputs, &i, NULL, &v)) == 0)
|
|
|
|
{
|
|
|
|
ctf_dict_t *in = (ctf_dict_t *) v;
|
|
|
|
in->ctf_link_in_out = NULL;
|
|
|
|
}
|
|
|
|
if (err != ECTF_NEXT_END)
|
|
|
|
{
|
|
|
|
fp->ctf_flags &= ~LCTF_LINKING;
|
|
|
|
ctf_err_warn (fp, 1, err, _("iteration error removing old outputs"));
|
2023-09-13 17:02:36 +08:00
|
|
|
return ctf_set_errno (fp, err);
|
libctf: fix linking together multiple objects derived from the same source
Right now, if you compile the same .c input repeatedly with CTF enabled
and different compilation flags, then arrange to link all of these
together, then things misbehave in various ways. libctf may conflate
either inputs (if the .o files have the same name, say if they are
stored in different .a archives), or per-CU outputs when conflicting
types are found: the latter can lead to entirely spurious errors when
it tries to produce multiple per-CU outputs with the same name
(discarding all but the last, but then looking for types in the earlier
ones which have just been thrown away).
Fixing this is multi-pronged. Both inputs and outputs need to be
differentiated in the hashtables libctf keeps them in: inputs with the
same cuname and filename need to be considered distinct as long as they
have different associated CTF dicts, and per-CU outputs need to be
considered distinct as long as they have different associated input
dicts. Right now there is nothing tying the two together other than the
CU name: fix this by introducing a new field in the ctf_dict_t named
ctf_link_in_out, which (for input dicts) points to the associated per-CU
output dict (if any), and for output dicts points to the associated
input dict. At creation time the name used is completely arbitrary:
it's only important that it be distinct if CTF dicts are distinct. So,
when a clash is found, adjust the CU name by sticking the number of
elements in the input on the end. At output time, the CU name will
appear in the linked object, so it matters a little more that it look
slightly less ugly: in conflicting cases, append an incrementing
integer, starting at 0.
This naming scheme is not very helpful, but it's hard to see what else
we can do. The input .o name may be the same. The input .a name is not
even visible to ctf_link, and even *that* might be the same, because
.a's can contain many members with the same name, all of which
participate in the link. All we really know is that the two have
distinct dictionaries with distinct types in them, and at least this way
they are all represented, any any symbols, variables etc referring to
those types are accurately stored.
(As a side-effect this also fixes a use-after-free and double-free when
errors are found during variable or symbol emission.)
Use the opportunity to prevent a couple of sources of problems, to wit
changing the active CU mappings when a link has already been done
(no effect on ld, which doesn't use CU mappings at all), and causing
multiple consecutive ctf_link's to have the same net effect as just
doing the last one (no effect on ld, which only ever does one
ctf_link) rather than having the links be a sort of half-incremental
not-really-intended mess.
libctf/ChangeLog:
PR libctf/29242
* ctf-impl.h (struct ctf_dict) [ctf_link_in_out]: New.
* ctf-dedup.c (ctf_dedup_emit_type): Set it.
* ctf-link.c (ctf_link_add_ctf_internal): Set the input
CU name uniquely when clashes are found.
(ctf_link_add): Document what repeated additions do.
(ctf_new_per_cu_name): New, come up with a consistent
name for a new per-CU dict.
(ctf_link_deduplicating): Use it.
(ctf_create_per_cu): Use it, and ctf_link_in_out, and set
ctf_link_in_out properly. Don't overwrite per-CU dicts with
per-CU dicts relating to different inputs.
(ctf_link_add_cu_mapping): Prevent per-CU mappings being set up
if we already have per-CU outputs.
(ctf_link_one_variable): Adjust ctf_link_per_cu call.
(ctf_link_deduplicating_one_symtypetab): Likewise.
(ctf_link_empty_outputs): New, delete all the ctf_link_outputs
and blank out ctf_link_in_out on the corresponding inputs.
(ctf_link): Clarify the effect of multiple ctf_link calls.
Empty ctf_link_outputs if it already exists rather than
having the old output leak into the new link. Fix a variable
name.
* testsuite/config/default.exp (AR): Add.
(OBJDUMP): Likewise.
* testsuite/libctf-regression/libctf-repeat-cu.exp: New test.
* testsuite/libctf-regression/libctf-repeat-cu*: Main program,
library, and expected results for the test.
2022-06-11 00:05:50 +08:00
|
|
|
}
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
libctf, link: tie in the deduplicating linker
This fairly intricate commit connects up the CTF linker machinery (which
operates in terms of ctf_archive_t's on ctf_link_inputs ->
ctf_link_outputs) to the deduplicator (which operates in terms of arrays
of ctf_file_t's, all the archives exploded).
The nondeduplicating linker is retained, but is not called unless the
CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the
environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have
confidence in the much-more-complex deduplicating linker, I hope the
nondeduplicating linker can be removed.
In brief, what this does is traverses each input archive in
ctf_link_inputs, opening every member (if not already open) and tying
child dicts to their parents, shoving them into an array and
constructing a corresponding parents array that tells the deduplicator
which dict is the parent of which child. We then call ctf_dedup and
ctf_dedup_emit with that array of inputs, taking the outputs that result
and putting them into ctf_link_outputs where the rest of the CTF linker
expects to find them, then linking in the variables just as is done by
the nondeduplicating linker.
It also implements much of the CU-mapping side of things. The problem
CU-mapping introduces is that if you map many input CUs into one output,
this is saying that you want many translation units to produce at most
one child dict if conflicting types are found in any of them. This
means you can suddenly have multiple distinct types with the same name
in the same dict, which libctf cannot really represent because it's not
something you can do with C translation units.
The deduplicator machinery already committed does as best it can with
these, hiding types with conflicting names rather than making child
dicts out of them: but we still need to call it. This is done similarly
to the main link, taking the inputs (one CU output at a time),
deduplicating them, taking the output and making it an input to the
final link. Two (significant) optimizations are done: we share atoms
tables between all these links and the final link (so e.g. all type hash
values are shared, all decorated type names, etc); and any CU-mapped
links with only one input (and no child dicts) doesn't need to do
anything other than renaming the CU: the CU-mapped link phase can be
skipped for it. Put together, large CU-mapped links can save 50% of
their memory usage and about as much time (and the memory usage for
CU-mapped links is significant, because all those output CUs have to
have all their types stored in memory all at once).
include/
* ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the
deduplicator.
libctf/
* ctf-impl.h (ctf_list_splice): New.
* ctf-util.h (ctf_list_splice): Likewise.
* ctf-link.c (link_sort_inputs_cb_arg_t): Likewise.
(ctf_link_sort_inputs): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_variables): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Call it.
2020-06-06 05:57:06 +08:00
|
|
|
/* Do a deduplicating link using the ctf-dedup machinery. */
|
|
|
|
static void
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
ctf_link_deduplicating (ctf_dict_t *fp)
|
libctf, link: tie in the deduplicating linker
This fairly intricate commit connects up the CTF linker machinery (which
operates in terms of ctf_archive_t's on ctf_link_inputs ->
ctf_link_outputs) to the deduplicator (which operates in terms of arrays
of ctf_file_t's, all the archives exploded).
The nondeduplicating linker is retained, but is not called unless the
CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the
environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have
confidence in the much-more-complex deduplicating linker, I hope the
nondeduplicating linker can be removed.
In brief, what this does is traverses each input archive in
ctf_link_inputs, opening every member (if not already open) and tying
child dicts to their parents, shoving them into an array and
constructing a corresponding parents array that tells the deduplicator
which dict is the parent of which child. We then call ctf_dedup and
ctf_dedup_emit with that array of inputs, taking the outputs that result
and putting them into ctf_link_outputs where the rest of the CTF linker
expects to find them, then linking in the variables just as is done by
the nondeduplicating linker.
It also implements much of the CU-mapping side of things. The problem
CU-mapping introduces is that if you map many input CUs into one output,
this is saying that you want many translation units to produce at most
one child dict if conflicting types are found in any of them. This
means you can suddenly have multiple distinct types with the same name
in the same dict, which libctf cannot really represent because it's not
something you can do with C translation units.
The deduplicator machinery already committed does as best it can with
these, hiding types with conflicting names rather than making child
dicts out of them: but we still need to call it. This is done similarly
to the main link, taking the inputs (one CU output at a time),
deduplicating them, taking the output and making it an input to the
final link. Two (significant) optimizations are done: we share atoms
tables between all these links and the final link (so e.g. all type hash
values are shared, all decorated type names, etc); and any CU-mapped
links with only one input (and no child dicts) doesn't need to do
anything other than renaming the CU: the CU-mapped link phase can be
skipped for it. Put together, large CU-mapped links can save 50% of
their memory usage and about as much time (and the memory usage for
CU-mapped links is significant, because all those output CUs have to
have all their types stored in memory all at once).
include/
* ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the
deduplicator.
libctf/
* ctf-impl.h (ctf_list_splice): New.
* ctf-util.h (ctf_list_splice): Likewise.
* ctf-link.c (link_sort_inputs_cb_arg_t): Likewise.
(ctf_link_sort_inputs): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_variables): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Call it.
2020-06-06 05:57:06 +08:00
|
|
|
{
|
|
|
|
size_t i;
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
ctf_dict_t **inputs, **outputs = NULL;
|
libctf, link: tie in the deduplicating linker
This fairly intricate commit connects up the CTF linker machinery (which
operates in terms of ctf_archive_t's on ctf_link_inputs ->
ctf_link_outputs) to the deduplicator (which operates in terms of arrays
of ctf_file_t's, all the archives exploded).
The nondeduplicating linker is retained, but is not called unless the
CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the
environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have
confidence in the much-more-complex deduplicating linker, I hope the
nondeduplicating linker can be removed.
In brief, what this does is traverses each input archive in
ctf_link_inputs, opening every member (if not already open) and tying
child dicts to their parents, shoving them into an array and
constructing a corresponding parents array that tells the deduplicator
which dict is the parent of which child. We then call ctf_dedup and
ctf_dedup_emit with that array of inputs, taking the outputs that result
and putting them into ctf_link_outputs where the rest of the CTF linker
expects to find them, then linking in the variables just as is done by
the nondeduplicating linker.
It also implements much of the CU-mapping side of things. The problem
CU-mapping introduces is that if you map many input CUs into one output,
this is saying that you want many translation units to produce at most
one child dict if conflicting types are found in any of them. This
means you can suddenly have multiple distinct types with the same name
in the same dict, which libctf cannot really represent because it's not
something you can do with C translation units.
The deduplicator machinery already committed does as best it can with
these, hiding types with conflicting names rather than making child
dicts out of them: but we still need to call it. This is done similarly
to the main link, taking the inputs (one CU output at a time),
deduplicating them, taking the output and making it an input to the
final link. Two (significant) optimizations are done: we share atoms
tables between all these links and the final link (so e.g. all type hash
values are shared, all decorated type names, etc); and any CU-mapped
links with only one input (and no child dicts) doesn't need to do
anything other than renaming the CU: the CU-mapped link phase can be
skipped for it. Put together, large CU-mapped links can save 50% of
their memory usage and about as much time (and the memory usage for
CU-mapped links is significant, because all those output CUs have to
have all their types stored in memory all at once).
include/
* ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the
deduplicator.
libctf/
* ctf-impl.h (ctf_list_splice): New.
* ctf-util.h (ctf_list_splice): Likewise.
* ctf-link.c (link_sort_inputs_cb_arg_t): Likewise.
(ctf_link_sort_inputs): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_variables): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Call it.
2020-06-06 05:57:06 +08:00
|
|
|
ssize_t ninputs;
|
|
|
|
uint32_t noutputs;
|
|
|
|
uint32_t *parents;
|
|
|
|
|
|
|
|
if (ctf_dedup_atoms_init (fp) < 0)
|
|
|
|
{
|
libctf, binutils, include, ld: gettextize and improve error handling
This commit follows on from the earlier commit "libctf, ld, binutils:
add textual error/warning reporting for libctf" and converts every error
in libctf that was reported using ctf_dprintf to use ctf_err_warn
instead, gettextizing them in the process, using N_() where necessary to
avoid doing gettext calls unless an error message is actually generated,
and rephrasing some error messages for ease of translation.
This requires a slight change in the ctf_errwarning_next API: this API
is public but has not been in a release yet, so can still change freely.
The problem is that many errors are emitted at open time (whether
opening of a CTF dict, or opening of a CTF archive): the former of these
throws away its incompletely-initialized ctf_file_t rather than return
it, and the latter has no ctf_file_t at all. So errors and warnings
emitted at open time cannot be stored in the ctf_file_t, and have to go
elsewhere.
We put them in a static local in ctf-subr.c (which is not very
thread-safe: a later commit will improve things here): ctf_err_warn with
a NULL fp adds to this list, and the public interface
ctf_errwarning_next with a NULL fp retrieves from it.
We need a slight exception from the usual iterator rules in this case:
with a NULL fp, there is nowhere to store the ECTF_NEXT_END "error"
which signifies the end of iteration, so we add a new err parameter to
ctf_errwarning_next which is used to report such iteration-related
errors. (If an fp is provided -- i.e., if not reporting open errors --
this is optional, but even if it's optional it's still an API change.
This is actually useful from a usability POV as well, since
ctf_errwarning_next is usually called when there's been an error, so
overwriting the error code with ECTF_NEXT_END is not very helpful!
So, unusually, ctf_errwarning_next now uses the passed fp for its
error code *only* if no errp pointer is passed in, and leaves it
untouched otherwise.)
ld, objdump and readelf are adapted to call ctf_errwarning_next with a
NULL fp to report open errors where appropriate.
The ctf_err_warn API also has to change, gaining a new error-number
parameter which is used to add the error message corresponding to that
error number into the debug stream when LIBCTF_DEBUG is enabled:
changing this API is easy at this point since we are already touching
all existing calls to gettextize them. We need this because the debug
stream should contain the errno's message, but the error reported in the
error/warning stream should *not*, because the caller will probably
report it themselves at failure time regardless, and reporting it in
every error message that leads up to it leads to a ridiculous chattering
on failure, which is likely to end up as ridiculous chattering on stderr
(trimmed a bit):
CTF error: `ld/testsuite/ld-ctf/A.c (0): lookup failure for type 3: flags 1: The parent CTF dictionary is unavailable'
CTF error: `ld/testsuite/ld-ctf/A.c (0): struct/union member type hashing error during type hashing for type 80000001, kind 6: The parent CTF dictionary is unavailable'
CTF error: `deduplicating link variable emission failed for ld/testsuite/ld-ctf/A.c: The parent CTF dictionary is unavailable'
ld/.libs/lt-ld-new: warning: CTF linking failed; output will have no CTF section: `The parent CTF dictionary is unavailable'
We only need to be told that the parent CTF dictionary is unavailable
*once*, not over and over again!
errmsgs are still emitted on warning generation, because warnings do not
usually lead to a failure propagated up to the caller and reported
there.
Debug-stream messages are not translated. If translation is turned on,
there will be a mixture of English and translated messages in the debug
stream, but rather that than burden the translators with debug-only
output.
binutils/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_archive_member): Move error-
reporting...
(dump_ctf_errs): ... into this separate function.
(dump_ctf): Call it on open errors.
* readelf.c (dump_ctf_archive_member): Move error-
reporting...
(dump_ctf_errs): ... into this separate function. Support
calls with NULL fp. Adjust for new err parameter to
ctf_errwarning_next.
(dump_section_as_ctf): Call it on open errors.
include/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_errwarning_next): New err parameter.
ld/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (lang_ctf_errs_warnings): Support calls with NULL fp.
Adjust for new err parameter to ctf_errwarning_next. Only
check for assertion failures when fp is non-NULL.
(ldlang_open_ctf): Call it on open errors.
* testsuite/ld-ctf/ctf.exp: Always use the C locale to avoid
breaking the diags tests.
libctf/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-subr.c (open_errors): New list.
(ctf_err_warn): Calls with NULL fp append to open_errors. Add err
parameter, and use it to decorate the debug stream with errmsgs.
(ctf_err_warn_to_open): Splice errors from a CTF dict into the
open_errors.
(ctf_errwarning_next): Calls with NULL fp report from open_errors.
New err param to report iteration errors (including end-of-iteration)
when fp is NULL.
(ctf_assert_fail_internal): Adjust ctf_err_warn call for new err
parameter: gettextize.
* ctf-impl.h (ctfo_get_vbytes): Add ctf_file_t parameter.
(LCTF_VBYTES): Adjust.
(ctf_err_warn_to_open): New.
(ctf_err_warn): Adjust.
(ctf_bundle): Used in only one place: move...
* ctf-create.c: ... here.
(enumcmp): Use ctf_err_warn, not ctf_dprintf, passing the err number
down as needed. Don't emit the errmsg. Gettextize.
(membcmp): Likewise.
(ctf_add_type_internal): Likewise.
(ctf_write_mem): Likewise.
(ctf_compress_write): Likewise. Report errors writing the header or
body.
(ctf_write): Likewise.
* ctf-archive.c (ctf_arc_write_fd): Use ctf_err_warn, not
ctf_dprintf, and gettextize, as above.
(ctf_arc_write): Likewise.
(ctf_arc_bufopen): Likewise.
(ctf_arc_open_internal): Likewise.
* ctf-labels.c (ctf_label_iter): Likewise.
* ctf-open-bfd.c (ctf_bfdclose): Likewise.
(ctf_bfdopen): Likewise.
(ctf_bfdopen_ctfsect): Likewise.
(ctf_fdopen): Likewise.
* ctf-string.c (ctf_str_write_strtab): Likewise.
* ctf-types.c (ctf_type_resolve): Likewise.
* ctf-open.c (get_vbytes_common): Likewise. Pass down the ctf dict.
(get_vbytes_v1): Pass down the ctf dict.
(get_vbytes_v2): Likewise.
(flip_ctf): Likewise.
(flip_types): Likewise. Use ctf_err_warn, not ctf_dprintf, and
gettextize, as above.
(upgrade_types_v1): Adjust calls.
(init_types): Use ctf_err_warn, not ctf_dprintf, as above.
(ctf_bufopen_internal): Likewise. Adjust calls. Transplant errors
emitted into individual dicts into the open errors if this turns
out to be a failed open in the end.
* ctf-dump.c (ctf_dump_format_type): Adjust ctf_err_warn for new err
argument. Gettextize. Don't emit the errmsg.
(ctf_dump_funcs): Likewise. Collapse err label into its only case.
(ctf_dump_type): Likewise.
* ctf-link.c (ctf_create_per_cu): Adjust ctf_err_warn for new err
argument. Gettextize. Don't emit the errmsg.
(ctf_link_one_type): Likewise.
(ctf_link_lazy_open): Likewise.
(ctf_link_one_input_archive): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Likewise.
(ctf_link_deduplicating_per_cu): Likewise. Add some missed
ctf_set_errnos to obscure error cases.
* ctf-dedup.c (ctf_dedup_rhash_type): Adjust ctf_err_warn for new
err argument. Gettextize. Don't emit the errmsg.
(ctf_dedup_populate_mappings): Likewise.
(ctf_dedup_detect_name_ambiguity): Likewise.
(ctf_dedup_init): Likewise.
(ctf_dedup_multiple_input_dicts): Likewise.
(ctf_dedup_conflictify_unshared): Likewise.
(ctf_dedup): Likewise.
(ctf_dedup_rwalk_one_output_mapping): Likewise.
(ctf_dedup_id_to_target): Likewise.
(ctf_dedup_emit_type): Likewise.
(ctf_dedup_emit_struct_members): Likewise.
(ctf_dedup_populate_type_mapping): Likewise.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): Likewise.
(ctf_dedup_hash_type): Likewise. Fix a bit of messed-up error
status setting.
(ctf_dedup_rwalk_one_output_mapping): Likewise. Don't hide
unknown-type-kind messages (which signify file corruption).
2020-07-27 23:45:15 +08:00
|
|
|
ctf_err_warn (fp, 0, 0, _("allocating CTF dedup atoms table"));
|
libctf, link: tie in the deduplicating linker
This fairly intricate commit connects up the CTF linker machinery (which
operates in terms of ctf_archive_t's on ctf_link_inputs ->
ctf_link_outputs) to the deduplicator (which operates in terms of arrays
of ctf_file_t's, all the archives exploded).
The nondeduplicating linker is retained, but is not called unless the
CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the
environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have
confidence in the much-more-complex deduplicating linker, I hope the
nondeduplicating linker can be removed.
In brief, what this does is traverses each input archive in
ctf_link_inputs, opening every member (if not already open) and tying
child dicts to their parents, shoving them into an array and
constructing a corresponding parents array that tells the deduplicator
which dict is the parent of which child. We then call ctf_dedup and
ctf_dedup_emit with that array of inputs, taking the outputs that result
and putting them into ctf_link_outputs where the rest of the CTF linker
expects to find them, then linking in the variables just as is done by
the nondeduplicating linker.
It also implements much of the CU-mapping side of things. The problem
CU-mapping introduces is that if you map many input CUs into one output,
this is saying that you want many translation units to produce at most
one child dict if conflicting types are found in any of them. This
means you can suddenly have multiple distinct types with the same name
in the same dict, which libctf cannot really represent because it's not
something you can do with C translation units.
The deduplicator machinery already committed does as best it can with
these, hiding types with conflicting names rather than making child
dicts out of them: but we still need to call it. This is done similarly
to the main link, taking the inputs (one CU output at a time),
deduplicating them, taking the output and making it an input to the
final link. Two (significant) optimizations are done: we share atoms
tables between all these links and the final link (so e.g. all type hash
values are shared, all decorated type names, etc); and any CU-mapped
links with only one input (and no child dicts) doesn't need to do
anything other than renaming the CU: the CU-mapped link phase can be
skipped for it. Put together, large CU-mapped links can save 50% of
their memory usage and about as much time (and the memory usage for
CU-mapped links is significant, because all those output CUs have to
have all their types stored in memory all at once).
include/
* ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the
deduplicator.
libctf/
* ctf-impl.h (ctf_list_splice): New.
* ctf-util.h (ctf_list_splice): Likewise.
* ctf-link.c (link_sort_inputs_cb_arg_t): Likewise.
(ctf_link_sort_inputs): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_variables): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Call it.
2020-06-06 05:57:06 +08:00
|
|
|
return; /* Errno is set for us. */
|
|
|
|
}
|
|
|
|
|
|
|
|
if (fp->ctf_link_out_cu_mapping
|
|
|
|
&& (ctf_link_deduplicating_per_cu (fp) < 0))
|
|
|
|
return; /* Errno is set for us. */
|
|
|
|
|
|
|
|
if ((ninputs = ctf_link_deduplicating_count_inputs (fp, NULL, NULL)) < 0)
|
|
|
|
return; /* Errno is set for us. */
|
|
|
|
|
|
|
|
if ((inputs = ctf_link_deduplicating_open_inputs (fp, NULL, ninputs,
|
|
|
|
&parents)) == NULL)
|
|
|
|
return; /* Errno is set for us. */
|
|
|
|
|
|
|
|
if (ninputs == 1 && ctf_cuname (inputs[0]) != NULL)
|
|
|
|
ctf_cuname_set (fp, ctf_cuname (inputs[0]));
|
|
|
|
|
|
|
|
if (ctf_dedup (fp, inputs, ninputs, parents, 0) < 0)
|
|
|
|
{
|
libctf, binutils, include, ld: gettextize and improve error handling
This commit follows on from the earlier commit "libctf, ld, binutils:
add textual error/warning reporting for libctf" and converts every error
in libctf that was reported using ctf_dprintf to use ctf_err_warn
instead, gettextizing them in the process, using N_() where necessary to
avoid doing gettext calls unless an error message is actually generated,
and rephrasing some error messages for ease of translation.
This requires a slight change in the ctf_errwarning_next API: this API
is public but has not been in a release yet, so can still change freely.
The problem is that many errors are emitted at open time (whether
opening of a CTF dict, or opening of a CTF archive): the former of these
throws away its incompletely-initialized ctf_file_t rather than return
it, and the latter has no ctf_file_t at all. So errors and warnings
emitted at open time cannot be stored in the ctf_file_t, and have to go
elsewhere.
We put them in a static local in ctf-subr.c (which is not very
thread-safe: a later commit will improve things here): ctf_err_warn with
a NULL fp adds to this list, and the public interface
ctf_errwarning_next with a NULL fp retrieves from it.
We need a slight exception from the usual iterator rules in this case:
with a NULL fp, there is nowhere to store the ECTF_NEXT_END "error"
which signifies the end of iteration, so we add a new err parameter to
ctf_errwarning_next which is used to report such iteration-related
errors. (If an fp is provided -- i.e., if not reporting open errors --
this is optional, but even if it's optional it's still an API change.
This is actually useful from a usability POV as well, since
ctf_errwarning_next is usually called when there's been an error, so
overwriting the error code with ECTF_NEXT_END is not very helpful!
So, unusually, ctf_errwarning_next now uses the passed fp for its
error code *only* if no errp pointer is passed in, and leaves it
untouched otherwise.)
ld, objdump and readelf are adapted to call ctf_errwarning_next with a
NULL fp to report open errors where appropriate.
The ctf_err_warn API also has to change, gaining a new error-number
parameter which is used to add the error message corresponding to that
error number into the debug stream when LIBCTF_DEBUG is enabled:
changing this API is easy at this point since we are already touching
all existing calls to gettextize them. We need this because the debug
stream should contain the errno's message, but the error reported in the
error/warning stream should *not*, because the caller will probably
report it themselves at failure time regardless, and reporting it in
every error message that leads up to it leads to a ridiculous chattering
on failure, which is likely to end up as ridiculous chattering on stderr
(trimmed a bit):
CTF error: `ld/testsuite/ld-ctf/A.c (0): lookup failure for type 3: flags 1: The parent CTF dictionary is unavailable'
CTF error: `ld/testsuite/ld-ctf/A.c (0): struct/union member type hashing error during type hashing for type 80000001, kind 6: The parent CTF dictionary is unavailable'
CTF error: `deduplicating link variable emission failed for ld/testsuite/ld-ctf/A.c: The parent CTF dictionary is unavailable'
ld/.libs/lt-ld-new: warning: CTF linking failed; output will have no CTF section: `The parent CTF dictionary is unavailable'
We only need to be told that the parent CTF dictionary is unavailable
*once*, not over and over again!
errmsgs are still emitted on warning generation, because warnings do not
usually lead to a failure propagated up to the caller and reported
there.
Debug-stream messages are not translated. If translation is turned on,
there will be a mixture of English and translated messages in the debug
stream, but rather that than burden the translators with debug-only
output.
binutils/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_archive_member): Move error-
reporting...
(dump_ctf_errs): ... into this separate function.
(dump_ctf): Call it on open errors.
* readelf.c (dump_ctf_archive_member): Move error-
reporting...
(dump_ctf_errs): ... into this separate function. Support
calls with NULL fp. Adjust for new err parameter to
ctf_errwarning_next.
(dump_section_as_ctf): Call it on open errors.
include/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_errwarning_next): New err parameter.
ld/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (lang_ctf_errs_warnings): Support calls with NULL fp.
Adjust for new err parameter to ctf_errwarning_next. Only
check for assertion failures when fp is non-NULL.
(ldlang_open_ctf): Call it on open errors.
* testsuite/ld-ctf/ctf.exp: Always use the C locale to avoid
breaking the diags tests.
libctf/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-subr.c (open_errors): New list.
(ctf_err_warn): Calls with NULL fp append to open_errors. Add err
parameter, and use it to decorate the debug stream with errmsgs.
(ctf_err_warn_to_open): Splice errors from a CTF dict into the
open_errors.
(ctf_errwarning_next): Calls with NULL fp report from open_errors.
New err param to report iteration errors (including end-of-iteration)
when fp is NULL.
(ctf_assert_fail_internal): Adjust ctf_err_warn call for new err
parameter: gettextize.
* ctf-impl.h (ctfo_get_vbytes): Add ctf_file_t parameter.
(LCTF_VBYTES): Adjust.
(ctf_err_warn_to_open): New.
(ctf_err_warn): Adjust.
(ctf_bundle): Used in only one place: move...
* ctf-create.c: ... here.
(enumcmp): Use ctf_err_warn, not ctf_dprintf, passing the err number
down as needed. Don't emit the errmsg. Gettextize.
(membcmp): Likewise.
(ctf_add_type_internal): Likewise.
(ctf_write_mem): Likewise.
(ctf_compress_write): Likewise. Report errors writing the header or
body.
(ctf_write): Likewise.
* ctf-archive.c (ctf_arc_write_fd): Use ctf_err_warn, not
ctf_dprintf, and gettextize, as above.
(ctf_arc_write): Likewise.
(ctf_arc_bufopen): Likewise.
(ctf_arc_open_internal): Likewise.
* ctf-labels.c (ctf_label_iter): Likewise.
* ctf-open-bfd.c (ctf_bfdclose): Likewise.
(ctf_bfdopen): Likewise.
(ctf_bfdopen_ctfsect): Likewise.
(ctf_fdopen): Likewise.
* ctf-string.c (ctf_str_write_strtab): Likewise.
* ctf-types.c (ctf_type_resolve): Likewise.
* ctf-open.c (get_vbytes_common): Likewise. Pass down the ctf dict.
(get_vbytes_v1): Pass down the ctf dict.
(get_vbytes_v2): Likewise.
(flip_ctf): Likewise.
(flip_types): Likewise. Use ctf_err_warn, not ctf_dprintf, and
gettextize, as above.
(upgrade_types_v1): Adjust calls.
(init_types): Use ctf_err_warn, not ctf_dprintf, as above.
(ctf_bufopen_internal): Likewise. Adjust calls. Transplant errors
emitted into individual dicts into the open errors if this turns
out to be a failed open in the end.
* ctf-dump.c (ctf_dump_format_type): Adjust ctf_err_warn for new err
argument. Gettextize. Don't emit the errmsg.
(ctf_dump_funcs): Likewise. Collapse err label into its only case.
(ctf_dump_type): Likewise.
* ctf-link.c (ctf_create_per_cu): Adjust ctf_err_warn for new err
argument. Gettextize. Don't emit the errmsg.
(ctf_link_one_type): Likewise.
(ctf_link_lazy_open): Likewise.
(ctf_link_one_input_archive): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Likewise.
(ctf_link_deduplicating_per_cu): Likewise. Add some missed
ctf_set_errnos to obscure error cases.
* ctf-dedup.c (ctf_dedup_rhash_type): Adjust ctf_err_warn for new
err argument. Gettextize. Don't emit the errmsg.
(ctf_dedup_populate_mappings): Likewise.
(ctf_dedup_detect_name_ambiguity): Likewise.
(ctf_dedup_init): Likewise.
(ctf_dedup_multiple_input_dicts): Likewise.
(ctf_dedup_conflictify_unshared): Likewise.
(ctf_dedup): Likewise.
(ctf_dedup_rwalk_one_output_mapping): Likewise.
(ctf_dedup_id_to_target): Likewise.
(ctf_dedup_emit_type): Likewise.
(ctf_dedup_emit_struct_members): Likewise.
(ctf_dedup_populate_type_mapping): Likewise.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): Likewise.
(ctf_dedup_hash_type): Likewise. Fix a bit of messed-up error
status setting.
(ctf_dedup_rwalk_one_output_mapping): Likewise. Don't hide
unknown-type-kind messages (which signify file corruption).
2020-07-27 23:45:15 +08:00
|
|
|
ctf_err_warn (fp, 0, 0, _("deduplication failed for %s"),
|
|
|
|
ctf_link_input_name (fp));
|
libctf, link: tie in the deduplicating linker
This fairly intricate commit connects up the CTF linker machinery (which
operates in terms of ctf_archive_t's on ctf_link_inputs ->
ctf_link_outputs) to the deduplicator (which operates in terms of arrays
of ctf_file_t's, all the archives exploded).
The nondeduplicating linker is retained, but is not called unless the
CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the
environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have
confidence in the much-more-complex deduplicating linker, I hope the
nondeduplicating linker can be removed.
In brief, what this does is traverses each input archive in
ctf_link_inputs, opening every member (if not already open) and tying
child dicts to their parents, shoving them into an array and
constructing a corresponding parents array that tells the deduplicator
which dict is the parent of which child. We then call ctf_dedup and
ctf_dedup_emit with that array of inputs, taking the outputs that result
and putting them into ctf_link_outputs where the rest of the CTF linker
expects to find them, then linking in the variables just as is done by
the nondeduplicating linker.
It also implements much of the CU-mapping side of things. The problem
CU-mapping introduces is that if you map many input CUs into one output,
this is saying that you want many translation units to produce at most
one child dict if conflicting types are found in any of them. This
means you can suddenly have multiple distinct types with the same name
in the same dict, which libctf cannot really represent because it's not
something you can do with C translation units.
The deduplicator machinery already committed does as best it can with
these, hiding types with conflicting names rather than making child
dicts out of them: but we still need to call it. This is done similarly
to the main link, taking the inputs (one CU output at a time),
deduplicating them, taking the output and making it an input to the
final link. Two (significant) optimizations are done: we share atoms
tables between all these links and the final link (so e.g. all type hash
values are shared, all decorated type names, etc); and any CU-mapped
links with only one input (and no child dicts) doesn't need to do
anything other than renaming the CU: the CU-mapped link phase can be
skipped for it. Put together, large CU-mapped links can save 50% of
their memory usage and about as much time (and the memory usage for
CU-mapped links is significant, because all those output CUs have to
have all their types stored in memory all at once).
include/
* ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the
deduplicator.
libctf/
* ctf-impl.h (ctf_list_splice): New.
* ctf-util.h (ctf_list_splice): Likewise.
* ctf-link.c (link_sort_inputs_cb_arg_t): Likewise.
(ctf_link_sort_inputs): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_variables): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Call it.
2020-06-06 05:57:06 +08:00
|
|
|
goto err;
|
|
|
|
}
|
|
|
|
|
|
|
|
if ((outputs = ctf_dedup_emit (fp, inputs, ninputs, parents, &noutputs,
|
|
|
|
0)) == NULL)
|
|
|
|
{
|
libctf, binutils, include, ld: gettextize and improve error handling
This commit follows on from the earlier commit "libctf, ld, binutils:
add textual error/warning reporting for libctf" and converts every error
in libctf that was reported using ctf_dprintf to use ctf_err_warn
instead, gettextizing them in the process, using N_() where necessary to
avoid doing gettext calls unless an error message is actually generated,
and rephrasing some error messages for ease of translation.
This requires a slight change in the ctf_errwarning_next API: this API
is public but has not been in a release yet, so can still change freely.
The problem is that many errors are emitted at open time (whether
opening of a CTF dict, or opening of a CTF archive): the former of these
throws away its incompletely-initialized ctf_file_t rather than return
it, and the latter has no ctf_file_t at all. So errors and warnings
emitted at open time cannot be stored in the ctf_file_t, and have to go
elsewhere.
We put them in a static local in ctf-subr.c (which is not very
thread-safe: a later commit will improve things here): ctf_err_warn with
a NULL fp adds to this list, and the public interface
ctf_errwarning_next with a NULL fp retrieves from it.
We need a slight exception from the usual iterator rules in this case:
with a NULL fp, there is nowhere to store the ECTF_NEXT_END "error"
which signifies the end of iteration, so we add a new err parameter to
ctf_errwarning_next which is used to report such iteration-related
errors. (If an fp is provided -- i.e., if not reporting open errors --
this is optional, but even if it's optional it's still an API change.
This is actually useful from a usability POV as well, since
ctf_errwarning_next is usually called when there's been an error, so
overwriting the error code with ECTF_NEXT_END is not very helpful!
So, unusually, ctf_errwarning_next now uses the passed fp for its
error code *only* if no errp pointer is passed in, and leaves it
untouched otherwise.)
ld, objdump and readelf are adapted to call ctf_errwarning_next with a
NULL fp to report open errors where appropriate.
The ctf_err_warn API also has to change, gaining a new error-number
parameter which is used to add the error message corresponding to that
error number into the debug stream when LIBCTF_DEBUG is enabled:
changing this API is easy at this point since we are already touching
all existing calls to gettextize them. We need this because the debug
stream should contain the errno's message, but the error reported in the
error/warning stream should *not*, because the caller will probably
report it themselves at failure time regardless, and reporting it in
every error message that leads up to it leads to a ridiculous chattering
on failure, which is likely to end up as ridiculous chattering on stderr
(trimmed a bit):
CTF error: `ld/testsuite/ld-ctf/A.c (0): lookup failure for type 3: flags 1: The parent CTF dictionary is unavailable'
CTF error: `ld/testsuite/ld-ctf/A.c (0): struct/union member type hashing error during type hashing for type 80000001, kind 6: The parent CTF dictionary is unavailable'
CTF error: `deduplicating link variable emission failed for ld/testsuite/ld-ctf/A.c: The parent CTF dictionary is unavailable'
ld/.libs/lt-ld-new: warning: CTF linking failed; output will have no CTF section: `The parent CTF dictionary is unavailable'
We only need to be told that the parent CTF dictionary is unavailable
*once*, not over and over again!
errmsgs are still emitted on warning generation, because warnings do not
usually lead to a failure propagated up to the caller and reported
there.
Debug-stream messages are not translated. If translation is turned on,
there will be a mixture of English and translated messages in the debug
stream, but rather that than burden the translators with debug-only
output.
binutils/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_archive_member): Move error-
reporting...
(dump_ctf_errs): ... into this separate function.
(dump_ctf): Call it on open errors.
* readelf.c (dump_ctf_archive_member): Move error-
reporting...
(dump_ctf_errs): ... into this separate function. Support
calls with NULL fp. Adjust for new err parameter to
ctf_errwarning_next.
(dump_section_as_ctf): Call it on open errors.
include/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_errwarning_next): New err parameter.
ld/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (lang_ctf_errs_warnings): Support calls with NULL fp.
Adjust for new err parameter to ctf_errwarning_next. Only
check for assertion failures when fp is non-NULL.
(ldlang_open_ctf): Call it on open errors.
* testsuite/ld-ctf/ctf.exp: Always use the C locale to avoid
breaking the diags tests.
libctf/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-subr.c (open_errors): New list.
(ctf_err_warn): Calls with NULL fp append to open_errors. Add err
parameter, and use it to decorate the debug stream with errmsgs.
(ctf_err_warn_to_open): Splice errors from a CTF dict into the
open_errors.
(ctf_errwarning_next): Calls with NULL fp report from open_errors.
New err param to report iteration errors (including end-of-iteration)
when fp is NULL.
(ctf_assert_fail_internal): Adjust ctf_err_warn call for new err
parameter: gettextize.
* ctf-impl.h (ctfo_get_vbytes): Add ctf_file_t parameter.
(LCTF_VBYTES): Adjust.
(ctf_err_warn_to_open): New.
(ctf_err_warn): Adjust.
(ctf_bundle): Used in only one place: move...
* ctf-create.c: ... here.
(enumcmp): Use ctf_err_warn, not ctf_dprintf, passing the err number
down as needed. Don't emit the errmsg. Gettextize.
(membcmp): Likewise.
(ctf_add_type_internal): Likewise.
(ctf_write_mem): Likewise.
(ctf_compress_write): Likewise. Report errors writing the header or
body.
(ctf_write): Likewise.
* ctf-archive.c (ctf_arc_write_fd): Use ctf_err_warn, not
ctf_dprintf, and gettextize, as above.
(ctf_arc_write): Likewise.
(ctf_arc_bufopen): Likewise.
(ctf_arc_open_internal): Likewise.
* ctf-labels.c (ctf_label_iter): Likewise.
* ctf-open-bfd.c (ctf_bfdclose): Likewise.
(ctf_bfdopen): Likewise.
(ctf_bfdopen_ctfsect): Likewise.
(ctf_fdopen): Likewise.
* ctf-string.c (ctf_str_write_strtab): Likewise.
* ctf-types.c (ctf_type_resolve): Likewise.
* ctf-open.c (get_vbytes_common): Likewise. Pass down the ctf dict.
(get_vbytes_v1): Pass down the ctf dict.
(get_vbytes_v2): Likewise.
(flip_ctf): Likewise.
(flip_types): Likewise. Use ctf_err_warn, not ctf_dprintf, and
gettextize, as above.
(upgrade_types_v1): Adjust calls.
(init_types): Use ctf_err_warn, not ctf_dprintf, as above.
(ctf_bufopen_internal): Likewise. Adjust calls. Transplant errors
emitted into individual dicts into the open errors if this turns
out to be a failed open in the end.
* ctf-dump.c (ctf_dump_format_type): Adjust ctf_err_warn for new err
argument. Gettextize. Don't emit the errmsg.
(ctf_dump_funcs): Likewise. Collapse err label into its only case.
(ctf_dump_type): Likewise.
* ctf-link.c (ctf_create_per_cu): Adjust ctf_err_warn for new err
argument. Gettextize. Don't emit the errmsg.
(ctf_link_one_type): Likewise.
(ctf_link_lazy_open): Likewise.
(ctf_link_one_input_archive): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Likewise.
(ctf_link_deduplicating_per_cu): Likewise. Add some missed
ctf_set_errnos to obscure error cases.
* ctf-dedup.c (ctf_dedup_rhash_type): Adjust ctf_err_warn for new
err argument. Gettextize. Don't emit the errmsg.
(ctf_dedup_populate_mappings): Likewise.
(ctf_dedup_detect_name_ambiguity): Likewise.
(ctf_dedup_init): Likewise.
(ctf_dedup_multiple_input_dicts): Likewise.
(ctf_dedup_conflictify_unshared): Likewise.
(ctf_dedup): Likewise.
(ctf_dedup_rwalk_one_output_mapping): Likewise.
(ctf_dedup_id_to_target): Likewise.
(ctf_dedup_emit_type): Likewise.
(ctf_dedup_emit_struct_members): Likewise.
(ctf_dedup_populate_type_mapping): Likewise.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): Likewise.
(ctf_dedup_hash_type): Likewise. Fix a bit of messed-up error
status setting.
(ctf_dedup_rwalk_one_output_mapping): Likewise. Don't hide
unknown-type-kind messages (which signify file corruption).
2020-07-27 23:45:15 +08:00
|
|
|
ctf_err_warn (fp, 0, 0, _("deduplicating link type emission failed "
|
|
|
|
"for %s"), ctf_link_input_name (fp));
|
libctf, link: tie in the deduplicating linker
This fairly intricate commit connects up the CTF linker machinery (which
operates in terms of ctf_archive_t's on ctf_link_inputs ->
ctf_link_outputs) to the deduplicator (which operates in terms of arrays
of ctf_file_t's, all the archives exploded).
The nondeduplicating linker is retained, but is not called unless the
CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the
environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have
confidence in the much-more-complex deduplicating linker, I hope the
nondeduplicating linker can be removed.
In brief, what this does is traverses each input archive in
ctf_link_inputs, opening every member (if not already open) and tying
child dicts to their parents, shoving them into an array and
constructing a corresponding parents array that tells the deduplicator
which dict is the parent of which child. We then call ctf_dedup and
ctf_dedup_emit with that array of inputs, taking the outputs that result
and putting them into ctf_link_outputs where the rest of the CTF linker
expects to find them, then linking in the variables just as is done by
the nondeduplicating linker.
It also implements much of the CU-mapping side of things. The problem
CU-mapping introduces is that if you map many input CUs into one output,
this is saying that you want many translation units to produce at most
one child dict if conflicting types are found in any of them. This
means you can suddenly have multiple distinct types with the same name
in the same dict, which libctf cannot really represent because it's not
something you can do with C translation units.
The deduplicator machinery already committed does as best it can with
these, hiding types with conflicting names rather than making child
dicts out of them: but we still need to call it. This is done similarly
to the main link, taking the inputs (one CU output at a time),
deduplicating them, taking the output and making it an input to the
final link. Two (significant) optimizations are done: we share atoms
tables between all these links and the final link (so e.g. all type hash
values are shared, all decorated type names, etc); and any CU-mapped
links with only one input (and no child dicts) doesn't need to do
anything other than renaming the CU: the CU-mapped link phase can be
skipped for it. Put together, large CU-mapped links can save 50% of
their memory usage and about as much time (and the memory usage for
CU-mapped links is significant, because all those output CUs have to
have all their types stored in memory all at once).
include/
* ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the
deduplicator.
libctf/
* ctf-impl.h (ctf_list_splice): New.
* ctf-util.h (ctf_list_splice): Likewise.
* ctf-link.c (link_sort_inputs_cb_arg_t): Likewise.
(ctf_link_sort_inputs): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_variables): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Call it.
2020-06-06 05:57:06 +08:00
|
|
|
goto err;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (!ctf_assert (fp, outputs[0] == fp))
|
libctf: minor error-handling fixes
A transient bug in the preceding change (fixed before commit) exposed a
new failure, of ld/testsuite/ld-ctf/diag-parname.d. This attempts to
ensure that if we link a dict with child type IDs but no attached
parent, we get a suitable ECTF_NOPARENT error. This was happening
before this commit, but only by chance, because ctf_variable_iter and
ctf_variable_next check to see if the dict they're passed is a child
dict without an associated parent. We forgot error-checking on the
ctf_variable_next call, and as a result this was concealed -- and
looking for the problem exposed a new bug.
If any of the lookups beneath ctf_dedup_hash_type fail, the CTF link
does *not* fail, but acts quite bizarrely, skipping the type but
emitting an error to the CTF error/warning log -- so the linker will
report an error, emit a partial CTF dict missing some types, and exit
with exitcode 0 as if nothing went wrong. Since ctf_dedup_hash_type is
never expected to fail in normal operation, this is surely wrong:
failures at emission time do not emit partial CTF dicts, so failures
at hashing time should not either.
So propagate the error back up.
Also fix a couple of smaller bugs where we fail to properly free things
and/or propagate error codes on various rare link-time errors and
out-of-memory conditions.
libctf/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-dedup.c (ctf_dedup): Pass on errors from ctf_dedup_hash_type.
Call ctf_dedup_fini properly on other errors.
(ctf_dedup_emit_type): Set the errno on dynhash insertion failure.
* ctf-link.c (ctf_link_deduplicating_per_cu): Close outputs beyond
output 0 when asserting because >1 output is found.
(ctf_link_deduplicating): Likewise, when asserting because the
shared output is not the same as the passed-in fp.
2021-03-02 23:10:05 +08:00
|
|
|
{
|
|
|
|
for (i = 1; i < noutputs; i++)
|
|
|
|
ctf_dict_close (outputs[i]);
|
|
|
|
goto err;
|
|
|
|
}
|
libctf, link: tie in the deduplicating linker
This fairly intricate commit connects up the CTF linker machinery (which
operates in terms of ctf_archive_t's on ctf_link_inputs ->
ctf_link_outputs) to the deduplicator (which operates in terms of arrays
of ctf_file_t's, all the archives exploded).
The nondeduplicating linker is retained, but is not called unless the
CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the
environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have
confidence in the much-more-complex deduplicating linker, I hope the
nondeduplicating linker can be removed.
In brief, what this does is traverses each input archive in
ctf_link_inputs, opening every member (if not already open) and tying
child dicts to their parents, shoving them into an array and
constructing a corresponding parents array that tells the deduplicator
which dict is the parent of which child. We then call ctf_dedup and
ctf_dedup_emit with that array of inputs, taking the outputs that result
and putting them into ctf_link_outputs where the rest of the CTF linker
expects to find them, then linking in the variables just as is done by
the nondeduplicating linker.
It also implements much of the CU-mapping side of things. The problem
CU-mapping introduces is that if you map many input CUs into one output,
this is saying that you want many translation units to produce at most
one child dict if conflicting types are found in any of them. This
means you can suddenly have multiple distinct types with the same name
in the same dict, which libctf cannot really represent because it's not
something you can do with C translation units.
The deduplicator machinery already committed does as best it can with
these, hiding types with conflicting names rather than making child
dicts out of them: but we still need to call it. This is done similarly
to the main link, taking the inputs (one CU output at a time),
deduplicating them, taking the output and making it an input to the
final link. Two (significant) optimizations are done: we share atoms
tables between all these links and the final link (so e.g. all type hash
values are shared, all decorated type names, etc); and any CU-mapped
links with only one input (and no child dicts) doesn't need to do
anything other than renaming the CU: the CU-mapped link phase can be
skipped for it. Put together, large CU-mapped links can save 50% of
their memory usage and about as much time (and the memory usage for
CU-mapped links is significant, because all those output CUs have to
have all their types stored in memory all at once).
include/
* ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the
deduplicator.
libctf/
* ctf-impl.h (ctf_list_splice): New.
* ctf-util.h (ctf_list_splice): Likewise.
* ctf-link.c (link_sort_inputs_cb_arg_t): Likewise.
(ctf_link_sort_inputs): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_variables): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Call it.
2020-06-06 05:57:06 +08:00
|
|
|
|
|
|
|
for (i = 0; i < noutputs; i++)
|
|
|
|
{
|
|
|
|
char *dynname;
|
|
|
|
|
|
|
|
/* We already have access to this one. Close the duplicate. */
|
|
|
|
if (i == 0)
|
|
|
|
{
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
ctf_dict_close (outputs[0]);
|
libctf, link: tie in the deduplicating linker
This fairly intricate commit connects up the CTF linker machinery (which
operates in terms of ctf_archive_t's on ctf_link_inputs ->
ctf_link_outputs) to the deduplicator (which operates in terms of arrays
of ctf_file_t's, all the archives exploded).
The nondeduplicating linker is retained, but is not called unless the
CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the
environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have
confidence in the much-more-complex deduplicating linker, I hope the
nondeduplicating linker can be removed.
In brief, what this does is traverses each input archive in
ctf_link_inputs, opening every member (if not already open) and tying
child dicts to their parents, shoving them into an array and
constructing a corresponding parents array that tells the deduplicator
which dict is the parent of which child. We then call ctf_dedup and
ctf_dedup_emit with that array of inputs, taking the outputs that result
and putting them into ctf_link_outputs where the rest of the CTF linker
expects to find them, then linking in the variables just as is done by
the nondeduplicating linker.
It also implements much of the CU-mapping side of things. The problem
CU-mapping introduces is that if you map many input CUs into one output,
this is saying that you want many translation units to produce at most
one child dict if conflicting types are found in any of them. This
means you can suddenly have multiple distinct types with the same name
in the same dict, which libctf cannot really represent because it's not
something you can do with C translation units.
The deduplicator machinery already committed does as best it can with
these, hiding types with conflicting names rather than making child
dicts out of them: but we still need to call it. This is done similarly
to the main link, taking the inputs (one CU output at a time),
deduplicating them, taking the output and making it an input to the
final link. Two (significant) optimizations are done: we share atoms
tables between all these links and the final link (so e.g. all type hash
values are shared, all decorated type names, etc); and any CU-mapped
links with only one input (and no child dicts) doesn't need to do
anything other than renaming the CU: the CU-mapped link phase can be
skipped for it. Put together, large CU-mapped links can save 50% of
their memory usage and about as much time (and the memory usage for
CU-mapped links is significant, because all those output CUs have to
have all their types stored in memory all at once).
include/
* ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the
deduplicator.
libctf/
* ctf-impl.h (ctf_list_splice): New.
* ctf-util.h (ctf_list_splice): Likewise.
* ctf-link.c (link_sort_inputs_cb_arg_t): Likewise.
(ctf_link_sort_inputs): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_variables): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Call it.
2020-06-06 05:57:06 +08:00
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
libctf: fix linking together multiple objects derived from the same source
Right now, if you compile the same .c input repeatedly with CTF enabled
and different compilation flags, then arrange to link all of these
together, then things misbehave in various ways. libctf may conflate
either inputs (if the .o files have the same name, say if they are
stored in different .a archives), or per-CU outputs when conflicting
types are found: the latter can lead to entirely spurious errors when
it tries to produce multiple per-CU outputs with the same name
(discarding all but the last, but then looking for types in the earlier
ones which have just been thrown away).
Fixing this is multi-pronged. Both inputs and outputs need to be
differentiated in the hashtables libctf keeps them in: inputs with the
same cuname and filename need to be considered distinct as long as they
have different associated CTF dicts, and per-CU outputs need to be
considered distinct as long as they have different associated input
dicts. Right now there is nothing tying the two together other than the
CU name: fix this by introducing a new field in the ctf_dict_t named
ctf_link_in_out, which (for input dicts) points to the associated per-CU
output dict (if any), and for output dicts points to the associated
input dict. At creation time the name used is completely arbitrary:
it's only important that it be distinct if CTF dicts are distinct. So,
when a clash is found, adjust the CU name by sticking the number of
elements in the input on the end. At output time, the CU name will
appear in the linked object, so it matters a little more that it look
slightly less ugly: in conflicting cases, append an incrementing
integer, starting at 0.
This naming scheme is not very helpful, but it's hard to see what else
we can do. The input .o name may be the same. The input .a name is not
even visible to ctf_link, and even *that* might be the same, because
.a's can contain many members with the same name, all of which
participate in the link. All we really know is that the two have
distinct dictionaries with distinct types in them, and at least this way
they are all represented, any any symbols, variables etc referring to
those types are accurately stored.
(As a side-effect this also fixes a use-after-free and double-free when
errors are found during variable or symbol emission.)
Use the opportunity to prevent a couple of sources of problems, to wit
changing the active CU mappings when a link has already been done
(no effect on ld, which doesn't use CU mappings at all), and causing
multiple consecutive ctf_link's to have the same net effect as just
doing the last one (no effect on ld, which only ever does one
ctf_link) rather than having the links be a sort of half-incremental
not-really-intended mess.
libctf/ChangeLog:
PR libctf/29242
* ctf-impl.h (struct ctf_dict) [ctf_link_in_out]: New.
* ctf-dedup.c (ctf_dedup_emit_type): Set it.
* ctf-link.c (ctf_link_add_ctf_internal): Set the input
CU name uniquely when clashes are found.
(ctf_link_add): Document what repeated additions do.
(ctf_new_per_cu_name): New, come up with a consistent
name for a new per-CU dict.
(ctf_link_deduplicating): Use it.
(ctf_create_per_cu): Use it, and ctf_link_in_out, and set
ctf_link_in_out properly. Don't overwrite per-CU dicts with
per-CU dicts relating to different inputs.
(ctf_link_add_cu_mapping): Prevent per-CU mappings being set up
if we already have per-CU outputs.
(ctf_link_one_variable): Adjust ctf_link_per_cu call.
(ctf_link_deduplicating_one_symtypetab): Likewise.
(ctf_link_empty_outputs): New, delete all the ctf_link_outputs
and blank out ctf_link_in_out on the corresponding inputs.
(ctf_link): Clarify the effect of multiple ctf_link calls.
Empty ctf_link_outputs if it already exists rather than
having the old output leak into the new link. Fix a variable
name.
* testsuite/config/default.exp (AR): Add.
(OBJDUMP): Likewise.
* testsuite/libctf-regression/libctf-repeat-cu.exp: New test.
* testsuite/libctf-regression/libctf-repeat-cu*: Main program,
library, and expected results for the test.
2022-06-11 00:05:50 +08:00
|
|
|
if ((dynname = ctf_new_per_cu_name (fp, ctf_cuname (outputs[i]))) == NULL)
|
libctf, link: tie in the deduplicating linker
This fairly intricate commit connects up the CTF linker machinery (which
operates in terms of ctf_archive_t's on ctf_link_inputs ->
ctf_link_outputs) to the deduplicator (which operates in terms of arrays
of ctf_file_t's, all the archives exploded).
The nondeduplicating linker is retained, but is not called unless the
CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the
environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have
confidence in the much-more-complex deduplicating linker, I hope the
nondeduplicating linker can be removed.
In brief, what this does is traverses each input archive in
ctf_link_inputs, opening every member (if not already open) and tying
child dicts to their parents, shoving them into an array and
constructing a corresponding parents array that tells the deduplicator
which dict is the parent of which child. We then call ctf_dedup and
ctf_dedup_emit with that array of inputs, taking the outputs that result
and putting them into ctf_link_outputs where the rest of the CTF linker
expects to find them, then linking in the variables just as is done by
the nondeduplicating linker.
It also implements much of the CU-mapping side of things. The problem
CU-mapping introduces is that if you map many input CUs into one output,
this is saying that you want many translation units to produce at most
one child dict if conflicting types are found in any of them. This
means you can suddenly have multiple distinct types with the same name
in the same dict, which libctf cannot really represent because it's not
something you can do with C translation units.
The deduplicator machinery already committed does as best it can with
these, hiding types with conflicting names rather than making child
dicts out of them: but we still need to call it. This is done similarly
to the main link, taking the inputs (one CU output at a time),
deduplicating them, taking the output and making it an input to the
final link. Two (significant) optimizations are done: we share atoms
tables between all these links and the final link (so e.g. all type hash
values are shared, all decorated type names, etc); and any CU-mapped
links with only one input (and no child dicts) doesn't need to do
anything other than renaming the CU: the CU-mapped link phase can be
skipped for it. Put together, large CU-mapped links can save 50% of
their memory usage and about as much time (and the memory usage for
CU-mapped links is significant, because all those output CUs have to
have all their types stored in memory all at once).
include/
* ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the
deduplicator.
libctf/
* ctf-impl.h (ctf_list_splice): New.
* ctf-util.h (ctf_list_splice): Likewise.
* ctf-link.c (link_sort_inputs_cb_arg_t): Likewise.
(ctf_link_sort_inputs): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_variables): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Call it.
2020-06-06 05:57:06 +08:00
|
|
|
goto oom_one_output;
|
|
|
|
|
|
|
|
if (ctf_dynhash_insert (fp->ctf_link_outputs, dynname, outputs[i]) < 0)
|
|
|
|
goto oom_one_output;
|
|
|
|
|
|
|
|
continue;
|
|
|
|
|
|
|
|
oom_one_output:
|
|
|
|
ctf_set_errno (fp, ENOMEM);
|
libctf, binutils, include, ld: gettextize and improve error handling
This commit follows on from the earlier commit "libctf, ld, binutils:
add textual error/warning reporting for libctf" and converts every error
in libctf that was reported using ctf_dprintf to use ctf_err_warn
instead, gettextizing them in the process, using N_() where necessary to
avoid doing gettext calls unless an error message is actually generated,
and rephrasing some error messages for ease of translation.
This requires a slight change in the ctf_errwarning_next API: this API
is public but has not been in a release yet, so can still change freely.
The problem is that many errors are emitted at open time (whether
opening of a CTF dict, or opening of a CTF archive): the former of these
throws away its incompletely-initialized ctf_file_t rather than return
it, and the latter has no ctf_file_t at all. So errors and warnings
emitted at open time cannot be stored in the ctf_file_t, and have to go
elsewhere.
We put them in a static local in ctf-subr.c (which is not very
thread-safe: a later commit will improve things here): ctf_err_warn with
a NULL fp adds to this list, and the public interface
ctf_errwarning_next with a NULL fp retrieves from it.
We need a slight exception from the usual iterator rules in this case:
with a NULL fp, there is nowhere to store the ECTF_NEXT_END "error"
which signifies the end of iteration, so we add a new err parameter to
ctf_errwarning_next which is used to report such iteration-related
errors. (If an fp is provided -- i.e., if not reporting open errors --
this is optional, but even if it's optional it's still an API change.
This is actually useful from a usability POV as well, since
ctf_errwarning_next is usually called when there's been an error, so
overwriting the error code with ECTF_NEXT_END is not very helpful!
So, unusually, ctf_errwarning_next now uses the passed fp for its
error code *only* if no errp pointer is passed in, and leaves it
untouched otherwise.)
ld, objdump and readelf are adapted to call ctf_errwarning_next with a
NULL fp to report open errors where appropriate.
The ctf_err_warn API also has to change, gaining a new error-number
parameter which is used to add the error message corresponding to that
error number into the debug stream when LIBCTF_DEBUG is enabled:
changing this API is easy at this point since we are already touching
all existing calls to gettextize them. We need this because the debug
stream should contain the errno's message, but the error reported in the
error/warning stream should *not*, because the caller will probably
report it themselves at failure time regardless, and reporting it in
every error message that leads up to it leads to a ridiculous chattering
on failure, which is likely to end up as ridiculous chattering on stderr
(trimmed a bit):
CTF error: `ld/testsuite/ld-ctf/A.c (0): lookup failure for type 3: flags 1: The parent CTF dictionary is unavailable'
CTF error: `ld/testsuite/ld-ctf/A.c (0): struct/union member type hashing error during type hashing for type 80000001, kind 6: The parent CTF dictionary is unavailable'
CTF error: `deduplicating link variable emission failed for ld/testsuite/ld-ctf/A.c: The parent CTF dictionary is unavailable'
ld/.libs/lt-ld-new: warning: CTF linking failed; output will have no CTF section: `The parent CTF dictionary is unavailable'
We only need to be told that the parent CTF dictionary is unavailable
*once*, not over and over again!
errmsgs are still emitted on warning generation, because warnings do not
usually lead to a failure propagated up to the caller and reported
there.
Debug-stream messages are not translated. If translation is turned on,
there will be a mixture of English and translated messages in the debug
stream, but rather that than burden the translators with debug-only
output.
binutils/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_archive_member): Move error-
reporting...
(dump_ctf_errs): ... into this separate function.
(dump_ctf): Call it on open errors.
* readelf.c (dump_ctf_archive_member): Move error-
reporting...
(dump_ctf_errs): ... into this separate function. Support
calls with NULL fp. Adjust for new err parameter to
ctf_errwarning_next.
(dump_section_as_ctf): Call it on open errors.
include/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_errwarning_next): New err parameter.
ld/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (lang_ctf_errs_warnings): Support calls with NULL fp.
Adjust for new err parameter to ctf_errwarning_next. Only
check for assertion failures when fp is non-NULL.
(ldlang_open_ctf): Call it on open errors.
* testsuite/ld-ctf/ctf.exp: Always use the C locale to avoid
breaking the diags tests.
libctf/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-subr.c (open_errors): New list.
(ctf_err_warn): Calls with NULL fp append to open_errors. Add err
parameter, and use it to decorate the debug stream with errmsgs.
(ctf_err_warn_to_open): Splice errors from a CTF dict into the
open_errors.
(ctf_errwarning_next): Calls with NULL fp report from open_errors.
New err param to report iteration errors (including end-of-iteration)
when fp is NULL.
(ctf_assert_fail_internal): Adjust ctf_err_warn call for new err
parameter: gettextize.
* ctf-impl.h (ctfo_get_vbytes): Add ctf_file_t parameter.
(LCTF_VBYTES): Adjust.
(ctf_err_warn_to_open): New.
(ctf_err_warn): Adjust.
(ctf_bundle): Used in only one place: move...
* ctf-create.c: ... here.
(enumcmp): Use ctf_err_warn, not ctf_dprintf, passing the err number
down as needed. Don't emit the errmsg. Gettextize.
(membcmp): Likewise.
(ctf_add_type_internal): Likewise.
(ctf_write_mem): Likewise.
(ctf_compress_write): Likewise. Report errors writing the header or
body.
(ctf_write): Likewise.
* ctf-archive.c (ctf_arc_write_fd): Use ctf_err_warn, not
ctf_dprintf, and gettextize, as above.
(ctf_arc_write): Likewise.
(ctf_arc_bufopen): Likewise.
(ctf_arc_open_internal): Likewise.
* ctf-labels.c (ctf_label_iter): Likewise.
* ctf-open-bfd.c (ctf_bfdclose): Likewise.
(ctf_bfdopen): Likewise.
(ctf_bfdopen_ctfsect): Likewise.
(ctf_fdopen): Likewise.
* ctf-string.c (ctf_str_write_strtab): Likewise.
* ctf-types.c (ctf_type_resolve): Likewise.
* ctf-open.c (get_vbytes_common): Likewise. Pass down the ctf dict.
(get_vbytes_v1): Pass down the ctf dict.
(get_vbytes_v2): Likewise.
(flip_ctf): Likewise.
(flip_types): Likewise. Use ctf_err_warn, not ctf_dprintf, and
gettextize, as above.
(upgrade_types_v1): Adjust calls.
(init_types): Use ctf_err_warn, not ctf_dprintf, as above.
(ctf_bufopen_internal): Likewise. Adjust calls. Transplant errors
emitted into individual dicts into the open errors if this turns
out to be a failed open in the end.
* ctf-dump.c (ctf_dump_format_type): Adjust ctf_err_warn for new err
argument. Gettextize. Don't emit the errmsg.
(ctf_dump_funcs): Likewise. Collapse err label into its only case.
(ctf_dump_type): Likewise.
* ctf-link.c (ctf_create_per_cu): Adjust ctf_err_warn for new err
argument. Gettextize. Don't emit the errmsg.
(ctf_link_one_type): Likewise.
(ctf_link_lazy_open): Likewise.
(ctf_link_one_input_archive): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Likewise.
(ctf_link_deduplicating_per_cu): Likewise. Add some missed
ctf_set_errnos to obscure error cases.
* ctf-dedup.c (ctf_dedup_rhash_type): Adjust ctf_err_warn for new
err argument. Gettextize. Don't emit the errmsg.
(ctf_dedup_populate_mappings): Likewise.
(ctf_dedup_detect_name_ambiguity): Likewise.
(ctf_dedup_init): Likewise.
(ctf_dedup_multiple_input_dicts): Likewise.
(ctf_dedup_conflictify_unshared): Likewise.
(ctf_dedup): Likewise.
(ctf_dedup_rwalk_one_output_mapping): Likewise.
(ctf_dedup_id_to_target): Likewise.
(ctf_dedup_emit_type): Likewise.
(ctf_dedup_emit_struct_members): Likewise.
(ctf_dedup_populate_type_mapping): Likewise.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): Likewise.
(ctf_dedup_hash_type): Likewise. Fix a bit of messed-up error
status setting.
(ctf_dedup_rwalk_one_output_mapping): Likewise. Don't hide
unknown-type-kind messages (which signify file corruption).
2020-07-27 23:45:15 +08:00
|
|
|
ctf_err_warn (fp, 0, 0, _("out of memory allocating link outputs"));
|
libctf, link: tie in the deduplicating linker
This fairly intricate commit connects up the CTF linker machinery (which
operates in terms of ctf_archive_t's on ctf_link_inputs ->
ctf_link_outputs) to the deduplicator (which operates in terms of arrays
of ctf_file_t's, all the archives exploded).
The nondeduplicating linker is retained, but is not called unless the
CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the
environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have
confidence in the much-more-complex deduplicating linker, I hope the
nondeduplicating linker can be removed.
In brief, what this does is traverses each input archive in
ctf_link_inputs, opening every member (if not already open) and tying
child dicts to their parents, shoving them into an array and
constructing a corresponding parents array that tells the deduplicator
which dict is the parent of which child. We then call ctf_dedup and
ctf_dedup_emit with that array of inputs, taking the outputs that result
and putting them into ctf_link_outputs where the rest of the CTF linker
expects to find them, then linking in the variables just as is done by
the nondeduplicating linker.
It also implements much of the CU-mapping side of things. The problem
CU-mapping introduces is that if you map many input CUs into one output,
this is saying that you want many translation units to produce at most
one child dict if conflicting types are found in any of them. This
means you can suddenly have multiple distinct types with the same name
in the same dict, which libctf cannot really represent because it's not
something you can do with C translation units.
The deduplicator machinery already committed does as best it can with
these, hiding types with conflicting names rather than making child
dicts out of them: but we still need to call it. This is done similarly
to the main link, taking the inputs (one CU output at a time),
deduplicating them, taking the output and making it an input to the
final link. Two (significant) optimizations are done: we share atoms
tables between all these links and the final link (so e.g. all type hash
values are shared, all decorated type names, etc); and any CU-mapped
links with only one input (and no child dicts) doesn't need to do
anything other than renaming the CU: the CU-mapped link phase can be
skipped for it. Put together, large CU-mapped links can save 50% of
their memory usage and about as much time (and the memory usage for
CU-mapped links is significant, because all those output CUs have to
have all their types stored in memory all at once).
include/
* ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the
deduplicator.
libctf/
* ctf-impl.h (ctf_list_splice): New.
* ctf-util.h (ctf_list_splice): Likewise.
* ctf-link.c (link_sort_inputs_cb_arg_t): Likewise.
(ctf_link_sort_inputs): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_variables): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Call it.
2020-06-06 05:57:06 +08:00
|
|
|
free (dynname);
|
|
|
|
|
|
|
|
for (; i < noutputs; i++)
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
ctf_dict_close (outputs[i]);
|
libctf, link: tie in the deduplicating linker
This fairly intricate commit connects up the CTF linker machinery (which
operates in terms of ctf_archive_t's on ctf_link_inputs ->
ctf_link_outputs) to the deduplicator (which operates in terms of arrays
of ctf_file_t's, all the archives exploded).
The nondeduplicating linker is retained, but is not called unless the
CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the
environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have
confidence in the much-more-complex deduplicating linker, I hope the
nondeduplicating linker can be removed.
In brief, what this does is traverses each input archive in
ctf_link_inputs, opening every member (if not already open) and tying
child dicts to their parents, shoving them into an array and
constructing a corresponding parents array that tells the deduplicator
which dict is the parent of which child. We then call ctf_dedup and
ctf_dedup_emit with that array of inputs, taking the outputs that result
and putting them into ctf_link_outputs where the rest of the CTF linker
expects to find them, then linking in the variables just as is done by
the nondeduplicating linker.
It also implements much of the CU-mapping side of things. The problem
CU-mapping introduces is that if you map many input CUs into one output,
this is saying that you want many translation units to produce at most
one child dict if conflicting types are found in any of them. This
means you can suddenly have multiple distinct types with the same name
in the same dict, which libctf cannot really represent because it's not
something you can do with C translation units.
The deduplicator machinery already committed does as best it can with
these, hiding types with conflicting names rather than making child
dicts out of them: but we still need to call it. This is done similarly
to the main link, taking the inputs (one CU output at a time),
deduplicating them, taking the output and making it an input to the
final link. Two (significant) optimizations are done: we share atoms
tables between all these links and the final link (so e.g. all type hash
values are shared, all decorated type names, etc); and any CU-mapped
links with only one input (and no child dicts) doesn't need to do
anything other than renaming the CU: the CU-mapped link phase can be
skipped for it. Put together, large CU-mapped links can save 50% of
their memory usage and about as much time (and the memory usage for
CU-mapped links is significant, because all those output CUs have to
have all their types stored in memory all at once).
include/
* ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the
deduplicator.
libctf/
* ctf-impl.h (ctf_list_splice): New.
* ctf-util.h (ctf_list_splice): Likewise.
* ctf-link.c (link_sort_inputs_cb_arg_t): Likewise.
(ctf_link_sort_inputs): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_variables): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Call it.
2020-06-06 05:57:06 +08:00
|
|
|
goto err;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (!(fp->ctf_link_flags & CTF_LINK_OMIT_VARIABLES_SECTION)
|
|
|
|
&& ctf_link_deduplicating_variables (fp, inputs, ninputs, 0) < 0)
|
|
|
|
{
|
libctf, binutils, include, ld: gettextize and improve error handling
This commit follows on from the earlier commit "libctf, ld, binutils:
add textual error/warning reporting for libctf" and converts every error
in libctf that was reported using ctf_dprintf to use ctf_err_warn
instead, gettextizing them in the process, using N_() where necessary to
avoid doing gettext calls unless an error message is actually generated,
and rephrasing some error messages for ease of translation.
This requires a slight change in the ctf_errwarning_next API: this API
is public but has not been in a release yet, so can still change freely.
The problem is that many errors are emitted at open time (whether
opening of a CTF dict, or opening of a CTF archive): the former of these
throws away its incompletely-initialized ctf_file_t rather than return
it, and the latter has no ctf_file_t at all. So errors and warnings
emitted at open time cannot be stored in the ctf_file_t, and have to go
elsewhere.
We put them in a static local in ctf-subr.c (which is not very
thread-safe: a later commit will improve things here): ctf_err_warn with
a NULL fp adds to this list, and the public interface
ctf_errwarning_next with a NULL fp retrieves from it.
We need a slight exception from the usual iterator rules in this case:
with a NULL fp, there is nowhere to store the ECTF_NEXT_END "error"
which signifies the end of iteration, so we add a new err parameter to
ctf_errwarning_next which is used to report such iteration-related
errors. (If an fp is provided -- i.e., if not reporting open errors --
this is optional, but even if it's optional it's still an API change.
This is actually useful from a usability POV as well, since
ctf_errwarning_next is usually called when there's been an error, so
overwriting the error code with ECTF_NEXT_END is not very helpful!
So, unusually, ctf_errwarning_next now uses the passed fp for its
error code *only* if no errp pointer is passed in, and leaves it
untouched otherwise.)
ld, objdump and readelf are adapted to call ctf_errwarning_next with a
NULL fp to report open errors where appropriate.
The ctf_err_warn API also has to change, gaining a new error-number
parameter which is used to add the error message corresponding to that
error number into the debug stream when LIBCTF_DEBUG is enabled:
changing this API is easy at this point since we are already touching
all existing calls to gettextize them. We need this because the debug
stream should contain the errno's message, but the error reported in the
error/warning stream should *not*, because the caller will probably
report it themselves at failure time regardless, and reporting it in
every error message that leads up to it leads to a ridiculous chattering
on failure, which is likely to end up as ridiculous chattering on stderr
(trimmed a bit):
CTF error: `ld/testsuite/ld-ctf/A.c (0): lookup failure for type 3: flags 1: The parent CTF dictionary is unavailable'
CTF error: `ld/testsuite/ld-ctf/A.c (0): struct/union member type hashing error during type hashing for type 80000001, kind 6: The parent CTF dictionary is unavailable'
CTF error: `deduplicating link variable emission failed for ld/testsuite/ld-ctf/A.c: The parent CTF dictionary is unavailable'
ld/.libs/lt-ld-new: warning: CTF linking failed; output will have no CTF section: `The parent CTF dictionary is unavailable'
We only need to be told that the parent CTF dictionary is unavailable
*once*, not over and over again!
errmsgs are still emitted on warning generation, because warnings do not
usually lead to a failure propagated up to the caller and reported
there.
Debug-stream messages are not translated. If translation is turned on,
there will be a mixture of English and translated messages in the debug
stream, but rather that than burden the translators with debug-only
output.
binutils/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_archive_member): Move error-
reporting...
(dump_ctf_errs): ... into this separate function.
(dump_ctf): Call it on open errors.
* readelf.c (dump_ctf_archive_member): Move error-
reporting...
(dump_ctf_errs): ... into this separate function. Support
calls with NULL fp. Adjust for new err parameter to
ctf_errwarning_next.
(dump_section_as_ctf): Call it on open errors.
include/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_errwarning_next): New err parameter.
ld/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (lang_ctf_errs_warnings): Support calls with NULL fp.
Adjust for new err parameter to ctf_errwarning_next. Only
check for assertion failures when fp is non-NULL.
(ldlang_open_ctf): Call it on open errors.
* testsuite/ld-ctf/ctf.exp: Always use the C locale to avoid
breaking the diags tests.
libctf/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-subr.c (open_errors): New list.
(ctf_err_warn): Calls with NULL fp append to open_errors. Add err
parameter, and use it to decorate the debug stream with errmsgs.
(ctf_err_warn_to_open): Splice errors from a CTF dict into the
open_errors.
(ctf_errwarning_next): Calls with NULL fp report from open_errors.
New err param to report iteration errors (including end-of-iteration)
when fp is NULL.
(ctf_assert_fail_internal): Adjust ctf_err_warn call for new err
parameter: gettextize.
* ctf-impl.h (ctfo_get_vbytes): Add ctf_file_t parameter.
(LCTF_VBYTES): Adjust.
(ctf_err_warn_to_open): New.
(ctf_err_warn): Adjust.
(ctf_bundle): Used in only one place: move...
* ctf-create.c: ... here.
(enumcmp): Use ctf_err_warn, not ctf_dprintf, passing the err number
down as needed. Don't emit the errmsg. Gettextize.
(membcmp): Likewise.
(ctf_add_type_internal): Likewise.
(ctf_write_mem): Likewise.
(ctf_compress_write): Likewise. Report errors writing the header or
body.
(ctf_write): Likewise.
* ctf-archive.c (ctf_arc_write_fd): Use ctf_err_warn, not
ctf_dprintf, and gettextize, as above.
(ctf_arc_write): Likewise.
(ctf_arc_bufopen): Likewise.
(ctf_arc_open_internal): Likewise.
* ctf-labels.c (ctf_label_iter): Likewise.
* ctf-open-bfd.c (ctf_bfdclose): Likewise.
(ctf_bfdopen): Likewise.
(ctf_bfdopen_ctfsect): Likewise.
(ctf_fdopen): Likewise.
* ctf-string.c (ctf_str_write_strtab): Likewise.
* ctf-types.c (ctf_type_resolve): Likewise.
* ctf-open.c (get_vbytes_common): Likewise. Pass down the ctf dict.
(get_vbytes_v1): Pass down the ctf dict.
(get_vbytes_v2): Likewise.
(flip_ctf): Likewise.
(flip_types): Likewise. Use ctf_err_warn, not ctf_dprintf, and
gettextize, as above.
(upgrade_types_v1): Adjust calls.
(init_types): Use ctf_err_warn, not ctf_dprintf, as above.
(ctf_bufopen_internal): Likewise. Adjust calls. Transplant errors
emitted into individual dicts into the open errors if this turns
out to be a failed open in the end.
* ctf-dump.c (ctf_dump_format_type): Adjust ctf_err_warn for new err
argument. Gettextize. Don't emit the errmsg.
(ctf_dump_funcs): Likewise. Collapse err label into its only case.
(ctf_dump_type): Likewise.
* ctf-link.c (ctf_create_per_cu): Adjust ctf_err_warn for new err
argument. Gettextize. Don't emit the errmsg.
(ctf_link_one_type): Likewise.
(ctf_link_lazy_open): Likewise.
(ctf_link_one_input_archive): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Likewise.
(ctf_link_deduplicating_per_cu): Likewise. Add some missed
ctf_set_errnos to obscure error cases.
* ctf-dedup.c (ctf_dedup_rhash_type): Adjust ctf_err_warn for new
err argument. Gettextize. Don't emit the errmsg.
(ctf_dedup_populate_mappings): Likewise.
(ctf_dedup_detect_name_ambiguity): Likewise.
(ctf_dedup_init): Likewise.
(ctf_dedup_multiple_input_dicts): Likewise.
(ctf_dedup_conflictify_unshared): Likewise.
(ctf_dedup): Likewise.
(ctf_dedup_rwalk_one_output_mapping): Likewise.
(ctf_dedup_id_to_target): Likewise.
(ctf_dedup_emit_type): Likewise.
(ctf_dedup_emit_struct_members): Likewise.
(ctf_dedup_populate_type_mapping): Likewise.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): Likewise.
(ctf_dedup_hash_type): Likewise. Fix a bit of messed-up error
status setting.
(ctf_dedup_rwalk_one_output_mapping): Likewise. Don't hide
unknown-type-kind messages (which signify file corruption).
2020-07-27 23:45:15 +08:00
|
|
|
ctf_err_warn (fp, 0, 0, _("deduplicating link variable emission failed for "
|
|
|
|
"%s"), ctf_link_input_name (fp));
|
2020-11-20 21:34:04 +08:00
|
|
|
goto err_clean_outputs;
|
libctf, link: tie in the deduplicating linker
This fairly intricate commit connects up the CTF linker machinery (which
operates in terms of ctf_archive_t's on ctf_link_inputs ->
ctf_link_outputs) to the deduplicator (which operates in terms of arrays
of ctf_file_t's, all the archives exploded).
The nondeduplicating linker is retained, but is not called unless the
CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the
environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have
confidence in the much-more-complex deduplicating linker, I hope the
nondeduplicating linker can be removed.
In brief, what this does is traverses each input archive in
ctf_link_inputs, opening every member (if not already open) and tying
child dicts to their parents, shoving them into an array and
constructing a corresponding parents array that tells the deduplicator
which dict is the parent of which child. We then call ctf_dedup and
ctf_dedup_emit with that array of inputs, taking the outputs that result
and putting them into ctf_link_outputs where the rest of the CTF linker
expects to find them, then linking in the variables just as is done by
the nondeduplicating linker.
It also implements much of the CU-mapping side of things. The problem
CU-mapping introduces is that if you map many input CUs into one output,
this is saying that you want many translation units to produce at most
one child dict if conflicting types are found in any of them. This
means you can suddenly have multiple distinct types with the same name
in the same dict, which libctf cannot really represent because it's not
something you can do with C translation units.
The deduplicator machinery already committed does as best it can with
these, hiding types with conflicting names rather than making child
dicts out of them: but we still need to call it. This is done similarly
to the main link, taking the inputs (one CU output at a time),
deduplicating them, taking the output and making it an input to the
final link. Two (significant) optimizations are done: we share atoms
tables between all these links and the final link (so e.g. all type hash
values are shared, all decorated type names, etc); and any CU-mapped
links with only one input (and no child dicts) doesn't need to do
anything other than renaming the CU: the CU-mapped link phase can be
skipped for it. Put together, large CU-mapped links can save 50% of
their memory usage and about as much time (and the memory usage for
CU-mapped links is significant, because all those output CUs have to
have all their types stored in memory all at once).
include/
* ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the
deduplicator.
libctf/
* ctf-impl.h (ctf_list_splice): New.
* ctf-util.h (ctf_list_splice): Likewise.
* ctf-link.c (link_sort_inputs_cb_arg_t): Likewise.
(ctf_link_sort_inputs): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_variables): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Call it.
2020-06-06 05:57:06 +08:00
|
|
|
}
|
|
|
|
|
libctf: symbol type linking support
This adds facilities to write out the function info and data object
sections, which efficiently map from entries in the symbol table to
types. The write-side code is entirely new: the read-side code was
merely significantly changed and support for indexed tables added
(pointed to by the no-longer-unused cth_objtidxoff and cth_funcidxoff
header fields).
With this in place, you can use ctf_lookup_by_symbol to look up the
types of symbols of function and object type (and, as before, you can
use ctf_lookup_variable to look up types of file-scope variables not
present in the symbol table, as long as you know their name: but
variables that are also data objects are now found in the data object
section instead.)
(Compatible) file format change:
The CTF spec has always said that the function info section looks much
like the CTF_K_FUNCTIONs in the type section: an info word (including an
argument count) followed by a return type and N argument types. This
format is suboptimal: it means function symbols cannot be deduplicated
and it causes a lot of ugly code duplication in libctf. But
conveniently the compiler has never emitted this! Because it has always
emitted a rather different format that libctf has never accepted, we can
be sure that there are no instances of this function info section in the
wild, and can freely change its format without compatibility concerns or
a file format version bump. (And since it has never been emitted in any
code that generated any older file format version, either, we need keep
no code to read the format as specified at all!)
So the function info section is now specified as an array of uint32_t,
exactly like the object data section: each entry is a type ID in the
type section which must be of kind CTF_K_FUNCTION, the prototype of
this function.
This allows function types to be deduplicated and also correctly encodes
the fact that all functions declared in C really are types available to
the program: so they should be stored in the type section like all other
types. (In format v4, we will be able to represent the types of static
functions as well, but that really does require a file format change.)
We introduce a new header flag, CTF_F_NEWFUNCINFO, which is set if the
new function info format is in use. A sufficiently new compiler will
always set this flag. New libctf will always set this flag: old libctf
will refuse to open any CTF dicts that have this flag set. If the flag
is not set on a dict being read in, new libctf will disregard the
function info section. Format v4 will remove this flag (or, rather, the
flag has no meaning there and the bit position may be recycled for some
other purpose).
New API:
Symbol addition:
ctf_add_func_sym: Add a symbol with a given name and type. The
type must be of kind CTF_K_FUNCTION (a function
pointer). Internally this adds a name -> type
mapping to the ctf_funchash in the ctf_dict.
ctf_add_objt_sym: Add a symbol with a given name and type. The type
kind can be anything, including function pointers.
This adds to ctf_objthash.
These both treat symbols as name -> type mappings: the linker associates
symbol names with symbol indexes via the ctf_link_shuffle_syms callback,
which sets up the ctf_dynsyms/ctf_dynsymidx/ctf_dynsymmax fields in the
ctf_dict. Repeated relinks can add more symbols.
Variables that are also exposed as symbols are removed from the variable
section at serialization time.
CTF symbol type sections which have enough pads, defined by
CTF_INDEX_PAD_THRESHOLD (whether because they are in dicts with symbols
where most types are unknown, or in archive where most types are defined
in some child or parent dict, not in this specific dict) are sorted by
name rather than symidx and accompanied by an index which associates
each symbol type entry with a name: the existing ctf_lookup_by_symbol
will map symbol indexes to symbol names and look the names up in the
index automatically. (This is currently ELF-symbol-table-dependent, but
there is almost nothing specific to ELF in here and we can add support
for other symbol table formats easily).
The compiler also uses index sections to communicate the contents of
object file symbol tables without relying on any specific ordering of
symbols: it doesn't need to sort them, and libctf will detect an
unsorted index section via the absence of the new CTF_F_IDXSORTED header
flag, and sort it if needed.
Iteration:
ctf_symbol_next: Iterator which returns the types and names of symbols
one by one, either for function or data symbols.
This does not require any sorting: the ctf_link machinery uses it to
pull in all the compiler-provided symbols cheaply, but it is not
restricted to that use.
(Compatible) changes in API:
ctf_lookup_by_symbol: can now be called for object and function
symbols: never returns ECTF_NOTDATA (which is
now not thrown by anything, but is kept for
compatibility and because it is a plausible
error that we might start throwing again at some
later date).
Internally we also have changes to the ctf-string functionality so that
"external" strings (those where we track a string -> offset mapping, but
only write out an offset) can be consulted via the usual means
(ctf_strptr) before the strtab is written out. This is important
because ctf_link_add_linker_symbol can now be handed symbols named via
strtab offsets, and ctf_link_shuffle_syms must figure out their actual
names by looking in the external symtab we have just been fed by the
ctf_link_add_strtab callback, long before that strtab is written out.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_symbol_next): New.
(ctf_add_objt_sym): Likewise.
(ctf_add_func_sym): Likewise.
* ctf.h: Document new function info section format.
(CTF_F_NEWFUNCINFO): New.
(CTF_F_IDXSORTED): New.
(CTF_F_MAX): Adjust accordingly.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (CTF_INDEX_PAD_THRESHOLD): New.
(_libctf_nonnull_): Likewise.
(ctf_in_flight_dynsym_t): New.
(ctf_dict_t) <ctf_funcidx_names>: Likewise.
<ctf_objtidx_names>: Likewise.
<ctf_nfuncidx>: Likewise.
<ctf_nobjtidx>: Likewise.
<ctf_funcidx_sxlate>: Likewise.
<ctf_objtidx_sxlate>: Likewise.
<ctf_objthash>: Likewise.
<ctf_funchash>: Likewise.
<ctf_dynsyms>: Likewise.
<ctf_dynsymidx>: Likewise.
<ctf_dynsymmax>: Likewise.
<ctf_in_flight_dynsym>: Likewise.
(struct ctf_next) <u.ctn_next>: Likewise.
(ctf_symtab_skippable): New prototype.
(ctf_add_funcobjt_sym): Likewise.
(ctf_dynhash_sort_by_name): Likewise.
(ctf_sym_to_elf64): Rename to...
(ctf_elf32_to_link_sym): ... this, and...
(ctf_elf64_to_link_sym): ... this.
* ctf-open.c (init_symtab): Check for lack of CTF_F_NEWFUNCINFO
flag, and presence of index sections. Refactor out
ctf_symtab_skippable and ctf_elf*_to_link_sym, and use them. Use
ctf_link_sym_t, not Elf64_Sym. Skip initializing objt or func
sxlate sections if corresponding index section is present. Adjust
for new func info section format.
(ctf_bufopen_internal): Add ctf_err_warn to corrupt-file error
handling. Report incorrect-length index sections. Always do an
init_symtab, even if there is no symtab section (there may be index
sections still).
(flip_objts): Adjust comment: func and objt sections are actually
identical in structure now, no need to caveat.
(ctf_dict_close): Free newly-added data structures.
* ctf-create.c (ctf_create): Initialize them.
(ctf_symtab_skippable): New, refactored out of
init_symtab, with st_nameidx_set check added.
(ctf_add_funcobjt_sym): New, add a function or object symbol to the
ctf_objthash or ctf_funchash, by name.
(ctf_add_objt_sym): Call it.
(ctf_add_func_sym): Likewise.
(symtypetab_delete_nonstatic_vars): New, delete vars also present as
data objects.
(CTF_SYMTYPETAB_EMIT_FUNCTION): New flag to symtypetab emitters:
this is a function emission, not a data object emission.
(CTF_SYMTYPETAB_EMIT_PAD): New flag to symtypetab emitters: emit
pads for symbols with no type (only set for unindexed sections).
(CTF_SYMTYPETAB_FORCE_INDEXED): New flag to symtypetab emitters:
always emit indexed.
(symtypetab_density): New, figure out section sizes.
(emit_symtypetab): New, emit a symtypetab.
(emit_symtypetab_index): New, emit a symtypetab index.
(ctf_serialize): Call them, emitting suitably sorted symtypetab
sections and indexes. Set suitable header flags. Copy over new
fields.
* ctf-hash.c (ctf_dynhash_sort_by_name): New, used to impose an
order on symtypetab index sections.
* ctf-link.c (ctf_add_type_mapping): Delete erroneous comment
relating to code that was never committed.
(ctf_link_one_variable): Improve variable name.
(check_sym): New, symtypetab analogue of check_variable.
(ctf_link_deduplicating_one_symtypetab): New.
(ctf_link_deduplicating_syms): Likewise.
(ctf_link_deduplicating): Call them.
(ctf_link_deduplicating_per_cu): Note that we don't call them in
this case (yet).
(ctf_link_add_strtab): Set the error on the fp correctly.
(ctf_link_add_linker_symbol): New (no longer a do-nothing stub), add
a linker symbol to the in-flight list.
(ctf_link_shuffle_syms): New (no longer a do-nothing stub), turn the
in-flight list into a mapping we can use, now its names are
resolvable in the external strtab.
* ctf-string.c (ctf_str_rollback_atom): Don't roll back atoms with
external strtab offsets.
(ctf_str_rollback): Adjust comment.
(ctf_str_write_strtab): Migrate ctf_syn_ext_strtab population from
writeout time...
(ctf_str_add_external): ... to string addition time.
* ctf-lookup.c (ctf_lookup_var_key_t): Rename to...
(ctf_lookup_idx_key_t): ... this, now we use it for syms too.
<clik_names>: New member, a name table.
(ctf_lookup_var): Adjust accordingly.
(ctf_lookup_variable): Likewise.
(ctf_lookup_by_id): Shuffle further up in the file.
(ctf_symidx_sort_arg_cb): New, callback for...
(sort_symidx_by_name): ... this new function to sort a symidx
found to be unsorted (likely originating from the compiler).
(ctf_symidx_sort): New, sort a symidx.
(ctf_lookup_symbol_name): Support dynamic symbols with indexes
provided by the linker. Use ctf_link_sym_t, not Elf64_Sym.
Check the parent if a child lookup fails.
(ctf_lookup_by_symbol): Likewise. Work for function symbols too.
(ctf_symbol_next): New, iterate over symbols with types (without
sorting).
(ctf_lookup_idx_name): New, bsearch for symbol names in indexes.
(ctf_try_lookup_indexed): New, attempt an indexed lookup.
(ctf_func_info): Reimplement in terms of ctf_lookup_by_symbol.
(ctf_func_args): Likewise.
(ctf_get_dict): Move...
* ctf-types.c (ctf_get_dict): ... here.
* ctf-util.c (ctf_sym_to_elf64): Re-express as...
(ctf_elf64_to_link_sym): ... this. Add new st_symidx field, and
st_nameidx_set (always 0, so st_nameidx can be ignored). Look in
the ELF strtab for names.
(ctf_elf32_to_link_sym): Likewise, for Elf32_Sym.
(ctf_next_destroy): Destroy ctf_next_t.u.ctn_next if need be.
* libctf.ver: Add ctf_symbol_next, ctf_add_objt_sym and
ctf_add_func_sym.
2020-11-20 21:34:04 +08:00
|
|
|
if (ctf_link_deduplicating_syms (fp, inputs, ninputs, 0) < 0)
|
|
|
|
{
|
|
|
|
ctf_err_warn (fp, 0, 0, _("deduplicating link symbol emission failed for "
|
|
|
|
"%s"), ctf_link_input_name (fp));
|
2020-11-20 21:34:04 +08:00
|
|
|
goto err_clean_outputs;
|
libctf: symbol type linking support
This adds facilities to write out the function info and data object
sections, which efficiently map from entries in the symbol table to
types. The write-side code is entirely new: the read-side code was
merely significantly changed and support for indexed tables added
(pointed to by the no-longer-unused cth_objtidxoff and cth_funcidxoff
header fields).
With this in place, you can use ctf_lookup_by_symbol to look up the
types of symbols of function and object type (and, as before, you can
use ctf_lookup_variable to look up types of file-scope variables not
present in the symbol table, as long as you know their name: but
variables that are also data objects are now found in the data object
section instead.)
(Compatible) file format change:
The CTF spec has always said that the function info section looks much
like the CTF_K_FUNCTIONs in the type section: an info word (including an
argument count) followed by a return type and N argument types. This
format is suboptimal: it means function symbols cannot be deduplicated
and it causes a lot of ugly code duplication in libctf. But
conveniently the compiler has never emitted this! Because it has always
emitted a rather different format that libctf has never accepted, we can
be sure that there are no instances of this function info section in the
wild, and can freely change its format without compatibility concerns or
a file format version bump. (And since it has never been emitted in any
code that generated any older file format version, either, we need keep
no code to read the format as specified at all!)
So the function info section is now specified as an array of uint32_t,
exactly like the object data section: each entry is a type ID in the
type section which must be of kind CTF_K_FUNCTION, the prototype of
this function.
This allows function types to be deduplicated and also correctly encodes
the fact that all functions declared in C really are types available to
the program: so they should be stored in the type section like all other
types. (In format v4, we will be able to represent the types of static
functions as well, but that really does require a file format change.)
We introduce a new header flag, CTF_F_NEWFUNCINFO, which is set if the
new function info format is in use. A sufficiently new compiler will
always set this flag. New libctf will always set this flag: old libctf
will refuse to open any CTF dicts that have this flag set. If the flag
is not set on a dict being read in, new libctf will disregard the
function info section. Format v4 will remove this flag (or, rather, the
flag has no meaning there and the bit position may be recycled for some
other purpose).
New API:
Symbol addition:
ctf_add_func_sym: Add a symbol with a given name and type. The
type must be of kind CTF_K_FUNCTION (a function
pointer). Internally this adds a name -> type
mapping to the ctf_funchash in the ctf_dict.
ctf_add_objt_sym: Add a symbol with a given name and type. The type
kind can be anything, including function pointers.
This adds to ctf_objthash.
These both treat symbols as name -> type mappings: the linker associates
symbol names with symbol indexes via the ctf_link_shuffle_syms callback,
which sets up the ctf_dynsyms/ctf_dynsymidx/ctf_dynsymmax fields in the
ctf_dict. Repeated relinks can add more symbols.
Variables that are also exposed as symbols are removed from the variable
section at serialization time.
CTF symbol type sections which have enough pads, defined by
CTF_INDEX_PAD_THRESHOLD (whether because they are in dicts with symbols
where most types are unknown, or in archive where most types are defined
in some child or parent dict, not in this specific dict) are sorted by
name rather than symidx and accompanied by an index which associates
each symbol type entry with a name: the existing ctf_lookup_by_symbol
will map symbol indexes to symbol names and look the names up in the
index automatically. (This is currently ELF-symbol-table-dependent, but
there is almost nothing specific to ELF in here and we can add support
for other symbol table formats easily).
The compiler also uses index sections to communicate the contents of
object file symbol tables without relying on any specific ordering of
symbols: it doesn't need to sort them, and libctf will detect an
unsorted index section via the absence of the new CTF_F_IDXSORTED header
flag, and sort it if needed.
Iteration:
ctf_symbol_next: Iterator which returns the types and names of symbols
one by one, either for function or data symbols.
This does not require any sorting: the ctf_link machinery uses it to
pull in all the compiler-provided symbols cheaply, but it is not
restricted to that use.
(Compatible) changes in API:
ctf_lookup_by_symbol: can now be called for object and function
symbols: never returns ECTF_NOTDATA (which is
now not thrown by anything, but is kept for
compatibility and because it is a plausible
error that we might start throwing again at some
later date).
Internally we also have changes to the ctf-string functionality so that
"external" strings (those where we track a string -> offset mapping, but
only write out an offset) can be consulted via the usual means
(ctf_strptr) before the strtab is written out. This is important
because ctf_link_add_linker_symbol can now be handed symbols named via
strtab offsets, and ctf_link_shuffle_syms must figure out their actual
names by looking in the external symtab we have just been fed by the
ctf_link_add_strtab callback, long before that strtab is written out.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_symbol_next): New.
(ctf_add_objt_sym): Likewise.
(ctf_add_func_sym): Likewise.
* ctf.h: Document new function info section format.
(CTF_F_NEWFUNCINFO): New.
(CTF_F_IDXSORTED): New.
(CTF_F_MAX): Adjust accordingly.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (CTF_INDEX_PAD_THRESHOLD): New.
(_libctf_nonnull_): Likewise.
(ctf_in_flight_dynsym_t): New.
(ctf_dict_t) <ctf_funcidx_names>: Likewise.
<ctf_objtidx_names>: Likewise.
<ctf_nfuncidx>: Likewise.
<ctf_nobjtidx>: Likewise.
<ctf_funcidx_sxlate>: Likewise.
<ctf_objtidx_sxlate>: Likewise.
<ctf_objthash>: Likewise.
<ctf_funchash>: Likewise.
<ctf_dynsyms>: Likewise.
<ctf_dynsymidx>: Likewise.
<ctf_dynsymmax>: Likewise.
<ctf_in_flight_dynsym>: Likewise.
(struct ctf_next) <u.ctn_next>: Likewise.
(ctf_symtab_skippable): New prototype.
(ctf_add_funcobjt_sym): Likewise.
(ctf_dynhash_sort_by_name): Likewise.
(ctf_sym_to_elf64): Rename to...
(ctf_elf32_to_link_sym): ... this, and...
(ctf_elf64_to_link_sym): ... this.
* ctf-open.c (init_symtab): Check for lack of CTF_F_NEWFUNCINFO
flag, and presence of index sections. Refactor out
ctf_symtab_skippable and ctf_elf*_to_link_sym, and use them. Use
ctf_link_sym_t, not Elf64_Sym. Skip initializing objt or func
sxlate sections if corresponding index section is present. Adjust
for new func info section format.
(ctf_bufopen_internal): Add ctf_err_warn to corrupt-file error
handling. Report incorrect-length index sections. Always do an
init_symtab, even if there is no symtab section (there may be index
sections still).
(flip_objts): Adjust comment: func and objt sections are actually
identical in structure now, no need to caveat.
(ctf_dict_close): Free newly-added data structures.
* ctf-create.c (ctf_create): Initialize them.
(ctf_symtab_skippable): New, refactored out of
init_symtab, with st_nameidx_set check added.
(ctf_add_funcobjt_sym): New, add a function or object symbol to the
ctf_objthash or ctf_funchash, by name.
(ctf_add_objt_sym): Call it.
(ctf_add_func_sym): Likewise.
(symtypetab_delete_nonstatic_vars): New, delete vars also present as
data objects.
(CTF_SYMTYPETAB_EMIT_FUNCTION): New flag to symtypetab emitters:
this is a function emission, not a data object emission.
(CTF_SYMTYPETAB_EMIT_PAD): New flag to symtypetab emitters: emit
pads for symbols with no type (only set for unindexed sections).
(CTF_SYMTYPETAB_FORCE_INDEXED): New flag to symtypetab emitters:
always emit indexed.
(symtypetab_density): New, figure out section sizes.
(emit_symtypetab): New, emit a symtypetab.
(emit_symtypetab_index): New, emit a symtypetab index.
(ctf_serialize): Call them, emitting suitably sorted symtypetab
sections and indexes. Set suitable header flags. Copy over new
fields.
* ctf-hash.c (ctf_dynhash_sort_by_name): New, used to impose an
order on symtypetab index sections.
* ctf-link.c (ctf_add_type_mapping): Delete erroneous comment
relating to code that was never committed.
(ctf_link_one_variable): Improve variable name.
(check_sym): New, symtypetab analogue of check_variable.
(ctf_link_deduplicating_one_symtypetab): New.
(ctf_link_deduplicating_syms): Likewise.
(ctf_link_deduplicating): Call them.
(ctf_link_deduplicating_per_cu): Note that we don't call them in
this case (yet).
(ctf_link_add_strtab): Set the error on the fp correctly.
(ctf_link_add_linker_symbol): New (no longer a do-nothing stub), add
a linker symbol to the in-flight list.
(ctf_link_shuffle_syms): New (no longer a do-nothing stub), turn the
in-flight list into a mapping we can use, now its names are
resolvable in the external strtab.
* ctf-string.c (ctf_str_rollback_atom): Don't roll back atoms with
external strtab offsets.
(ctf_str_rollback): Adjust comment.
(ctf_str_write_strtab): Migrate ctf_syn_ext_strtab population from
writeout time...
(ctf_str_add_external): ... to string addition time.
* ctf-lookup.c (ctf_lookup_var_key_t): Rename to...
(ctf_lookup_idx_key_t): ... this, now we use it for syms too.
<clik_names>: New member, a name table.
(ctf_lookup_var): Adjust accordingly.
(ctf_lookup_variable): Likewise.
(ctf_lookup_by_id): Shuffle further up in the file.
(ctf_symidx_sort_arg_cb): New, callback for...
(sort_symidx_by_name): ... this new function to sort a symidx
found to be unsorted (likely originating from the compiler).
(ctf_symidx_sort): New, sort a symidx.
(ctf_lookup_symbol_name): Support dynamic symbols with indexes
provided by the linker. Use ctf_link_sym_t, not Elf64_Sym.
Check the parent if a child lookup fails.
(ctf_lookup_by_symbol): Likewise. Work for function symbols too.
(ctf_symbol_next): New, iterate over symbols with types (without
sorting).
(ctf_lookup_idx_name): New, bsearch for symbol names in indexes.
(ctf_try_lookup_indexed): New, attempt an indexed lookup.
(ctf_func_info): Reimplement in terms of ctf_lookup_by_symbol.
(ctf_func_args): Likewise.
(ctf_get_dict): Move...
* ctf-types.c (ctf_get_dict): ... here.
* ctf-util.c (ctf_sym_to_elf64): Re-express as...
(ctf_elf64_to_link_sym): ... this. Add new st_symidx field, and
st_nameidx_set (always 0, so st_nameidx can be ignored). Look in
the ELF strtab for names.
(ctf_elf32_to_link_sym): Likewise, for Elf32_Sym.
(ctf_next_destroy): Destroy ctf_next_t.u.ctn_next if need be.
* libctf.ver: Add ctf_symbol_next, ctf_add_objt_sym and
ctf_add_func_sym.
2020-11-20 21:34:04 +08:00
|
|
|
}
|
|
|
|
|
libctf: add a deduplicator-specific type mapping table
When CTF linking is done, the linker has to track the association
between types in the inputs and types in the outputs. The deduplicator
does this via the cd_output_emission_hashes, which maps from hashes of
types (valid in both the input and output) to the IDs of types in the
specific dict in which the cd_emission_hashes is held. However, the
nondeduplicating linker and ctf_add_type used a different mechanism, a
dedicated hashtab stored in the ctf_link_type_mapping, populated via
ctf_add_type_mapping and queried via the ctf_type_mapping function. To
allow the same functions to be used for variable and symbol population
in both the deduplicating and nondeduplicating linker, the deduplicator
carefully transferred all its input->output mappings into this hashtab
before returning.
This is *expensive*. The number of entries in this hashtab scales as the
number of input types, and unlike the hashing machinery the type mapping
machinery (the only other thing which scales that way) has not been much
optimized.
Now the nondeduplicating linker is gone, we can throw this out, move
the existing type mapping machinery to ctf-create.c and dedicate it to
ctf_add_type alone, and add a new function ctf_dedup_type_mapping which
uses the deduplicator's built-in knowledge of type mappings directly,
without requiring an expensive repopulation phase.
This speeds up a test link of nouveau.ko (a good worst-case candidate
with a lot of types in each of a lot of input files) from 9.11s to 7.15s
in my testing, a speedup of over 20%.
libctf/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (ctf_dict_t) <ctf_link_type_mapping>: No longer used
by the nondeduplicating linker.
(ctf_add_type_mapping): Removed, now static.
(ctf_type_mapping): Likewise.
(ctf_dedup_type_mapping): New.
(ctf_dedup_t) <cd_input_nums>: New.
* ctf-dedup.c (ctf_dedup_init): Populate it.
(ctf_dedup_fini): Free it again. Emphasise that this has to be
the last thing called.
(ctf_dedup): Populate it.
(ctf_dedup_populate_type_mapping): Removed.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): No longer call it. No longer call
ctf_dedup_fini either.
(ctf_dedup_type_mapping): New.
* ctf-link.c (ctf_unnamed_cuname): New.
(ctf_create_per_cu): Arguments must be non-null now.
(ctf_in_member_cb_arg): Removed.
(ctf_link): No longer populate it. No longer discard the
mapping table.
(ctf_link_deduplicating_one_symtypetab): Use
ctf_dedup_type_mapping, not ctf_type_mapping. Use
ctf_unnamed_cuname.
(ctf_link_one_variable): Likewise. Pass in args individually: no
longer a ctf_variable_iter callback.
(empty_link_type_mapping): Removed.
(ctf_link_deduplicating_variables): Use ctf_variable_next, not
ctf_variable_iter. No longer pack arguments to
ctf_link_one_variable into a struct.
(ctf_link_deduplicating_per_cu): Call ctf_dedup_fini once
all link phases are done.
(ctf_link_deduplicating): Likewise.
(ctf_link_intern_extern_string): Improve comment.
(ctf_add_type_mapping): Migrate...
(ctf_type_mapping): ... these functions...
* ctf-create.c (ctf_add_type_mapping): ... here...
(ctf_type_mapping): ... and make static, for the sole use of
ctf_add_type.
2021-03-02 23:10:05 +08:00
|
|
|
ctf_dedup_fini (fp, outputs, noutputs);
|
|
|
|
|
libctf, link: tie in the deduplicating linker
This fairly intricate commit connects up the CTF linker machinery (which
operates in terms of ctf_archive_t's on ctf_link_inputs ->
ctf_link_outputs) to the deduplicator (which operates in terms of arrays
of ctf_file_t's, all the archives exploded).
The nondeduplicating linker is retained, but is not called unless the
CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the
environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have
confidence in the much-more-complex deduplicating linker, I hope the
nondeduplicating linker can be removed.
In brief, what this does is traverses each input archive in
ctf_link_inputs, opening every member (if not already open) and tying
child dicts to their parents, shoving them into an array and
constructing a corresponding parents array that tells the deduplicator
which dict is the parent of which child. We then call ctf_dedup and
ctf_dedup_emit with that array of inputs, taking the outputs that result
and putting them into ctf_link_outputs where the rest of the CTF linker
expects to find them, then linking in the variables just as is done by
the nondeduplicating linker.
It also implements much of the CU-mapping side of things. The problem
CU-mapping introduces is that if you map many input CUs into one output,
this is saying that you want many translation units to produce at most
one child dict if conflicting types are found in any of them. This
means you can suddenly have multiple distinct types with the same name
in the same dict, which libctf cannot really represent because it's not
something you can do with C translation units.
The deduplicator machinery already committed does as best it can with
these, hiding types with conflicting names rather than making child
dicts out of them: but we still need to call it. This is done similarly
to the main link, taking the inputs (one CU output at a time),
deduplicating them, taking the output and making it an input to the
final link. Two (significant) optimizations are done: we share atoms
tables between all these links and the final link (so e.g. all type hash
values are shared, all decorated type names, etc); and any CU-mapped
links with only one input (and no child dicts) doesn't need to do
anything other than renaming the CU: the CU-mapped link phase can be
skipped for it. Put together, large CU-mapped links can save 50% of
their memory usage and about as much time (and the memory usage for
CU-mapped links is significant, because all those output CUs have to
have all their types stored in memory all at once).
include/
* ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the
deduplicator.
libctf/
* ctf-impl.h (ctf_list_splice): New.
* ctf-util.h (ctf_list_splice): Likewise.
* ctf-link.c (link_sort_inputs_cb_arg_t): Likewise.
(ctf_link_sort_inputs): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_variables): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Call it.
2020-06-06 05:57:06 +08:00
|
|
|
/* Now close all the inputs, including per-CU intermediates. */
|
|
|
|
|
|
|
|
if (ctf_link_deduplicating_close_inputs (fp, NULL, inputs, ninputs) < 0)
|
|
|
|
return; /* errno is set for us. */
|
|
|
|
|
|
|
|
ninputs = 0; /* Prevent double-close. */
|
|
|
|
ctf_set_errno (fp, 0);
|
|
|
|
|
|
|
|
/* Fall through. */
|
|
|
|
|
|
|
|
err:
|
|
|
|
for (i = 0; i < (size_t) ninputs; i++)
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
ctf_dict_close (inputs[i]);
|
libctf, link: tie in the deduplicating linker
This fairly intricate commit connects up the CTF linker machinery (which
operates in terms of ctf_archive_t's on ctf_link_inputs ->
ctf_link_outputs) to the deduplicator (which operates in terms of arrays
of ctf_file_t's, all the archives exploded).
The nondeduplicating linker is retained, but is not called unless the
CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the
environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have
confidence in the much-more-complex deduplicating linker, I hope the
nondeduplicating linker can be removed.
In brief, what this does is traverses each input archive in
ctf_link_inputs, opening every member (if not already open) and tying
child dicts to their parents, shoving them into an array and
constructing a corresponding parents array that tells the deduplicator
which dict is the parent of which child. We then call ctf_dedup and
ctf_dedup_emit with that array of inputs, taking the outputs that result
and putting them into ctf_link_outputs where the rest of the CTF linker
expects to find them, then linking in the variables just as is done by
the nondeduplicating linker.
It also implements much of the CU-mapping side of things. The problem
CU-mapping introduces is that if you map many input CUs into one output,
this is saying that you want many translation units to produce at most
one child dict if conflicting types are found in any of them. This
means you can suddenly have multiple distinct types with the same name
in the same dict, which libctf cannot really represent because it's not
something you can do with C translation units.
The deduplicator machinery already committed does as best it can with
these, hiding types with conflicting names rather than making child
dicts out of them: but we still need to call it. This is done similarly
to the main link, taking the inputs (one CU output at a time),
deduplicating them, taking the output and making it an input to the
final link. Two (significant) optimizations are done: we share atoms
tables between all these links and the final link (so e.g. all type hash
values are shared, all decorated type names, etc); and any CU-mapped
links with only one input (and no child dicts) doesn't need to do
anything other than renaming the CU: the CU-mapped link phase can be
skipped for it. Put together, large CU-mapped links can save 50% of
their memory usage and about as much time (and the memory usage for
CU-mapped links is significant, because all those output CUs have to
have all their types stored in memory all at once).
include/
* ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the
deduplicator.
libctf/
* ctf-impl.h (ctf_list_splice): New.
* ctf-util.h (ctf_list_splice): Likewise.
* ctf-link.c (link_sort_inputs_cb_arg_t): Likewise.
(ctf_link_sort_inputs): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_variables): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Call it.
2020-06-06 05:57:06 +08:00
|
|
|
free (inputs);
|
|
|
|
free (parents);
|
|
|
|
free (outputs);
|
|
|
|
return;
|
2020-11-20 21:34:04 +08:00
|
|
|
|
|
|
|
err_clean_outputs:
|
libctf: fix linking together multiple objects derived from the same source
Right now, if you compile the same .c input repeatedly with CTF enabled
and different compilation flags, then arrange to link all of these
together, then things misbehave in various ways. libctf may conflate
either inputs (if the .o files have the same name, say if they are
stored in different .a archives), or per-CU outputs when conflicting
types are found: the latter can lead to entirely spurious errors when
it tries to produce multiple per-CU outputs with the same name
(discarding all but the last, but then looking for types in the earlier
ones which have just been thrown away).
Fixing this is multi-pronged. Both inputs and outputs need to be
differentiated in the hashtables libctf keeps them in: inputs with the
same cuname and filename need to be considered distinct as long as they
have different associated CTF dicts, and per-CU outputs need to be
considered distinct as long as they have different associated input
dicts. Right now there is nothing tying the two together other than the
CU name: fix this by introducing a new field in the ctf_dict_t named
ctf_link_in_out, which (for input dicts) points to the associated per-CU
output dict (if any), and for output dicts points to the associated
input dict. At creation time the name used is completely arbitrary:
it's only important that it be distinct if CTF dicts are distinct. So,
when a clash is found, adjust the CU name by sticking the number of
elements in the input on the end. At output time, the CU name will
appear in the linked object, so it matters a little more that it look
slightly less ugly: in conflicting cases, append an incrementing
integer, starting at 0.
This naming scheme is not very helpful, but it's hard to see what else
we can do. The input .o name may be the same. The input .a name is not
even visible to ctf_link, and even *that* might be the same, because
.a's can contain many members with the same name, all of which
participate in the link. All we really know is that the two have
distinct dictionaries with distinct types in them, and at least this way
they are all represented, any any symbols, variables etc referring to
those types are accurately stored.
(As a side-effect this also fixes a use-after-free and double-free when
errors are found during variable or symbol emission.)
Use the opportunity to prevent a couple of sources of problems, to wit
changing the active CU mappings when a link has already been done
(no effect on ld, which doesn't use CU mappings at all), and causing
multiple consecutive ctf_link's to have the same net effect as just
doing the last one (no effect on ld, which only ever does one
ctf_link) rather than having the links be a sort of half-incremental
not-really-intended mess.
libctf/ChangeLog:
PR libctf/29242
* ctf-impl.h (struct ctf_dict) [ctf_link_in_out]: New.
* ctf-dedup.c (ctf_dedup_emit_type): Set it.
* ctf-link.c (ctf_link_add_ctf_internal): Set the input
CU name uniquely when clashes are found.
(ctf_link_add): Document what repeated additions do.
(ctf_new_per_cu_name): New, come up with a consistent
name for a new per-CU dict.
(ctf_link_deduplicating): Use it.
(ctf_create_per_cu): Use it, and ctf_link_in_out, and set
ctf_link_in_out properly. Don't overwrite per-CU dicts with
per-CU dicts relating to different inputs.
(ctf_link_add_cu_mapping): Prevent per-CU mappings being set up
if we already have per-CU outputs.
(ctf_link_one_variable): Adjust ctf_link_per_cu call.
(ctf_link_deduplicating_one_symtypetab): Likewise.
(ctf_link_empty_outputs): New, delete all the ctf_link_outputs
and blank out ctf_link_in_out on the corresponding inputs.
(ctf_link): Clarify the effect of multiple ctf_link calls.
Empty ctf_link_outputs if it already exists rather than
having the old output leak into the new link. Fix a variable
name.
* testsuite/config/default.exp (AR): Add.
(OBJDUMP): Likewise.
* testsuite/libctf-regression/libctf-repeat-cu.exp: New test.
* testsuite/libctf-regression/libctf-repeat-cu*: Main program,
library, and expected results for the test.
2022-06-11 00:05:50 +08:00
|
|
|
ctf_link_empty_outputs (fp);
|
2020-11-20 21:34:04 +08:00
|
|
|
goto err;
|
libctf, link: tie in the deduplicating linker
This fairly intricate commit connects up the CTF linker machinery (which
operates in terms of ctf_archive_t's on ctf_link_inputs ->
ctf_link_outputs) to the deduplicator (which operates in terms of arrays
of ctf_file_t's, all the archives exploded).
The nondeduplicating linker is retained, but is not called unless the
CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the
environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have
confidence in the much-more-complex deduplicating linker, I hope the
nondeduplicating linker can be removed.
In brief, what this does is traverses each input archive in
ctf_link_inputs, opening every member (if not already open) and tying
child dicts to their parents, shoving them into an array and
constructing a corresponding parents array that tells the deduplicator
which dict is the parent of which child. We then call ctf_dedup and
ctf_dedup_emit with that array of inputs, taking the outputs that result
and putting them into ctf_link_outputs where the rest of the CTF linker
expects to find them, then linking in the variables just as is done by
the nondeduplicating linker.
It also implements much of the CU-mapping side of things. The problem
CU-mapping introduces is that if you map many input CUs into one output,
this is saying that you want many translation units to produce at most
one child dict if conflicting types are found in any of them. This
means you can suddenly have multiple distinct types with the same name
in the same dict, which libctf cannot really represent because it's not
something you can do with C translation units.
The deduplicator machinery already committed does as best it can with
these, hiding types with conflicting names rather than making child
dicts out of them: but we still need to call it. This is done similarly
to the main link, taking the inputs (one CU output at a time),
deduplicating them, taking the output and making it an input to the
final link. Two (significant) optimizations are done: we share atoms
tables between all these links and the final link (so e.g. all type hash
values are shared, all decorated type names, etc); and any CU-mapped
links with only one input (and no child dicts) doesn't need to do
anything other than renaming the CU: the CU-mapped link phase can be
skipped for it. Put together, large CU-mapped links can save 50% of
their memory usage and about as much time (and the memory usage for
CU-mapped links is significant, because all those output CUs have to
have all their types stored in memory all at once).
include/
* ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the
deduplicator.
libctf/
* ctf-impl.h (ctf_list_splice): New.
* ctf-util.h (ctf_list_splice): Likewise.
* ctf-link.c (link_sort_inputs_cb_arg_t): Likewise.
(ctf_link_sort_inputs): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating_variables): Likewise.
(ctf_link_deduplicating_per_cu): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Call it.
2020-06-06 05:57:06 +08:00
|
|
|
}
|
|
|
|
|
libctf, include: remove the nondeduplicating CTF linker
The nondeduplicating CTF linker was kept around when the deduplicating
one was added so that people had something to fall back to in case the
deduplicating linker turned out to be buggy. It's now much more stable
than the nondeduplicating linker, in addition to much faster, using much
less memory and producing much better output. In addition, while
libctf has a linker flag to invoke the nondeduplicating linker, ld does
not expose it: the only way to turn it on within ld is an intentionally-
undocumented environment variable. So we can remove it without any ABI
or user-visibility concerns (the only thing we leave around is the
CTF_LINK_NONDEDUP flag, which can easily be interpreted as "deduplicate
less", though right now it does nothing).
This lets us remove a lot of complexity associated with tracking
filenames and CU names separately (something the deduplcating linker
never bothered with, since the cunames are always reliable and ld never
hands us useful filenames anyway)
The biggest lacuna left behind is the ctf_type_mapping machinery, which
slows down deduplicating links quite a lot. We can't just ditch it
because ctf_add_type uses it: removing the slowdown from the
deduplicating linker is a job for another commit.
include/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (CTF_LINK_SHARE_DUPLICATED): Note that this might
merely change how much deduplication is done.
libctf/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-link.c (ctf_create_per_cu): Drop FILENAME now that it is
always identical to CUNAME.
(ctf_link_deduplicating_one_symtypetab): Adjust.
(ctf_link_one_type): Remove.
(ctf_link_one_input_archive_member): Likewise.
(ctf_link_close_one_input_archive): Likewise.
(ctf_link_one_input_archive): Likewise.
(ctf_link): No longer call it. Drop CTF_LINK_NONDEDUP path.
Improve header comment a bit (dicts, not files). Adjust
ctf_create_per_cu call.
(ctf_link_deduplicating_variables): Simplify.
(ctf_link_in_member_cb_arg_t) <cu_name>: Remove.
<in_input_cu_file>: Likewise.
<in_fp_parent>: Likewise.
<done_parent>: Likewise.
(ctf_link_one_variable): Turn uses of in_file_name to in_cuname.
2021-03-02 23:10:05 +08:00
|
|
|
/* Merge types and variable sections in all dicts added to the link together.
|
libctf: fix linking together multiple objects derived from the same source
Right now, if you compile the same .c input repeatedly with CTF enabled
and different compilation flags, then arrange to link all of these
together, then things misbehave in various ways. libctf may conflate
either inputs (if the .o files have the same name, say if they are
stored in different .a archives), or per-CU outputs when conflicting
types are found: the latter can lead to entirely spurious errors when
it tries to produce multiple per-CU outputs with the same name
(discarding all but the last, but then looking for types in the earlier
ones which have just been thrown away).
Fixing this is multi-pronged. Both inputs and outputs need to be
differentiated in the hashtables libctf keeps them in: inputs with the
same cuname and filename need to be considered distinct as long as they
have different associated CTF dicts, and per-CU outputs need to be
considered distinct as long as they have different associated input
dicts. Right now there is nothing tying the two together other than the
CU name: fix this by introducing a new field in the ctf_dict_t named
ctf_link_in_out, which (for input dicts) points to the associated per-CU
output dict (if any), and for output dicts points to the associated
input dict. At creation time the name used is completely arbitrary:
it's only important that it be distinct if CTF dicts are distinct. So,
when a clash is found, adjust the CU name by sticking the number of
elements in the input on the end. At output time, the CU name will
appear in the linked object, so it matters a little more that it look
slightly less ugly: in conflicting cases, append an incrementing
integer, starting at 0.
This naming scheme is not very helpful, but it's hard to see what else
we can do. The input .o name may be the same. The input .a name is not
even visible to ctf_link, and even *that* might be the same, because
.a's can contain many members with the same name, all of which
participate in the link. All we really know is that the two have
distinct dictionaries with distinct types in them, and at least this way
they are all represented, any any symbols, variables etc referring to
those types are accurately stored.
(As a side-effect this also fixes a use-after-free and double-free when
errors are found during variable or symbol emission.)
Use the opportunity to prevent a couple of sources of problems, to wit
changing the active CU mappings when a link has already been done
(no effect on ld, which doesn't use CU mappings at all), and causing
multiple consecutive ctf_link's to have the same net effect as just
doing the last one (no effect on ld, which only ever does one
ctf_link) rather than having the links be a sort of half-incremental
not-really-intended mess.
libctf/ChangeLog:
PR libctf/29242
* ctf-impl.h (struct ctf_dict) [ctf_link_in_out]: New.
* ctf-dedup.c (ctf_dedup_emit_type): Set it.
* ctf-link.c (ctf_link_add_ctf_internal): Set the input
CU name uniquely when clashes are found.
(ctf_link_add): Document what repeated additions do.
(ctf_new_per_cu_name): New, come up with a consistent
name for a new per-CU dict.
(ctf_link_deduplicating): Use it.
(ctf_create_per_cu): Use it, and ctf_link_in_out, and set
ctf_link_in_out properly. Don't overwrite per-CU dicts with
per-CU dicts relating to different inputs.
(ctf_link_add_cu_mapping): Prevent per-CU mappings being set up
if we already have per-CU outputs.
(ctf_link_one_variable): Adjust ctf_link_per_cu call.
(ctf_link_deduplicating_one_symtypetab): Likewise.
(ctf_link_empty_outputs): New, delete all the ctf_link_outputs
and blank out ctf_link_in_out on the corresponding inputs.
(ctf_link): Clarify the effect of multiple ctf_link calls.
Empty ctf_link_outputs if it already exists rather than
having the old output leak into the new link. Fix a variable
name.
* testsuite/config/default.exp (AR): Add.
(OBJDUMP): Likewise.
* testsuite/libctf-regression/libctf-repeat-cu.exp: New test.
* testsuite/libctf-regression/libctf-repeat-cu*: Main program,
library, and expected results for the test.
2022-06-11 00:05:50 +08:00
|
|
|
The result of any previous link is discarded. */
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
int
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
ctf_link (ctf_dict_t *fp, int flags)
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
{
|
libctf, link: redo cu-mapping handling
Now a bunch of stuff that doesn't apply to ld or any normal use of
libctf, piled into one commit so that it's easier to ignore.
The cu-mapping machinery associates incoming compilation unit names with
outgoing names of CTF dictionaries that should correspond to them, for
non-gdb CTF consumers that would like to group multiple TUs into a
single child dict if conflicting types are found in it (the existing use
case is one kernel module, one child CTF dict, even if the kernel module
is composed of multiple CUs).
The upcoming deduplicator needs to track not only the mapping from
incoming CU name to outgoing dict name, but the inverse mapping from
outgoing dict name to incoming CU name, so it can work over every CTF
dict we might see in the output and link into it.
So rejig the ctf-link machinery to do that. Simultaneously (because
they are closely associated and were written at the same time), we add a
new CTF_LINK_EMPTY_CU_MAPPINGS flag to ctf_link, which tells the
ctf_link machinery to create empty child dicts for each outgoing CU
mapping even if no CUs that correspond to it exist in the link. This is
a bit (OK, quite a lot) of a waste of space, but some existing consumers
require it. (Nobody else should use it.)
Its value is not consecutive with existing CTF_LINK flag values because
we're about to add more flags that are conceptually closer to the
existing ones than this one is.
include/
* ctf-api.h (CTF_LINK_EMPTY_CU_MAPPINGS): New.
libctf/
* ctf-impl.h (ctf_file_t): Improve comments.
<ctf_link_cu_mapping>: Split into...
<ctf_link_in_cu_mapping>: ... this...
<ctf_link_out_cu_mapping>: ... and this.
* ctf-create.c (ctf_serialize): Adjust.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-link.c (ctf_create_per_cu): Look things up in the
in_cu_mapping instead of the cu_mapping.
(ctf_link_add_cu_mapping): The deduplicating link will define
what happens if many FROMs share a TO.
(ctf_link_add_cu_mapping): Create in_cu_mapping and
out_cu_mapping. Do not create ctf_link_outputs here any more, or
create per-CU dicts here: they are already created when needed.
(ctf_link_one_variable): Log a debug message if we skip a
variable due to its type being concealed in a CU-mapped link.
(This is probably too common a case to make into a warning.)
(ctf_link): Create empty per-CU dicts if requested.
2020-06-06 00:36:16 +08:00
|
|
|
int err;
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
|
libctf, link: add lazy linking: clean up input members: err/warn cleanup
This rather large and intertwined pile of changes does three things:
First, it transitions from dprintf to ctf_err_warn for things the user might
care about: this one file is the major impetus for the ctf_err_warn
infrastructure, because things like file names are crucial in linker
error messages, and errno values are utterly incapable of
communicating them
Second, it stabilizes the ctf_link APIs: you can now call
ctf_link_add_ctf without a CTF argument (only a NAME), to lazily
ctf_open the file with the given NAME when needed, and close it as soon
as possible, to save memory. This is not an API change because a null
CTF argument was prohibited before now.
Since getting CTF directly from files uses ctf_open, passing in only a
NAME requires use of libctf, not libctf-nobfd. The linker's behaviour
is unchanged, as it still passes in a ctf_archive_t as before.
This also let us fix a leak: we were opening ctf_archives and their
containing ctf_files, then only closing the files and leaving the
archives open.
Third, this commit restructures the ctf_link_in_member argument used by
the CTF linking machinery and adjusts its users accordingly.
We drop two members:
- arcname, which is difficult to construct and then only used in error
messages (that were only dprintf()ed, so never seen!)
- share_mode, since we store the flags passed to ctf_link (including the
share mode) in a new ctf_file_t.ctf_link_flags to help dedup get hold
of it
We rename others whose existing names were fairly dreadful:
- done_main_member -> done_parent, using consistent terminology for .ctf
as the parent of all archive members
- main_input_fp -> in_fp_parent, likewise
- file_name -> in_file_name, likewise
We add one new member, cu_mapped.
Finally, we move the various frees of things like mapping table data to
the top-level ctf_link, since deduplicating links will want to do that
too.
include/
* ctf-api.h (ECTF_NEEDSBFD): New.
(ECTF_NERR): Adjust.
(ctf_link): Rename share_mode arg to flags.
libctf/
* Makefile.am: Set -DNOBFD=1 in libctf-nobfd, and =0 elsewhere.
* Makefile.in: Regenerated.
* ctf-impl.h (ctf_link_input_name): New.
(ctf_file_t) <ctf_link_flags>: New.
* ctf-create.c (ctf_serialize): Adjust accordingly.
* ctf-link.c: Define ctf_open as weak when PIC.
(ctf_arc_close_thunk): Remove unnecessary thunk.
(ctf_file_close_thunk): Likewise.
(ctf_link_input_name): New.
(ctf_link_input_t): New value of the ctf_file_t.ctf_link_input.
(ctf_link_input_close): Adjust accordingly.
(ctf_link_add_ctf_internal): New, split from...
(ctf_link_add_ctf): ... here. Return error if lazy loading of
CTF is not possible. Change to just call...
(ctf_link_add): ... this new function.
(ctf_link_add_cu_mapping): Transition to ctf_err_warn. Drop the
ctf_file_close_thunk.
(ctf_link_in_member_cb_arg_t) <file_name> Rename to...
<in_file_name>: ... this.
<arcname>: Drop.
<share_mode>: Likewise (migrated to ctf_link_flags).
<done_main_member>: Rename to...
<done_parent>: ... this.
<main_input_fp>: Rename to...
<in_fp_parent>: ... this.
<cu_mapped>: New.
(ctf_link_one_type): Adjuwt accordingly. Transition to
ctf_err_warn, removing a TODO.
(ctf_link_one_variable): Note a case too common to warn about.
Report in the debug stream if a cu-mapped link prevents addition
of a conflicting variable.
(ctf_link_one_input_archive_member): Adjust.
(ctf_link_lazy_open): New, open a CTF archive for linking when
needed.
(ctf_link_close_one_input_archive): New, close it again.
(ctf_link_one_input_archive): Adjust for lazy opening, member
renames, and ctf_err_warn transition. Move the
empty_link_type_mapping call to...
(ctf_link): ... here. Adjut for renamings and thunk removal.
Don't spuriously fail if some input contains no CTF data.
(ctf_link_write): ctf_err_warn transition.
* libctf.ver: Remove not-yet-stable comment.
2020-06-05 02:28:52 +08:00
|
|
|
fp->ctf_link_flags = flags;
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
|
|
|
|
if (fp->ctf_link_inputs == NULL)
|
|
|
|
return 0; /* Nothing to do. */
|
|
|
|
|
libctf: fix linking together multiple objects derived from the same source
Right now, if you compile the same .c input repeatedly with CTF enabled
and different compilation flags, then arrange to link all of these
together, then things misbehave in various ways. libctf may conflate
either inputs (if the .o files have the same name, say if they are
stored in different .a archives), or per-CU outputs when conflicting
types are found: the latter can lead to entirely spurious errors when
it tries to produce multiple per-CU outputs with the same name
(discarding all but the last, but then looking for types in the earlier
ones which have just been thrown away).
Fixing this is multi-pronged. Both inputs and outputs need to be
differentiated in the hashtables libctf keeps them in: inputs with the
same cuname and filename need to be considered distinct as long as they
have different associated CTF dicts, and per-CU outputs need to be
considered distinct as long as they have different associated input
dicts. Right now there is nothing tying the two together other than the
CU name: fix this by introducing a new field in the ctf_dict_t named
ctf_link_in_out, which (for input dicts) points to the associated per-CU
output dict (if any), and for output dicts points to the associated
input dict. At creation time the name used is completely arbitrary:
it's only important that it be distinct if CTF dicts are distinct. So,
when a clash is found, adjust the CU name by sticking the number of
elements in the input on the end. At output time, the CU name will
appear in the linked object, so it matters a little more that it look
slightly less ugly: in conflicting cases, append an incrementing
integer, starting at 0.
This naming scheme is not very helpful, but it's hard to see what else
we can do. The input .o name may be the same. The input .a name is not
even visible to ctf_link, and even *that* might be the same, because
.a's can contain many members with the same name, all of which
participate in the link. All we really know is that the two have
distinct dictionaries with distinct types in them, and at least this way
they are all represented, any any symbols, variables etc referring to
those types are accurately stored.
(As a side-effect this also fixes a use-after-free and double-free when
errors are found during variable or symbol emission.)
Use the opportunity to prevent a couple of sources of problems, to wit
changing the active CU mappings when a link has already been done
(no effect on ld, which doesn't use CU mappings at all), and causing
multiple consecutive ctf_link's to have the same net effect as just
doing the last one (no effect on ld, which only ever does one
ctf_link) rather than having the links be a sort of half-incremental
not-really-intended mess.
libctf/ChangeLog:
PR libctf/29242
* ctf-impl.h (struct ctf_dict) [ctf_link_in_out]: New.
* ctf-dedup.c (ctf_dedup_emit_type): Set it.
* ctf-link.c (ctf_link_add_ctf_internal): Set the input
CU name uniquely when clashes are found.
(ctf_link_add): Document what repeated additions do.
(ctf_new_per_cu_name): New, come up with a consistent
name for a new per-CU dict.
(ctf_link_deduplicating): Use it.
(ctf_create_per_cu): Use it, and ctf_link_in_out, and set
ctf_link_in_out properly. Don't overwrite per-CU dicts with
per-CU dicts relating to different inputs.
(ctf_link_add_cu_mapping): Prevent per-CU mappings being set up
if we already have per-CU outputs.
(ctf_link_one_variable): Adjust ctf_link_per_cu call.
(ctf_link_deduplicating_one_symtypetab): Likewise.
(ctf_link_empty_outputs): New, delete all the ctf_link_outputs
and blank out ctf_link_in_out on the corresponding inputs.
(ctf_link): Clarify the effect of multiple ctf_link calls.
Empty ctf_link_outputs if it already exists rather than
having the old output leak into the new link. Fix a variable
name.
* testsuite/config/default.exp (AR): Add.
(OBJDUMP): Likewise.
* testsuite/libctf-regression/libctf-repeat-cu.exp: New test.
* testsuite/libctf-regression/libctf-repeat-cu*: Main program,
library, and expected results for the test.
2022-06-11 00:05:50 +08:00
|
|
|
if (fp->ctf_link_outputs != NULL)
|
|
|
|
ctf_link_empty_outputs (fp);
|
|
|
|
else
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
fp->ctf_link_outputs = ctf_dynhash_create (ctf_hash_string,
|
|
|
|
ctf_hash_eq_string, free,
|
libctf, link: add lazy linking: clean up input members: err/warn cleanup
This rather large and intertwined pile of changes does three things:
First, it transitions from dprintf to ctf_err_warn for things the user might
care about: this one file is the major impetus for the ctf_err_warn
infrastructure, because things like file names are crucial in linker
error messages, and errno values are utterly incapable of
communicating them
Second, it stabilizes the ctf_link APIs: you can now call
ctf_link_add_ctf without a CTF argument (only a NAME), to lazily
ctf_open the file with the given NAME when needed, and close it as soon
as possible, to save memory. This is not an API change because a null
CTF argument was prohibited before now.
Since getting CTF directly from files uses ctf_open, passing in only a
NAME requires use of libctf, not libctf-nobfd. The linker's behaviour
is unchanged, as it still passes in a ctf_archive_t as before.
This also let us fix a leak: we were opening ctf_archives and their
containing ctf_files, then only closing the files and leaving the
archives open.
Third, this commit restructures the ctf_link_in_member argument used by
the CTF linking machinery and adjusts its users accordingly.
We drop two members:
- arcname, which is difficult to construct and then only used in error
messages (that were only dprintf()ed, so never seen!)
- share_mode, since we store the flags passed to ctf_link (including the
share mode) in a new ctf_file_t.ctf_link_flags to help dedup get hold
of it
We rename others whose existing names were fairly dreadful:
- done_main_member -> done_parent, using consistent terminology for .ctf
as the parent of all archive members
- main_input_fp -> in_fp_parent, likewise
- file_name -> in_file_name, likewise
We add one new member, cu_mapped.
Finally, we move the various frees of things like mapping table data to
the top-level ctf_link, since deduplicating links will want to do that
too.
include/
* ctf-api.h (ECTF_NEEDSBFD): New.
(ECTF_NERR): Adjust.
(ctf_link): Rename share_mode arg to flags.
libctf/
* Makefile.am: Set -DNOBFD=1 in libctf-nobfd, and =0 elsewhere.
* Makefile.in: Regenerated.
* ctf-impl.h (ctf_link_input_name): New.
(ctf_file_t) <ctf_link_flags>: New.
* ctf-create.c (ctf_serialize): Adjust accordingly.
* ctf-link.c: Define ctf_open as weak when PIC.
(ctf_arc_close_thunk): Remove unnecessary thunk.
(ctf_file_close_thunk): Likewise.
(ctf_link_input_name): New.
(ctf_link_input_t): New value of the ctf_file_t.ctf_link_input.
(ctf_link_input_close): Adjust accordingly.
(ctf_link_add_ctf_internal): New, split from...
(ctf_link_add_ctf): ... here. Return error if lazy loading of
CTF is not possible. Change to just call...
(ctf_link_add): ... this new function.
(ctf_link_add_cu_mapping): Transition to ctf_err_warn. Drop the
ctf_file_close_thunk.
(ctf_link_in_member_cb_arg_t) <file_name> Rename to...
<in_file_name>: ... this.
<arcname>: Drop.
<share_mode>: Likewise (migrated to ctf_link_flags).
<done_main_member>: Rename to...
<done_parent>: ... this.
<main_input_fp>: Rename to...
<in_fp_parent>: ... this.
<cu_mapped>: New.
(ctf_link_one_type): Adjuwt accordingly. Transition to
ctf_err_warn, removing a TODO.
(ctf_link_one_variable): Note a case too common to warn about.
Report in the debug stream if a cu-mapped link prevents addition
of a conflicting variable.
(ctf_link_one_input_archive_member): Adjust.
(ctf_link_lazy_open): New, open a CTF archive for linking when
needed.
(ctf_link_close_one_input_archive): New, close it again.
(ctf_link_one_input_archive): Adjust for lazy opening, member
renames, and ctf_err_warn transition. Move the
empty_link_type_mapping call to...
(ctf_link): ... here. Adjut for renamings and thunk removal.
Don't spuriously fail if some input contains no CTF data.
(ctf_link_write): ctf_err_warn transition.
* libctf.ver: Remove not-yet-stable comment.
2020-06-05 02:28:52 +08:00
|
|
|
(ctf_hash_free_fun)
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
ctf_dict_close);
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
|
|
|
|
if (fp->ctf_link_outputs == NULL)
|
|
|
|
return ctf_set_errno (fp, ENOMEM);
|
|
|
|
|
libctf, link: fix CU-mapped links with CTF_LINK_EMPTY_CU_MAPPINGS
This is a bug in the intersection of two obscure options that cannot
even be invoked from ld with a feature added to stop ld of the
same input file repeatedly from crashing the linker.
The latter fix involved tracking input files (internally to libctf) not
just with their input CU name but with a version of their input CU name
that was augmented with a numeric prefix if their linker input file name
was changed, to prevent distinct CTF dicts with the same cuname from
overwriting each other. (We can't use just the linker input file name
because one linker input can contain many CU dicts, particularly under
ld -r). If these inputs then produced conflicting types, those types
were emitted into similarly-named output dicts, so we needed similar
machinery to detect clashing output dicts and add a numeric prefix to
them as well.
This works fine, except that if you used the cu-mapping feature to force
double-linking of CTF (so that your CTF can be grouped into output dicts
larger than a single translation unit) and then also used
CTF_LINK_EMPTY_CU_MAPPINGS to force every possible output dict in the
mapping to be created (even if empty), we did the creation of empty dicts
first, and then all the actual content got considered to be a clash. So
you ended up with a pile of useless empty dicts and then all the content
was in full dicts with the same names suffixed with a #0. This seems
likely to confuse consumers that use this facility.
Fixed by generating all the EMPTY_CU_MAPPINGS empty dicts after linking
is complete, not before it runs.
No impact on ld, which does not do cu-mapped links or pass
CTF_LINK_EMPTY_CU_MAPPINGS to ctf_link().
libctf/
* ctf-link.c (ctf_create_per_cu): Don't create new dicts iff one
already exists and we are making one for no input in particular.
(ctf_link): Emit empty CTF dicts corresponding to no input in
particular only after linkiing is complete.
2023-04-08 03:09:24 +08:00
|
|
|
fp->ctf_flags |= LCTF_LINKING;
|
|
|
|
ctf_link_deduplicating (fp);
|
|
|
|
fp->ctf_flags &= ~LCTF_LINKING;
|
|
|
|
|
|
|
|
if ((ctf_errno (fp) != 0) && (ctf_errno (fp) != ECTF_NOCTFDATA))
|
|
|
|
return -1;
|
|
|
|
|
libctf, link: redo cu-mapping handling
Now a bunch of stuff that doesn't apply to ld or any normal use of
libctf, piled into one commit so that it's easier to ignore.
The cu-mapping machinery associates incoming compilation unit names with
outgoing names of CTF dictionaries that should correspond to them, for
non-gdb CTF consumers that would like to group multiple TUs into a
single child dict if conflicting types are found in it (the existing use
case is one kernel module, one child CTF dict, even if the kernel module
is composed of multiple CUs).
The upcoming deduplicator needs to track not only the mapping from
incoming CU name to outgoing dict name, but the inverse mapping from
outgoing dict name to incoming CU name, so it can work over every CTF
dict we might see in the output and link into it.
So rejig the ctf-link machinery to do that. Simultaneously (because
they are closely associated and were written at the same time), we add a
new CTF_LINK_EMPTY_CU_MAPPINGS flag to ctf_link, which tells the
ctf_link machinery to create empty child dicts for each outgoing CU
mapping even if no CUs that correspond to it exist in the link. This is
a bit (OK, quite a lot) of a waste of space, but some existing consumers
require it. (Nobody else should use it.)
Its value is not consecutive with existing CTF_LINK flag values because
we're about to add more flags that are conceptually closer to the
existing ones than this one is.
include/
* ctf-api.h (CTF_LINK_EMPTY_CU_MAPPINGS): New.
libctf/
* ctf-impl.h (ctf_file_t): Improve comments.
<ctf_link_cu_mapping>: Split into...
<ctf_link_in_cu_mapping>: ... this...
<ctf_link_out_cu_mapping>: ... and this.
* ctf-create.c (ctf_serialize): Adjust.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-link.c (ctf_create_per_cu): Look things up in the
in_cu_mapping instead of the cu_mapping.
(ctf_link_add_cu_mapping): The deduplicating link will define
what happens if many FROMs share a TO.
(ctf_link_add_cu_mapping): Create in_cu_mapping and
out_cu_mapping. Do not create ctf_link_outputs here any more, or
create per-CU dicts here: they are already created when needed.
(ctf_link_one_variable): Log a debug message if we skip a
variable due to its type being concealed in a CU-mapped link.
(This is probably too common a case to make into a warning.)
(ctf_link): Create empty per-CU dicts if requested.
2020-06-06 00:36:16 +08:00
|
|
|
/* Create empty CUs if requested. We do not currently claim that multiple
|
|
|
|
links in succession with CTF_LINK_EMPTY_CU_MAPPINGS set in some calls and
|
|
|
|
not set in others will do anything especially sensible. */
|
|
|
|
|
|
|
|
if (fp->ctf_link_out_cu_mapping && (flags & CTF_LINK_EMPTY_CU_MAPPINGS))
|
|
|
|
{
|
libctf: fix linking together multiple objects derived from the same source
Right now, if you compile the same .c input repeatedly with CTF enabled
and different compilation flags, then arrange to link all of these
together, then things misbehave in various ways. libctf may conflate
either inputs (if the .o files have the same name, say if they are
stored in different .a archives), or per-CU outputs when conflicting
types are found: the latter can lead to entirely spurious errors when
it tries to produce multiple per-CU outputs with the same name
(discarding all but the last, but then looking for types in the earlier
ones which have just been thrown away).
Fixing this is multi-pronged. Both inputs and outputs need to be
differentiated in the hashtables libctf keeps them in: inputs with the
same cuname and filename need to be considered distinct as long as they
have different associated CTF dicts, and per-CU outputs need to be
considered distinct as long as they have different associated input
dicts. Right now there is nothing tying the two together other than the
CU name: fix this by introducing a new field in the ctf_dict_t named
ctf_link_in_out, which (for input dicts) points to the associated per-CU
output dict (if any), and for output dicts points to the associated
input dict. At creation time the name used is completely arbitrary:
it's only important that it be distinct if CTF dicts are distinct. So,
when a clash is found, adjust the CU name by sticking the number of
elements in the input on the end. At output time, the CU name will
appear in the linked object, so it matters a little more that it look
slightly less ugly: in conflicting cases, append an incrementing
integer, starting at 0.
This naming scheme is not very helpful, but it's hard to see what else
we can do. The input .o name may be the same. The input .a name is not
even visible to ctf_link, and even *that* might be the same, because
.a's can contain many members with the same name, all of which
participate in the link. All we really know is that the two have
distinct dictionaries with distinct types in them, and at least this way
they are all represented, any any symbols, variables etc referring to
those types are accurately stored.
(As a side-effect this also fixes a use-after-free and double-free when
errors are found during variable or symbol emission.)
Use the opportunity to prevent a couple of sources of problems, to wit
changing the active CU mappings when a link has already been done
(no effect on ld, which doesn't use CU mappings at all), and causing
multiple consecutive ctf_link's to have the same net effect as just
doing the last one (no effect on ld, which only ever does one
ctf_link) rather than having the links be a sort of half-incremental
not-really-intended mess.
libctf/ChangeLog:
PR libctf/29242
* ctf-impl.h (struct ctf_dict) [ctf_link_in_out]: New.
* ctf-dedup.c (ctf_dedup_emit_type): Set it.
* ctf-link.c (ctf_link_add_ctf_internal): Set the input
CU name uniquely when clashes are found.
(ctf_link_add): Document what repeated additions do.
(ctf_new_per_cu_name): New, come up with a consistent
name for a new per-CU dict.
(ctf_link_deduplicating): Use it.
(ctf_create_per_cu): Use it, and ctf_link_in_out, and set
ctf_link_in_out properly. Don't overwrite per-CU dicts with
per-CU dicts relating to different inputs.
(ctf_link_add_cu_mapping): Prevent per-CU mappings being set up
if we already have per-CU outputs.
(ctf_link_one_variable): Adjust ctf_link_per_cu call.
(ctf_link_deduplicating_one_symtypetab): Likewise.
(ctf_link_empty_outputs): New, delete all the ctf_link_outputs
and blank out ctf_link_in_out on the corresponding inputs.
(ctf_link): Clarify the effect of multiple ctf_link calls.
Empty ctf_link_outputs if it already exists rather than
having the old output leak into the new link. Fix a variable
name.
* testsuite/config/default.exp (AR): Add.
(OBJDUMP): Likewise.
* testsuite/libctf-regression/libctf-repeat-cu.exp: New test.
* testsuite/libctf-regression/libctf-repeat-cu*: Main program,
library, and expected results for the test.
2022-06-11 00:05:50 +08:00
|
|
|
ctf_next_t *i = NULL;
|
|
|
|
void *k;
|
libctf, link: redo cu-mapping handling
Now a bunch of stuff that doesn't apply to ld or any normal use of
libctf, piled into one commit so that it's easier to ignore.
The cu-mapping machinery associates incoming compilation unit names with
outgoing names of CTF dictionaries that should correspond to them, for
non-gdb CTF consumers that would like to group multiple TUs into a
single child dict if conflicting types are found in it (the existing use
case is one kernel module, one child CTF dict, even if the kernel module
is composed of multiple CUs).
The upcoming deduplicator needs to track not only the mapping from
incoming CU name to outgoing dict name, but the inverse mapping from
outgoing dict name to incoming CU name, so it can work over every CTF
dict we might see in the output and link into it.
So rejig the ctf-link machinery to do that. Simultaneously (because
they are closely associated and were written at the same time), we add a
new CTF_LINK_EMPTY_CU_MAPPINGS flag to ctf_link, which tells the
ctf_link machinery to create empty child dicts for each outgoing CU
mapping even if no CUs that correspond to it exist in the link. This is
a bit (OK, quite a lot) of a waste of space, but some existing consumers
require it. (Nobody else should use it.)
Its value is not consecutive with existing CTF_LINK flag values because
we're about to add more flags that are conceptually closer to the
existing ones than this one is.
include/
* ctf-api.h (CTF_LINK_EMPTY_CU_MAPPINGS): New.
libctf/
* ctf-impl.h (ctf_file_t): Improve comments.
<ctf_link_cu_mapping>: Split into...
<ctf_link_in_cu_mapping>: ... this...
<ctf_link_out_cu_mapping>: ... and this.
* ctf-create.c (ctf_serialize): Adjust.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-link.c (ctf_create_per_cu): Look things up in the
in_cu_mapping instead of the cu_mapping.
(ctf_link_add_cu_mapping): The deduplicating link will define
what happens if many FROMs share a TO.
(ctf_link_add_cu_mapping): Create in_cu_mapping and
out_cu_mapping. Do not create ctf_link_outputs here any more, or
create per-CU dicts here: they are already created when needed.
(ctf_link_one_variable): Log a debug message if we skip a
variable due to its type being concealed in a CU-mapped link.
(This is probably too common a case to make into a warning.)
(ctf_link): Create empty per-CU dicts if requested.
2020-06-06 00:36:16 +08:00
|
|
|
|
libctf: fix linking together multiple objects derived from the same source
Right now, if you compile the same .c input repeatedly with CTF enabled
and different compilation flags, then arrange to link all of these
together, then things misbehave in various ways. libctf may conflate
either inputs (if the .o files have the same name, say if they are
stored in different .a archives), or per-CU outputs when conflicting
types are found: the latter can lead to entirely spurious errors when
it tries to produce multiple per-CU outputs with the same name
(discarding all but the last, but then looking for types in the earlier
ones which have just been thrown away).
Fixing this is multi-pronged. Both inputs and outputs need to be
differentiated in the hashtables libctf keeps them in: inputs with the
same cuname and filename need to be considered distinct as long as they
have different associated CTF dicts, and per-CU outputs need to be
considered distinct as long as they have different associated input
dicts. Right now there is nothing tying the two together other than the
CU name: fix this by introducing a new field in the ctf_dict_t named
ctf_link_in_out, which (for input dicts) points to the associated per-CU
output dict (if any), and for output dicts points to the associated
input dict. At creation time the name used is completely arbitrary:
it's only important that it be distinct if CTF dicts are distinct. So,
when a clash is found, adjust the CU name by sticking the number of
elements in the input on the end. At output time, the CU name will
appear in the linked object, so it matters a little more that it look
slightly less ugly: in conflicting cases, append an incrementing
integer, starting at 0.
This naming scheme is not very helpful, but it's hard to see what else
we can do. The input .o name may be the same. The input .a name is not
even visible to ctf_link, and even *that* might be the same, because
.a's can contain many members with the same name, all of which
participate in the link. All we really know is that the two have
distinct dictionaries with distinct types in them, and at least this way
they are all represented, any any symbols, variables etc referring to
those types are accurately stored.
(As a side-effect this also fixes a use-after-free and double-free when
errors are found during variable or symbol emission.)
Use the opportunity to prevent a couple of sources of problems, to wit
changing the active CU mappings when a link has already been done
(no effect on ld, which doesn't use CU mappings at all), and causing
multiple consecutive ctf_link's to have the same net effect as just
doing the last one (no effect on ld, which only ever does one
ctf_link) rather than having the links be a sort of half-incremental
not-really-intended mess.
libctf/ChangeLog:
PR libctf/29242
* ctf-impl.h (struct ctf_dict) [ctf_link_in_out]: New.
* ctf-dedup.c (ctf_dedup_emit_type): Set it.
* ctf-link.c (ctf_link_add_ctf_internal): Set the input
CU name uniquely when clashes are found.
(ctf_link_add): Document what repeated additions do.
(ctf_new_per_cu_name): New, come up with a consistent
name for a new per-CU dict.
(ctf_link_deduplicating): Use it.
(ctf_create_per_cu): Use it, and ctf_link_in_out, and set
ctf_link_in_out properly. Don't overwrite per-CU dicts with
per-CU dicts relating to different inputs.
(ctf_link_add_cu_mapping): Prevent per-CU mappings being set up
if we already have per-CU outputs.
(ctf_link_one_variable): Adjust ctf_link_per_cu call.
(ctf_link_deduplicating_one_symtypetab): Likewise.
(ctf_link_empty_outputs): New, delete all the ctf_link_outputs
and blank out ctf_link_in_out on the corresponding inputs.
(ctf_link): Clarify the effect of multiple ctf_link calls.
Empty ctf_link_outputs if it already exists rather than
having the old output leak into the new link. Fix a variable
name.
* testsuite/config/default.exp (AR): Add.
(OBJDUMP): Likewise.
* testsuite/libctf-regression/libctf-repeat-cu.exp: New test.
* testsuite/libctf-regression/libctf-repeat-cu*: Main program,
library, and expected results for the test.
2022-06-11 00:05:50 +08:00
|
|
|
while ((err = ctf_dynhash_next (fp->ctf_link_out_cu_mapping, &i, &k,
|
libctf, link: redo cu-mapping handling
Now a bunch of stuff that doesn't apply to ld or any normal use of
libctf, piled into one commit so that it's easier to ignore.
The cu-mapping machinery associates incoming compilation unit names with
outgoing names of CTF dictionaries that should correspond to them, for
non-gdb CTF consumers that would like to group multiple TUs into a
single child dict if conflicting types are found in it (the existing use
case is one kernel module, one child CTF dict, even if the kernel module
is composed of multiple CUs).
The upcoming deduplicator needs to track not only the mapping from
incoming CU name to outgoing dict name, but the inverse mapping from
outgoing dict name to incoming CU name, so it can work over every CTF
dict we might see in the output and link into it.
So rejig the ctf-link machinery to do that. Simultaneously (because
they are closely associated and were written at the same time), we add a
new CTF_LINK_EMPTY_CU_MAPPINGS flag to ctf_link, which tells the
ctf_link machinery to create empty child dicts for each outgoing CU
mapping even if no CUs that correspond to it exist in the link. This is
a bit (OK, quite a lot) of a waste of space, but some existing consumers
require it. (Nobody else should use it.)
Its value is not consecutive with existing CTF_LINK flag values because
we're about to add more flags that are conceptually closer to the
existing ones than this one is.
include/
* ctf-api.h (CTF_LINK_EMPTY_CU_MAPPINGS): New.
libctf/
* ctf-impl.h (ctf_file_t): Improve comments.
<ctf_link_cu_mapping>: Split into...
<ctf_link_in_cu_mapping>: ... this...
<ctf_link_out_cu_mapping>: ... and this.
* ctf-create.c (ctf_serialize): Adjust.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-link.c (ctf_create_per_cu): Look things up in the
in_cu_mapping instead of the cu_mapping.
(ctf_link_add_cu_mapping): The deduplicating link will define
what happens if many FROMs share a TO.
(ctf_link_add_cu_mapping): Create in_cu_mapping and
out_cu_mapping. Do not create ctf_link_outputs here any more, or
create per-CU dicts here: they are already created when needed.
(ctf_link_one_variable): Log a debug message if we skip a
variable due to its type being concealed in a CU-mapped link.
(This is probably too common a case to make into a warning.)
(ctf_link): Create empty per-CU dicts if requested.
2020-06-06 00:36:16 +08:00
|
|
|
NULL)) == 0)
|
|
|
|
{
|
libctf: fix linking together multiple objects derived from the same source
Right now, if you compile the same .c input repeatedly with CTF enabled
and different compilation flags, then arrange to link all of these
together, then things misbehave in various ways. libctf may conflate
either inputs (if the .o files have the same name, say if they are
stored in different .a archives), or per-CU outputs when conflicting
types are found: the latter can lead to entirely spurious errors when
it tries to produce multiple per-CU outputs with the same name
(discarding all but the last, but then looking for types in the earlier
ones which have just been thrown away).
Fixing this is multi-pronged. Both inputs and outputs need to be
differentiated in the hashtables libctf keeps them in: inputs with the
same cuname and filename need to be considered distinct as long as they
have different associated CTF dicts, and per-CU outputs need to be
considered distinct as long as they have different associated input
dicts. Right now there is nothing tying the two together other than the
CU name: fix this by introducing a new field in the ctf_dict_t named
ctf_link_in_out, which (for input dicts) points to the associated per-CU
output dict (if any), and for output dicts points to the associated
input dict. At creation time the name used is completely arbitrary:
it's only important that it be distinct if CTF dicts are distinct. So,
when a clash is found, adjust the CU name by sticking the number of
elements in the input on the end. At output time, the CU name will
appear in the linked object, so it matters a little more that it look
slightly less ugly: in conflicting cases, append an incrementing
integer, starting at 0.
This naming scheme is not very helpful, but it's hard to see what else
we can do. The input .o name may be the same. The input .a name is not
even visible to ctf_link, and even *that* might be the same, because
.a's can contain many members with the same name, all of which
participate in the link. All we really know is that the two have
distinct dictionaries with distinct types in them, and at least this way
they are all represented, any any symbols, variables etc referring to
those types are accurately stored.
(As a side-effect this also fixes a use-after-free and double-free when
errors are found during variable or symbol emission.)
Use the opportunity to prevent a couple of sources of problems, to wit
changing the active CU mappings when a link has already been done
(no effect on ld, which doesn't use CU mappings at all), and causing
multiple consecutive ctf_link's to have the same net effect as just
doing the last one (no effect on ld, which only ever does one
ctf_link) rather than having the links be a sort of half-incremental
not-really-intended mess.
libctf/ChangeLog:
PR libctf/29242
* ctf-impl.h (struct ctf_dict) [ctf_link_in_out]: New.
* ctf-dedup.c (ctf_dedup_emit_type): Set it.
* ctf-link.c (ctf_link_add_ctf_internal): Set the input
CU name uniquely when clashes are found.
(ctf_link_add): Document what repeated additions do.
(ctf_new_per_cu_name): New, come up with a consistent
name for a new per-CU dict.
(ctf_link_deduplicating): Use it.
(ctf_create_per_cu): Use it, and ctf_link_in_out, and set
ctf_link_in_out properly. Don't overwrite per-CU dicts with
per-CU dicts relating to different inputs.
(ctf_link_add_cu_mapping): Prevent per-CU mappings being set up
if we already have per-CU outputs.
(ctf_link_one_variable): Adjust ctf_link_per_cu call.
(ctf_link_deduplicating_one_symtypetab): Likewise.
(ctf_link_empty_outputs): New, delete all the ctf_link_outputs
and blank out ctf_link_in_out on the corresponding inputs.
(ctf_link): Clarify the effect of multiple ctf_link calls.
Empty ctf_link_outputs if it already exists rather than
having the old output leak into the new link. Fix a variable
name.
* testsuite/config/default.exp (AR): Add.
(OBJDUMP): Likewise.
* testsuite/libctf-regression/libctf-repeat-cu.exp: New test.
* testsuite/libctf-regression/libctf-repeat-cu*: Main program,
library, and expected results for the test.
2022-06-11 00:05:50 +08:00
|
|
|
const char *to = (const char *) k;
|
|
|
|
if (ctf_create_per_cu (fp, NULL, to) == NULL)
|
libctf, link: redo cu-mapping handling
Now a bunch of stuff that doesn't apply to ld or any normal use of
libctf, piled into one commit so that it's easier to ignore.
The cu-mapping machinery associates incoming compilation unit names with
outgoing names of CTF dictionaries that should correspond to them, for
non-gdb CTF consumers that would like to group multiple TUs into a
single child dict if conflicting types are found in it (the existing use
case is one kernel module, one child CTF dict, even if the kernel module
is composed of multiple CUs).
The upcoming deduplicator needs to track not only the mapping from
incoming CU name to outgoing dict name, but the inverse mapping from
outgoing dict name to incoming CU name, so it can work over every CTF
dict we might see in the output and link into it.
So rejig the ctf-link machinery to do that. Simultaneously (because
they are closely associated and were written at the same time), we add a
new CTF_LINK_EMPTY_CU_MAPPINGS flag to ctf_link, which tells the
ctf_link machinery to create empty child dicts for each outgoing CU
mapping even if no CUs that correspond to it exist in the link. This is
a bit (OK, quite a lot) of a waste of space, but some existing consumers
require it. (Nobody else should use it.)
Its value is not consecutive with existing CTF_LINK flag values because
we're about to add more flags that are conceptually closer to the
existing ones than this one is.
include/
* ctf-api.h (CTF_LINK_EMPTY_CU_MAPPINGS): New.
libctf/
* ctf-impl.h (ctf_file_t): Improve comments.
<ctf_link_cu_mapping>: Split into...
<ctf_link_in_cu_mapping>: ... this...
<ctf_link_out_cu_mapping>: ... and this.
* ctf-create.c (ctf_serialize): Adjust.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-link.c (ctf_create_per_cu): Look things up in the
in_cu_mapping instead of the cu_mapping.
(ctf_link_add_cu_mapping): The deduplicating link will define
what happens if many FROMs share a TO.
(ctf_link_add_cu_mapping): Create in_cu_mapping and
out_cu_mapping. Do not create ctf_link_outputs here any more, or
create per-CU dicts here: they are already created when needed.
(ctf_link_one_variable): Log a debug message if we skip a
variable due to its type being concealed in a CU-mapped link.
(This is probably too common a case to make into a warning.)
(ctf_link): Create empty per-CU dicts if requested.
2020-06-06 00:36:16 +08:00
|
|
|
{
|
libctf, ld: fix symtypetab and var section population under ld -r
The variable section in a CTF dict is meant to contain the types of
variables that do not appear in the symbol table (mostly file-scope
static declarations). We implement this by having the compiler emit
all potential data symbols into both sections, then delete those
symbols from the variable section that correspond to data symbols the
linker has reported.
Unfortunately, the check for this in ctf_serialize is wrong: rather than
checking the set of linker-reported symbols, we check the set of names
in the data object symtypetab section: if the linker has reported no
symbols at all (usually if ld -r has been run, or if a non-linker
program that does not use symbol tables is calling ctf_link) this will
include every single symbol, emptying the variable section completely.
Worse, when ld -r is in use, we want to force writeout of every
symtypetab entry on the inputs, in an indexed section, whether or not
the linker has reported them, since this isn't a final link yet and the
symbol table is not finalized (and may grow more symbols than the linker
has yet reported). But the check for this is flawed too: we were
relying on ctf_link_shuffle_syms not having been called if no symbols
exist, but that function is *always* called by ld even when ld -r is in
use: ctf_link_add_linker_symbol is the one that's not called when there
are no symbols.
We clearly need to rethink this. Using the emptiness of the set of
reported symbols as a test for ld -r is just ugly: the linker already
knows if ld -r is underway and can just tell us. So add a new linker
flag CTF_LINK_NO_FILTER_REPORTED_SYMS that is set to stop the linker
filtering the symbols in the symtypetab sections using the set that the
linker has reported: use the presence or absence of this flag to
determine whether to emit unindexed symtabs: we only remove entries from
the variable section when filtering symbols, and we only remove them if
they are in the reported symbol set, fixing the case where no symbols
are reported by the linker at all.
(The negative sense of the new CTF_LINK flag is intentional: the common
case, both for ld and for simple tools that want to do a ctf_link with
no ELF symbol table in sight, is probably to filter out symbols that no
linker has reported: i.e., for the simple tools, all of them.)
There's another wrinkle, though. It is quite possible for a non-linker
to add symbols to a dict via ctf_add_*_sym and then write it out via the
ctf_write APIs: perhaps it's preparing a dict for a later linker
invocation. Right now this would not lead to anything terribly
meaningful happening: ctf_serialize just assumes it was called via
ctf_link if symbols are present. So add an (internal-to-libctf) flag
that indicates that a writeout is happening via ctf_link_write, and set
it there (propagating it to child dicts as needed). ctf_serialize can
then spot when it is not being called by a linker, and arrange to always
write out an indexed, sorted symtypetab for fastest possible future
symbol lookup by name in that case. (The writeouts done by ld -r are
unsorted, because the only thing likely to use those symtabs is the
linker, which doesn't benefit from symtypetab sorting.)
Tests added for all three linking cases (ld -r, ld -shared, ld), with a
bit of testsuite framework enhancement to stop it unconditionally
linking the CTF to be checked by the lookup program with -shared, so
tests can now examine CTF linked with -r or indeed with no flags at all,
though the output filename is still foo.so even in this case.
Another test added for the non-linker case that endeavours to determine
whether the symtypetab is sorted by examining the order of entries
returned from ctf_symbol_next: nobody outside libctf should rely on
this ordering, but this test is not outside libctf :)
include/ChangeLog
2021-01-26 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (CTF_LINK_NO_FILTER_REPORTED_SYMS): New.
ld/ChangeLog
2021-01-26 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (lang_merge_ctf): Set CTF_LINK_NO_FILTER_REPORTED_SYMS
when appropriate.
libctf/ChangeLog
2021-01-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.c (_libctf_nonnull_): Add parameters.
(LCTF_LINKING): New flag.
(ctf_dict_t) <ctf_link_flags>: Mention it.
* ctf-link.c (ctf_link): Keep LCTF_LINKING set across call.
(ctf_write): Likewise, including in child dictionaries.
(ctf_link_shuffle_syms): Make sure ctf_dynsyms is NULL if there
are no reported symbols.
* ctf-create.c (symtypetab_delete_nonstatic_vars): Make sure
the variable has been reported as a symbol by the linker.
(symtypetab_skippable): Mention relationship between SYMFP and the
flags.
(symtypetab_density): Adjust nonnullity. Exit early if no symbols
were reported and force-indexing is off (i.e., we are doing a
final link).
(ctf_serialize): Handle the !LCTF_LINKING case by writing out an
indexed, sorted symtypetab (and allow SYMFP to be NULL in this
case). Turn sorting off if this is a non-final link. Only delete
nonstatic vars if we are filtering symbols and the linker has
reported some.
* testsuite/libctf-regression/nonstatic-var-section-ld-r*:
New test of variable and symtypetab section population when
ld -r is used.
* testsuite/libctf-regression/nonstatic-var-section-ld-executable.lk:
Likewise, when ld of an executable is used.
* testsuite/libctf-regression/nonstatic-var-section-ld.lk:
Likewise, when ld -shared alone is used.
* testsuite/libctf-regression/nonstatic-var-section-ld*.c:
Lookup programs for the above.
* testsuite/libctf-writable/symtypetab-nonlinker-writeout.*: New
test, testing survival of symbols across ctf_write paths.
* testsuite/lib/ctf-lib.exp (run_lookup_test): New option,
nonshared, suppressing linking of the SOURCE with -shared.
2021-01-17 00:49:29 +08:00
|
|
|
fp->ctf_flags &= ~LCTF_LINKING;
|
libctf, link: redo cu-mapping handling
Now a bunch of stuff that doesn't apply to ld or any normal use of
libctf, piled into one commit so that it's easier to ignore.
The cu-mapping machinery associates incoming compilation unit names with
outgoing names of CTF dictionaries that should correspond to them, for
non-gdb CTF consumers that would like to group multiple TUs into a
single child dict if conflicting types are found in it (the existing use
case is one kernel module, one child CTF dict, even if the kernel module
is composed of multiple CUs).
The upcoming deduplicator needs to track not only the mapping from
incoming CU name to outgoing dict name, but the inverse mapping from
outgoing dict name to incoming CU name, so it can work over every CTF
dict we might see in the output and link into it.
So rejig the ctf-link machinery to do that. Simultaneously (because
they are closely associated and were written at the same time), we add a
new CTF_LINK_EMPTY_CU_MAPPINGS flag to ctf_link, which tells the
ctf_link machinery to create empty child dicts for each outgoing CU
mapping even if no CUs that correspond to it exist in the link. This is
a bit (OK, quite a lot) of a waste of space, but some existing consumers
require it. (Nobody else should use it.)
Its value is not consecutive with existing CTF_LINK flag values because
we're about to add more flags that are conceptually closer to the
existing ones than this one is.
include/
* ctf-api.h (CTF_LINK_EMPTY_CU_MAPPINGS): New.
libctf/
* ctf-impl.h (ctf_file_t): Improve comments.
<ctf_link_cu_mapping>: Split into...
<ctf_link_in_cu_mapping>: ... this...
<ctf_link_out_cu_mapping>: ... and this.
* ctf-create.c (ctf_serialize): Adjust.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-link.c (ctf_create_per_cu): Look things up in the
in_cu_mapping instead of the cu_mapping.
(ctf_link_add_cu_mapping): The deduplicating link will define
what happens if many FROMs share a TO.
(ctf_link_add_cu_mapping): Create in_cu_mapping and
out_cu_mapping. Do not create ctf_link_outputs here any more, or
create per-CU dicts here: they are already created when needed.
(ctf_link_one_variable): Log a debug message if we skip a
variable due to its type being concealed in a CU-mapped link.
(This is probably too common a case to make into a warning.)
(ctf_link): Create empty per-CU dicts if requested.
2020-06-06 00:36:16 +08:00
|
|
|
ctf_next_destroy (i);
|
|
|
|
return -1; /* Errno is set for us. */
|
|
|
|
}
|
|
|
|
}
|
|
|
|
if (err != ECTF_NEXT_END)
|
|
|
|
{
|
libctf, ld: fix symtypetab and var section population under ld -r
The variable section in a CTF dict is meant to contain the types of
variables that do not appear in the symbol table (mostly file-scope
static declarations). We implement this by having the compiler emit
all potential data symbols into both sections, then delete those
symbols from the variable section that correspond to data symbols the
linker has reported.
Unfortunately, the check for this in ctf_serialize is wrong: rather than
checking the set of linker-reported symbols, we check the set of names
in the data object symtypetab section: if the linker has reported no
symbols at all (usually if ld -r has been run, or if a non-linker
program that does not use symbol tables is calling ctf_link) this will
include every single symbol, emptying the variable section completely.
Worse, when ld -r is in use, we want to force writeout of every
symtypetab entry on the inputs, in an indexed section, whether or not
the linker has reported them, since this isn't a final link yet and the
symbol table is not finalized (and may grow more symbols than the linker
has yet reported). But the check for this is flawed too: we were
relying on ctf_link_shuffle_syms not having been called if no symbols
exist, but that function is *always* called by ld even when ld -r is in
use: ctf_link_add_linker_symbol is the one that's not called when there
are no symbols.
We clearly need to rethink this. Using the emptiness of the set of
reported symbols as a test for ld -r is just ugly: the linker already
knows if ld -r is underway and can just tell us. So add a new linker
flag CTF_LINK_NO_FILTER_REPORTED_SYMS that is set to stop the linker
filtering the symbols in the symtypetab sections using the set that the
linker has reported: use the presence or absence of this flag to
determine whether to emit unindexed symtabs: we only remove entries from
the variable section when filtering symbols, and we only remove them if
they are in the reported symbol set, fixing the case where no symbols
are reported by the linker at all.
(The negative sense of the new CTF_LINK flag is intentional: the common
case, both for ld and for simple tools that want to do a ctf_link with
no ELF symbol table in sight, is probably to filter out symbols that no
linker has reported: i.e., for the simple tools, all of them.)
There's another wrinkle, though. It is quite possible for a non-linker
to add symbols to a dict via ctf_add_*_sym and then write it out via the
ctf_write APIs: perhaps it's preparing a dict for a later linker
invocation. Right now this would not lead to anything terribly
meaningful happening: ctf_serialize just assumes it was called via
ctf_link if symbols are present. So add an (internal-to-libctf) flag
that indicates that a writeout is happening via ctf_link_write, and set
it there (propagating it to child dicts as needed). ctf_serialize can
then spot when it is not being called by a linker, and arrange to always
write out an indexed, sorted symtypetab for fastest possible future
symbol lookup by name in that case. (The writeouts done by ld -r are
unsorted, because the only thing likely to use those symtabs is the
linker, which doesn't benefit from symtypetab sorting.)
Tests added for all three linking cases (ld -r, ld -shared, ld), with a
bit of testsuite framework enhancement to stop it unconditionally
linking the CTF to be checked by the lookup program with -shared, so
tests can now examine CTF linked with -r or indeed with no flags at all,
though the output filename is still foo.so even in this case.
Another test added for the non-linker case that endeavours to determine
whether the symtypetab is sorted by examining the order of entries
returned from ctf_symbol_next: nobody outside libctf should rely on
this ordering, but this test is not outside libctf :)
include/ChangeLog
2021-01-26 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (CTF_LINK_NO_FILTER_REPORTED_SYMS): New.
ld/ChangeLog
2021-01-26 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (lang_merge_ctf): Set CTF_LINK_NO_FILTER_REPORTED_SYMS
when appropriate.
libctf/ChangeLog
2021-01-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.c (_libctf_nonnull_): Add parameters.
(LCTF_LINKING): New flag.
(ctf_dict_t) <ctf_link_flags>: Mention it.
* ctf-link.c (ctf_link): Keep LCTF_LINKING set across call.
(ctf_write): Likewise, including in child dictionaries.
(ctf_link_shuffle_syms): Make sure ctf_dynsyms is NULL if there
are no reported symbols.
* ctf-create.c (symtypetab_delete_nonstatic_vars): Make sure
the variable has been reported as a symbol by the linker.
(symtypetab_skippable): Mention relationship between SYMFP and the
flags.
(symtypetab_density): Adjust nonnullity. Exit early if no symbols
were reported and force-indexing is off (i.e., we are doing a
final link).
(ctf_serialize): Handle the !LCTF_LINKING case by writing out an
indexed, sorted symtypetab (and allow SYMFP to be NULL in this
case). Turn sorting off if this is a non-final link. Only delete
nonstatic vars if we are filtering symbols and the linker has
reported some.
* testsuite/libctf-regression/nonstatic-var-section-ld-r*:
New test of variable and symtypetab section population when
ld -r is used.
* testsuite/libctf-regression/nonstatic-var-section-ld-executable.lk:
Likewise, when ld of an executable is used.
* testsuite/libctf-regression/nonstatic-var-section-ld.lk:
Likewise, when ld -shared alone is used.
* testsuite/libctf-regression/nonstatic-var-section-ld*.c:
Lookup programs for the above.
* testsuite/libctf-writable/symtypetab-nonlinker-writeout.*: New
test, testing survival of symbols across ctf_write paths.
* testsuite/lib/ctf-lib.exp (run_lookup_test): New option,
nonshared, suppressing linking of the SOURCE with -shared.
2021-01-17 00:49:29 +08:00
|
|
|
fp->ctf_flags &= ~LCTF_LINKING;
|
libctf, binutils, include, ld: gettextize and improve error handling
This commit follows on from the earlier commit "libctf, ld, binutils:
add textual error/warning reporting for libctf" and converts every error
in libctf that was reported using ctf_dprintf to use ctf_err_warn
instead, gettextizing them in the process, using N_() where necessary to
avoid doing gettext calls unless an error message is actually generated,
and rephrasing some error messages for ease of translation.
This requires a slight change in the ctf_errwarning_next API: this API
is public but has not been in a release yet, so can still change freely.
The problem is that many errors are emitted at open time (whether
opening of a CTF dict, or opening of a CTF archive): the former of these
throws away its incompletely-initialized ctf_file_t rather than return
it, and the latter has no ctf_file_t at all. So errors and warnings
emitted at open time cannot be stored in the ctf_file_t, and have to go
elsewhere.
We put them in a static local in ctf-subr.c (which is not very
thread-safe: a later commit will improve things here): ctf_err_warn with
a NULL fp adds to this list, and the public interface
ctf_errwarning_next with a NULL fp retrieves from it.
We need a slight exception from the usual iterator rules in this case:
with a NULL fp, there is nowhere to store the ECTF_NEXT_END "error"
which signifies the end of iteration, so we add a new err parameter to
ctf_errwarning_next which is used to report such iteration-related
errors. (If an fp is provided -- i.e., if not reporting open errors --
this is optional, but even if it's optional it's still an API change.
This is actually useful from a usability POV as well, since
ctf_errwarning_next is usually called when there's been an error, so
overwriting the error code with ECTF_NEXT_END is not very helpful!
So, unusually, ctf_errwarning_next now uses the passed fp for its
error code *only* if no errp pointer is passed in, and leaves it
untouched otherwise.)
ld, objdump and readelf are adapted to call ctf_errwarning_next with a
NULL fp to report open errors where appropriate.
The ctf_err_warn API also has to change, gaining a new error-number
parameter which is used to add the error message corresponding to that
error number into the debug stream when LIBCTF_DEBUG is enabled:
changing this API is easy at this point since we are already touching
all existing calls to gettextize them. We need this because the debug
stream should contain the errno's message, but the error reported in the
error/warning stream should *not*, because the caller will probably
report it themselves at failure time regardless, and reporting it in
every error message that leads up to it leads to a ridiculous chattering
on failure, which is likely to end up as ridiculous chattering on stderr
(trimmed a bit):
CTF error: `ld/testsuite/ld-ctf/A.c (0): lookup failure for type 3: flags 1: The parent CTF dictionary is unavailable'
CTF error: `ld/testsuite/ld-ctf/A.c (0): struct/union member type hashing error during type hashing for type 80000001, kind 6: The parent CTF dictionary is unavailable'
CTF error: `deduplicating link variable emission failed for ld/testsuite/ld-ctf/A.c: The parent CTF dictionary is unavailable'
ld/.libs/lt-ld-new: warning: CTF linking failed; output will have no CTF section: `The parent CTF dictionary is unavailable'
We only need to be told that the parent CTF dictionary is unavailable
*once*, not over and over again!
errmsgs are still emitted on warning generation, because warnings do not
usually lead to a failure propagated up to the caller and reported
there.
Debug-stream messages are not translated. If translation is turned on,
there will be a mixture of English and translated messages in the debug
stream, but rather that than burden the translators with debug-only
output.
binutils/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_archive_member): Move error-
reporting...
(dump_ctf_errs): ... into this separate function.
(dump_ctf): Call it on open errors.
* readelf.c (dump_ctf_archive_member): Move error-
reporting...
(dump_ctf_errs): ... into this separate function. Support
calls with NULL fp. Adjust for new err parameter to
ctf_errwarning_next.
(dump_section_as_ctf): Call it on open errors.
include/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_errwarning_next): New err parameter.
ld/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (lang_ctf_errs_warnings): Support calls with NULL fp.
Adjust for new err parameter to ctf_errwarning_next. Only
check for assertion failures when fp is non-NULL.
(ldlang_open_ctf): Call it on open errors.
* testsuite/ld-ctf/ctf.exp: Always use the C locale to avoid
breaking the diags tests.
libctf/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-subr.c (open_errors): New list.
(ctf_err_warn): Calls with NULL fp append to open_errors. Add err
parameter, and use it to decorate the debug stream with errmsgs.
(ctf_err_warn_to_open): Splice errors from a CTF dict into the
open_errors.
(ctf_errwarning_next): Calls with NULL fp report from open_errors.
New err param to report iteration errors (including end-of-iteration)
when fp is NULL.
(ctf_assert_fail_internal): Adjust ctf_err_warn call for new err
parameter: gettextize.
* ctf-impl.h (ctfo_get_vbytes): Add ctf_file_t parameter.
(LCTF_VBYTES): Adjust.
(ctf_err_warn_to_open): New.
(ctf_err_warn): Adjust.
(ctf_bundle): Used in only one place: move...
* ctf-create.c: ... here.
(enumcmp): Use ctf_err_warn, not ctf_dprintf, passing the err number
down as needed. Don't emit the errmsg. Gettextize.
(membcmp): Likewise.
(ctf_add_type_internal): Likewise.
(ctf_write_mem): Likewise.
(ctf_compress_write): Likewise. Report errors writing the header or
body.
(ctf_write): Likewise.
* ctf-archive.c (ctf_arc_write_fd): Use ctf_err_warn, not
ctf_dprintf, and gettextize, as above.
(ctf_arc_write): Likewise.
(ctf_arc_bufopen): Likewise.
(ctf_arc_open_internal): Likewise.
* ctf-labels.c (ctf_label_iter): Likewise.
* ctf-open-bfd.c (ctf_bfdclose): Likewise.
(ctf_bfdopen): Likewise.
(ctf_bfdopen_ctfsect): Likewise.
(ctf_fdopen): Likewise.
* ctf-string.c (ctf_str_write_strtab): Likewise.
* ctf-types.c (ctf_type_resolve): Likewise.
* ctf-open.c (get_vbytes_common): Likewise. Pass down the ctf dict.
(get_vbytes_v1): Pass down the ctf dict.
(get_vbytes_v2): Likewise.
(flip_ctf): Likewise.
(flip_types): Likewise. Use ctf_err_warn, not ctf_dprintf, and
gettextize, as above.
(upgrade_types_v1): Adjust calls.
(init_types): Use ctf_err_warn, not ctf_dprintf, as above.
(ctf_bufopen_internal): Likewise. Adjust calls. Transplant errors
emitted into individual dicts into the open errors if this turns
out to be a failed open in the end.
* ctf-dump.c (ctf_dump_format_type): Adjust ctf_err_warn for new err
argument. Gettextize. Don't emit the errmsg.
(ctf_dump_funcs): Likewise. Collapse err label into its only case.
(ctf_dump_type): Likewise.
* ctf-link.c (ctf_create_per_cu): Adjust ctf_err_warn for new err
argument. Gettextize. Don't emit the errmsg.
(ctf_link_one_type): Likewise.
(ctf_link_lazy_open): Likewise.
(ctf_link_one_input_archive): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Likewise.
(ctf_link_deduplicating_per_cu): Likewise. Add some missed
ctf_set_errnos to obscure error cases.
* ctf-dedup.c (ctf_dedup_rhash_type): Adjust ctf_err_warn for new
err argument. Gettextize. Don't emit the errmsg.
(ctf_dedup_populate_mappings): Likewise.
(ctf_dedup_detect_name_ambiguity): Likewise.
(ctf_dedup_init): Likewise.
(ctf_dedup_multiple_input_dicts): Likewise.
(ctf_dedup_conflictify_unshared): Likewise.
(ctf_dedup): Likewise.
(ctf_dedup_rwalk_one_output_mapping): Likewise.
(ctf_dedup_id_to_target): Likewise.
(ctf_dedup_emit_type): Likewise.
(ctf_dedup_emit_struct_members): Likewise.
(ctf_dedup_populate_type_mapping): Likewise.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): Likewise.
(ctf_dedup_hash_type): Likewise. Fix a bit of messed-up error
status setting.
(ctf_dedup_rwalk_one_output_mapping): Likewise. Don't hide
unknown-type-kind messages (which signify file corruption).
2020-07-27 23:45:15 +08:00
|
|
|
ctf_err_warn (fp, 1, err, _("iteration error creating empty CUs"));
|
2023-09-13 17:02:36 +08:00
|
|
|
return ctf_set_errno (fp, err);
|
libctf, link: redo cu-mapping handling
Now a bunch of stuff that doesn't apply to ld or any normal use of
libctf, piled into one commit so that it's easier to ignore.
The cu-mapping machinery associates incoming compilation unit names with
outgoing names of CTF dictionaries that should correspond to them, for
non-gdb CTF consumers that would like to group multiple TUs into a
single child dict if conflicting types are found in it (the existing use
case is one kernel module, one child CTF dict, even if the kernel module
is composed of multiple CUs).
The upcoming deduplicator needs to track not only the mapping from
incoming CU name to outgoing dict name, but the inverse mapping from
outgoing dict name to incoming CU name, so it can work over every CTF
dict we might see in the output and link into it.
So rejig the ctf-link machinery to do that. Simultaneously (because
they are closely associated and were written at the same time), we add a
new CTF_LINK_EMPTY_CU_MAPPINGS flag to ctf_link, which tells the
ctf_link machinery to create empty child dicts for each outgoing CU
mapping even if no CUs that correspond to it exist in the link. This is
a bit (OK, quite a lot) of a waste of space, but some existing consumers
require it. (Nobody else should use it.)
Its value is not consecutive with existing CTF_LINK flag values because
we're about to add more flags that are conceptually closer to the
existing ones than this one is.
include/
* ctf-api.h (CTF_LINK_EMPTY_CU_MAPPINGS): New.
libctf/
* ctf-impl.h (ctf_file_t): Improve comments.
<ctf_link_cu_mapping>: Split into...
<ctf_link_in_cu_mapping>: ... this...
<ctf_link_out_cu_mapping>: ... and this.
* ctf-create.c (ctf_serialize): Adjust.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-link.c (ctf_create_per_cu): Look things up in the
in_cu_mapping instead of the cu_mapping.
(ctf_link_add_cu_mapping): The deduplicating link will define
what happens if many FROMs share a TO.
(ctf_link_add_cu_mapping): Create in_cu_mapping and
out_cu_mapping. Do not create ctf_link_outputs here any more, or
create per-CU dicts here: they are already created when needed.
(ctf_link_one_variable): Log a debug message if we skip a
variable due to its type being concealed in a CU-mapped link.
(This is probably too common a case to make into a warning.)
(ctf_link): Create empty per-CU dicts if requested.
2020-06-06 00:36:16 +08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
typedef struct ctf_link_out_string_cb_arg
|
|
|
|
{
|
|
|
|
const char *str;
|
|
|
|
uint32_t offset;
|
|
|
|
int err;
|
|
|
|
} ctf_link_out_string_cb_arg_t;
|
|
|
|
|
|
|
|
/* Intern a string in the string table of an output per-CU CTF file. */
|
|
|
|
static void
|
|
|
|
ctf_link_intern_extern_string (void *key _libctf_unused_, void *value,
|
|
|
|
void *arg_)
|
|
|
|
{
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
ctf_dict_t *fp = (ctf_dict_t *) value;
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
ctf_link_out_string_cb_arg_t *arg = (ctf_link_out_string_cb_arg_t *) arg_;
|
|
|
|
|
libctf: avoid the need to ever use ctf_update
The method of operation of libctf when the dictionary is writable has
before now been that types that are added land in the dynamic type
section, which is a linked list and hash of IDs -> dynamic type
definitions (and, recently a hash of names): the DTDs are a bit of CTF
representing the ctf_type_t and ad hoc C structures representing the
vlen. Historically, libctf was unable to do anything with these types,
not even look them up by ID, let alone by name: if you wanted to do that
say if you were adding a type that depended on one you just added) you
called ctf_update, which serializes all the DTDs into a CTF file and
reopens it, copying its guts over the fp it's called with. The
ctf_updated types are then frozen in amber and unchangeable: all lookups
will return the types in the static portion in preference to the dynamic
portion, and we will refuse to re-add things that already exist in the
static portion (and, of late, in the dynamic portion too). The libctf
machinery remembers the boundary between static and dynamic types and
looks in the right portion for each type. Lots of things still don't
quite work with dynamic types (e.g. getting their size), but enough
works to do a bunch of additions and then a ctf_update, most of the
time.
Except it doesn't, because ctf_add_type finds it necessary to walk the
full dynamic type definition list looking for types with matching names,
so it gets slower and slower with every type you add: fixing this
requires calling ctf_update periodically for no other reason than to
avoid massively slowing things down.
This is all clunky and very slow but kind of works, until you consider
that it is in fact possible and indeed necessary to modify one sort of
type after it has been added: forwards. These are necessarily promoted
to structs, unions or enums, and when they do so *their type ID does not
change*. So all of a sudden we are changing types that already exist in
the static portion. ctf_update gets massively confused by this and
allocates space enough for the forward (with no members), but then emits
the new dynamic type (with all the members) into it. You get an
assertion failure after that, if you're lucky, or a coredump.
So this commit rejigs things a bit and arranges to exclusively use the
dynamic type definitions in writable dictionaries, and the static type
definitions in readable dictionaries: we don't at any time have a mixture
of static and dynamic types, and you don't need to call ctf_update to
make things "appear". The ctf_dtbyname hash I introduced a few months
ago, which maps things like "struct foo" to DTDs, is removed, replaced
instead by a change of type of the four dictionaries which track names.
Rather than just being (unresizable) ctf_hash_t's populated only at
ctf_bufopen time, they are now a ctf_names_t structure, which is a pair
of ctf_hash_t and ctf_dynhash_t, with the ctf_hash_t portion being used
in readonly dictionaries, and the ctf_dynhash_t being used in writable
ones. The decision as to which to use is centralized in the new
functions ctf_lookup_by_rawname (which takes a type kind) and
ctf_lookup_by_rawhash, which it calls (which takes a ctf_names_t *.)
This change lets us switch from using static to dynamic name hashes on
the fly across the entirety of libctf without complexifying anything: in
fact, because we now centralize the knowledge about how to map from type
kind to name hash, it actually simplifies things and lets us throw out
quite a lot of now-unnecessary complexity, from ctf_dtnyname (replaced
by the dynamic half of the name tables), through to ctf_dtnextid (now
that a dictionary's static portion is never referenced if the dictionary
is writable, we can just use ctf_typemax to indicate the maximum type:
dynamic or non-dynamic does not matter, and we no longer need to track
the boundary between the types). You can now ctf_rollback() as far as
you like, even past a ctf_update or for that matter a full writeout; all
the iteration functions work just as well on writable as on read-only
dictionaries; ctf_add_type no longer needs expensive duplicated code to
run over the dynamic types hunting for ones it might be interested in;
and the linker no longer needs a hack to call ctf_update so that calling
ctf_add_type is not impossibly expensive.
There is still a bit more complexity: some new code paths in ctf-types.c
need to know how to extract information from dynamic types. This
complexity will go away again in a few months when libctf acquires a
proper intermediate representation.
You can still call ctf_update if you like (it's public API, after all),
but its only effect now is to set the point to which ctf_discard rolls
back.
Obviously *something* still needs to serialize the CTF file before
writeout, and this job is done by ctf_serialize, which does everything
ctf_update used to except set the counter used by ctf_discard. It is
automatically called by the various functions that do CTF writeout:
nobody else ever needs to call it.
With this in place, forwards that are promoted to non-forwards no longer
crash the link, even if it happens tens of thousands of types later.
v5: fix tabdamage.
libctf/
* ctf-impl.h (ctf_names_t): New.
(ctf_lookup_t) <ctf_hash>: Now a ctf_names_t, not a ctf_hash_t.
(ctf_file_t) <ctf_structs>: Likewise.
<ctf_unions>: Likewise.
<ctf_enums>: Likewise.
<ctf_names>: Likewise.
<ctf_lookups>: Improve comment.
<ctf_ptrtab_len>: New.
<ctf_prov_strtab>: New.
<ctf_str_prov_offset>: New.
<ctf_dtbyname>: Remove, redundant to the names hashes.
<ctf_dtnextid>: Remove, redundant to ctf_typemax.
(ctf_dtdef_t) <dtd_name>: Remove.
<dtd_data>: Note that the ctt_name is now populated.
(ctf_str_atom_t) <csa_offset>: This is now the strtab
offset for internal strings too.
<csa_external_offset>: New, the external strtab offset.
(CTF_INDEX_TO_TYPEPTR): Handle the LCTF_RDWR case.
(ctf_name_table): New declaration.
(ctf_lookup_by_rawname): Likewise.
(ctf_lookup_by_rawhash): Likewise.
(ctf_set_ctl_hashes): Likewise.
(ctf_serialize): Likewise.
(ctf_dtd_insert): Adjust.
(ctf_simple_open_internal): Likewise.
(ctf_bufopen_internal): Likewise.
(ctf_list_empty_p): Likewise.
(ctf_str_remove_ref): Likewise.
(ctf_str_add): Returns uint32_t now.
(ctf_str_add_ref): Likewise.
(ctf_str_add_external): Now returns a boolean (int).
* ctf-string.c (ctf_strraw_explicit): Check the ctf_prov_strtab
for strings in the appropriate range.
(ctf_str_create_atoms): Create the ctf_prov_strtab. Detect OOM
when adding the null string to the new strtab.
(ctf_str_free_atoms): Destroy the ctf_prov_strtab.
(ctf_str_add_ref_internal): Add make_provisional argument. If
make_provisional, populate the offset and fill in the
ctf_prov_strtab accordingly.
(ctf_str_add): Return the offset, not the string.
(ctf_str_add_ref): Likewise.
(ctf_str_add_external): Return a success integer.
(ctf_str_remove_ref): New, remove a single ref.
(ctf_str_count_strtab): Do not count the initial null string's
length or the existence or length of any unreferenced internal
atoms.
(ctf_str_populate_sorttab): Skip atoms with no refs.
(ctf_str_write_strtab): Populate the nullstr earlier. Add one
to the cts_len for the null string, since it is no longer done
in ctf_str_count_strtab. Adjust for csa_external_offset rename.
Populate the csa_offset for both internal and external cases.
Flush the ctf_prov_strtab afterwards, and reset the
ctf_str_prov_offset.
* ctf-create.c (ctf_grow_ptrtab): New.
(ctf_create): Call it. Initialize new fields rather than old
ones. Tell ctf_bufopen_internal that this is a writable dictionary.
Set the ctl hashes and data model.
(ctf_update): Rename to...
(ctf_serialize): ... this. Leave a compatibility function behind.
Tell ctf_simple_open_internal that this is a writable dictionary.
Pass the new fields along from the old dictionary. Drop
ctf_dtnextid and ctf_dtbyname. Use ctf_strraw, not dtd_name.
Do not zero out the DTD's ctt_name.
(ctf_prefixed_name): Rename to...
(ctf_name_table): ... this. No longer return a prefixed name: return
the applicable name table instead.
(ctf_dtd_insert): Use it, and use the right name table. Pass in the
kind we're adding. Migrate away from dtd_name.
(ctf_dtd_delete): Adjust similarly. Remove the ref to the
deleted ctt_name.
(ctf_dtd_lookup_type_by_name): Remove.
(ctf_dynamic_type): Always return NULL on read-only dictionaries.
No longer check ctf_dtnextid: check ctf_typemax instead.
(ctf_snapshot): No longer use ctf_dtnextid: use ctf_typemax instead.
(ctf_rollback): Likewise. No longer fail with ECTF_OVERROLLBACK. Use
ctf_name_table and the right name table, and migrate away from
dtd_name as in ctf_dtd_delete.
(ctf_add_generic): Pass in the kind explicitly and pass it to
ctf_dtd_insert. Use ctf_typemax, not ctf_dtnextid. Migrate away
from dtd_name to using ctf_str_add_ref to populate the ctt_name.
Grow the ptrtab if needed.
(ctf_add_encoded): Pass in the kind.
(ctf_add_slice): Likewise.
(ctf_add_array): Likewise.
(ctf_add_function): Likewise.
(ctf_add_typedef): Likewise.
(ctf_add_reftype): Likewise. Initialize the ctf_ptrtab, checking
ctt_name rather than dtd_name.
(ctf_add_struct_sized): Pass in the kind. Use
ctf_lookup_by_rawname, not ctf_hash_lookup_type /
ctf_dtd_lookup_type_by_name.
(ctf_add_union_sized): Likewise.
(ctf_add_enum): Likewise.
(ctf_add_enum_encoded): Likewise.
(ctf_add_forward): Likewise.
(ctf_add_type): Likewise.
(ctf_compress_write): Call ctf_serialize: adjust for ctf_size not
being initialized until after the call.
(ctf_write_mem): Likewise.
(ctf_write): Likewise.
* ctf-archive.c (arc_write_one_ctf): Likewise.
* ctf-lookup.c (ctf_lookup_by_name): Use ctf_lookuup_by_rawhash, not
ctf_hash_lookup_type.
(ctf_lookup_by_id): No longer check the readonly types if the
dictionary is writable.
* ctf-open.c (init_types): Assert that this dictionary is not
writable. Adjust to use the new name hashes, ctf_name_table,
and ctf_ptrtab_len. GNU style fix for the final ptrtab scan.
(ctf_bufopen_internal): New 'writable' parameter. Flip on LCTF_RDWR
if set. Drop out early when dictionary is writable. Split the
ctf_lookups initialization into...
(ctf_set_cth_hashes): ... this new function.
(ctf_simple_open_internal): Adjust. New 'writable' parameter.
(ctf_simple_open): Adjust accordingly.
(ctf_bufopen): Likewise.
(ctf_file_close): Destroy the appropriate name hashes. No longer
destroy ctf_dtbyname, which is gone.
(ctf_getdatasect): Remove spurious "extern".
* ctf-types.c (ctf_lookup_by_rawname): New, look up types in the
specified name table, given a kind.
(ctf_lookup_by_rawhash): Likewise, given a ctf_names_t *.
(ctf_member_iter): Add support for iterating over the
dynamic type list.
(ctf_enum_iter): Likewise.
(ctf_variable_iter): Likewise.
(ctf_type_rvisit): Likewise.
(ctf_member_info): Add support for types in the dynamic type list.
(ctf_enum_name): Likewise.
(ctf_enum_value): Likewise.
(ctf_func_type_info): Likewise.
(ctf_func_type_args): Likewise.
* ctf-link.c (ctf_accumulate_archive_names): No longer call
ctf_update.
(ctf_link_write): Likewise.
(ctf_link_intern_extern_string): Adjust for new
ctf_str_add_external return value.
(ctf_link_add_strtab): Likewise.
* ctf-util.c (ctf_list_empty_p): New.
2019-08-08 00:55:09 +08:00
|
|
|
if (!ctf_str_add_external (fp, arg->str, arg->offset))
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
arg->err = ENOMEM;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Repeatedly call ADD_STRING to acquire strings from the external string table,
|
|
|
|
adding them to the atoms table for this CU and all subsidiary CUs.
|
|
|
|
|
libctf: support addition of types to dicts read via ctf_open()
libctf has long declared deserialized dictionaries (out of files or ELF
sections or memory buffers or whatever) to be read-only: back in the
furthest prehistory this was not the case, in that you could add a
few sorts of type to such dicts, but attempting to do so often caused
horrible memory corruption, so I banned the lot.
But it turns out real consumers want it (notably DTrace, which
synthesises pointers to types that don't have them and adds them to the
ctf_open()ed dicts if it needs them). Let's bring it back again, but
without the memory corruption and without the massive code duplication
required in days of yore to distinguish between static and dynamic
types: the representation of both types has been identical for a few
years, with the only difference being that types as a whole are stored in
a big buffer for types read in via ctf_open and per-type hashtables for
newly-added types.
So we discard the internally-visible concept of "readonly dictionaries"
in favour of declaring the *range of types* that were already present
when the dict was read in to be read-only: you can't modify them (say,
by adding members to them if they're structs, or calling ctf_set_array
on them), but you can add more types and point to them. (The API
remains the same, with calls sometimes returning ECTF_RDONLY, but now
they do so less often.)
This is a fairly invasive change, mostly because code written since the
ban was introduced didn't take the possibility of a static/dynamic split
into account. Some of these irregularities were hard to define as
anything but bugs.
Notably:
- The symbol handling was assuming that symbols only needed to be
looked for in dynamic hashtabs or static linker-laid-out indexed/
nonindexed layouts, but now we want to check both in case people
added more symbols to a dict they opened.
- The code that handles type additions wasn't checking to see if types
with the same name existed *at all* (so you could do
ctf_add_typedef (fp, "foo", bar) repeatedly without error). This
seems reasonable for types you just added, but we probably *do* want
to ban addition of types with names that override names we already
used in the ctf_open()ed portion, since that would probably corrupt
existing type relationships. (Doing things this way also avoids
causing new errors for any existing code that was doing this sort of
thing.)
- ctf_lookup_variable entirely failed to work for variables just added
by ctf_add_variable: you had to write the dict out and read it back
in again before they appeared.
- The symbol handling remembered what symbols you looked up but didn't
remember their types, so you could look up an object symbol and then
find it popping up when you asked for function symbols, which seems
less than ideal. Since we had to rejig things enough to be able to
distinguish function and object symbols internally anyway (in order
to give suitable errors if you try to add a symbol with a name that
already existed in the ctf_open()ed dict), this bug suddenly became
more visible and was easily fixed.
We do not (yet) support writing out dicts that have been previously read
in via ctf_open() or other deserializer (you can look things up in them,
but not write them out a second time). This never worked, so there is
no incompatibility; if it is needed at a later date, the serializer is a
little bit closer to having it work now (the only table we don't deal
with is the types table, and that's because the upcoming CTFv4 changes
are likely to make major changes to the way that table is represented
internally, so adding more code that depends on its current form seems
like a bad idea).
There is a new testcase that tests much of this, in particular that
modification of existing types is still banned and that you can add new
ones and chase them without error.
libctf/
* ctf-impl.h (struct ctf_dict.ctf_symhash): Split into...
(ctf_dict.ctf_symhash_func): ... this and...
(ctf_dict.ctf_symhash_objt): ... this.
(ctf_dict.ctf_stypes): New, counts static types.
(LCTF_INDEX_TO_TYPEPTR): Use it instead of CTF_RDWR.
(LCTF_RDWR): Deleted.
(LCTF_DIRTY): Renumbered.
(LCTF_LINKING): Likewise.
(ctf_lookup_variable_here): New.
(ctf_lookup_by_sym_or_name): Likewise.
(ctf_symbol_next_static): Likewise.
(ctf_add_variable_forced): Likewise.
(ctf_add_funcobjt_sym_forced): Likewise.
(ctf_simple_open_internal): Adjust.
(ctf_bufopen_internal): Likewise.
* ctf-create.c (ctf_grow_ptrtab): Adjust a lot to start with.
(ctf_create): Migrate a bunch of initializations into bufopen.
Force recreation of name tables. Do not forcibly override the
model, let ctf_bufopen do it.
(ctf_static_type): New.
(ctf_update): Drop LCTF_RDWR check.
(ctf_dynamic_type): Likewise.
(ctf_add_function): Likewise.
(ctf_add_type_internal): Likewise.
(ctf_rollback): Check ctf_stypes, not LCTF_RDWR.
(ctf_set_array): Likewise.
(ctf_add_struct_sized): Likewise.
(ctf_add_union_sized): Likewise.
(ctf_add_enum): Likewise.
(ctf_add_enumerator): Likewise (only on the target dict).
(ctf_add_member_offset): Likewise.
(ctf_add_generic): Drop LCTF_RDWR check. Ban addition of types
with colliding names.
(ctf_add_forward): Note safety under the new rules.
(ctf_add_variable): Split all but the existence check into...
(ctf_add_variable_forced): ... this new function.
(ctf_add_funcobjt_sym): Likewise...
(ctf_add_funcobjt_sym_forced): ... for this new function.
* ctf-link.c (ctf_link_add_linker_symbol): Ban calling on dicts
with any stypes.
(ctf_link_add_strtab): Likewise.
(ctf_link_shuffle_syms): Likewise.
(ctf_link_intern_extern_string): Note pre-existing prohibition.
* ctf-lookup.c (ctf_lookup_by_id): Drop LCTF_RDWR check.
(ctf_lookup_variable): Split out looking in a dict but not
its parent into...
(ctf_lookup_variable_here): ... this new function.
(ctf_lookup_symbol_idx): Track whether looking up a function or
object: cache them separately.
(ctf_symbol_next): Split out looking in non-dynamic symtypetab
entries to...
(ctf_symbol_next_static): ... this new function. Don't get confused
by the simultaneous presence of static and dynamic symtypetab entries.
(ctf_try_lookup_indexed): Don't waste time looking up symbols by
index before there can be any idea how symbols are numbered.
(ctf_lookup_by_sym_or_name): Distinguish between function and
data object lookups. Drop LCTF_RDWR.
(ctf_lookup_by_symbol): Adjust.
(ctf_lookup_by_symbol_name): Likewise.
* ctf-open.c (init_types): Rename to...
(init_static_types): ... this. Drop LCTF_RDWR. Populate ctf_stypes.
(ctf_simple_open): Drop writable arg.
(ctf_simple_open_internal): Likewise.
(ctf_bufopen): Likewise.
(ctf_bufopen_internal): Populate fields only used for writable dicts.
Drop LCTF_RDWR.
(ctf_dict_close): Cater for symhash cache split.
* ctf-serialize.c (ctf_serialize): Use ctf_stypes, not LCTF_RDWR.
* ctf-types.c (ctf_variable_next): Drop LCTF_RDWR.
* testsuite/libctf-lookup/add-to-opened*: New test.
2023-12-20 00:58:19 +08:00
|
|
|
Must be called on a dict that has not yet been serialized.
|
|
|
|
|
libctf: add a deduplicator-specific type mapping table
When CTF linking is done, the linker has to track the association
between types in the inputs and types in the outputs. The deduplicator
does this via the cd_output_emission_hashes, which maps from hashes of
types (valid in both the input and output) to the IDs of types in the
specific dict in which the cd_emission_hashes is held. However, the
nondeduplicating linker and ctf_add_type used a different mechanism, a
dedicated hashtab stored in the ctf_link_type_mapping, populated via
ctf_add_type_mapping and queried via the ctf_type_mapping function. To
allow the same functions to be used for variable and symbol population
in both the deduplicating and nondeduplicating linker, the deduplicator
carefully transferred all its input->output mappings into this hashtab
before returning.
This is *expensive*. The number of entries in this hashtab scales as the
number of input types, and unlike the hashing machinery the type mapping
machinery (the only other thing which scales that way) has not been much
optimized.
Now the nondeduplicating linker is gone, we can throw this out, move
the existing type mapping machinery to ctf-create.c and dedicate it to
ctf_add_type alone, and add a new function ctf_dedup_type_mapping which
uses the deduplicator's built-in knowledge of type mappings directly,
without requiring an expensive repopulation phase.
This speeds up a test link of nouveau.ko (a good worst-case candidate
with a lot of types in each of a lot of input files) from 9.11s to 7.15s
in my testing, a speedup of over 20%.
libctf/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (ctf_dict_t) <ctf_link_type_mapping>: No longer used
by the nondeduplicating linker.
(ctf_add_type_mapping): Removed, now static.
(ctf_type_mapping): Likewise.
(ctf_dedup_type_mapping): New.
(ctf_dedup_t) <cd_input_nums>: New.
* ctf-dedup.c (ctf_dedup_init): Populate it.
(ctf_dedup_fini): Free it again. Emphasise that this has to be
the last thing called.
(ctf_dedup): Populate it.
(ctf_dedup_populate_type_mapping): Removed.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): No longer call it. No longer call
ctf_dedup_fini either.
(ctf_dedup_type_mapping): New.
* ctf-link.c (ctf_unnamed_cuname): New.
(ctf_create_per_cu): Arguments must be non-null now.
(ctf_in_member_cb_arg): Removed.
(ctf_link): No longer populate it. No longer discard the
mapping table.
(ctf_link_deduplicating_one_symtypetab): Use
ctf_dedup_type_mapping, not ctf_type_mapping. Use
ctf_unnamed_cuname.
(ctf_link_one_variable): Likewise. Pass in args individually: no
longer a ctf_variable_iter callback.
(empty_link_type_mapping): Removed.
(ctf_link_deduplicating_variables): Use ctf_variable_next, not
ctf_variable_iter. No longer pack arguments to
ctf_link_one_variable into a struct.
(ctf_link_deduplicating_per_cu): Call ctf_dedup_fini once
all link phases are done.
(ctf_link_deduplicating): Likewise.
(ctf_link_intern_extern_string): Improve comment.
(ctf_add_type_mapping): Migrate...
(ctf_type_mapping): ... these functions...
* ctf-create.c (ctf_add_type_mapping): ... here...
(ctf_type_mapping): ... and make static, for the sole use of
ctf_add_type.
2021-03-02 23:10:05 +08:00
|
|
|
If ctf_link is also called, it must be called first if you want the new CTF
|
|
|
|
files ctf_link can create to get their strings dedupped against the ELF
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
strtab properly. */
|
|
|
|
int
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
ctf_link_add_strtab (ctf_dict_t *fp, ctf_link_strtab_string_f *add_string,
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
void *arg)
|
|
|
|
{
|
|
|
|
const char *str;
|
|
|
|
uint32_t offset;
|
|
|
|
int err = 0;
|
|
|
|
|
libctf: support addition of types to dicts read via ctf_open()
libctf has long declared deserialized dictionaries (out of files or ELF
sections or memory buffers or whatever) to be read-only: back in the
furthest prehistory this was not the case, in that you could add a
few sorts of type to such dicts, but attempting to do so often caused
horrible memory corruption, so I banned the lot.
But it turns out real consumers want it (notably DTrace, which
synthesises pointers to types that don't have them and adds them to the
ctf_open()ed dicts if it needs them). Let's bring it back again, but
without the memory corruption and without the massive code duplication
required in days of yore to distinguish between static and dynamic
types: the representation of both types has been identical for a few
years, with the only difference being that types as a whole are stored in
a big buffer for types read in via ctf_open and per-type hashtables for
newly-added types.
So we discard the internally-visible concept of "readonly dictionaries"
in favour of declaring the *range of types* that were already present
when the dict was read in to be read-only: you can't modify them (say,
by adding members to them if they're structs, or calling ctf_set_array
on them), but you can add more types and point to them. (The API
remains the same, with calls sometimes returning ECTF_RDONLY, but now
they do so less often.)
This is a fairly invasive change, mostly because code written since the
ban was introduced didn't take the possibility of a static/dynamic split
into account. Some of these irregularities were hard to define as
anything but bugs.
Notably:
- The symbol handling was assuming that symbols only needed to be
looked for in dynamic hashtabs or static linker-laid-out indexed/
nonindexed layouts, but now we want to check both in case people
added more symbols to a dict they opened.
- The code that handles type additions wasn't checking to see if types
with the same name existed *at all* (so you could do
ctf_add_typedef (fp, "foo", bar) repeatedly without error). This
seems reasonable for types you just added, but we probably *do* want
to ban addition of types with names that override names we already
used in the ctf_open()ed portion, since that would probably corrupt
existing type relationships. (Doing things this way also avoids
causing new errors for any existing code that was doing this sort of
thing.)
- ctf_lookup_variable entirely failed to work for variables just added
by ctf_add_variable: you had to write the dict out and read it back
in again before they appeared.
- The symbol handling remembered what symbols you looked up but didn't
remember their types, so you could look up an object symbol and then
find it popping up when you asked for function symbols, which seems
less than ideal. Since we had to rejig things enough to be able to
distinguish function and object symbols internally anyway (in order
to give suitable errors if you try to add a symbol with a name that
already existed in the ctf_open()ed dict), this bug suddenly became
more visible and was easily fixed.
We do not (yet) support writing out dicts that have been previously read
in via ctf_open() or other deserializer (you can look things up in them,
but not write them out a second time). This never worked, so there is
no incompatibility; if it is needed at a later date, the serializer is a
little bit closer to having it work now (the only table we don't deal
with is the types table, and that's because the upcoming CTFv4 changes
are likely to make major changes to the way that table is represented
internally, so adding more code that depends on its current form seems
like a bad idea).
There is a new testcase that tests much of this, in particular that
modification of existing types is still banned and that you can add new
ones and chase them without error.
libctf/
* ctf-impl.h (struct ctf_dict.ctf_symhash): Split into...
(ctf_dict.ctf_symhash_func): ... this and...
(ctf_dict.ctf_symhash_objt): ... this.
(ctf_dict.ctf_stypes): New, counts static types.
(LCTF_INDEX_TO_TYPEPTR): Use it instead of CTF_RDWR.
(LCTF_RDWR): Deleted.
(LCTF_DIRTY): Renumbered.
(LCTF_LINKING): Likewise.
(ctf_lookup_variable_here): New.
(ctf_lookup_by_sym_or_name): Likewise.
(ctf_symbol_next_static): Likewise.
(ctf_add_variable_forced): Likewise.
(ctf_add_funcobjt_sym_forced): Likewise.
(ctf_simple_open_internal): Adjust.
(ctf_bufopen_internal): Likewise.
* ctf-create.c (ctf_grow_ptrtab): Adjust a lot to start with.
(ctf_create): Migrate a bunch of initializations into bufopen.
Force recreation of name tables. Do not forcibly override the
model, let ctf_bufopen do it.
(ctf_static_type): New.
(ctf_update): Drop LCTF_RDWR check.
(ctf_dynamic_type): Likewise.
(ctf_add_function): Likewise.
(ctf_add_type_internal): Likewise.
(ctf_rollback): Check ctf_stypes, not LCTF_RDWR.
(ctf_set_array): Likewise.
(ctf_add_struct_sized): Likewise.
(ctf_add_union_sized): Likewise.
(ctf_add_enum): Likewise.
(ctf_add_enumerator): Likewise (only on the target dict).
(ctf_add_member_offset): Likewise.
(ctf_add_generic): Drop LCTF_RDWR check. Ban addition of types
with colliding names.
(ctf_add_forward): Note safety under the new rules.
(ctf_add_variable): Split all but the existence check into...
(ctf_add_variable_forced): ... this new function.
(ctf_add_funcobjt_sym): Likewise...
(ctf_add_funcobjt_sym_forced): ... for this new function.
* ctf-link.c (ctf_link_add_linker_symbol): Ban calling on dicts
with any stypes.
(ctf_link_add_strtab): Likewise.
(ctf_link_shuffle_syms): Likewise.
(ctf_link_intern_extern_string): Note pre-existing prohibition.
* ctf-lookup.c (ctf_lookup_by_id): Drop LCTF_RDWR check.
(ctf_lookup_variable): Split out looking in a dict but not
its parent into...
(ctf_lookup_variable_here): ... this new function.
(ctf_lookup_symbol_idx): Track whether looking up a function or
object: cache them separately.
(ctf_symbol_next): Split out looking in non-dynamic symtypetab
entries to...
(ctf_symbol_next_static): ... this new function. Don't get confused
by the simultaneous presence of static and dynamic symtypetab entries.
(ctf_try_lookup_indexed): Don't waste time looking up symbols by
index before there can be any idea how symbols are numbered.
(ctf_lookup_by_sym_or_name): Distinguish between function and
data object lookups. Drop LCTF_RDWR.
(ctf_lookup_by_symbol): Adjust.
(ctf_lookup_by_symbol_name): Likewise.
* ctf-open.c (init_types): Rename to...
(init_static_types): ... this. Drop LCTF_RDWR. Populate ctf_stypes.
(ctf_simple_open): Drop writable arg.
(ctf_simple_open_internal): Likewise.
(ctf_bufopen): Likewise.
(ctf_bufopen_internal): Populate fields only used for writable dicts.
Drop LCTF_RDWR.
(ctf_dict_close): Cater for symhash cache split.
* ctf-serialize.c (ctf_serialize): Use ctf_stypes, not LCTF_RDWR.
* ctf-types.c (ctf_variable_next): Drop LCTF_RDWR.
* testsuite/libctf-lookup/add-to-opened*: New test.
2023-12-20 00:58:19 +08:00
|
|
|
if (fp->ctf_stypes > 0)
|
|
|
|
return ctf_set_errno (fp, ECTF_RDONLY);
|
|
|
|
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
while ((str = add_string (&offset, arg)) != NULL)
|
|
|
|
{
|
|
|
|
ctf_link_out_string_cb_arg_t iter_arg = { str, offset, 0 };
|
|
|
|
|
libctf: avoid the need to ever use ctf_update
The method of operation of libctf when the dictionary is writable has
before now been that types that are added land in the dynamic type
section, which is a linked list and hash of IDs -> dynamic type
definitions (and, recently a hash of names): the DTDs are a bit of CTF
representing the ctf_type_t and ad hoc C structures representing the
vlen. Historically, libctf was unable to do anything with these types,
not even look them up by ID, let alone by name: if you wanted to do that
say if you were adding a type that depended on one you just added) you
called ctf_update, which serializes all the DTDs into a CTF file and
reopens it, copying its guts over the fp it's called with. The
ctf_updated types are then frozen in amber and unchangeable: all lookups
will return the types in the static portion in preference to the dynamic
portion, and we will refuse to re-add things that already exist in the
static portion (and, of late, in the dynamic portion too). The libctf
machinery remembers the boundary between static and dynamic types and
looks in the right portion for each type. Lots of things still don't
quite work with dynamic types (e.g. getting their size), but enough
works to do a bunch of additions and then a ctf_update, most of the
time.
Except it doesn't, because ctf_add_type finds it necessary to walk the
full dynamic type definition list looking for types with matching names,
so it gets slower and slower with every type you add: fixing this
requires calling ctf_update periodically for no other reason than to
avoid massively slowing things down.
This is all clunky and very slow but kind of works, until you consider
that it is in fact possible and indeed necessary to modify one sort of
type after it has been added: forwards. These are necessarily promoted
to structs, unions or enums, and when they do so *their type ID does not
change*. So all of a sudden we are changing types that already exist in
the static portion. ctf_update gets massively confused by this and
allocates space enough for the forward (with no members), but then emits
the new dynamic type (with all the members) into it. You get an
assertion failure after that, if you're lucky, or a coredump.
So this commit rejigs things a bit and arranges to exclusively use the
dynamic type definitions in writable dictionaries, and the static type
definitions in readable dictionaries: we don't at any time have a mixture
of static and dynamic types, and you don't need to call ctf_update to
make things "appear". The ctf_dtbyname hash I introduced a few months
ago, which maps things like "struct foo" to DTDs, is removed, replaced
instead by a change of type of the four dictionaries which track names.
Rather than just being (unresizable) ctf_hash_t's populated only at
ctf_bufopen time, they are now a ctf_names_t structure, which is a pair
of ctf_hash_t and ctf_dynhash_t, with the ctf_hash_t portion being used
in readonly dictionaries, and the ctf_dynhash_t being used in writable
ones. The decision as to which to use is centralized in the new
functions ctf_lookup_by_rawname (which takes a type kind) and
ctf_lookup_by_rawhash, which it calls (which takes a ctf_names_t *.)
This change lets us switch from using static to dynamic name hashes on
the fly across the entirety of libctf without complexifying anything: in
fact, because we now centralize the knowledge about how to map from type
kind to name hash, it actually simplifies things and lets us throw out
quite a lot of now-unnecessary complexity, from ctf_dtnyname (replaced
by the dynamic half of the name tables), through to ctf_dtnextid (now
that a dictionary's static portion is never referenced if the dictionary
is writable, we can just use ctf_typemax to indicate the maximum type:
dynamic or non-dynamic does not matter, and we no longer need to track
the boundary between the types). You can now ctf_rollback() as far as
you like, even past a ctf_update or for that matter a full writeout; all
the iteration functions work just as well on writable as on read-only
dictionaries; ctf_add_type no longer needs expensive duplicated code to
run over the dynamic types hunting for ones it might be interested in;
and the linker no longer needs a hack to call ctf_update so that calling
ctf_add_type is not impossibly expensive.
There is still a bit more complexity: some new code paths in ctf-types.c
need to know how to extract information from dynamic types. This
complexity will go away again in a few months when libctf acquires a
proper intermediate representation.
You can still call ctf_update if you like (it's public API, after all),
but its only effect now is to set the point to which ctf_discard rolls
back.
Obviously *something* still needs to serialize the CTF file before
writeout, and this job is done by ctf_serialize, which does everything
ctf_update used to except set the counter used by ctf_discard. It is
automatically called by the various functions that do CTF writeout:
nobody else ever needs to call it.
With this in place, forwards that are promoted to non-forwards no longer
crash the link, even if it happens tens of thousands of types later.
v5: fix tabdamage.
libctf/
* ctf-impl.h (ctf_names_t): New.
(ctf_lookup_t) <ctf_hash>: Now a ctf_names_t, not a ctf_hash_t.
(ctf_file_t) <ctf_structs>: Likewise.
<ctf_unions>: Likewise.
<ctf_enums>: Likewise.
<ctf_names>: Likewise.
<ctf_lookups>: Improve comment.
<ctf_ptrtab_len>: New.
<ctf_prov_strtab>: New.
<ctf_str_prov_offset>: New.
<ctf_dtbyname>: Remove, redundant to the names hashes.
<ctf_dtnextid>: Remove, redundant to ctf_typemax.
(ctf_dtdef_t) <dtd_name>: Remove.
<dtd_data>: Note that the ctt_name is now populated.
(ctf_str_atom_t) <csa_offset>: This is now the strtab
offset for internal strings too.
<csa_external_offset>: New, the external strtab offset.
(CTF_INDEX_TO_TYPEPTR): Handle the LCTF_RDWR case.
(ctf_name_table): New declaration.
(ctf_lookup_by_rawname): Likewise.
(ctf_lookup_by_rawhash): Likewise.
(ctf_set_ctl_hashes): Likewise.
(ctf_serialize): Likewise.
(ctf_dtd_insert): Adjust.
(ctf_simple_open_internal): Likewise.
(ctf_bufopen_internal): Likewise.
(ctf_list_empty_p): Likewise.
(ctf_str_remove_ref): Likewise.
(ctf_str_add): Returns uint32_t now.
(ctf_str_add_ref): Likewise.
(ctf_str_add_external): Now returns a boolean (int).
* ctf-string.c (ctf_strraw_explicit): Check the ctf_prov_strtab
for strings in the appropriate range.
(ctf_str_create_atoms): Create the ctf_prov_strtab. Detect OOM
when adding the null string to the new strtab.
(ctf_str_free_atoms): Destroy the ctf_prov_strtab.
(ctf_str_add_ref_internal): Add make_provisional argument. If
make_provisional, populate the offset and fill in the
ctf_prov_strtab accordingly.
(ctf_str_add): Return the offset, not the string.
(ctf_str_add_ref): Likewise.
(ctf_str_add_external): Return a success integer.
(ctf_str_remove_ref): New, remove a single ref.
(ctf_str_count_strtab): Do not count the initial null string's
length or the existence or length of any unreferenced internal
atoms.
(ctf_str_populate_sorttab): Skip atoms with no refs.
(ctf_str_write_strtab): Populate the nullstr earlier. Add one
to the cts_len for the null string, since it is no longer done
in ctf_str_count_strtab. Adjust for csa_external_offset rename.
Populate the csa_offset for both internal and external cases.
Flush the ctf_prov_strtab afterwards, and reset the
ctf_str_prov_offset.
* ctf-create.c (ctf_grow_ptrtab): New.
(ctf_create): Call it. Initialize new fields rather than old
ones. Tell ctf_bufopen_internal that this is a writable dictionary.
Set the ctl hashes and data model.
(ctf_update): Rename to...
(ctf_serialize): ... this. Leave a compatibility function behind.
Tell ctf_simple_open_internal that this is a writable dictionary.
Pass the new fields along from the old dictionary. Drop
ctf_dtnextid and ctf_dtbyname. Use ctf_strraw, not dtd_name.
Do not zero out the DTD's ctt_name.
(ctf_prefixed_name): Rename to...
(ctf_name_table): ... this. No longer return a prefixed name: return
the applicable name table instead.
(ctf_dtd_insert): Use it, and use the right name table. Pass in the
kind we're adding. Migrate away from dtd_name.
(ctf_dtd_delete): Adjust similarly. Remove the ref to the
deleted ctt_name.
(ctf_dtd_lookup_type_by_name): Remove.
(ctf_dynamic_type): Always return NULL on read-only dictionaries.
No longer check ctf_dtnextid: check ctf_typemax instead.
(ctf_snapshot): No longer use ctf_dtnextid: use ctf_typemax instead.
(ctf_rollback): Likewise. No longer fail with ECTF_OVERROLLBACK. Use
ctf_name_table and the right name table, and migrate away from
dtd_name as in ctf_dtd_delete.
(ctf_add_generic): Pass in the kind explicitly and pass it to
ctf_dtd_insert. Use ctf_typemax, not ctf_dtnextid. Migrate away
from dtd_name to using ctf_str_add_ref to populate the ctt_name.
Grow the ptrtab if needed.
(ctf_add_encoded): Pass in the kind.
(ctf_add_slice): Likewise.
(ctf_add_array): Likewise.
(ctf_add_function): Likewise.
(ctf_add_typedef): Likewise.
(ctf_add_reftype): Likewise. Initialize the ctf_ptrtab, checking
ctt_name rather than dtd_name.
(ctf_add_struct_sized): Pass in the kind. Use
ctf_lookup_by_rawname, not ctf_hash_lookup_type /
ctf_dtd_lookup_type_by_name.
(ctf_add_union_sized): Likewise.
(ctf_add_enum): Likewise.
(ctf_add_enum_encoded): Likewise.
(ctf_add_forward): Likewise.
(ctf_add_type): Likewise.
(ctf_compress_write): Call ctf_serialize: adjust for ctf_size not
being initialized until after the call.
(ctf_write_mem): Likewise.
(ctf_write): Likewise.
* ctf-archive.c (arc_write_one_ctf): Likewise.
* ctf-lookup.c (ctf_lookup_by_name): Use ctf_lookuup_by_rawhash, not
ctf_hash_lookup_type.
(ctf_lookup_by_id): No longer check the readonly types if the
dictionary is writable.
* ctf-open.c (init_types): Assert that this dictionary is not
writable. Adjust to use the new name hashes, ctf_name_table,
and ctf_ptrtab_len. GNU style fix for the final ptrtab scan.
(ctf_bufopen_internal): New 'writable' parameter. Flip on LCTF_RDWR
if set. Drop out early when dictionary is writable. Split the
ctf_lookups initialization into...
(ctf_set_cth_hashes): ... this new function.
(ctf_simple_open_internal): Adjust. New 'writable' parameter.
(ctf_simple_open): Adjust accordingly.
(ctf_bufopen): Likewise.
(ctf_file_close): Destroy the appropriate name hashes. No longer
destroy ctf_dtbyname, which is gone.
(ctf_getdatasect): Remove spurious "extern".
* ctf-types.c (ctf_lookup_by_rawname): New, look up types in the
specified name table, given a kind.
(ctf_lookup_by_rawhash): Likewise, given a ctf_names_t *.
(ctf_member_iter): Add support for iterating over the
dynamic type list.
(ctf_enum_iter): Likewise.
(ctf_variable_iter): Likewise.
(ctf_type_rvisit): Likewise.
(ctf_member_info): Add support for types in the dynamic type list.
(ctf_enum_name): Likewise.
(ctf_enum_value): Likewise.
(ctf_func_type_info): Likewise.
(ctf_func_type_args): Likewise.
* ctf-link.c (ctf_accumulate_archive_names): No longer call
ctf_update.
(ctf_link_write): Likewise.
(ctf_link_intern_extern_string): Adjust for new
ctf_str_add_external return value.
(ctf_link_add_strtab): Likewise.
* ctf-util.c (ctf_list_empty_p): New.
2019-08-08 00:55:09 +08:00
|
|
|
if (!ctf_str_add_external (fp, str, offset))
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
err = ENOMEM;
|
|
|
|
|
|
|
|
ctf_dynhash_iter (fp->ctf_link_outputs, ctf_link_intern_extern_string,
|
|
|
|
&iter_arg);
|
|
|
|
if (iter_arg.err)
|
|
|
|
err = iter_arg.err;
|
|
|
|
}
|
|
|
|
|
libctf: symbol type linking support
This adds facilities to write out the function info and data object
sections, which efficiently map from entries in the symbol table to
types. The write-side code is entirely new: the read-side code was
merely significantly changed and support for indexed tables added
(pointed to by the no-longer-unused cth_objtidxoff and cth_funcidxoff
header fields).
With this in place, you can use ctf_lookup_by_symbol to look up the
types of symbols of function and object type (and, as before, you can
use ctf_lookup_variable to look up types of file-scope variables not
present in the symbol table, as long as you know their name: but
variables that are also data objects are now found in the data object
section instead.)
(Compatible) file format change:
The CTF spec has always said that the function info section looks much
like the CTF_K_FUNCTIONs in the type section: an info word (including an
argument count) followed by a return type and N argument types. This
format is suboptimal: it means function symbols cannot be deduplicated
and it causes a lot of ugly code duplication in libctf. But
conveniently the compiler has never emitted this! Because it has always
emitted a rather different format that libctf has never accepted, we can
be sure that there are no instances of this function info section in the
wild, and can freely change its format without compatibility concerns or
a file format version bump. (And since it has never been emitted in any
code that generated any older file format version, either, we need keep
no code to read the format as specified at all!)
So the function info section is now specified as an array of uint32_t,
exactly like the object data section: each entry is a type ID in the
type section which must be of kind CTF_K_FUNCTION, the prototype of
this function.
This allows function types to be deduplicated and also correctly encodes
the fact that all functions declared in C really are types available to
the program: so they should be stored in the type section like all other
types. (In format v4, we will be able to represent the types of static
functions as well, but that really does require a file format change.)
We introduce a new header flag, CTF_F_NEWFUNCINFO, which is set if the
new function info format is in use. A sufficiently new compiler will
always set this flag. New libctf will always set this flag: old libctf
will refuse to open any CTF dicts that have this flag set. If the flag
is not set on a dict being read in, new libctf will disregard the
function info section. Format v4 will remove this flag (or, rather, the
flag has no meaning there and the bit position may be recycled for some
other purpose).
New API:
Symbol addition:
ctf_add_func_sym: Add a symbol with a given name and type. The
type must be of kind CTF_K_FUNCTION (a function
pointer). Internally this adds a name -> type
mapping to the ctf_funchash in the ctf_dict.
ctf_add_objt_sym: Add a symbol with a given name and type. The type
kind can be anything, including function pointers.
This adds to ctf_objthash.
These both treat symbols as name -> type mappings: the linker associates
symbol names with symbol indexes via the ctf_link_shuffle_syms callback,
which sets up the ctf_dynsyms/ctf_dynsymidx/ctf_dynsymmax fields in the
ctf_dict. Repeated relinks can add more symbols.
Variables that are also exposed as symbols are removed from the variable
section at serialization time.
CTF symbol type sections which have enough pads, defined by
CTF_INDEX_PAD_THRESHOLD (whether because they are in dicts with symbols
where most types are unknown, or in archive where most types are defined
in some child or parent dict, not in this specific dict) are sorted by
name rather than symidx and accompanied by an index which associates
each symbol type entry with a name: the existing ctf_lookup_by_symbol
will map symbol indexes to symbol names and look the names up in the
index automatically. (This is currently ELF-symbol-table-dependent, but
there is almost nothing specific to ELF in here and we can add support
for other symbol table formats easily).
The compiler also uses index sections to communicate the contents of
object file symbol tables without relying on any specific ordering of
symbols: it doesn't need to sort them, and libctf will detect an
unsorted index section via the absence of the new CTF_F_IDXSORTED header
flag, and sort it if needed.
Iteration:
ctf_symbol_next: Iterator which returns the types and names of symbols
one by one, either for function or data symbols.
This does not require any sorting: the ctf_link machinery uses it to
pull in all the compiler-provided symbols cheaply, but it is not
restricted to that use.
(Compatible) changes in API:
ctf_lookup_by_symbol: can now be called for object and function
symbols: never returns ECTF_NOTDATA (which is
now not thrown by anything, but is kept for
compatibility and because it is a plausible
error that we might start throwing again at some
later date).
Internally we also have changes to the ctf-string functionality so that
"external" strings (those where we track a string -> offset mapping, but
only write out an offset) can be consulted via the usual means
(ctf_strptr) before the strtab is written out. This is important
because ctf_link_add_linker_symbol can now be handed symbols named via
strtab offsets, and ctf_link_shuffle_syms must figure out their actual
names by looking in the external symtab we have just been fed by the
ctf_link_add_strtab callback, long before that strtab is written out.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_symbol_next): New.
(ctf_add_objt_sym): Likewise.
(ctf_add_func_sym): Likewise.
* ctf.h: Document new function info section format.
(CTF_F_NEWFUNCINFO): New.
(CTF_F_IDXSORTED): New.
(CTF_F_MAX): Adjust accordingly.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (CTF_INDEX_PAD_THRESHOLD): New.
(_libctf_nonnull_): Likewise.
(ctf_in_flight_dynsym_t): New.
(ctf_dict_t) <ctf_funcidx_names>: Likewise.
<ctf_objtidx_names>: Likewise.
<ctf_nfuncidx>: Likewise.
<ctf_nobjtidx>: Likewise.
<ctf_funcidx_sxlate>: Likewise.
<ctf_objtidx_sxlate>: Likewise.
<ctf_objthash>: Likewise.
<ctf_funchash>: Likewise.
<ctf_dynsyms>: Likewise.
<ctf_dynsymidx>: Likewise.
<ctf_dynsymmax>: Likewise.
<ctf_in_flight_dynsym>: Likewise.
(struct ctf_next) <u.ctn_next>: Likewise.
(ctf_symtab_skippable): New prototype.
(ctf_add_funcobjt_sym): Likewise.
(ctf_dynhash_sort_by_name): Likewise.
(ctf_sym_to_elf64): Rename to...
(ctf_elf32_to_link_sym): ... this, and...
(ctf_elf64_to_link_sym): ... this.
* ctf-open.c (init_symtab): Check for lack of CTF_F_NEWFUNCINFO
flag, and presence of index sections. Refactor out
ctf_symtab_skippable and ctf_elf*_to_link_sym, and use them. Use
ctf_link_sym_t, not Elf64_Sym. Skip initializing objt or func
sxlate sections if corresponding index section is present. Adjust
for new func info section format.
(ctf_bufopen_internal): Add ctf_err_warn to corrupt-file error
handling. Report incorrect-length index sections. Always do an
init_symtab, even if there is no symtab section (there may be index
sections still).
(flip_objts): Adjust comment: func and objt sections are actually
identical in structure now, no need to caveat.
(ctf_dict_close): Free newly-added data structures.
* ctf-create.c (ctf_create): Initialize them.
(ctf_symtab_skippable): New, refactored out of
init_symtab, with st_nameidx_set check added.
(ctf_add_funcobjt_sym): New, add a function or object symbol to the
ctf_objthash or ctf_funchash, by name.
(ctf_add_objt_sym): Call it.
(ctf_add_func_sym): Likewise.
(symtypetab_delete_nonstatic_vars): New, delete vars also present as
data objects.
(CTF_SYMTYPETAB_EMIT_FUNCTION): New flag to symtypetab emitters:
this is a function emission, not a data object emission.
(CTF_SYMTYPETAB_EMIT_PAD): New flag to symtypetab emitters: emit
pads for symbols with no type (only set for unindexed sections).
(CTF_SYMTYPETAB_FORCE_INDEXED): New flag to symtypetab emitters:
always emit indexed.
(symtypetab_density): New, figure out section sizes.
(emit_symtypetab): New, emit a symtypetab.
(emit_symtypetab_index): New, emit a symtypetab index.
(ctf_serialize): Call them, emitting suitably sorted symtypetab
sections and indexes. Set suitable header flags. Copy over new
fields.
* ctf-hash.c (ctf_dynhash_sort_by_name): New, used to impose an
order on symtypetab index sections.
* ctf-link.c (ctf_add_type_mapping): Delete erroneous comment
relating to code that was never committed.
(ctf_link_one_variable): Improve variable name.
(check_sym): New, symtypetab analogue of check_variable.
(ctf_link_deduplicating_one_symtypetab): New.
(ctf_link_deduplicating_syms): Likewise.
(ctf_link_deduplicating): Call them.
(ctf_link_deduplicating_per_cu): Note that we don't call them in
this case (yet).
(ctf_link_add_strtab): Set the error on the fp correctly.
(ctf_link_add_linker_symbol): New (no longer a do-nothing stub), add
a linker symbol to the in-flight list.
(ctf_link_shuffle_syms): New (no longer a do-nothing stub), turn the
in-flight list into a mapping we can use, now its names are
resolvable in the external strtab.
* ctf-string.c (ctf_str_rollback_atom): Don't roll back atoms with
external strtab offsets.
(ctf_str_rollback): Adjust comment.
(ctf_str_write_strtab): Migrate ctf_syn_ext_strtab population from
writeout time...
(ctf_str_add_external): ... to string addition time.
* ctf-lookup.c (ctf_lookup_var_key_t): Rename to...
(ctf_lookup_idx_key_t): ... this, now we use it for syms too.
<clik_names>: New member, a name table.
(ctf_lookup_var): Adjust accordingly.
(ctf_lookup_variable): Likewise.
(ctf_lookup_by_id): Shuffle further up in the file.
(ctf_symidx_sort_arg_cb): New, callback for...
(sort_symidx_by_name): ... this new function to sort a symidx
found to be unsorted (likely originating from the compiler).
(ctf_symidx_sort): New, sort a symidx.
(ctf_lookup_symbol_name): Support dynamic symbols with indexes
provided by the linker. Use ctf_link_sym_t, not Elf64_Sym.
Check the parent if a child lookup fails.
(ctf_lookup_by_symbol): Likewise. Work for function symbols too.
(ctf_symbol_next): New, iterate over symbols with types (without
sorting).
(ctf_lookup_idx_name): New, bsearch for symbol names in indexes.
(ctf_try_lookup_indexed): New, attempt an indexed lookup.
(ctf_func_info): Reimplement in terms of ctf_lookup_by_symbol.
(ctf_func_args): Likewise.
(ctf_get_dict): Move...
* ctf-types.c (ctf_get_dict): ... here.
* ctf-util.c (ctf_sym_to_elf64): Re-express as...
(ctf_elf64_to_link_sym): ... this. Add new st_symidx field, and
st_nameidx_set (always 0, so st_nameidx can be ignored). Look in
the ELF strtab for names.
(ctf_elf32_to_link_sym): Likewise, for Elf32_Sym.
(ctf_next_destroy): Destroy ctf_next_t.u.ctn_next if need be.
* libctf.ver: Add ctf_symbol_next, ctf_add_objt_sym and
ctf_add_func_sym.
2020-11-20 21:34:04 +08:00
|
|
|
if (err)
|
|
|
|
ctf_set_errno (fp, err);
|
|
|
|
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
return -err;
|
|
|
|
}
|
|
|
|
|
libctf: symbol type linking support
This adds facilities to write out the function info and data object
sections, which efficiently map from entries in the symbol table to
types. The write-side code is entirely new: the read-side code was
merely significantly changed and support for indexed tables added
(pointed to by the no-longer-unused cth_objtidxoff and cth_funcidxoff
header fields).
With this in place, you can use ctf_lookup_by_symbol to look up the
types of symbols of function and object type (and, as before, you can
use ctf_lookup_variable to look up types of file-scope variables not
present in the symbol table, as long as you know their name: but
variables that are also data objects are now found in the data object
section instead.)
(Compatible) file format change:
The CTF spec has always said that the function info section looks much
like the CTF_K_FUNCTIONs in the type section: an info word (including an
argument count) followed by a return type and N argument types. This
format is suboptimal: it means function symbols cannot be deduplicated
and it causes a lot of ugly code duplication in libctf. But
conveniently the compiler has never emitted this! Because it has always
emitted a rather different format that libctf has never accepted, we can
be sure that there are no instances of this function info section in the
wild, and can freely change its format without compatibility concerns or
a file format version bump. (And since it has never been emitted in any
code that generated any older file format version, either, we need keep
no code to read the format as specified at all!)
So the function info section is now specified as an array of uint32_t,
exactly like the object data section: each entry is a type ID in the
type section which must be of kind CTF_K_FUNCTION, the prototype of
this function.
This allows function types to be deduplicated and also correctly encodes
the fact that all functions declared in C really are types available to
the program: so they should be stored in the type section like all other
types. (In format v4, we will be able to represent the types of static
functions as well, but that really does require a file format change.)
We introduce a new header flag, CTF_F_NEWFUNCINFO, which is set if the
new function info format is in use. A sufficiently new compiler will
always set this flag. New libctf will always set this flag: old libctf
will refuse to open any CTF dicts that have this flag set. If the flag
is not set on a dict being read in, new libctf will disregard the
function info section. Format v4 will remove this flag (or, rather, the
flag has no meaning there and the bit position may be recycled for some
other purpose).
New API:
Symbol addition:
ctf_add_func_sym: Add a symbol with a given name and type. The
type must be of kind CTF_K_FUNCTION (a function
pointer). Internally this adds a name -> type
mapping to the ctf_funchash in the ctf_dict.
ctf_add_objt_sym: Add a symbol with a given name and type. The type
kind can be anything, including function pointers.
This adds to ctf_objthash.
These both treat symbols as name -> type mappings: the linker associates
symbol names with symbol indexes via the ctf_link_shuffle_syms callback,
which sets up the ctf_dynsyms/ctf_dynsymidx/ctf_dynsymmax fields in the
ctf_dict. Repeated relinks can add more symbols.
Variables that are also exposed as symbols are removed from the variable
section at serialization time.
CTF symbol type sections which have enough pads, defined by
CTF_INDEX_PAD_THRESHOLD (whether because they are in dicts with symbols
where most types are unknown, or in archive where most types are defined
in some child or parent dict, not in this specific dict) are sorted by
name rather than symidx and accompanied by an index which associates
each symbol type entry with a name: the existing ctf_lookup_by_symbol
will map symbol indexes to symbol names and look the names up in the
index automatically. (This is currently ELF-symbol-table-dependent, but
there is almost nothing specific to ELF in here and we can add support
for other symbol table formats easily).
The compiler also uses index sections to communicate the contents of
object file symbol tables without relying on any specific ordering of
symbols: it doesn't need to sort them, and libctf will detect an
unsorted index section via the absence of the new CTF_F_IDXSORTED header
flag, and sort it if needed.
Iteration:
ctf_symbol_next: Iterator which returns the types and names of symbols
one by one, either for function or data symbols.
This does not require any sorting: the ctf_link machinery uses it to
pull in all the compiler-provided symbols cheaply, but it is not
restricted to that use.
(Compatible) changes in API:
ctf_lookup_by_symbol: can now be called for object and function
symbols: never returns ECTF_NOTDATA (which is
now not thrown by anything, but is kept for
compatibility and because it is a plausible
error that we might start throwing again at some
later date).
Internally we also have changes to the ctf-string functionality so that
"external" strings (those where we track a string -> offset mapping, but
only write out an offset) can be consulted via the usual means
(ctf_strptr) before the strtab is written out. This is important
because ctf_link_add_linker_symbol can now be handed symbols named via
strtab offsets, and ctf_link_shuffle_syms must figure out their actual
names by looking in the external symtab we have just been fed by the
ctf_link_add_strtab callback, long before that strtab is written out.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_symbol_next): New.
(ctf_add_objt_sym): Likewise.
(ctf_add_func_sym): Likewise.
* ctf.h: Document new function info section format.
(CTF_F_NEWFUNCINFO): New.
(CTF_F_IDXSORTED): New.
(CTF_F_MAX): Adjust accordingly.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (CTF_INDEX_PAD_THRESHOLD): New.
(_libctf_nonnull_): Likewise.
(ctf_in_flight_dynsym_t): New.
(ctf_dict_t) <ctf_funcidx_names>: Likewise.
<ctf_objtidx_names>: Likewise.
<ctf_nfuncidx>: Likewise.
<ctf_nobjtidx>: Likewise.
<ctf_funcidx_sxlate>: Likewise.
<ctf_objtidx_sxlate>: Likewise.
<ctf_objthash>: Likewise.
<ctf_funchash>: Likewise.
<ctf_dynsyms>: Likewise.
<ctf_dynsymidx>: Likewise.
<ctf_dynsymmax>: Likewise.
<ctf_in_flight_dynsym>: Likewise.
(struct ctf_next) <u.ctn_next>: Likewise.
(ctf_symtab_skippable): New prototype.
(ctf_add_funcobjt_sym): Likewise.
(ctf_dynhash_sort_by_name): Likewise.
(ctf_sym_to_elf64): Rename to...
(ctf_elf32_to_link_sym): ... this, and...
(ctf_elf64_to_link_sym): ... this.
* ctf-open.c (init_symtab): Check for lack of CTF_F_NEWFUNCINFO
flag, and presence of index sections. Refactor out
ctf_symtab_skippable and ctf_elf*_to_link_sym, and use them. Use
ctf_link_sym_t, not Elf64_Sym. Skip initializing objt or func
sxlate sections if corresponding index section is present. Adjust
for new func info section format.
(ctf_bufopen_internal): Add ctf_err_warn to corrupt-file error
handling. Report incorrect-length index sections. Always do an
init_symtab, even if there is no symtab section (there may be index
sections still).
(flip_objts): Adjust comment: func and objt sections are actually
identical in structure now, no need to caveat.
(ctf_dict_close): Free newly-added data structures.
* ctf-create.c (ctf_create): Initialize them.
(ctf_symtab_skippable): New, refactored out of
init_symtab, with st_nameidx_set check added.
(ctf_add_funcobjt_sym): New, add a function or object symbol to the
ctf_objthash or ctf_funchash, by name.
(ctf_add_objt_sym): Call it.
(ctf_add_func_sym): Likewise.
(symtypetab_delete_nonstatic_vars): New, delete vars also present as
data objects.
(CTF_SYMTYPETAB_EMIT_FUNCTION): New flag to symtypetab emitters:
this is a function emission, not a data object emission.
(CTF_SYMTYPETAB_EMIT_PAD): New flag to symtypetab emitters: emit
pads for symbols with no type (only set for unindexed sections).
(CTF_SYMTYPETAB_FORCE_INDEXED): New flag to symtypetab emitters:
always emit indexed.
(symtypetab_density): New, figure out section sizes.
(emit_symtypetab): New, emit a symtypetab.
(emit_symtypetab_index): New, emit a symtypetab index.
(ctf_serialize): Call them, emitting suitably sorted symtypetab
sections and indexes. Set suitable header flags. Copy over new
fields.
* ctf-hash.c (ctf_dynhash_sort_by_name): New, used to impose an
order on symtypetab index sections.
* ctf-link.c (ctf_add_type_mapping): Delete erroneous comment
relating to code that was never committed.
(ctf_link_one_variable): Improve variable name.
(check_sym): New, symtypetab analogue of check_variable.
(ctf_link_deduplicating_one_symtypetab): New.
(ctf_link_deduplicating_syms): Likewise.
(ctf_link_deduplicating): Call them.
(ctf_link_deduplicating_per_cu): Note that we don't call them in
this case (yet).
(ctf_link_add_strtab): Set the error on the fp correctly.
(ctf_link_add_linker_symbol): New (no longer a do-nothing stub), add
a linker symbol to the in-flight list.
(ctf_link_shuffle_syms): New (no longer a do-nothing stub), turn the
in-flight list into a mapping we can use, now its names are
resolvable in the external strtab.
* ctf-string.c (ctf_str_rollback_atom): Don't roll back atoms with
external strtab offsets.
(ctf_str_rollback): Adjust comment.
(ctf_str_write_strtab): Migrate ctf_syn_ext_strtab population from
writeout time...
(ctf_str_add_external): ... to string addition time.
* ctf-lookup.c (ctf_lookup_var_key_t): Rename to...
(ctf_lookup_idx_key_t): ... this, now we use it for syms too.
<clik_names>: New member, a name table.
(ctf_lookup_var): Adjust accordingly.
(ctf_lookup_variable): Likewise.
(ctf_lookup_by_id): Shuffle further up in the file.
(ctf_symidx_sort_arg_cb): New, callback for...
(sort_symidx_by_name): ... this new function to sort a symidx
found to be unsorted (likely originating from the compiler).
(ctf_symidx_sort): New, sort a symidx.
(ctf_lookup_symbol_name): Support dynamic symbols with indexes
provided by the linker. Use ctf_link_sym_t, not Elf64_Sym.
Check the parent if a child lookup fails.
(ctf_lookup_by_symbol): Likewise. Work for function symbols too.
(ctf_symbol_next): New, iterate over symbols with types (without
sorting).
(ctf_lookup_idx_name): New, bsearch for symbol names in indexes.
(ctf_try_lookup_indexed): New, attempt an indexed lookup.
(ctf_func_info): Reimplement in terms of ctf_lookup_by_symbol.
(ctf_func_args): Likewise.
(ctf_get_dict): Move...
* ctf-types.c (ctf_get_dict): ... here.
* ctf-util.c (ctf_sym_to_elf64): Re-express as...
(ctf_elf64_to_link_sym): ... this. Add new st_symidx field, and
st_nameidx_set (always 0, so st_nameidx can be ignored). Look in
the ELF strtab for names.
(ctf_elf32_to_link_sym): Likewise, for Elf32_Sym.
(ctf_next_destroy): Destroy ctf_next_t.u.ctn_next if need be.
* libctf.ver: Add ctf_symbol_next, ctf_add_objt_sym and
ctf_add_func_sym.
2020-11-20 21:34:04 +08:00
|
|
|
/* Inform the ctf-link machinery of a new symbol in the target symbol table
|
|
|
|
(which must be some symtab that is not usually stripped, and which
|
|
|
|
is in agreement with ctf_bfdopen_ctfsect). May be called either before or
|
libctf: support addition of types to dicts read via ctf_open()
libctf has long declared deserialized dictionaries (out of files or ELF
sections or memory buffers or whatever) to be read-only: back in the
furthest prehistory this was not the case, in that you could add a
few sorts of type to such dicts, but attempting to do so often caused
horrible memory corruption, so I banned the lot.
But it turns out real consumers want it (notably DTrace, which
synthesises pointers to types that don't have them and adds them to the
ctf_open()ed dicts if it needs them). Let's bring it back again, but
without the memory corruption and without the massive code duplication
required in days of yore to distinguish between static and dynamic
types: the representation of both types has been identical for a few
years, with the only difference being that types as a whole are stored in
a big buffer for types read in via ctf_open and per-type hashtables for
newly-added types.
So we discard the internally-visible concept of "readonly dictionaries"
in favour of declaring the *range of types* that were already present
when the dict was read in to be read-only: you can't modify them (say,
by adding members to them if they're structs, or calling ctf_set_array
on them), but you can add more types and point to them. (The API
remains the same, with calls sometimes returning ECTF_RDONLY, but now
they do so less often.)
This is a fairly invasive change, mostly because code written since the
ban was introduced didn't take the possibility of a static/dynamic split
into account. Some of these irregularities were hard to define as
anything but bugs.
Notably:
- The symbol handling was assuming that symbols only needed to be
looked for in dynamic hashtabs or static linker-laid-out indexed/
nonindexed layouts, but now we want to check both in case people
added more symbols to a dict they opened.
- The code that handles type additions wasn't checking to see if types
with the same name existed *at all* (so you could do
ctf_add_typedef (fp, "foo", bar) repeatedly without error). This
seems reasonable for types you just added, but we probably *do* want
to ban addition of types with names that override names we already
used in the ctf_open()ed portion, since that would probably corrupt
existing type relationships. (Doing things this way also avoids
causing new errors for any existing code that was doing this sort of
thing.)
- ctf_lookup_variable entirely failed to work for variables just added
by ctf_add_variable: you had to write the dict out and read it back
in again before they appeared.
- The symbol handling remembered what symbols you looked up but didn't
remember their types, so you could look up an object symbol and then
find it popping up when you asked for function symbols, which seems
less than ideal. Since we had to rejig things enough to be able to
distinguish function and object symbols internally anyway (in order
to give suitable errors if you try to add a symbol with a name that
already existed in the ctf_open()ed dict), this bug suddenly became
more visible and was easily fixed.
We do not (yet) support writing out dicts that have been previously read
in via ctf_open() or other deserializer (you can look things up in them,
but not write them out a second time). This never worked, so there is
no incompatibility; if it is needed at a later date, the serializer is a
little bit closer to having it work now (the only table we don't deal
with is the types table, and that's because the upcoming CTFv4 changes
are likely to make major changes to the way that table is represented
internally, so adding more code that depends on its current form seems
like a bad idea).
There is a new testcase that tests much of this, in particular that
modification of existing types is still banned and that you can add new
ones and chase them without error.
libctf/
* ctf-impl.h (struct ctf_dict.ctf_symhash): Split into...
(ctf_dict.ctf_symhash_func): ... this and...
(ctf_dict.ctf_symhash_objt): ... this.
(ctf_dict.ctf_stypes): New, counts static types.
(LCTF_INDEX_TO_TYPEPTR): Use it instead of CTF_RDWR.
(LCTF_RDWR): Deleted.
(LCTF_DIRTY): Renumbered.
(LCTF_LINKING): Likewise.
(ctf_lookup_variable_here): New.
(ctf_lookup_by_sym_or_name): Likewise.
(ctf_symbol_next_static): Likewise.
(ctf_add_variable_forced): Likewise.
(ctf_add_funcobjt_sym_forced): Likewise.
(ctf_simple_open_internal): Adjust.
(ctf_bufopen_internal): Likewise.
* ctf-create.c (ctf_grow_ptrtab): Adjust a lot to start with.
(ctf_create): Migrate a bunch of initializations into bufopen.
Force recreation of name tables. Do not forcibly override the
model, let ctf_bufopen do it.
(ctf_static_type): New.
(ctf_update): Drop LCTF_RDWR check.
(ctf_dynamic_type): Likewise.
(ctf_add_function): Likewise.
(ctf_add_type_internal): Likewise.
(ctf_rollback): Check ctf_stypes, not LCTF_RDWR.
(ctf_set_array): Likewise.
(ctf_add_struct_sized): Likewise.
(ctf_add_union_sized): Likewise.
(ctf_add_enum): Likewise.
(ctf_add_enumerator): Likewise (only on the target dict).
(ctf_add_member_offset): Likewise.
(ctf_add_generic): Drop LCTF_RDWR check. Ban addition of types
with colliding names.
(ctf_add_forward): Note safety under the new rules.
(ctf_add_variable): Split all but the existence check into...
(ctf_add_variable_forced): ... this new function.
(ctf_add_funcobjt_sym): Likewise...
(ctf_add_funcobjt_sym_forced): ... for this new function.
* ctf-link.c (ctf_link_add_linker_symbol): Ban calling on dicts
with any stypes.
(ctf_link_add_strtab): Likewise.
(ctf_link_shuffle_syms): Likewise.
(ctf_link_intern_extern_string): Note pre-existing prohibition.
* ctf-lookup.c (ctf_lookup_by_id): Drop LCTF_RDWR check.
(ctf_lookup_variable): Split out looking in a dict but not
its parent into...
(ctf_lookup_variable_here): ... this new function.
(ctf_lookup_symbol_idx): Track whether looking up a function or
object: cache them separately.
(ctf_symbol_next): Split out looking in non-dynamic symtypetab
entries to...
(ctf_symbol_next_static): ... this new function. Don't get confused
by the simultaneous presence of static and dynamic symtypetab entries.
(ctf_try_lookup_indexed): Don't waste time looking up symbols by
index before there can be any idea how symbols are numbered.
(ctf_lookup_by_sym_or_name): Distinguish between function and
data object lookups. Drop LCTF_RDWR.
(ctf_lookup_by_symbol): Adjust.
(ctf_lookup_by_symbol_name): Likewise.
* ctf-open.c (init_types): Rename to...
(init_static_types): ... this. Drop LCTF_RDWR. Populate ctf_stypes.
(ctf_simple_open): Drop writable arg.
(ctf_simple_open_internal): Likewise.
(ctf_bufopen): Likewise.
(ctf_bufopen_internal): Populate fields only used for writable dicts.
Drop LCTF_RDWR.
(ctf_dict_close): Cater for symhash cache split.
* ctf-serialize.c (ctf_serialize): Use ctf_stypes, not LCTF_RDWR.
* ctf-types.c (ctf_variable_next): Drop LCTF_RDWR.
* testsuite/libctf-lookup/add-to-opened*: New test.
2023-12-20 00:58:19 +08:00
|
|
|
after ctf_link_add_strtab. As with that function, must be called on a dict which
|
|
|
|
has not yet been serialized. */
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
int
|
bfd, include, ld, binutils, libctf: CTF should use the dynstr/sym
This is embarrassing.
The whole point of CTF is that it remains intact even after a binary is
stripped, providing a compact mapping from symbols to types for
everything in the externally-visible interface of an ELF object: it has
connections to the symbol table for that purpose, and to the string
table to avoid duplicating symbol names. So it's a shame that the hooks
I implemented last year served to hook it up to the .symtab and .strtab,
which obviously disappear on strip, leaving any accompanying the CTF
dict containing references to strings (and, soon, symbols) which don't
exist any more because their containing strtab has been vaporized. The
original Solaris design used .dynsym and .dynstr (well, actually,
.ldynsym, which has more symbols) which do not disappear. So should we.
Thankfully the work we did before serves as guide rails, and adjusting
things to use the .dynstr and .dynsym was fast and easy. The only
annoyance is that the dynsym is assembled inside elflink.c in a fairly
piecemeal fashion, so that the easiest way to get the symbols out was to
hook in before every call to swap_symbol_out (we also leave in a hook in
front of symbol additions to the .symtab because it seems plausible that
we might want to hook them in future too: for now that hook is unused).
We adjust things so that rather than being offered a whole hash table of
symbols at once, libctf is now given symbols one at a time, with st_name
indexes already resolved and pointing at their final .dynstr offsets:
it's now up to libctf to resolve these to names as needed using the
strtab info we pass it separately.
Some bits might be contentious. The ctf_new_dynstr callback takes an
elf_internal_sym, and this remains an elf_internal_sym right down
through the generic emulation layers into ldelfgen. This is no worse
than the elf_sym_strtab we used to pass down, but in the future when we
gain non-ELF CTF symtab support we might want to lower the
elf_internal_sym to some other representation (perhaps a
ctf_link_symbol) in bfd or in ldlang_ctf_new_dynsym. We rename the
'apply_strsym' hooks to 'acquire_strings' instead, becuse they no longer
have anything to do with symbols.
There are some API changes to pieces of API which are technically public
but actually totally unused by anything and/or unused by anything but ld
so they can change freely: the ctf_link_symbol gains new fields to allow
symbol names to be given as strtab offsets as well as strings, and a
symidx so that the symbol index can be passed in. ctf_link_shuffle_syms
loses its callback parameter: the idea now is that linkers call the new
ctf_link_add_linker_symbol for every symbol in .dynsym, feed in all the
strtab entries with ctf_link_add_strtab, and then a call to
ctf_link_shuffle_syms will apply both and arrange to use them to reorder
the CTF symtab at CTF serialization time (which is coming in the next
commit).
Inside libctf we have a new preamble flag CTF_F_DYNSTR which is always
set in v3-format CTF dicts from this commit forwards: CTF dicts without
this flag are associated with .strtab like they used to be, so that old
dicts' external strings don't turn to garbage when loaded by new libctf.
Dicts with this flag are associated with .dynstr and .dynsym instead.
(The flag is not the next in sequence because this commit was written
quite late: the missing flags will be filled in by the next commit.)
Tests forthcoming in a later commit in this series.
bfd/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* elflink.c (elf_finalize_dynstr): Call examine_strtab after
dynstr finalization.
(elf_link_swap_symbols_out): Don't call it here. Call
ctf_new_symbol before swap_symbol_out.
(elf_link_output_extsym): Call ctf_new_dynsym before
swap_symbol_out.
(bfd_elf_final_link): Likewise.
* elf.c (swap_out_syms): Pass in bfd_link_info. Call
ctf_new_symbol before swap_symbol_out.
(_bfd_elf_compute_section_file_positions): Adjust.
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* readelf.c (dump_section_as_ctf): Use .dynsym and .dynstr, not
.symtab and .strtab.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* bfdlink.h (struct elf_sym_strtab): Replace with...
(struct elf_internal_sym): ... this.
(struct bfd_link_callbacks) <examine_strtab>: Take only a
symstrtab argument.
<ctf_new_symbol>: New.
<ctf_new_dynsym>: Likewise.
* ctf-api.h (struct ctf_link_sym) <st_symidx>: New.
<st_nameidx>: Likewise.
<st_nameidx_set>: Likewise.
(ctf_link_iter_symbol_f): Removed.
(ctf_link_shuffle_syms): Remove most parameters, just takes a
ctf_dict_t now.
(ctf_link_add_linker_symbol): New, split from
ctf_link_shuffle_syms.
* ctf.h (CTF_F_DYNSTR): New.
(CTF_F_MAX): Adjust.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldelfgen.c (struct ctf_strsym_iter_cb_arg): Rename to...
(struct ctf_strtab_iter_cb_arg): ... this, changing fields:
<syms>: Remove.
<symcount>: Remove.
<symstrtab>: Rename to...
<strtab>: ... this.
(ldelf_ctf_strtab_iter_cb): Adjust.
(ldelf_ctf_symbols_iter_cb): Remove.
(ldelf_new_dynsym_for_ctf): New, tell libctf about a single
symbol.
(ldelf_examine_strtab_for_ctf): Rename to...
(ldelf_acquire_strings_for_ctf): ... this, only doing the strtab
portion and not symbols.
* ldelfgen.h: Adjust declarations accordingly.
* ldemul.c (ldemul_examine_strtab_for_ctf): Rename to...
(ldemul_acquire_strings_for_ctf): ... this.
(ldemul_new_dynsym_for_ctf): New.
* ldemul.h: Adjust declarations accordingly.
* ldlang.c (ldlang_ctf_apply_strsym): Rename to...
(ldlang_ctf_acquire_strings): ... this.
(ldlang_ctf_new_dynsym): New.
(lang_write_ctf): Call ldemul_new_dynsym_for_ctf with NULL to do
the actual symbol shuffle.
* ldlang.h (struct elf_strtab_hash): Adjust accordingly.
* ldmain.c (bfd_link_callbacks): Wire up new/renamed callbacks.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-link.c (ctf_link_shuffle_syms): Adjust.
(ctf_link_add_linker_symbol): New, unimplemented stub.
* libctf.ver: Add it.
* ctf-create.c (ctf_serialize): Set CTF_F_DYNSTR on newly-serialized
dicts.
* ctf-open-bfd.c (ctf_bfdopen_ctfsect): Check for the flag: open the
symtab/strtab if not present, dynsym/dynstr otherwise.
* ctf-archive.c (ctf_arc_bufpreamble): New, get the preamble from
some arbitrary member of a CTF archive.
* ctf-impl.h (ctf_arc_bufpreamble): Declare it.
2020-11-20 21:34:04 +08:00
|
|
|
ctf_link_add_linker_symbol (ctf_dict_t *fp, ctf_link_sym_t *sym)
|
|
|
|
{
|
libctf: symbol type linking support
This adds facilities to write out the function info and data object
sections, which efficiently map from entries in the symbol table to
types. The write-side code is entirely new: the read-side code was
merely significantly changed and support for indexed tables added
(pointed to by the no-longer-unused cth_objtidxoff and cth_funcidxoff
header fields).
With this in place, you can use ctf_lookup_by_symbol to look up the
types of symbols of function and object type (and, as before, you can
use ctf_lookup_variable to look up types of file-scope variables not
present in the symbol table, as long as you know their name: but
variables that are also data objects are now found in the data object
section instead.)
(Compatible) file format change:
The CTF spec has always said that the function info section looks much
like the CTF_K_FUNCTIONs in the type section: an info word (including an
argument count) followed by a return type and N argument types. This
format is suboptimal: it means function symbols cannot be deduplicated
and it causes a lot of ugly code duplication in libctf. But
conveniently the compiler has never emitted this! Because it has always
emitted a rather different format that libctf has never accepted, we can
be sure that there are no instances of this function info section in the
wild, and can freely change its format without compatibility concerns or
a file format version bump. (And since it has never been emitted in any
code that generated any older file format version, either, we need keep
no code to read the format as specified at all!)
So the function info section is now specified as an array of uint32_t,
exactly like the object data section: each entry is a type ID in the
type section which must be of kind CTF_K_FUNCTION, the prototype of
this function.
This allows function types to be deduplicated and also correctly encodes
the fact that all functions declared in C really are types available to
the program: so they should be stored in the type section like all other
types. (In format v4, we will be able to represent the types of static
functions as well, but that really does require a file format change.)
We introduce a new header flag, CTF_F_NEWFUNCINFO, which is set if the
new function info format is in use. A sufficiently new compiler will
always set this flag. New libctf will always set this flag: old libctf
will refuse to open any CTF dicts that have this flag set. If the flag
is not set on a dict being read in, new libctf will disregard the
function info section. Format v4 will remove this flag (or, rather, the
flag has no meaning there and the bit position may be recycled for some
other purpose).
New API:
Symbol addition:
ctf_add_func_sym: Add a symbol with a given name and type. The
type must be of kind CTF_K_FUNCTION (a function
pointer). Internally this adds a name -> type
mapping to the ctf_funchash in the ctf_dict.
ctf_add_objt_sym: Add a symbol with a given name and type. The type
kind can be anything, including function pointers.
This adds to ctf_objthash.
These both treat symbols as name -> type mappings: the linker associates
symbol names with symbol indexes via the ctf_link_shuffle_syms callback,
which sets up the ctf_dynsyms/ctf_dynsymidx/ctf_dynsymmax fields in the
ctf_dict. Repeated relinks can add more symbols.
Variables that are also exposed as symbols are removed from the variable
section at serialization time.
CTF symbol type sections which have enough pads, defined by
CTF_INDEX_PAD_THRESHOLD (whether because they are in dicts with symbols
where most types are unknown, or in archive where most types are defined
in some child or parent dict, not in this specific dict) are sorted by
name rather than symidx and accompanied by an index which associates
each symbol type entry with a name: the existing ctf_lookup_by_symbol
will map symbol indexes to symbol names and look the names up in the
index automatically. (This is currently ELF-symbol-table-dependent, but
there is almost nothing specific to ELF in here and we can add support
for other symbol table formats easily).
The compiler also uses index sections to communicate the contents of
object file symbol tables without relying on any specific ordering of
symbols: it doesn't need to sort them, and libctf will detect an
unsorted index section via the absence of the new CTF_F_IDXSORTED header
flag, and sort it if needed.
Iteration:
ctf_symbol_next: Iterator which returns the types and names of symbols
one by one, either for function or data symbols.
This does not require any sorting: the ctf_link machinery uses it to
pull in all the compiler-provided symbols cheaply, but it is not
restricted to that use.
(Compatible) changes in API:
ctf_lookup_by_symbol: can now be called for object and function
symbols: never returns ECTF_NOTDATA (which is
now not thrown by anything, but is kept for
compatibility and because it is a plausible
error that we might start throwing again at some
later date).
Internally we also have changes to the ctf-string functionality so that
"external" strings (those where we track a string -> offset mapping, but
only write out an offset) can be consulted via the usual means
(ctf_strptr) before the strtab is written out. This is important
because ctf_link_add_linker_symbol can now be handed symbols named via
strtab offsets, and ctf_link_shuffle_syms must figure out their actual
names by looking in the external symtab we have just been fed by the
ctf_link_add_strtab callback, long before that strtab is written out.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_symbol_next): New.
(ctf_add_objt_sym): Likewise.
(ctf_add_func_sym): Likewise.
* ctf.h: Document new function info section format.
(CTF_F_NEWFUNCINFO): New.
(CTF_F_IDXSORTED): New.
(CTF_F_MAX): Adjust accordingly.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (CTF_INDEX_PAD_THRESHOLD): New.
(_libctf_nonnull_): Likewise.
(ctf_in_flight_dynsym_t): New.
(ctf_dict_t) <ctf_funcidx_names>: Likewise.
<ctf_objtidx_names>: Likewise.
<ctf_nfuncidx>: Likewise.
<ctf_nobjtidx>: Likewise.
<ctf_funcidx_sxlate>: Likewise.
<ctf_objtidx_sxlate>: Likewise.
<ctf_objthash>: Likewise.
<ctf_funchash>: Likewise.
<ctf_dynsyms>: Likewise.
<ctf_dynsymidx>: Likewise.
<ctf_dynsymmax>: Likewise.
<ctf_in_flight_dynsym>: Likewise.
(struct ctf_next) <u.ctn_next>: Likewise.
(ctf_symtab_skippable): New prototype.
(ctf_add_funcobjt_sym): Likewise.
(ctf_dynhash_sort_by_name): Likewise.
(ctf_sym_to_elf64): Rename to...
(ctf_elf32_to_link_sym): ... this, and...
(ctf_elf64_to_link_sym): ... this.
* ctf-open.c (init_symtab): Check for lack of CTF_F_NEWFUNCINFO
flag, and presence of index sections. Refactor out
ctf_symtab_skippable and ctf_elf*_to_link_sym, and use them. Use
ctf_link_sym_t, not Elf64_Sym. Skip initializing objt or func
sxlate sections if corresponding index section is present. Adjust
for new func info section format.
(ctf_bufopen_internal): Add ctf_err_warn to corrupt-file error
handling. Report incorrect-length index sections. Always do an
init_symtab, even if there is no symtab section (there may be index
sections still).
(flip_objts): Adjust comment: func and objt sections are actually
identical in structure now, no need to caveat.
(ctf_dict_close): Free newly-added data structures.
* ctf-create.c (ctf_create): Initialize them.
(ctf_symtab_skippable): New, refactored out of
init_symtab, with st_nameidx_set check added.
(ctf_add_funcobjt_sym): New, add a function or object symbol to the
ctf_objthash or ctf_funchash, by name.
(ctf_add_objt_sym): Call it.
(ctf_add_func_sym): Likewise.
(symtypetab_delete_nonstatic_vars): New, delete vars also present as
data objects.
(CTF_SYMTYPETAB_EMIT_FUNCTION): New flag to symtypetab emitters:
this is a function emission, not a data object emission.
(CTF_SYMTYPETAB_EMIT_PAD): New flag to symtypetab emitters: emit
pads for symbols with no type (only set for unindexed sections).
(CTF_SYMTYPETAB_FORCE_INDEXED): New flag to symtypetab emitters:
always emit indexed.
(symtypetab_density): New, figure out section sizes.
(emit_symtypetab): New, emit a symtypetab.
(emit_symtypetab_index): New, emit a symtypetab index.
(ctf_serialize): Call them, emitting suitably sorted symtypetab
sections and indexes. Set suitable header flags. Copy over new
fields.
* ctf-hash.c (ctf_dynhash_sort_by_name): New, used to impose an
order on symtypetab index sections.
* ctf-link.c (ctf_add_type_mapping): Delete erroneous comment
relating to code that was never committed.
(ctf_link_one_variable): Improve variable name.
(check_sym): New, symtypetab analogue of check_variable.
(ctf_link_deduplicating_one_symtypetab): New.
(ctf_link_deduplicating_syms): Likewise.
(ctf_link_deduplicating): Call them.
(ctf_link_deduplicating_per_cu): Note that we don't call them in
this case (yet).
(ctf_link_add_strtab): Set the error on the fp correctly.
(ctf_link_add_linker_symbol): New (no longer a do-nothing stub), add
a linker symbol to the in-flight list.
(ctf_link_shuffle_syms): New (no longer a do-nothing stub), turn the
in-flight list into a mapping we can use, now its names are
resolvable in the external strtab.
* ctf-string.c (ctf_str_rollback_atom): Don't roll back atoms with
external strtab offsets.
(ctf_str_rollback): Adjust comment.
(ctf_str_write_strtab): Migrate ctf_syn_ext_strtab population from
writeout time...
(ctf_str_add_external): ... to string addition time.
* ctf-lookup.c (ctf_lookup_var_key_t): Rename to...
(ctf_lookup_idx_key_t): ... this, now we use it for syms too.
<clik_names>: New member, a name table.
(ctf_lookup_var): Adjust accordingly.
(ctf_lookup_variable): Likewise.
(ctf_lookup_by_id): Shuffle further up in the file.
(ctf_symidx_sort_arg_cb): New, callback for...
(sort_symidx_by_name): ... this new function to sort a symidx
found to be unsorted (likely originating from the compiler).
(ctf_symidx_sort): New, sort a symidx.
(ctf_lookup_symbol_name): Support dynamic symbols with indexes
provided by the linker. Use ctf_link_sym_t, not Elf64_Sym.
Check the parent if a child lookup fails.
(ctf_lookup_by_symbol): Likewise. Work for function symbols too.
(ctf_symbol_next): New, iterate over symbols with types (without
sorting).
(ctf_lookup_idx_name): New, bsearch for symbol names in indexes.
(ctf_try_lookup_indexed): New, attempt an indexed lookup.
(ctf_func_info): Reimplement in terms of ctf_lookup_by_symbol.
(ctf_func_args): Likewise.
(ctf_get_dict): Move...
* ctf-types.c (ctf_get_dict): ... here.
* ctf-util.c (ctf_sym_to_elf64): Re-express as...
(ctf_elf64_to_link_sym): ... this. Add new st_symidx field, and
st_nameidx_set (always 0, so st_nameidx can be ignored). Look in
the ELF strtab for names.
(ctf_elf32_to_link_sym): Likewise, for Elf32_Sym.
(ctf_next_destroy): Destroy ctf_next_t.u.ctn_next if need be.
* libctf.ver: Add ctf_symbol_next, ctf_add_objt_sym and
ctf_add_func_sym.
2020-11-20 21:34:04 +08:00
|
|
|
ctf_in_flight_dynsym_t *cid;
|
|
|
|
|
|
|
|
/* Cheat a little: if there is already an ENOMEM error code recorded against
|
|
|
|
this dict, we shouldn't even try to add symbols because there will be no
|
|
|
|
memory to do so: probably we failed to add some previous symbol. This
|
|
|
|
makes out-of-memory exits 'sticky' across calls to this function, so the
|
|
|
|
caller doesn't need to worry about error conditions. */
|
|
|
|
|
|
|
|
if (ctf_errno (fp) == ENOMEM)
|
|
|
|
return -ENOMEM; /* errno is set for us. */
|
|
|
|
|
libctf: support addition of types to dicts read via ctf_open()
libctf has long declared deserialized dictionaries (out of files or ELF
sections or memory buffers or whatever) to be read-only: back in the
furthest prehistory this was not the case, in that you could add a
few sorts of type to such dicts, but attempting to do so often caused
horrible memory corruption, so I banned the lot.
But it turns out real consumers want it (notably DTrace, which
synthesises pointers to types that don't have them and adds them to the
ctf_open()ed dicts if it needs them). Let's bring it back again, but
without the memory corruption and without the massive code duplication
required in days of yore to distinguish between static and dynamic
types: the representation of both types has been identical for a few
years, with the only difference being that types as a whole are stored in
a big buffer for types read in via ctf_open and per-type hashtables for
newly-added types.
So we discard the internally-visible concept of "readonly dictionaries"
in favour of declaring the *range of types* that were already present
when the dict was read in to be read-only: you can't modify them (say,
by adding members to them if they're structs, or calling ctf_set_array
on them), but you can add more types and point to them. (The API
remains the same, with calls sometimes returning ECTF_RDONLY, but now
they do so less often.)
This is a fairly invasive change, mostly because code written since the
ban was introduced didn't take the possibility of a static/dynamic split
into account. Some of these irregularities were hard to define as
anything but bugs.
Notably:
- The symbol handling was assuming that symbols only needed to be
looked for in dynamic hashtabs or static linker-laid-out indexed/
nonindexed layouts, but now we want to check both in case people
added more symbols to a dict they opened.
- The code that handles type additions wasn't checking to see if types
with the same name existed *at all* (so you could do
ctf_add_typedef (fp, "foo", bar) repeatedly without error). This
seems reasonable for types you just added, but we probably *do* want
to ban addition of types with names that override names we already
used in the ctf_open()ed portion, since that would probably corrupt
existing type relationships. (Doing things this way also avoids
causing new errors for any existing code that was doing this sort of
thing.)
- ctf_lookup_variable entirely failed to work for variables just added
by ctf_add_variable: you had to write the dict out and read it back
in again before they appeared.
- The symbol handling remembered what symbols you looked up but didn't
remember their types, so you could look up an object symbol and then
find it popping up when you asked for function symbols, which seems
less than ideal. Since we had to rejig things enough to be able to
distinguish function and object symbols internally anyway (in order
to give suitable errors if you try to add a symbol with a name that
already existed in the ctf_open()ed dict), this bug suddenly became
more visible and was easily fixed.
We do not (yet) support writing out dicts that have been previously read
in via ctf_open() or other deserializer (you can look things up in them,
but not write them out a second time). This never worked, so there is
no incompatibility; if it is needed at a later date, the serializer is a
little bit closer to having it work now (the only table we don't deal
with is the types table, and that's because the upcoming CTFv4 changes
are likely to make major changes to the way that table is represented
internally, so adding more code that depends on its current form seems
like a bad idea).
There is a new testcase that tests much of this, in particular that
modification of existing types is still banned and that you can add new
ones and chase them without error.
libctf/
* ctf-impl.h (struct ctf_dict.ctf_symhash): Split into...
(ctf_dict.ctf_symhash_func): ... this and...
(ctf_dict.ctf_symhash_objt): ... this.
(ctf_dict.ctf_stypes): New, counts static types.
(LCTF_INDEX_TO_TYPEPTR): Use it instead of CTF_RDWR.
(LCTF_RDWR): Deleted.
(LCTF_DIRTY): Renumbered.
(LCTF_LINKING): Likewise.
(ctf_lookup_variable_here): New.
(ctf_lookup_by_sym_or_name): Likewise.
(ctf_symbol_next_static): Likewise.
(ctf_add_variable_forced): Likewise.
(ctf_add_funcobjt_sym_forced): Likewise.
(ctf_simple_open_internal): Adjust.
(ctf_bufopen_internal): Likewise.
* ctf-create.c (ctf_grow_ptrtab): Adjust a lot to start with.
(ctf_create): Migrate a bunch of initializations into bufopen.
Force recreation of name tables. Do not forcibly override the
model, let ctf_bufopen do it.
(ctf_static_type): New.
(ctf_update): Drop LCTF_RDWR check.
(ctf_dynamic_type): Likewise.
(ctf_add_function): Likewise.
(ctf_add_type_internal): Likewise.
(ctf_rollback): Check ctf_stypes, not LCTF_RDWR.
(ctf_set_array): Likewise.
(ctf_add_struct_sized): Likewise.
(ctf_add_union_sized): Likewise.
(ctf_add_enum): Likewise.
(ctf_add_enumerator): Likewise (only on the target dict).
(ctf_add_member_offset): Likewise.
(ctf_add_generic): Drop LCTF_RDWR check. Ban addition of types
with colliding names.
(ctf_add_forward): Note safety under the new rules.
(ctf_add_variable): Split all but the existence check into...
(ctf_add_variable_forced): ... this new function.
(ctf_add_funcobjt_sym): Likewise...
(ctf_add_funcobjt_sym_forced): ... for this new function.
* ctf-link.c (ctf_link_add_linker_symbol): Ban calling on dicts
with any stypes.
(ctf_link_add_strtab): Likewise.
(ctf_link_shuffle_syms): Likewise.
(ctf_link_intern_extern_string): Note pre-existing prohibition.
* ctf-lookup.c (ctf_lookup_by_id): Drop LCTF_RDWR check.
(ctf_lookup_variable): Split out looking in a dict but not
its parent into...
(ctf_lookup_variable_here): ... this new function.
(ctf_lookup_symbol_idx): Track whether looking up a function or
object: cache them separately.
(ctf_symbol_next): Split out looking in non-dynamic symtypetab
entries to...
(ctf_symbol_next_static): ... this new function. Don't get confused
by the simultaneous presence of static and dynamic symtypetab entries.
(ctf_try_lookup_indexed): Don't waste time looking up symbols by
index before there can be any idea how symbols are numbered.
(ctf_lookup_by_sym_or_name): Distinguish between function and
data object lookups. Drop LCTF_RDWR.
(ctf_lookup_by_symbol): Adjust.
(ctf_lookup_by_symbol_name): Likewise.
* ctf-open.c (init_types): Rename to...
(init_static_types): ... this. Drop LCTF_RDWR. Populate ctf_stypes.
(ctf_simple_open): Drop writable arg.
(ctf_simple_open_internal): Likewise.
(ctf_bufopen): Likewise.
(ctf_bufopen_internal): Populate fields only used for writable dicts.
Drop LCTF_RDWR.
(ctf_dict_close): Cater for symhash cache split.
* ctf-serialize.c (ctf_serialize): Use ctf_stypes, not LCTF_RDWR.
* ctf-types.c (ctf_variable_next): Drop LCTF_RDWR.
* testsuite/libctf-lookup/add-to-opened*: New test.
2023-12-20 00:58:19 +08:00
|
|
|
if (fp->ctf_stypes > 0)
|
|
|
|
return ctf_set_errno (fp, ECTF_RDONLY);
|
|
|
|
|
libctf: symbol type linking support
This adds facilities to write out the function info and data object
sections, which efficiently map from entries in the symbol table to
types. The write-side code is entirely new: the read-side code was
merely significantly changed and support for indexed tables added
(pointed to by the no-longer-unused cth_objtidxoff and cth_funcidxoff
header fields).
With this in place, you can use ctf_lookup_by_symbol to look up the
types of symbols of function and object type (and, as before, you can
use ctf_lookup_variable to look up types of file-scope variables not
present in the symbol table, as long as you know their name: but
variables that are also data objects are now found in the data object
section instead.)
(Compatible) file format change:
The CTF spec has always said that the function info section looks much
like the CTF_K_FUNCTIONs in the type section: an info word (including an
argument count) followed by a return type and N argument types. This
format is suboptimal: it means function symbols cannot be deduplicated
and it causes a lot of ugly code duplication in libctf. But
conveniently the compiler has never emitted this! Because it has always
emitted a rather different format that libctf has never accepted, we can
be sure that there are no instances of this function info section in the
wild, and can freely change its format without compatibility concerns or
a file format version bump. (And since it has never been emitted in any
code that generated any older file format version, either, we need keep
no code to read the format as specified at all!)
So the function info section is now specified as an array of uint32_t,
exactly like the object data section: each entry is a type ID in the
type section which must be of kind CTF_K_FUNCTION, the prototype of
this function.
This allows function types to be deduplicated and also correctly encodes
the fact that all functions declared in C really are types available to
the program: so they should be stored in the type section like all other
types. (In format v4, we will be able to represent the types of static
functions as well, but that really does require a file format change.)
We introduce a new header flag, CTF_F_NEWFUNCINFO, which is set if the
new function info format is in use. A sufficiently new compiler will
always set this flag. New libctf will always set this flag: old libctf
will refuse to open any CTF dicts that have this flag set. If the flag
is not set on a dict being read in, new libctf will disregard the
function info section. Format v4 will remove this flag (or, rather, the
flag has no meaning there and the bit position may be recycled for some
other purpose).
New API:
Symbol addition:
ctf_add_func_sym: Add a symbol with a given name and type. The
type must be of kind CTF_K_FUNCTION (a function
pointer). Internally this adds a name -> type
mapping to the ctf_funchash in the ctf_dict.
ctf_add_objt_sym: Add a symbol with a given name and type. The type
kind can be anything, including function pointers.
This adds to ctf_objthash.
These both treat symbols as name -> type mappings: the linker associates
symbol names with symbol indexes via the ctf_link_shuffle_syms callback,
which sets up the ctf_dynsyms/ctf_dynsymidx/ctf_dynsymmax fields in the
ctf_dict. Repeated relinks can add more symbols.
Variables that are also exposed as symbols are removed from the variable
section at serialization time.
CTF symbol type sections which have enough pads, defined by
CTF_INDEX_PAD_THRESHOLD (whether because they are in dicts with symbols
where most types are unknown, or in archive where most types are defined
in some child or parent dict, not in this specific dict) are sorted by
name rather than symidx and accompanied by an index which associates
each symbol type entry with a name: the existing ctf_lookup_by_symbol
will map symbol indexes to symbol names and look the names up in the
index automatically. (This is currently ELF-symbol-table-dependent, but
there is almost nothing specific to ELF in here and we can add support
for other symbol table formats easily).
The compiler also uses index sections to communicate the contents of
object file symbol tables without relying on any specific ordering of
symbols: it doesn't need to sort them, and libctf will detect an
unsorted index section via the absence of the new CTF_F_IDXSORTED header
flag, and sort it if needed.
Iteration:
ctf_symbol_next: Iterator which returns the types and names of symbols
one by one, either for function or data symbols.
This does not require any sorting: the ctf_link machinery uses it to
pull in all the compiler-provided symbols cheaply, but it is not
restricted to that use.
(Compatible) changes in API:
ctf_lookup_by_symbol: can now be called for object and function
symbols: never returns ECTF_NOTDATA (which is
now not thrown by anything, but is kept for
compatibility and because it is a plausible
error that we might start throwing again at some
later date).
Internally we also have changes to the ctf-string functionality so that
"external" strings (those where we track a string -> offset mapping, but
only write out an offset) can be consulted via the usual means
(ctf_strptr) before the strtab is written out. This is important
because ctf_link_add_linker_symbol can now be handed symbols named via
strtab offsets, and ctf_link_shuffle_syms must figure out their actual
names by looking in the external symtab we have just been fed by the
ctf_link_add_strtab callback, long before that strtab is written out.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_symbol_next): New.
(ctf_add_objt_sym): Likewise.
(ctf_add_func_sym): Likewise.
* ctf.h: Document new function info section format.
(CTF_F_NEWFUNCINFO): New.
(CTF_F_IDXSORTED): New.
(CTF_F_MAX): Adjust accordingly.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (CTF_INDEX_PAD_THRESHOLD): New.
(_libctf_nonnull_): Likewise.
(ctf_in_flight_dynsym_t): New.
(ctf_dict_t) <ctf_funcidx_names>: Likewise.
<ctf_objtidx_names>: Likewise.
<ctf_nfuncidx>: Likewise.
<ctf_nobjtidx>: Likewise.
<ctf_funcidx_sxlate>: Likewise.
<ctf_objtidx_sxlate>: Likewise.
<ctf_objthash>: Likewise.
<ctf_funchash>: Likewise.
<ctf_dynsyms>: Likewise.
<ctf_dynsymidx>: Likewise.
<ctf_dynsymmax>: Likewise.
<ctf_in_flight_dynsym>: Likewise.
(struct ctf_next) <u.ctn_next>: Likewise.
(ctf_symtab_skippable): New prototype.
(ctf_add_funcobjt_sym): Likewise.
(ctf_dynhash_sort_by_name): Likewise.
(ctf_sym_to_elf64): Rename to...
(ctf_elf32_to_link_sym): ... this, and...
(ctf_elf64_to_link_sym): ... this.
* ctf-open.c (init_symtab): Check for lack of CTF_F_NEWFUNCINFO
flag, and presence of index sections. Refactor out
ctf_symtab_skippable and ctf_elf*_to_link_sym, and use them. Use
ctf_link_sym_t, not Elf64_Sym. Skip initializing objt or func
sxlate sections if corresponding index section is present. Adjust
for new func info section format.
(ctf_bufopen_internal): Add ctf_err_warn to corrupt-file error
handling. Report incorrect-length index sections. Always do an
init_symtab, even if there is no symtab section (there may be index
sections still).
(flip_objts): Adjust comment: func and objt sections are actually
identical in structure now, no need to caveat.
(ctf_dict_close): Free newly-added data structures.
* ctf-create.c (ctf_create): Initialize them.
(ctf_symtab_skippable): New, refactored out of
init_symtab, with st_nameidx_set check added.
(ctf_add_funcobjt_sym): New, add a function or object symbol to the
ctf_objthash or ctf_funchash, by name.
(ctf_add_objt_sym): Call it.
(ctf_add_func_sym): Likewise.
(symtypetab_delete_nonstatic_vars): New, delete vars also present as
data objects.
(CTF_SYMTYPETAB_EMIT_FUNCTION): New flag to symtypetab emitters:
this is a function emission, not a data object emission.
(CTF_SYMTYPETAB_EMIT_PAD): New flag to symtypetab emitters: emit
pads for symbols with no type (only set for unindexed sections).
(CTF_SYMTYPETAB_FORCE_INDEXED): New flag to symtypetab emitters:
always emit indexed.
(symtypetab_density): New, figure out section sizes.
(emit_symtypetab): New, emit a symtypetab.
(emit_symtypetab_index): New, emit a symtypetab index.
(ctf_serialize): Call them, emitting suitably sorted symtypetab
sections and indexes. Set suitable header flags. Copy over new
fields.
* ctf-hash.c (ctf_dynhash_sort_by_name): New, used to impose an
order on symtypetab index sections.
* ctf-link.c (ctf_add_type_mapping): Delete erroneous comment
relating to code that was never committed.
(ctf_link_one_variable): Improve variable name.
(check_sym): New, symtypetab analogue of check_variable.
(ctf_link_deduplicating_one_symtypetab): New.
(ctf_link_deduplicating_syms): Likewise.
(ctf_link_deduplicating): Call them.
(ctf_link_deduplicating_per_cu): Note that we don't call them in
this case (yet).
(ctf_link_add_strtab): Set the error on the fp correctly.
(ctf_link_add_linker_symbol): New (no longer a do-nothing stub), add
a linker symbol to the in-flight list.
(ctf_link_shuffle_syms): New (no longer a do-nothing stub), turn the
in-flight list into a mapping we can use, now its names are
resolvable in the external strtab.
* ctf-string.c (ctf_str_rollback_atom): Don't roll back atoms with
external strtab offsets.
(ctf_str_rollback): Adjust comment.
(ctf_str_write_strtab): Migrate ctf_syn_ext_strtab population from
writeout time...
(ctf_str_add_external): ... to string addition time.
* ctf-lookup.c (ctf_lookup_var_key_t): Rename to...
(ctf_lookup_idx_key_t): ... this, now we use it for syms too.
<clik_names>: New member, a name table.
(ctf_lookup_var): Adjust accordingly.
(ctf_lookup_variable): Likewise.
(ctf_lookup_by_id): Shuffle further up in the file.
(ctf_symidx_sort_arg_cb): New, callback for...
(sort_symidx_by_name): ... this new function to sort a symidx
found to be unsorted (likely originating from the compiler).
(ctf_symidx_sort): New, sort a symidx.
(ctf_lookup_symbol_name): Support dynamic symbols with indexes
provided by the linker. Use ctf_link_sym_t, not Elf64_Sym.
Check the parent if a child lookup fails.
(ctf_lookup_by_symbol): Likewise. Work for function symbols too.
(ctf_symbol_next): New, iterate over symbols with types (without
sorting).
(ctf_lookup_idx_name): New, bsearch for symbol names in indexes.
(ctf_try_lookup_indexed): New, attempt an indexed lookup.
(ctf_func_info): Reimplement in terms of ctf_lookup_by_symbol.
(ctf_func_args): Likewise.
(ctf_get_dict): Move...
* ctf-types.c (ctf_get_dict): ... here.
* ctf-util.c (ctf_sym_to_elf64): Re-express as...
(ctf_elf64_to_link_sym): ... this. Add new st_symidx field, and
st_nameidx_set (always 0, so st_nameidx can be ignored). Look in
the ELF strtab for names.
(ctf_elf32_to_link_sym): Likewise, for Elf32_Sym.
(ctf_next_destroy): Destroy ctf_next_t.u.ctn_next if need be.
* libctf.ver: Add ctf_symbol_next, ctf_add_objt_sym and
ctf_add_func_sym.
2020-11-20 21:34:04 +08:00
|
|
|
if (ctf_symtab_skippable (sym))
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
if (sym->st_type != STT_OBJECT && sym->st_type != STT_FUNC)
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
/* Add the symbol to the in-flight list. */
|
|
|
|
|
|
|
|
if ((cid = malloc (sizeof (ctf_in_flight_dynsym_t))) == NULL)
|
|
|
|
goto oom;
|
|
|
|
|
|
|
|
cid->cid_sym = *sym;
|
|
|
|
ctf_list_append (&fp->ctf_in_flight_dynsyms, cid);
|
|
|
|
|
bfd, include, ld, binutils, libctf: CTF should use the dynstr/sym
This is embarrassing.
The whole point of CTF is that it remains intact even after a binary is
stripped, providing a compact mapping from symbols to types for
everything in the externally-visible interface of an ELF object: it has
connections to the symbol table for that purpose, and to the string
table to avoid duplicating symbol names. So it's a shame that the hooks
I implemented last year served to hook it up to the .symtab and .strtab,
which obviously disappear on strip, leaving any accompanying the CTF
dict containing references to strings (and, soon, symbols) which don't
exist any more because their containing strtab has been vaporized. The
original Solaris design used .dynsym and .dynstr (well, actually,
.ldynsym, which has more symbols) which do not disappear. So should we.
Thankfully the work we did before serves as guide rails, and adjusting
things to use the .dynstr and .dynsym was fast and easy. The only
annoyance is that the dynsym is assembled inside elflink.c in a fairly
piecemeal fashion, so that the easiest way to get the symbols out was to
hook in before every call to swap_symbol_out (we also leave in a hook in
front of symbol additions to the .symtab because it seems plausible that
we might want to hook them in future too: for now that hook is unused).
We adjust things so that rather than being offered a whole hash table of
symbols at once, libctf is now given symbols one at a time, with st_name
indexes already resolved and pointing at their final .dynstr offsets:
it's now up to libctf to resolve these to names as needed using the
strtab info we pass it separately.
Some bits might be contentious. The ctf_new_dynstr callback takes an
elf_internal_sym, and this remains an elf_internal_sym right down
through the generic emulation layers into ldelfgen. This is no worse
than the elf_sym_strtab we used to pass down, but in the future when we
gain non-ELF CTF symtab support we might want to lower the
elf_internal_sym to some other representation (perhaps a
ctf_link_symbol) in bfd or in ldlang_ctf_new_dynsym. We rename the
'apply_strsym' hooks to 'acquire_strings' instead, becuse they no longer
have anything to do with symbols.
There are some API changes to pieces of API which are technically public
but actually totally unused by anything and/or unused by anything but ld
so they can change freely: the ctf_link_symbol gains new fields to allow
symbol names to be given as strtab offsets as well as strings, and a
symidx so that the symbol index can be passed in. ctf_link_shuffle_syms
loses its callback parameter: the idea now is that linkers call the new
ctf_link_add_linker_symbol for every symbol in .dynsym, feed in all the
strtab entries with ctf_link_add_strtab, and then a call to
ctf_link_shuffle_syms will apply both and arrange to use them to reorder
the CTF symtab at CTF serialization time (which is coming in the next
commit).
Inside libctf we have a new preamble flag CTF_F_DYNSTR which is always
set in v3-format CTF dicts from this commit forwards: CTF dicts without
this flag are associated with .strtab like they used to be, so that old
dicts' external strings don't turn to garbage when loaded by new libctf.
Dicts with this flag are associated with .dynstr and .dynsym instead.
(The flag is not the next in sequence because this commit was written
quite late: the missing flags will be filled in by the next commit.)
Tests forthcoming in a later commit in this series.
bfd/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* elflink.c (elf_finalize_dynstr): Call examine_strtab after
dynstr finalization.
(elf_link_swap_symbols_out): Don't call it here. Call
ctf_new_symbol before swap_symbol_out.
(elf_link_output_extsym): Call ctf_new_dynsym before
swap_symbol_out.
(bfd_elf_final_link): Likewise.
* elf.c (swap_out_syms): Pass in bfd_link_info. Call
ctf_new_symbol before swap_symbol_out.
(_bfd_elf_compute_section_file_positions): Adjust.
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* readelf.c (dump_section_as_ctf): Use .dynsym and .dynstr, not
.symtab and .strtab.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* bfdlink.h (struct elf_sym_strtab): Replace with...
(struct elf_internal_sym): ... this.
(struct bfd_link_callbacks) <examine_strtab>: Take only a
symstrtab argument.
<ctf_new_symbol>: New.
<ctf_new_dynsym>: Likewise.
* ctf-api.h (struct ctf_link_sym) <st_symidx>: New.
<st_nameidx>: Likewise.
<st_nameidx_set>: Likewise.
(ctf_link_iter_symbol_f): Removed.
(ctf_link_shuffle_syms): Remove most parameters, just takes a
ctf_dict_t now.
(ctf_link_add_linker_symbol): New, split from
ctf_link_shuffle_syms.
* ctf.h (CTF_F_DYNSTR): New.
(CTF_F_MAX): Adjust.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldelfgen.c (struct ctf_strsym_iter_cb_arg): Rename to...
(struct ctf_strtab_iter_cb_arg): ... this, changing fields:
<syms>: Remove.
<symcount>: Remove.
<symstrtab>: Rename to...
<strtab>: ... this.
(ldelf_ctf_strtab_iter_cb): Adjust.
(ldelf_ctf_symbols_iter_cb): Remove.
(ldelf_new_dynsym_for_ctf): New, tell libctf about a single
symbol.
(ldelf_examine_strtab_for_ctf): Rename to...
(ldelf_acquire_strings_for_ctf): ... this, only doing the strtab
portion and not symbols.
* ldelfgen.h: Adjust declarations accordingly.
* ldemul.c (ldemul_examine_strtab_for_ctf): Rename to...
(ldemul_acquire_strings_for_ctf): ... this.
(ldemul_new_dynsym_for_ctf): New.
* ldemul.h: Adjust declarations accordingly.
* ldlang.c (ldlang_ctf_apply_strsym): Rename to...
(ldlang_ctf_acquire_strings): ... this.
(ldlang_ctf_new_dynsym): New.
(lang_write_ctf): Call ldemul_new_dynsym_for_ctf with NULL to do
the actual symbol shuffle.
* ldlang.h (struct elf_strtab_hash): Adjust accordingly.
* ldmain.c (bfd_link_callbacks): Wire up new/renamed callbacks.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-link.c (ctf_link_shuffle_syms): Adjust.
(ctf_link_add_linker_symbol): New, unimplemented stub.
* libctf.ver: Add it.
* ctf-create.c (ctf_serialize): Set CTF_F_DYNSTR on newly-serialized
dicts.
* ctf-open-bfd.c (ctf_bfdopen_ctfsect): Check for the flag: open the
symtab/strtab if not present, dynsym/dynstr otherwise.
* ctf-archive.c (ctf_arc_bufpreamble): New, get the preamble from
some arbitrary member of a CTF archive.
* ctf-impl.h (ctf_arc_bufpreamble): Declare it.
2020-11-20 21:34:04 +08:00
|
|
|
return 0;
|
libctf: symbol type linking support
This adds facilities to write out the function info and data object
sections, which efficiently map from entries in the symbol table to
types. The write-side code is entirely new: the read-side code was
merely significantly changed and support for indexed tables added
(pointed to by the no-longer-unused cth_objtidxoff and cth_funcidxoff
header fields).
With this in place, you can use ctf_lookup_by_symbol to look up the
types of symbols of function and object type (and, as before, you can
use ctf_lookup_variable to look up types of file-scope variables not
present in the symbol table, as long as you know their name: but
variables that are also data objects are now found in the data object
section instead.)
(Compatible) file format change:
The CTF spec has always said that the function info section looks much
like the CTF_K_FUNCTIONs in the type section: an info word (including an
argument count) followed by a return type and N argument types. This
format is suboptimal: it means function symbols cannot be deduplicated
and it causes a lot of ugly code duplication in libctf. But
conveniently the compiler has never emitted this! Because it has always
emitted a rather different format that libctf has never accepted, we can
be sure that there are no instances of this function info section in the
wild, and can freely change its format without compatibility concerns or
a file format version bump. (And since it has never been emitted in any
code that generated any older file format version, either, we need keep
no code to read the format as specified at all!)
So the function info section is now specified as an array of uint32_t,
exactly like the object data section: each entry is a type ID in the
type section which must be of kind CTF_K_FUNCTION, the prototype of
this function.
This allows function types to be deduplicated and also correctly encodes
the fact that all functions declared in C really are types available to
the program: so they should be stored in the type section like all other
types. (In format v4, we will be able to represent the types of static
functions as well, but that really does require a file format change.)
We introduce a new header flag, CTF_F_NEWFUNCINFO, which is set if the
new function info format is in use. A sufficiently new compiler will
always set this flag. New libctf will always set this flag: old libctf
will refuse to open any CTF dicts that have this flag set. If the flag
is not set on a dict being read in, new libctf will disregard the
function info section. Format v4 will remove this flag (or, rather, the
flag has no meaning there and the bit position may be recycled for some
other purpose).
New API:
Symbol addition:
ctf_add_func_sym: Add a symbol with a given name and type. The
type must be of kind CTF_K_FUNCTION (a function
pointer). Internally this adds a name -> type
mapping to the ctf_funchash in the ctf_dict.
ctf_add_objt_sym: Add a symbol with a given name and type. The type
kind can be anything, including function pointers.
This adds to ctf_objthash.
These both treat symbols as name -> type mappings: the linker associates
symbol names with symbol indexes via the ctf_link_shuffle_syms callback,
which sets up the ctf_dynsyms/ctf_dynsymidx/ctf_dynsymmax fields in the
ctf_dict. Repeated relinks can add more symbols.
Variables that are also exposed as symbols are removed from the variable
section at serialization time.
CTF symbol type sections which have enough pads, defined by
CTF_INDEX_PAD_THRESHOLD (whether because they are in dicts with symbols
where most types are unknown, or in archive where most types are defined
in some child or parent dict, not in this specific dict) are sorted by
name rather than symidx and accompanied by an index which associates
each symbol type entry with a name: the existing ctf_lookup_by_symbol
will map symbol indexes to symbol names and look the names up in the
index automatically. (This is currently ELF-symbol-table-dependent, but
there is almost nothing specific to ELF in here and we can add support
for other symbol table formats easily).
The compiler also uses index sections to communicate the contents of
object file symbol tables without relying on any specific ordering of
symbols: it doesn't need to sort them, and libctf will detect an
unsorted index section via the absence of the new CTF_F_IDXSORTED header
flag, and sort it if needed.
Iteration:
ctf_symbol_next: Iterator which returns the types and names of symbols
one by one, either for function or data symbols.
This does not require any sorting: the ctf_link machinery uses it to
pull in all the compiler-provided symbols cheaply, but it is not
restricted to that use.
(Compatible) changes in API:
ctf_lookup_by_symbol: can now be called for object and function
symbols: never returns ECTF_NOTDATA (which is
now not thrown by anything, but is kept for
compatibility and because it is a plausible
error that we might start throwing again at some
later date).
Internally we also have changes to the ctf-string functionality so that
"external" strings (those where we track a string -> offset mapping, but
only write out an offset) can be consulted via the usual means
(ctf_strptr) before the strtab is written out. This is important
because ctf_link_add_linker_symbol can now be handed symbols named via
strtab offsets, and ctf_link_shuffle_syms must figure out their actual
names by looking in the external symtab we have just been fed by the
ctf_link_add_strtab callback, long before that strtab is written out.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_symbol_next): New.
(ctf_add_objt_sym): Likewise.
(ctf_add_func_sym): Likewise.
* ctf.h: Document new function info section format.
(CTF_F_NEWFUNCINFO): New.
(CTF_F_IDXSORTED): New.
(CTF_F_MAX): Adjust accordingly.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (CTF_INDEX_PAD_THRESHOLD): New.
(_libctf_nonnull_): Likewise.
(ctf_in_flight_dynsym_t): New.
(ctf_dict_t) <ctf_funcidx_names>: Likewise.
<ctf_objtidx_names>: Likewise.
<ctf_nfuncidx>: Likewise.
<ctf_nobjtidx>: Likewise.
<ctf_funcidx_sxlate>: Likewise.
<ctf_objtidx_sxlate>: Likewise.
<ctf_objthash>: Likewise.
<ctf_funchash>: Likewise.
<ctf_dynsyms>: Likewise.
<ctf_dynsymidx>: Likewise.
<ctf_dynsymmax>: Likewise.
<ctf_in_flight_dynsym>: Likewise.
(struct ctf_next) <u.ctn_next>: Likewise.
(ctf_symtab_skippable): New prototype.
(ctf_add_funcobjt_sym): Likewise.
(ctf_dynhash_sort_by_name): Likewise.
(ctf_sym_to_elf64): Rename to...
(ctf_elf32_to_link_sym): ... this, and...
(ctf_elf64_to_link_sym): ... this.
* ctf-open.c (init_symtab): Check for lack of CTF_F_NEWFUNCINFO
flag, and presence of index sections. Refactor out
ctf_symtab_skippable and ctf_elf*_to_link_sym, and use them. Use
ctf_link_sym_t, not Elf64_Sym. Skip initializing objt or func
sxlate sections if corresponding index section is present. Adjust
for new func info section format.
(ctf_bufopen_internal): Add ctf_err_warn to corrupt-file error
handling. Report incorrect-length index sections. Always do an
init_symtab, even if there is no symtab section (there may be index
sections still).
(flip_objts): Adjust comment: func and objt sections are actually
identical in structure now, no need to caveat.
(ctf_dict_close): Free newly-added data structures.
* ctf-create.c (ctf_create): Initialize them.
(ctf_symtab_skippable): New, refactored out of
init_symtab, with st_nameidx_set check added.
(ctf_add_funcobjt_sym): New, add a function or object symbol to the
ctf_objthash or ctf_funchash, by name.
(ctf_add_objt_sym): Call it.
(ctf_add_func_sym): Likewise.
(symtypetab_delete_nonstatic_vars): New, delete vars also present as
data objects.
(CTF_SYMTYPETAB_EMIT_FUNCTION): New flag to symtypetab emitters:
this is a function emission, not a data object emission.
(CTF_SYMTYPETAB_EMIT_PAD): New flag to symtypetab emitters: emit
pads for symbols with no type (only set for unindexed sections).
(CTF_SYMTYPETAB_FORCE_INDEXED): New flag to symtypetab emitters:
always emit indexed.
(symtypetab_density): New, figure out section sizes.
(emit_symtypetab): New, emit a symtypetab.
(emit_symtypetab_index): New, emit a symtypetab index.
(ctf_serialize): Call them, emitting suitably sorted symtypetab
sections and indexes. Set suitable header flags. Copy over new
fields.
* ctf-hash.c (ctf_dynhash_sort_by_name): New, used to impose an
order on symtypetab index sections.
* ctf-link.c (ctf_add_type_mapping): Delete erroneous comment
relating to code that was never committed.
(ctf_link_one_variable): Improve variable name.
(check_sym): New, symtypetab analogue of check_variable.
(ctf_link_deduplicating_one_symtypetab): New.
(ctf_link_deduplicating_syms): Likewise.
(ctf_link_deduplicating): Call them.
(ctf_link_deduplicating_per_cu): Note that we don't call them in
this case (yet).
(ctf_link_add_strtab): Set the error on the fp correctly.
(ctf_link_add_linker_symbol): New (no longer a do-nothing stub), add
a linker symbol to the in-flight list.
(ctf_link_shuffle_syms): New (no longer a do-nothing stub), turn the
in-flight list into a mapping we can use, now its names are
resolvable in the external strtab.
* ctf-string.c (ctf_str_rollback_atom): Don't roll back atoms with
external strtab offsets.
(ctf_str_rollback): Adjust comment.
(ctf_str_write_strtab): Migrate ctf_syn_ext_strtab population from
writeout time...
(ctf_str_add_external): ... to string addition time.
* ctf-lookup.c (ctf_lookup_var_key_t): Rename to...
(ctf_lookup_idx_key_t): ... this, now we use it for syms too.
<clik_names>: New member, a name table.
(ctf_lookup_var): Adjust accordingly.
(ctf_lookup_variable): Likewise.
(ctf_lookup_by_id): Shuffle further up in the file.
(ctf_symidx_sort_arg_cb): New, callback for...
(sort_symidx_by_name): ... this new function to sort a symidx
found to be unsorted (likely originating from the compiler).
(ctf_symidx_sort): New, sort a symidx.
(ctf_lookup_symbol_name): Support dynamic symbols with indexes
provided by the linker. Use ctf_link_sym_t, not Elf64_Sym.
Check the parent if a child lookup fails.
(ctf_lookup_by_symbol): Likewise. Work for function symbols too.
(ctf_symbol_next): New, iterate over symbols with types (without
sorting).
(ctf_lookup_idx_name): New, bsearch for symbol names in indexes.
(ctf_try_lookup_indexed): New, attempt an indexed lookup.
(ctf_func_info): Reimplement in terms of ctf_lookup_by_symbol.
(ctf_func_args): Likewise.
(ctf_get_dict): Move...
* ctf-types.c (ctf_get_dict): ... here.
* ctf-util.c (ctf_sym_to_elf64): Re-express as...
(ctf_elf64_to_link_sym): ... this. Add new st_symidx field, and
st_nameidx_set (always 0, so st_nameidx can be ignored). Look in
the ELF strtab for names.
(ctf_elf32_to_link_sym): Likewise, for Elf32_Sym.
(ctf_next_destroy): Destroy ctf_next_t.u.ctn_next if need be.
* libctf.ver: Add ctf_symbol_next, ctf_add_objt_sym and
ctf_add_func_sym.
2020-11-20 21:34:04 +08:00
|
|
|
|
|
|
|
oom:
|
|
|
|
ctf_dynhash_destroy (fp->ctf_dynsyms);
|
|
|
|
fp->ctf_dynsyms = NULL;
|
|
|
|
ctf_set_errno (fp, ENOMEM);
|
|
|
|
return -ENOMEM;
|
bfd, include, ld, binutils, libctf: CTF should use the dynstr/sym
This is embarrassing.
The whole point of CTF is that it remains intact even after a binary is
stripped, providing a compact mapping from symbols to types for
everything in the externally-visible interface of an ELF object: it has
connections to the symbol table for that purpose, and to the string
table to avoid duplicating symbol names. So it's a shame that the hooks
I implemented last year served to hook it up to the .symtab and .strtab,
which obviously disappear on strip, leaving any accompanying the CTF
dict containing references to strings (and, soon, symbols) which don't
exist any more because their containing strtab has been vaporized. The
original Solaris design used .dynsym and .dynstr (well, actually,
.ldynsym, which has more symbols) which do not disappear. So should we.
Thankfully the work we did before serves as guide rails, and adjusting
things to use the .dynstr and .dynsym was fast and easy. The only
annoyance is that the dynsym is assembled inside elflink.c in a fairly
piecemeal fashion, so that the easiest way to get the symbols out was to
hook in before every call to swap_symbol_out (we also leave in a hook in
front of symbol additions to the .symtab because it seems plausible that
we might want to hook them in future too: for now that hook is unused).
We adjust things so that rather than being offered a whole hash table of
symbols at once, libctf is now given symbols one at a time, with st_name
indexes already resolved and pointing at their final .dynstr offsets:
it's now up to libctf to resolve these to names as needed using the
strtab info we pass it separately.
Some bits might be contentious. The ctf_new_dynstr callback takes an
elf_internal_sym, and this remains an elf_internal_sym right down
through the generic emulation layers into ldelfgen. This is no worse
than the elf_sym_strtab we used to pass down, but in the future when we
gain non-ELF CTF symtab support we might want to lower the
elf_internal_sym to some other representation (perhaps a
ctf_link_symbol) in bfd or in ldlang_ctf_new_dynsym. We rename the
'apply_strsym' hooks to 'acquire_strings' instead, becuse they no longer
have anything to do with symbols.
There are some API changes to pieces of API which are technically public
but actually totally unused by anything and/or unused by anything but ld
so they can change freely: the ctf_link_symbol gains new fields to allow
symbol names to be given as strtab offsets as well as strings, and a
symidx so that the symbol index can be passed in. ctf_link_shuffle_syms
loses its callback parameter: the idea now is that linkers call the new
ctf_link_add_linker_symbol for every symbol in .dynsym, feed in all the
strtab entries with ctf_link_add_strtab, and then a call to
ctf_link_shuffle_syms will apply both and arrange to use them to reorder
the CTF symtab at CTF serialization time (which is coming in the next
commit).
Inside libctf we have a new preamble flag CTF_F_DYNSTR which is always
set in v3-format CTF dicts from this commit forwards: CTF dicts without
this flag are associated with .strtab like they used to be, so that old
dicts' external strings don't turn to garbage when loaded by new libctf.
Dicts with this flag are associated with .dynstr and .dynsym instead.
(The flag is not the next in sequence because this commit was written
quite late: the missing flags will be filled in by the next commit.)
Tests forthcoming in a later commit in this series.
bfd/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* elflink.c (elf_finalize_dynstr): Call examine_strtab after
dynstr finalization.
(elf_link_swap_symbols_out): Don't call it here. Call
ctf_new_symbol before swap_symbol_out.
(elf_link_output_extsym): Call ctf_new_dynsym before
swap_symbol_out.
(bfd_elf_final_link): Likewise.
* elf.c (swap_out_syms): Pass in bfd_link_info. Call
ctf_new_symbol before swap_symbol_out.
(_bfd_elf_compute_section_file_positions): Adjust.
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* readelf.c (dump_section_as_ctf): Use .dynsym and .dynstr, not
.symtab and .strtab.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* bfdlink.h (struct elf_sym_strtab): Replace with...
(struct elf_internal_sym): ... this.
(struct bfd_link_callbacks) <examine_strtab>: Take only a
symstrtab argument.
<ctf_new_symbol>: New.
<ctf_new_dynsym>: Likewise.
* ctf-api.h (struct ctf_link_sym) <st_symidx>: New.
<st_nameidx>: Likewise.
<st_nameidx_set>: Likewise.
(ctf_link_iter_symbol_f): Removed.
(ctf_link_shuffle_syms): Remove most parameters, just takes a
ctf_dict_t now.
(ctf_link_add_linker_symbol): New, split from
ctf_link_shuffle_syms.
* ctf.h (CTF_F_DYNSTR): New.
(CTF_F_MAX): Adjust.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldelfgen.c (struct ctf_strsym_iter_cb_arg): Rename to...
(struct ctf_strtab_iter_cb_arg): ... this, changing fields:
<syms>: Remove.
<symcount>: Remove.
<symstrtab>: Rename to...
<strtab>: ... this.
(ldelf_ctf_strtab_iter_cb): Adjust.
(ldelf_ctf_symbols_iter_cb): Remove.
(ldelf_new_dynsym_for_ctf): New, tell libctf about a single
symbol.
(ldelf_examine_strtab_for_ctf): Rename to...
(ldelf_acquire_strings_for_ctf): ... this, only doing the strtab
portion and not symbols.
* ldelfgen.h: Adjust declarations accordingly.
* ldemul.c (ldemul_examine_strtab_for_ctf): Rename to...
(ldemul_acquire_strings_for_ctf): ... this.
(ldemul_new_dynsym_for_ctf): New.
* ldemul.h: Adjust declarations accordingly.
* ldlang.c (ldlang_ctf_apply_strsym): Rename to...
(ldlang_ctf_acquire_strings): ... this.
(ldlang_ctf_new_dynsym): New.
(lang_write_ctf): Call ldemul_new_dynsym_for_ctf with NULL to do
the actual symbol shuffle.
* ldlang.h (struct elf_strtab_hash): Adjust accordingly.
* ldmain.c (bfd_link_callbacks): Wire up new/renamed callbacks.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-link.c (ctf_link_shuffle_syms): Adjust.
(ctf_link_add_linker_symbol): New, unimplemented stub.
* libctf.ver: Add it.
* ctf-create.c (ctf_serialize): Set CTF_F_DYNSTR on newly-serialized
dicts.
* ctf-open-bfd.c (ctf_bfdopen_ctfsect): Check for the flag: open the
symtab/strtab if not present, dynsym/dynstr otherwise.
* ctf-archive.c (ctf_arc_bufpreamble): New, get the preamble from
some arbitrary member of a CTF archive.
* ctf-impl.h (ctf_arc_bufpreamble): Declare it.
2020-11-20 21:34:04 +08:00
|
|
|
}
|
libctf: symbol type linking support
This adds facilities to write out the function info and data object
sections, which efficiently map from entries in the symbol table to
types. The write-side code is entirely new: the read-side code was
merely significantly changed and support for indexed tables added
(pointed to by the no-longer-unused cth_objtidxoff and cth_funcidxoff
header fields).
With this in place, you can use ctf_lookup_by_symbol to look up the
types of symbols of function and object type (and, as before, you can
use ctf_lookup_variable to look up types of file-scope variables not
present in the symbol table, as long as you know their name: but
variables that are also data objects are now found in the data object
section instead.)
(Compatible) file format change:
The CTF spec has always said that the function info section looks much
like the CTF_K_FUNCTIONs in the type section: an info word (including an
argument count) followed by a return type and N argument types. This
format is suboptimal: it means function symbols cannot be deduplicated
and it causes a lot of ugly code duplication in libctf. But
conveniently the compiler has never emitted this! Because it has always
emitted a rather different format that libctf has never accepted, we can
be sure that there are no instances of this function info section in the
wild, and can freely change its format without compatibility concerns or
a file format version bump. (And since it has never been emitted in any
code that generated any older file format version, either, we need keep
no code to read the format as specified at all!)
So the function info section is now specified as an array of uint32_t,
exactly like the object data section: each entry is a type ID in the
type section which must be of kind CTF_K_FUNCTION, the prototype of
this function.
This allows function types to be deduplicated and also correctly encodes
the fact that all functions declared in C really are types available to
the program: so they should be stored in the type section like all other
types. (In format v4, we will be able to represent the types of static
functions as well, but that really does require a file format change.)
We introduce a new header flag, CTF_F_NEWFUNCINFO, which is set if the
new function info format is in use. A sufficiently new compiler will
always set this flag. New libctf will always set this flag: old libctf
will refuse to open any CTF dicts that have this flag set. If the flag
is not set on a dict being read in, new libctf will disregard the
function info section. Format v4 will remove this flag (or, rather, the
flag has no meaning there and the bit position may be recycled for some
other purpose).
New API:
Symbol addition:
ctf_add_func_sym: Add a symbol with a given name and type. The
type must be of kind CTF_K_FUNCTION (a function
pointer). Internally this adds a name -> type
mapping to the ctf_funchash in the ctf_dict.
ctf_add_objt_sym: Add a symbol with a given name and type. The type
kind can be anything, including function pointers.
This adds to ctf_objthash.
These both treat symbols as name -> type mappings: the linker associates
symbol names with symbol indexes via the ctf_link_shuffle_syms callback,
which sets up the ctf_dynsyms/ctf_dynsymidx/ctf_dynsymmax fields in the
ctf_dict. Repeated relinks can add more symbols.
Variables that are also exposed as symbols are removed from the variable
section at serialization time.
CTF symbol type sections which have enough pads, defined by
CTF_INDEX_PAD_THRESHOLD (whether because they are in dicts with symbols
where most types are unknown, or in archive where most types are defined
in some child or parent dict, not in this specific dict) are sorted by
name rather than symidx and accompanied by an index which associates
each symbol type entry with a name: the existing ctf_lookup_by_symbol
will map symbol indexes to symbol names and look the names up in the
index automatically. (This is currently ELF-symbol-table-dependent, but
there is almost nothing specific to ELF in here and we can add support
for other symbol table formats easily).
The compiler also uses index sections to communicate the contents of
object file symbol tables without relying on any specific ordering of
symbols: it doesn't need to sort them, and libctf will detect an
unsorted index section via the absence of the new CTF_F_IDXSORTED header
flag, and sort it if needed.
Iteration:
ctf_symbol_next: Iterator which returns the types and names of symbols
one by one, either for function or data symbols.
This does not require any sorting: the ctf_link machinery uses it to
pull in all the compiler-provided symbols cheaply, but it is not
restricted to that use.
(Compatible) changes in API:
ctf_lookup_by_symbol: can now be called for object and function
symbols: never returns ECTF_NOTDATA (which is
now not thrown by anything, but is kept for
compatibility and because it is a plausible
error that we might start throwing again at some
later date).
Internally we also have changes to the ctf-string functionality so that
"external" strings (those where we track a string -> offset mapping, but
only write out an offset) can be consulted via the usual means
(ctf_strptr) before the strtab is written out. This is important
because ctf_link_add_linker_symbol can now be handed symbols named via
strtab offsets, and ctf_link_shuffle_syms must figure out their actual
names by looking in the external symtab we have just been fed by the
ctf_link_add_strtab callback, long before that strtab is written out.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_symbol_next): New.
(ctf_add_objt_sym): Likewise.
(ctf_add_func_sym): Likewise.
* ctf.h: Document new function info section format.
(CTF_F_NEWFUNCINFO): New.
(CTF_F_IDXSORTED): New.
(CTF_F_MAX): Adjust accordingly.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (CTF_INDEX_PAD_THRESHOLD): New.
(_libctf_nonnull_): Likewise.
(ctf_in_flight_dynsym_t): New.
(ctf_dict_t) <ctf_funcidx_names>: Likewise.
<ctf_objtidx_names>: Likewise.
<ctf_nfuncidx>: Likewise.
<ctf_nobjtidx>: Likewise.
<ctf_funcidx_sxlate>: Likewise.
<ctf_objtidx_sxlate>: Likewise.
<ctf_objthash>: Likewise.
<ctf_funchash>: Likewise.
<ctf_dynsyms>: Likewise.
<ctf_dynsymidx>: Likewise.
<ctf_dynsymmax>: Likewise.
<ctf_in_flight_dynsym>: Likewise.
(struct ctf_next) <u.ctn_next>: Likewise.
(ctf_symtab_skippable): New prototype.
(ctf_add_funcobjt_sym): Likewise.
(ctf_dynhash_sort_by_name): Likewise.
(ctf_sym_to_elf64): Rename to...
(ctf_elf32_to_link_sym): ... this, and...
(ctf_elf64_to_link_sym): ... this.
* ctf-open.c (init_symtab): Check for lack of CTF_F_NEWFUNCINFO
flag, and presence of index sections. Refactor out
ctf_symtab_skippable and ctf_elf*_to_link_sym, and use them. Use
ctf_link_sym_t, not Elf64_Sym. Skip initializing objt or func
sxlate sections if corresponding index section is present. Adjust
for new func info section format.
(ctf_bufopen_internal): Add ctf_err_warn to corrupt-file error
handling. Report incorrect-length index sections. Always do an
init_symtab, even if there is no symtab section (there may be index
sections still).
(flip_objts): Adjust comment: func and objt sections are actually
identical in structure now, no need to caveat.
(ctf_dict_close): Free newly-added data structures.
* ctf-create.c (ctf_create): Initialize them.
(ctf_symtab_skippable): New, refactored out of
init_symtab, with st_nameidx_set check added.
(ctf_add_funcobjt_sym): New, add a function or object symbol to the
ctf_objthash or ctf_funchash, by name.
(ctf_add_objt_sym): Call it.
(ctf_add_func_sym): Likewise.
(symtypetab_delete_nonstatic_vars): New, delete vars also present as
data objects.
(CTF_SYMTYPETAB_EMIT_FUNCTION): New flag to symtypetab emitters:
this is a function emission, not a data object emission.
(CTF_SYMTYPETAB_EMIT_PAD): New flag to symtypetab emitters: emit
pads for symbols with no type (only set for unindexed sections).
(CTF_SYMTYPETAB_FORCE_INDEXED): New flag to symtypetab emitters:
always emit indexed.
(symtypetab_density): New, figure out section sizes.
(emit_symtypetab): New, emit a symtypetab.
(emit_symtypetab_index): New, emit a symtypetab index.
(ctf_serialize): Call them, emitting suitably sorted symtypetab
sections and indexes. Set suitable header flags. Copy over new
fields.
* ctf-hash.c (ctf_dynhash_sort_by_name): New, used to impose an
order on symtypetab index sections.
* ctf-link.c (ctf_add_type_mapping): Delete erroneous comment
relating to code that was never committed.
(ctf_link_one_variable): Improve variable name.
(check_sym): New, symtypetab analogue of check_variable.
(ctf_link_deduplicating_one_symtypetab): New.
(ctf_link_deduplicating_syms): Likewise.
(ctf_link_deduplicating): Call them.
(ctf_link_deduplicating_per_cu): Note that we don't call them in
this case (yet).
(ctf_link_add_strtab): Set the error on the fp correctly.
(ctf_link_add_linker_symbol): New (no longer a do-nothing stub), add
a linker symbol to the in-flight list.
(ctf_link_shuffle_syms): New (no longer a do-nothing stub), turn the
in-flight list into a mapping we can use, now its names are
resolvable in the external strtab.
* ctf-string.c (ctf_str_rollback_atom): Don't roll back atoms with
external strtab offsets.
(ctf_str_rollback): Adjust comment.
(ctf_str_write_strtab): Migrate ctf_syn_ext_strtab population from
writeout time...
(ctf_str_add_external): ... to string addition time.
* ctf-lookup.c (ctf_lookup_var_key_t): Rename to...
(ctf_lookup_idx_key_t): ... this, now we use it for syms too.
<clik_names>: New member, a name table.
(ctf_lookup_var): Adjust accordingly.
(ctf_lookup_variable): Likewise.
(ctf_lookup_by_id): Shuffle further up in the file.
(ctf_symidx_sort_arg_cb): New, callback for...
(sort_symidx_by_name): ... this new function to sort a symidx
found to be unsorted (likely originating from the compiler).
(ctf_symidx_sort): New, sort a symidx.
(ctf_lookup_symbol_name): Support dynamic symbols with indexes
provided by the linker. Use ctf_link_sym_t, not Elf64_Sym.
Check the parent if a child lookup fails.
(ctf_lookup_by_symbol): Likewise. Work for function symbols too.
(ctf_symbol_next): New, iterate over symbols with types (without
sorting).
(ctf_lookup_idx_name): New, bsearch for symbol names in indexes.
(ctf_try_lookup_indexed): New, attempt an indexed lookup.
(ctf_func_info): Reimplement in terms of ctf_lookup_by_symbol.
(ctf_func_args): Likewise.
(ctf_get_dict): Move...
* ctf-types.c (ctf_get_dict): ... here.
* ctf-util.c (ctf_sym_to_elf64): Re-express as...
(ctf_elf64_to_link_sym): ... this. Add new st_symidx field, and
st_nameidx_set (always 0, so st_nameidx can be ignored). Look in
the ELF strtab for names.
(ctf_elf32_to_link_sym): Likewise, for Elf32_Sym.
(ctf_next_destroy): Destroy ctf_next_t.u.ctn_next if need be.
* libctf.ver: Add ctf_symbol_next, ctf_add_objt_sym and
ctf_add_func_sym.
2020-11-20 21:34:04 +08:00
|
|
|
|
|
|
|
/* Impose an ordering on symbols. The ordering takes effect immediately, but
|
|
|
|
since the ordering info does not include type IDs, lookups may return nothing
|
|
|
|
until such IDs are added by calls to ctf_add_*_sym. Must be called after
|
|
|
|
ctf_link_add_strtab and ctf_link_add_linker_symbol. */
|
bfd, include, ld, binutils, libctf: CTF should use the dynstr/sym
This is embarrassing.
The whole point of CTF is that it remains intact even after a binary is
stripped, providing a compact mapping from symbols to types for
everything in the externally-visible interface of an ELF object: it has
connections to the symbol table for that purpose, and to the string
table to avoid duplicating symbol names. So it's a shame that the hooks
I implemented last year served to hook it up to the .symtab and .strtab,
which obviously disappear on strip, leaving any accompanying the CTF
dict containing references to strings (and, soon, symbols) which don't
exist any more because their containing strtab has been vaporized. The
original Solaris design used .dynsym and .dynstr (well, actually,
.ldynsym, which has more symbols) which do not disappear. So should we.
Thankfully the work we did before serves as guide rails, and adjusting
things to use the .dynstr and .dynsym was fast and easy. The only
annoyance is that the dynsym is assembled inside elflink.c in a fairly
piecemeal fashion, so that the easiest way to get the symbols out was to
hook in before every call to swap_symbol_out (we also leave in a hook in
front of symbol additions to the .symtab because it seems plausible that
we might want to hook them in future too: for now that hook is unused).
We adjust things so that rather than being offered a whole hash table of
symbols at once, libctf is now given symbols one at a time, with st_name
indexes already resolved and pointing at their final .dynstr offsets:
it's now up to libctf to resolve these to names as needed using the
strtab info we pass it separately.
Some bits might be contentious. The ctf_new_dynstr callback takes an
elf_internal_sym, and this remains an elf_internal_sym right down
through the generic emulation layers into ldelfgen. This is no worse
than the elf_sym_strtab we used to pass down, but in the future when we
gain non-ELF CTF symtab support we might want to lower the
elf_internal_sym to some other representation (perhaps a
ctf_link_symbol) in bfd or in ldlang_ctf_new_dynsym. We rename the
'apply_strsym' hooks to 'acquire_strings' instead, becuse they no longer
have anything to do with symbols.
There are some API changes to pieces of API which are technically public
but actually totally unused by anything and/or unused by anything but ld
so they can change freely: the ctf_link_symbol gains new fields to allow
symbol names to be given as strtab offsets as well as strings, and a
symidx so that the symbol index can be passed in. ctf_link_shuffle_syms
loses its callback parameter: the idea now is that linkers call the new
ctf_link_add_linker_symbol for every symbol in .dynsym, feed in all the
strtab entries with ctf_link_add_strtab, and then a call to
ctf_link_shuffle_syms will apply both and arrange to use them to reorder
the CTF symtab at CTF serialization time (which is coming in the next
commit).
Inside libctf we have a new preamble flag CTF_F_DYNSTR which is always
set in v3-format CTF dicts from this commit forwards: CTF dicts without
this flag are associated with .strtab like they used to be, so that old
dicts' external strings don't turn to garbage when loaded by new libctf.
Dicts with this flag are associated with .dynstr and .dynsym instead.
(The flag is not the next in sequence because this commit was written
quite late: the missing flags will be filled in by the next commit.)
Tests forthcoming in a later commit in this series.
bfd/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* elflink.c (elf_finalize_dynstr): Call examine_strtab after
dynstr finalization.
(elf_link_swap_symbols_out): Don't call it here. Call
ctf_new_symbol before swap_symbol_out.
(elf_link_output_extsym): Call ctf_new_dynsym before
swap_symbol_out.
(bfd_elf_final_link): Likewise.
* elf.c (swap_out_syms): Pass in bfd_link_info. Call
ctf_new_symbol before swap_symbol_out.
(_bfd_elf_compute_section_file_positions): Adjust.
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* readelf.c (dump_section_as_ctf): Use .dynsym and .dynstr, not
.symtab and .strtab.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* bfdlink.h (struct elf_sym_strtab): Replace with...
(struct elf_internal_sym): ... this.
(struct bfd_link_callbacks) <examine_strtab>: Take only a
symstrtab argument.
<ctf_new_symbol>: New.
<ctf_new_dynsym>: Likewise.
* ctf-api.h (struct ctf_link_sym) <st_symidx>: New.
<st_nameidx>: Likewise.
<st_nameidx_set>: Likewise.
(ctf_link_iter_symbol_f): Removed.
(ctf_link_shuffle_syms): Remove most parameters, just takes a
ctf_dict_t now.
(ctf_link_add_linker_symbol): New, split from
ctf_link_shuffle_syms.
* ctf.h (CTF_F_DYNSTR): New.
(CTF_F_MAX): Adjust.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldelfgen.c (struct ctf_strsym_iter_cb_arg): Rename to...
(struct ctf_strtab_iter_cb_arg): ... this, changing fields:
<syms>: Remove.
<symcount>: Remove.
<symstrtab>: Rename to...
<strtab>: ... this.
(ldelf_ctf_strtab_iter_cb): Adjust.
(ldelf_ctf_symbols_iter_cb): Remove.
(ldelf_new_dynsym_for_ctf): New, tell libctf about a single
symbol.
(ldelf_examine_strtab_for_ctf): Rename to...
(ldelf_acquire_strings_for_ctf): ... this, only doing the strtab
portion and not symbols.
* ldelfgen.h: Adjust declarations accordingly.
* ldemul.c (ldemul_examine_strtab_for_ctf): Rename to...
(ldemul_acquire_strings_for_ctf): ... this.
(ldemul_new_dynsym_for_ctf): New.
* ldemul.h: Adjust declarations accordingly.
* ldlang.c (ldlang_ctf_apply_strsym): Rename to...
(ldlang_ctf_acquire_strings): ... this.
(ldlang_ctf_new_dynsym): New.
(lang_write_ctf): Call ldemul_new_dynsym_for_ctf with NULL to do
the actual symbol shuffle.
* ldlang.h (struct elf_strtab_hash): Adjust accordingly.
* ldmain.c (bfd_link_callbacks): Wire up new/renamed callbacks.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-link.c (ctf_link_shuffle_syms): Adjust.
(ctf_link_add_linker_symbol): New, unimplemented stub.
* libctf.ver: Add it.
* ctf-create.c (ctf_serialize): Set CTF_F_DYNSTR on newly-serialized
dicts.
* ctf-open-bfd.c (ctf_bfdopen_ctfsect): Check for the flag: open the
symtab/strtab if not present, dynsym/dynstr otherwise.
* ctf-archive.c (ctf_arc_bufpreamble): New, get the preamble from
some arbitrary member of a CTF archive.
* ctf-impl.h (ctf_arc_bufpreamble): Declare it.
2020-11-20 21:34:04 +08:00
|
|
|
int
|
|
|
|
ctf_link_shuffle_syms (ctf_dict_t *fp)
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
{
|
libctf: symbol type linking support
This adds facilities to write out the function info and data object
sections, which efficiently map from entries in the symbol table to
types. The write-side code is entirely new: the read-side code was
merely significantly changed and support for indexed tables added
(pointed to by the no-longer-unused cth_objtidxoff and cth_funcidxoff
header fields).
With this in place, you can use ctf_lookup_by_symbol to look up the
types of symbols of function and object type (and, as before, you can
use ctf_lookup_variable to look up types of file-scope variables not
present in the symbol table, as long as you know their name: but
variables that are also data objects are now found in the data object
section instead.)
(Compatible) file format change:
The CTF spec has always said that the function info section looks much
like the CTF_K_FUNCTIONs in the type section: an info word (including an
argument count) followed by a return type and N argument types. This
format is suboptimal: it means function symbols cannot be deduplicated
and it causes a lot of ugly code duplication in libctf. But
conveniently the compiler has never emitted this! Because it has always
emitted a rather different format that libctf has never accepted, we can
be sure that there are no instances of this function info section in the
wild, and can freely change its format without compatibility concerns or
a file format version bump. (And since it has never been emitted in any
code that generated any older file format version, either, we need keep
no code to read the format as specified at all!)
So the function info section is now specified as an array of uint32_t,
exactly like the object data section: each entry is a type ID in the
type section which must be of kind CTF_K_FUNCTION, the prototype of
this function.
This allows function types to be deduplicated and also correctly encodes
the fact that all functions declared in C really are types available to
the program: so they should be stored in the type section like all other
types. (In format v4, we will be able to represent the types of static
functions as well, but that really does require a file format change.)
We introduce a new header flag, CTF_F_NEWFUNCINFO, which is set if the
new function info format is in use. A sufficiently new compiler will
always set this flag. New libctf will always set this flag: old libctf
will refuse to open any CTF dicts that have this flag set. If the flag
is not set on a dict being read in, new libctf will disregard the
function info section. Format v4 will remove this flag (or, rather, the
flag has no meaning there and the bit position may be recycled for some
other purpose).
New API:
Symbol addition:
ctf_add_func_sym: Add a symbol with a given name and type. The
type must be of kind CTF_K_FUNCTION (a function
pointer). Internally this adds a name -> type
mapping to the ctf_funchash in the ctf_dict.
ctf_add_objt_sym: Add a symbol with a given name and type. The type
kind can be anything, including function pointers.
This adds to ctf_objthash.
These both treat symbols as name -> type mappings: the linker associates
symbol names with symbol indexes via the ctf_link_shuffle_syms callback,
which sets up the ctf_dynsyms/ctf_dynsymidx/ctf_dynsymmax fields in the
ctf_dict. Repeated relinks can add more symbols.
Variables that are also exposed as symbols are removed from the variable
section at serialization time.
CTF symbol type sections which have enough pads, defined by
CTF_INDEX_PAD_THRESHOLD (whether because they are in dicts with symbols
where most types are unknown, or in archive where most types are defined
in some child or parent dict, not in this specific dict) are sorted by
name rather than symidx and accompanied by an index which associates
each symbol type entry with a name: the existing ctf_lookup_by_symbol
will map symbol indexes to symbol names and look the names up in the
index automatically. (This is currently ELF-symbol-table-dependent, but
there is almost nothing specific to ELF in here and we can add support
for other symbol table formats easily).
The compiler also uses index sections to communicate the contents of
object file symbol tables without relying on any specific ordering of
symbols: it doesn't need to sort them, and libctf will detect an
unsorted index section via the absence of the new CTF_F_IDXSORTED header
flag, and sort it if needed.
Iteration:
ctf_symbol_next: Iterator which returns the types and names of symbols
one by one, either for function or data symbols.
This does not require any sorting: the ctf_link machinery uses it to
pull in all the compiler-provided symbols cheaply, but it is not
restricted to that use.
(Compatible) changes in API:
ctf_lookup_by_symbol: can now be called for object and function
symbols: never returns ECTF_NOTDATA (which is
now not thrown by anything, but is kept for
compatibility and because it is a plausible
error that we might start throwing again at some
later date).
Internally we also have changes to the ctf-string functionality so that
"external" strings (those where we track a string -> offset mapping, but
only write out an offset) can be consulted via the usual means
(ctf_strptr) before the strtab is written out. This is important
because ctf_link_add_linker_symbol can now be handed symbols named via
strtab offsets, and ctf_link_shuffle_syms must figure out their actual
names by looking in the external symtab we have just been fed by the
ctf_link_add_strtab callback, long before that strtab is written out.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_symbol_next): New.
(ctf_add_objt_sym): Likewise.
(ctf_add_func_sym): Likewise.
* ctf.h: Document new function info section format.
(CTF_F_NEWFUNCINFO): New.
(CTF_F_IDXSORTED): New.
(CTF_F_MAX): Adjust accordingly.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (CTF_INDEX_PAD_THRESHOLD): New.
(_libctf_nonnull_): Likewise.
(ctf_in_flight_dynsym_t): New.
(ctf_dict_t) <ctf_funcidx_names>: Likewise.
<ctf_objtidx_names>: Likewise.
<ctf_nfuncidx>: Likewise.
<ctf_nobjtidx>: Likewise.
<ctf_funcidx_sxlate>: Likewise.
<ctf_objtidx_sxlate>: Likewise.
<ctf_objthash>: Likewise.
<ctf_funchash>: Likewise.
<ctf_dynsyms>: Likewise.
<ctf_dynsymidx>: Likewise.
<ctf_dynsymmax>: Likewise.
<ctf_in_flight_dynsym>: Likewise.
(struct ctf_next) <u.ctn_next>: Likewise.
(ctf_symtab_skippable): New prototype.
(ctf_add_funcobjt_sym): Likewise.
(ctf_dynhash_sort_by_name): Likewise.
(ctf_sym_to_elf64): Rename to...
(ctf_elf32_to_link_sym): ... this, and...
(ctf_elf64_to_link_sym): ... this.
* ctf-open.c (init_symtab): Check for lack of CTF_F_NEWFUNCINFO
flag, and presence of index sections. Refactor out
ctf_symtab_skippable and ctf_elf*_to_link_sym, and use them. Use
ctf_link_sym_t, not Elf64_Sym. Skip initializing objt or func
sxlate sections if corresponding index section is present. Adjust
for new func info section format.
(ctf_bufopen_internal): Add ctf_err_warn to corrupt-file error
handling. Report incorrect-length index sections. Always do an
init_symtab, even if there is no symtab section (there may be index
sections still).
(flip_objts): Adjust comment: func and objt sections are actually
identical in structure now, no need to caveat.
(ctf_dict_close): Free newly-added data structures.
* ctf-create.c (ctf_create): Initialize them.
(ctf_symtab_skippable): New, refactored out of
init_symtab, with st_nameidx_set check added.
(ctf_add_funcobjt_sym): New, add a function or object symbol to the
ctf_objthash or ctf_funchash, by name.
(ctf_add_objt_sym): Call it.
(ctf_add_func_sym): Likewise.
(symtypetab_delete_nonstatic_vars): New, delete vars also present as
data objects.
(CTF_SYMTYPETAB_EMIT_FUNCTION): New flag to symtypetab emitters:
this is a function emission, not a data object emission.
(CTF_SYMTYPETAB_EMIT_PAD): New flag to symtypetab emitters: emit
pads for symbols with no type (only set for unindexed sections).
(CTF_SYMTYPETAB_FORCE_INDEXED): New flag to symtypetab emitters:
always emit indexed.
(symtypetab_density): New, figure out section sizes.
(emit_symtypetab): New, emit a symtypetab.
(emit_symtypetab_index): New, emit a symtypetab index.
(ctf_serialize): Call them, emitting suitably sorted symtypetab
sections and indexes. Set suitable header flags. Copy over new
fields.
* ctf-hash.c (ctf_dynhash_sort_by_name): New, used to impose an
order on symtypetab index sections.
* ctf-link.c (ctf_add_type_mapping): Delete erroneous comment
relating to code that was never committed.
(ctf_link_one_variable): Improve variable name.
(check_sym): New, symtypetab analogue of check_variable.
(ctf_link_deduplicating_one_symtypetab): New.
(ctf_link_deduplicating_syms): Likewise.
(ctf_link_deduplicating): Call them.
(ctf_link_deduplicating_per_cu): Note that we don't call them in
this case (yet).
(ctf_link_add_strtab): Set the error on the fp correctly.
(ctf_link_add_linker_symbol): New (no longer a do-nothing stub), add
a linker symbol to the in-flight list.
(ctf_link_shuffle_syms): New (no longer a do-nothing stub), turn the
in-flight list into a mapping we can use, now its names are
resolvable in the external strtab.
* ctf-string.c (ctf_str_rollback_atom): Don't roll back atoms with
external strtab offsets.
(ctf_str_rollback): Adjust comment.
(ctf_str_write_strtab): Migrate ctf_syn_ext_strtab population from
writeout time...
(ctf_str_add_external): ... to string addition time.
* ctf-lookup.c (ctf_lookup_var_key_t): Rename to...
(ctf_lookup_idx_key_t): ... this, now we use it for syms too.
<clik_names>: New member, a name table.
(ctf_lookup_var): Adjust accordingly.
(ctf_lookup_variable): Likewise.
(ctf_lookup_by_id): Shuffle further up in the file.
(ctf_symidx_sort_arg_cb): New, callback for...
(sort_symidx_by_name): ... this new function to sort a symidx
found to be unsorted (likely originating from the compiler).
(ctf_symidx_sort): New, sort a symidx.
(ctf_lookup_symbol_name): Support dynamic symbols with indexes
provided by the linker. Use ctf_link_sym_t, not Elf64_Sym.
Check the parent if a child lookup fails.
(ctf_lookup_by_symbol): Likewise. Work for function symbols too.
(ctf_symbol_next): New, iterate over symbols with types (without
sorting).
(ctf_lookup_idx_name): New, bsearch for symbol names in indexes.
(ctf_try_lookup_indexed): New, attempt an indexed lookup.
(ctf_func_info): Reimplement in terms of ctf_lookup_by_symbol.
(ctf_func_args): Likewise.
(ctf_get_dict): Move...
* ctf-types.c (ctf_get_dict): ... here.
* ctf-util.c (ctf_sym_to_elf64): Re-express as...
(ctf_elf64_to_link_sym): ... this. Add new st_symidx field, and
st_nameidx_set (always 0, so st_nameidx can be ignored). Look in
the ELF strtab for names.
(ctf_elf32_to_link_sym): Likewise, for Elf32_Sym.
(ctf_next_destroy): Destroy ctf_next_t.u.ctn_next if need be.
* libctf.ver: Add ctf_symbol_next, ctf_add_objt_sym and
ctf_add_func_sym.
2020-11-20 21:34:04 +08:00
|
|
|
ctf_in_flight_dynsym_t *did, *nid;
|
|
|
|
ctf_next_t *i = NULL;
|
|
|
|
int err = ENOMEM;
|
|
|
|
void *name_, *sym_;
|
|
|
|
|
libctf: support addition of types to dicts read via ctf_open()
libctf has long declared deserialized dictionaries (out of files or ELF
sections or memory buffers or whatever) to be read-only: back in the
furthest prehistory this was not the case, in that you could add a
few sorts of type to such dicts, but attempting to do so often caused
horrible memory corruption, so I banned the lot.
But it turns out real consumers want it (notably DTrace, which
synthesises pointers to types that don't have them and adds them to the
ctf_open()ed dicts if it needs them). Let's bring it back again, but
without the memory corruption and without the massive code duplication
required in days of yore to distinguish between static and dynamic
types: the representation of both types has been identical for a few
years, with the only difference being that types as a whole are stored in
a big buffer for types read in via ctf_open and per-type hashtables for
newly-added types.
So we discard the internally-visible concept of "readonly dictionaries"
in favour of declaring the *range of types* that were already present
when the dict was read in to be read-only: you can't modify them (say,
by adding members to them if they're structs, or calling ctf_set_array
on them), but you can add more types and point to them. (The API
remains the same, with calls sometimes returning ECTF_RDONLY, but now
they do so less often.)
This is a fairly invasive change, mostly because code written since the
ban was introduced didn't take the possibility of a static/dynamic split
into account. Some of these irregularities were hard to define as
anything but bugs.
Notably:
- The symbol handling was assuming that symbols only needed to be
looked for in dynamic hashtabs or static linker-laid-out indexed/
nonindexed layouts, but now we want to check both in case people
added more symbols to a dict they opened.
- The code that handles type additions wasn't checking to see if types
with the same name existed *at all* (so you could do
ctf_add_typedef (fp, "foo", bar) repeatedly without error). This
seems reasonable for types you just added, but we probably *do* want
to ban addition of types with names that override names we already
used in the ctf_open()ed portion, since that would probably corrupt
existing type relationships. (Doing things this way also avoids
causing new errors for any existing code that was doing this sort of
thing.)
- ctf_lookup_variable entirely failed to work for variables just added
by ctf_add_variable: you had to write the dict out and read it back
in again before they appeared.
- The symbol handling remembered what symbols you looked up but didn't
remember their types, so you could look up an object symbol and then
find it popping up when you asked for function symbols, which seems
less than ideal. Since we had to rejig things enough to be able to
distinguish function and object symbols internally anyway (in order
to give suitable errors if you try to add a symbol with a name that
already existed in the ctf_open()ed dict), this bug suddenly became
more visible and was easily fixed.
We do not (yet) support writing out dicts that have been previously read
in via ctf_open() or other deserializer (you can look things up in them,
but not write them out a second time). This never worked, so there is
no incompatibility; if it is needed at a later date, the serializer is a
little bit closer to having it work now (the only table we don't deal
with is the types table, and that's because the upcoming CTFv4 changes
are likely to make major changes to the way that table is represented
internally, so adding more code that depends on its current form seems
like a bad idea).
There is a new testcase that tests much of this, in particular that
modification of existing types is still banned and that you can add new
ones and chase them without error.
libctf/
* ctf-impl.h (struct ctf_dict.ctf_symhash): Split into...
(ctf_dict.ctf_symhash_func): ... this and...
(ctf_dict.ctf_symhash_objt): ... this.
(ctf_dict.ctf_stypes): New, counts static types.
(LCTF_INDEX_TO_TYPEPTR): Use it instead of CTF_RDWR.
(LCTF_RDWR): Deleted.
(LCTF_DIRTY): Renumbered.
(LCTF_LINKING): Likewise.
(ctf_lookup_variable_here): New.
(ctf_lookup_by_sym_or_name): Likewise.
(ctf_symbol_next_static): Likewise.
(ctf_add_variable_forced): Likewise.
(ctf_add_funcobjt_sym_forced): Likewise.
(ctf_simple_open_internal): Adjust.
(ctf_bufopen_internal): Likewise.
* ctf-create.c (ctf_grow_ptrtab): Adjust a lot to start with.
(ctf_create): Migrate a bunch of initializations into bufopen.
Force recreation of name tables. Do not forcibly override the
model, let ctf_bufopen do it.
(ctf_static_type): New.
(ctf_update): Drop LCTF_RDWR check.
(ctf_dynamic_type): Likewise.
(ctf_add_function): Likewise.
(ctf_add_type_internal): Likewise.
(ctf_rollback): Check ctf_stypes, not LCTF_RDWR.
(ctf_set_array): Likewise.
(ctf_add_struct_sized): Likewise.
(ctf_add_union_sized): Likewise.
(ctf_add_enum): Likewise.
(ctf_add_enumerator): Likewise (only on the target dict).
(ctf_add_member_offset): Likewise.
(ctf_add_generic): Drop LCTF_RDWR check. Ban addition of types
with colliding names.
(ctf_add_forward): Note safety under the new rules.
(ctf_add_variable): Split all but the existence check into...
(ctf_add_variable_forced): ... this new function.
(ctf_add_funcobjt_sym): Likewise...
(ctf_add_funcobjt_sym_forced): ... for this new function.
* ctf-link.c (ctf_link_add_linker_symbol): Ban calling on dicts
with any stypes.
(ctf_link_add_strtab): Likewise.
(ctf_link_shuffle_syms): Likewise.
(ctf_link_intern_extern_string): Note pre-existing prohibition.
* ctf-lookup.c (ctf_lookup_by_id): Drop LCTF_RDWR check.
(ctf_lookup_variable): Split out looking in a dict but not
its parent into...
(ctf_lookup_variable_here): ... this new function.
(ctf_lookup_symbol_idx): Track whether looking up a function or
object: cache them separately.
(ctf_symbol_next): Split out looking in non-dynamic symtypetab
entries to...
(ctf_symbol_next_static): ... this new function. Don't get confused
by the simultaneous presence of static and dynamic symtypetab entries.
(ctf_try_lookup_indexed): Don't waste time looking up symbols by
index before there can be any idea how symbols are numbered.
(ctf_lookup_by_sym_or_name): Distinguish between function and
data object lookups. Drop LCTF_RDWR.
(ctf_lookup_by_symbol): Adjust.
(ctf_lookup_by_symbol_name): Likewise.
* ctf-open.c (init_types): Rename to...
(init_static_types): ... this. Drop LCTF_RDWR. Populate ctf_stypes.
(ctf_simple_open): Drop writable arg.
(ctf_simple_open_internal): Likewise.
(ctf_bufopen): Likewise.
(ctf_bufopen_internal): Populate fields only used for writable dicts.
Drop LCTF_RDWR.
(ctf_dict_close): Cater for symhash cache split.
* ctf-serialize.c (ctf_serialize): Use ctf_stypes, not LCTF_RDWR.
* ctf-types.c (ctf_variable_next): Drop LCTF_RDWR.
* testsuite/libctf-lookup/add-to-opened*: New test.
2023-12-20 00:58:19 +08:00
|
|
|
if (fp->ctf_stypes > 0)
|
|
|
|
return ctf_set_errno (fp, ECTF_RDONLY);
|
|
|
|
|
libctf: symbol type linking support
This adds facilities to write out the function info and data object
sections, which efficiently map from entries in the symbol table to
types. The write-side code is entirely new: the read-side code was
merely significantly changed and support for indexed tables added
(pointed to by the no-longer-unused cth_objtidxoff and cth_funcidxoff
header fields).
With this in place, you can use ctf_lookup_by_symbol to look up the
types of symbols of function and object type (and, as before, you can
use ctf_lookup_variable to look up types of file-scope variables not
present in the symbol table, as long as you know their name: but
variables that are also data objects are now found in the data object
section instead.)
(Compatible) file format change:
The CTF spec has always said that the function info section looks much
like the CTF_K_FUNCTIONs in the type section: an info word (including an
argument count) followed by a return type and N argument types. This
format is suboptimal: it means function symbols cannot be deduplicated
and it causes a lot of ugly code duplication in libctf. But
conveniently the compiler has never emitted this! Because it has always
emitted a rather different format that libctf has never accepted, we can
be sure that there are no instances of this function info section in the
wild, and can freely change its format without compatibility concerns or
a file format version bump. (And since it has never been emitted in any
code that generated any older file format version, either, we need keep
no code to read the format as specified at all!)
So the function info section is now specified as an array of uint32_t,
exactly like the object data section: each entry is a type ID in the
type section which must be of kind CTF_K_FUNCTION, the prototype of
this function.
This allows function types to be deduplicated and also correctly encodes
the fact that all functions declared in C really are types available to
the program: so they should be stored in the type section like all other
types. (In format v4, we will be able to represent the types of static
functions as well, but that really does require a file format change.)
We introduce a new header flag, CTF_F_NEWFUNCINFO, which is set if the
new function info format is in use. A sufficiently new compiler will
always set this flag. New libctf will always set this flag: old libctf
will refuse to open any CTF dicts that have this flag set. If the flag
is not set on a dict being read in, new libctf will disregard the
function info section. Format v4 will remove this flag (or, rather, the
flag has no meaning there and the bit position may be recycled for some
other purpose).
New API:
Symbol addition:
ctf_add_func_sym: Add a symbol with a given name and type. The
type must be of kind CTF_K_FUNCTION (a function
pointer). Internally this adds a name -> type
mapping to the ctf_funchash in the ctf_dict.
ctf_add_objt_sym: Add a symbol with a given name and type. The type
kind can be anything, including function pointers.
This adds to ctf_objthash.
These both treat symbols as name -> type mappings: the linker associates
symbol names with symbol indexes via the ctf_link_shuffle_syms callback,
which sets up the ctf_dynsyms/ctf_dynsymidx/ctf_dynsymmax fields in the
ctf_dict. Repeated relinks can add more symbols.
Variables that are also exposed as symbols are removed from the variable
section at serialization time.
CTF symbol type sections which have enough pads, defined by
CTF_INDEX_PAD_THRESHOLD (whether because they are in dicts with symbols
where most types are unknown, or in archive where most types are defined
in some child or parent dict, not in this specific dict) are sorted by
name rather than symidx and accompanied by an index which associates
each symbol type entry with a name: the existing ctf_lookup_by_symbol
will map symbol indexes to symbol names and look the names up in the
index automatically. (This is currently ELF-symbol-table-dependent, but
there is almost nothing specific to ELF in here and we can add support
for other symbol table formats easily).
The compiler also uses index sections to communicate the contents of
object file symbol tables without relying on any specific ordering of
symbols: it doesn't need to sort them, and libctf will detect an
unsorted index section via the absence of the new CTF_F_IDXSORTED header
flag, and sort it if needed.
Iteration:
ctf_symbol_next: Iterator which returns the types and names of symbols
one by one, either for function or data symbols.
This does not require any sorting: the ctf_link machinery uses it to
pull in all the compiler-provided symbols cheaply, but it is not
restricted to that use.
(Compatible) changes in API:
ctf_lookup_by_symbol: can now be called for object and function
symbols: never returns ECTF_NOTDATA (which is
now not thrown by anything, but is kept for
compatibility and because it is a plausible
error that we might start throwing again at some
later date).
Internally we also have changes to the ctf-string functionality so that
"external" strings (those where we track a string -> offset mapping, but
only write out an offset) can be consulted via the usual means
(ctf_strptr) before the strtab is written out. This is important
because ctf_link_add_linker_symbol can now be handed symbols named via
strtab offsets, and ctf_link_shuffle_syms must figure out their actual
names by looking in the external symtab we have just been fed by the
ctf_link_add_strtab callback, long before that strtab is written out.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_symbol_next): New.
(ctf_add_objt_sym): Likewise.
(ctf_add_func_sym): Likewise.
* ctf.h: Document new function info section format.
(CTF_F_NEWFUNCINFO): New.
(CTF_F_IDXSORTED): New.
(CTF_F_MAX): Adjust accordingly.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (CTF_INDEX_PAD_THRESHOLD): New.
(_libctf_nonnull_): Likewise.
(ctf_in_flight_dynsym_t): New.
(ctf_dict_t) <ctf_funcidx_names>: Likewise.
<ctf_objtidx_names>: Likewise.
<ctf_nfuncidx>: Likewise.
<ctf_nobjtidx>: Likewise.
<ctf_funcidx_sxlate>: Likewise.
<ctf_objtidx_sxlate>: Likewise.
<ctf_objthash>: Likewise.
<ctf_funchash>: Likewise.
<ctf_dynsyms>: Likewise.
<ctf_dynsymidx>: Likewise.
<ctf_dynsymmax>: Likewise.
<ctf_in_flight_dynsym>: Likewise.
(struct ctf_next) <u.ctn_next>: Likewise.
(ctf_symtab_skippable): New prototype.
(ctf_add_funcobjt_sym): Likewise.
(ctf_dynhash_sort_by_name): Likewise.
(ctf_sym_to_elf64): Rename to...
(ctf_elf32_to_link_sym): ... this, and...
(ctf_elf64_to_link_sym): ... this.
* ctf-open.c (init_symtab): Check for lack of CTF_F_NEWFUNCINFO
flag, and presence of index sections. Refactor out
ctf_symtab_skippable and ctf_elf*_to_link_sym, and use them. Use
ctf_link_sym_t, not Elf64_Sym. Skip initializing objt or func
sxlate sections if corresponding index section is present. Adjust
for new func info section format.
(ctf_bufopen_internal): Add ctf_err_warn to corrupt-file error
handling. Report incorrect-length index sections. Always do an
init_symtab, even if there is no symtab section (there may be index
sections still).
(flip_objts): Adjust comment: func and objt sections are actually
identical in structure now, no need to caveat.
(ctf_dict_close): Free newly-added data structures.
* ctf-create.c (ctf_create): Initialize them.
(ctf_symtab_skippable): New, refactored out of
init_symtab, with st_nameidx_set check added.
(ctf_add_funcobjt_sym): New, add a function or object symbol to the
ctf_objthash or ctf_funchash, by name.
(ctf_add_objt_sym): Call it.
(ctf_add_func_sym): Likewise.
(symtypetab_delete_nonstatic_vars): New, delete vars also present as
data objects.
(CTF_SYMTYPETAB_EMIT_FUNCTION): New flag to symtypetab emitters:
this is a function emission, not a data object emission.
(CTF_SYMTYPETAB_EMIT_PAD): New flag to symtypetab emitters: emit
pads for symbols with no type (only set for unindexed sections).
(CTF_SYMTYPETAB_FORCE_INDEXED): New flag to symtypetab emitters:
always emit indexed.
(symtypetab_density): New, figure out section sizes.
(emit_symtypetab): New, emit a symtypetab.
(emit_symtypetab_index): New, emit a symtypetab index.
(ctf_serialize): Call them, emitting suitably sorted symtypetab
sections and indexes. Set suitable header flags. Copy over new
fields.
* ctf-hash.c (ctf_dynhash_sort_by_name): New, used to impose an
order on symtypetab index sections.
* ctf-link.c (ctf_add_type_mapping): Delete erroneous comment
relating to code that was never committed.
(ctf_link_one_variable): Improve variable name.
(check_sym): New, symtypetab analogue of check_variable.
(ctf_link_deduplicating_one_symtypetab): New.
(ctf_link_deduplicating_syms): Likewise.
(ctf_link_deduplicating): Call them.
(ctf_link_deduplicating_per_cu): Note that we don't call them in
this case (yet).
(ctf_link_add_strtab): Set the error on the fp correctly.
(ctf_link_add_linker_symbol): New (no longer a do-nothing stub), add
a linker symbol to the in-flight list.
(ctf_link_shuffle_syms): New (no longer a do-nothing stub), turn the
in-flight list into a mapping we can use, now its names are
resolvable in the external strtab.
* ctf-string.c (ctf_str_rollback_atom): Don't roll back atoms with
external strtab offsets.
(ctf_str_rollback): Adjust comment.
(ctf_str_write_strtab): Migrate ctf_syn_ext_strtab population from
writeout time...
(ctf_str_add_external): ... to string addition time.
* ctf-lookup.c (ctf_lookup_var_key_t): Rename to...
(ctf_lookup_idx_key_t): ... this, now we use it for syms too.
<clik_names>: New member, a name table.
(ctf_lookup_var): Adjust accordingly.
(ctf_lookup_variable): Likewise.
(ctf_lookup_by_id): Shuffle further up in the file.
(ctf_symidx_sort_arg_cb): New, callback for...
(sort_symidx_by_name): ... this new function to sort a symidx
found to be unsorted (likely originating from the compiler).
(ctf_symidx_sort): New, sort a symidx.
(ctf_lookup_symbol_name): Support dynamic symbols with indexes
provided by the linker. Use ctf_link_sym_t, not Elf64_Sym.
Check the parent if a child lookup fails.
(ctf_lookup_by_symbol): Likewise. Work for function symbols too.
(ctf_symbol_next): New, iterate over symbols with types (without
sorting).
(ctf_lookup_idx_name): New, bsearch for symbol names in indexes.
(ctf_try_lookup_indexed): New, attempt an indexed lookup.
(ctf_func_info): Reimplement in terms of ctf_lookup_by_symbol.
(ctf_func_args): Likewise.
(ctf_get_dict): Move...
* ctf-types.c (ctf_get_dict): ... here.
* ctf-util.c (ctf_sym_to_elf64): Re-express as...
(ctf_elf64_to_link_sym): ... this. Add new st_symidx field, and
st_nameidx_set (always 0, so st_nameidx can be ignored). Look in
the ELF strtab for names.
(ctf_elf32_to_link_sym): Likewise, for Elf32_Sym.
(ctf_next_destroy): Destroy ctf_next_t.u.ctn_next if need be.
* libctf.ver: Add ctf_symbol_next, ctf_add_objt_sym and
ctf_add_func_sym.
2020-11-20 21:34:04 +08:00
|
|
|
if (!fp->ctf_dynsyms)
|
|
|
|
{
|
|
|
|
fp->ctf_dynsyms = ctf_dynhash_create (ctf_hash_string,
|
|
|
|
ctf_hash_eq_string,
|
|
|
|
NULL, free);
|
|
|
|
if (!fp->ctf_dynsyms)
|
|
|
|
{
|
|
|
|
ctf_set_errno (fp, ENOMEM);
|
|
|
|
return -ENOMEM;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Add all the symbols, excluding only those we already know are prohibited
|
|
|
|
from appearing in symtypetabs. */
|
|
|
|
|
|
|
|
for (did = ctf_list_next (&fp->ctf_in_flight_dynsyms); did != NULL; did = nid)
|
|
|
|
{
|
|
|
|
ctf_link_sym_t *new_sym;
|
|
|
|
|
|
|
|
nid = ctf_list_next (did);
|
|
|
|
ctf_list_delete (&fp->ctf_in_flight_dynsyms, did);
|
|
|
|
|
|
|
|
/* We might get a name or an external strtab offset. The strtab offset is
|
|
|
|
guaranteed resolvable at this point, so turn it into a string. */
|
|
|
|
|
|
|
|
if (did->cid_sym.st_name == NULL)
|
|
|
|
{
|
|
|
|
uint32_t off = CTF_SET_STID (did->cid_sym.st_nameidx, CTF_STRTAB_1);
|
|
|
|
|
|
|
|
did->cid_sym.st_name = ctf_strraw (fp, off);
|
|
|
|
did->cid_sym.st_nameidx_set = 0;
|
|
|
|
if (!ctf_assert (fp, did->cid_sym.st_name != NULL))
|
|
|
|
return -ECTF_INTERNAL; /* errno is set for us. */
|
|
|
|
}
|
|
|
|
|
|
|
|
/* The symbol might have turned out to be nameless, so we have to recheck
|
|
|
|
for skippability here. */
|
|
|
|
if (!ctf_symtab_skippable (&did->cid_sym))
|
|
|
|
{
|
bfd, ld, libctf: skip zero-refcount strings in CTF string reporting
This is a tricky one. BFD, on the linker's behalf, reports symbols to
libctf via the ctf_new_symbol and ctf_new_dynsym callbacks, which
ultimately call ctf_link_add_linker_symbol. But while this happens
after strtab offsets are finalized, it happens before the .dynstr is
actually laid out, so we can't iterate over it at this stage and
it is not clear what the reported symbols are actually called. So
a second callback, examine_strtab, is called after the .dynstr is
finalized, which calls ctf_link_add_strtab and ultimately leads
to ldelf_ctf_strtab_iter_cb being called back repeatedly until the
offsets of every string in the .dynstr is passed to libctf.
libctf can then use this to get symbol names out of the input (which
usually stores symbol types in the form of a name -> type mapping at
this stage) and extract the types of those symbols, feeding them back
into their final form as a 1:1 association with the real symtab's
STT_OBJ and STT_FUNC symbols (with a few skipped, see
ctf_symtab_skippable).
This representation is compact, but has one problem: if libctf somehow
gets confused about the st_type of a symbol, it'll stick an entry into
the function symtypetab when it should put it into the object
symtypetab, or vice versa, and *every symbol from that one on* will have
the wrong CTF type because it's actually looking up the type for a
different symbol.
And we have just such a bug. ctf_link_add_strtab was not taking the
refcounts of strings into consideration, so even strings that had been
eliminated from the strtab by virtue of being in objects eliminated via
--as-needed etc were being reported. This is harmful because it can
lead to multiple strings with the same apparent offset, and if the last
duplicate to be reported relates to an eliminated symbol, we look up the
wrong symbol from the input and gets its type wrong: if it's unlucky and
the eliminated symbol is also of the wrong st_type, we will end up with
a corrupted symtypetab.
Thankfully the wrong-st_type case is already diagnosed by a
this-can-never-happen paranoid warning:
CTF warning: Symbol 61a added to CTF as a function but is of type 1
or the converse
* CTF warning: Symbol a3 added to CTF as a data object but is of type 2
so at least we can tell when the corruption has spread to more than one
symbol's type.
Skipping zero-refcounted strings is easy: teach _bfd_elf_strtab_str to
skip them, and ldelf_ctf_strtab_iter_cb to loop over skipped strings
until it falls off the end or finds one that isn't skipped.
bfd/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* elf-strtab.c (_bfd_elf_strtab_str): Skip strings with zero refcount.
ld/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ldelfgen.c (ldelf_ctf_strtab_iter_cb): Skip zero-refcount strings.
libctf/ChangeLog
2021-03-02 Nick Alcock <nick.alcock@oracle.com>
* ctf-create.c (symtypetab_density): Report the symbol name as
well as index in the name != object error; note the likely
consequences.
* ctf-link.c (ctf_link_shuffle_syms): Report the symbol index
as well as name.
2021-03-02 23:10:05 +08:00
|
|
|
ctf_dprintf ("symbol from linker: %s (%x)\n", did->cid_sym.st_name,
|
|
|
|
did->cid_sym.st_symidx);
|
libctf: symbol type linking support
This adds facilities to write out the function info and data object
sections, which efficiently map from entries in the symbol table to
types. The write-side code is entirely new: the read-side code was
merely significantly changed and support for indexed tables added
(pointed to by the no-longer-unused cth_objtidxoff and cth_funcidxoff
header fields).
With this in place, you can use ctf_lookup_by_symbol to look up the
types of symbols of function and object type (and, as before, you can
use ctf_lookup_variable to look up types of file-scope variables not
present in the symbol table, as long as you know their name: but
variables that are also data objects are now found in the data object
section instead.)
(Compatible) file format change:
The CTF spec has always said that the function info section looks much
like the CTF_K_FUNCTIONs in the type section: an info word (including an
argument count) followed by a return type and N argument types. This
format is suboptimal: it means function symbols cannot be deduplicated
and it causes a lot of ugly code duplication in libctf. But
conveniently the compiler has never emitted this! Because it has always
emitted a rather different format that libctf has never accepted, we can
be sure that there are no instances of this function info section in the
wild, and can freely change its format without compatibility concerns or
a file format version bump. (And since it has never been emitted in any
code that generated any older file format version, either, we need keep
no code to read the format as specified at all!)
So the function info section is now specified as an array of uint32_t,
exactly like the object data section: each entry is a type ID in the
type section which must be of kind CTF_K_FUNCTION, the prototype of
this function.
This allows function types to be deduplicated and also correctly encodes
the fact that all functions declared in C really are types available to
the program: so they should be stored in the type section like all other
types. (In format v4, we will be able to represent the types of static
functions as well, but that really does require a file format change.)
We introduce a new header flag, CTF_F_NEWFUNCINFO, which is set if the
new function info format is in use. A sufficiently new compiler will
always set this flag. New libctf will always set this flag: old libctf
will refuse to open any CTF dicts that have this flag set. If the flag
is not set on a dict being read in, new libctf will disregard the
function info section. Format v4 will remove this flag (or, rather, the
flag has no meaning there and the bit position may be recycled for some
other purpose).
New API:
Symbol addition:
ctf_add_func_sym: Add a symbol with a given name and type. The
type must be of kind CTF_K_FUNCTION (a function
pointer). Internally this adds a name -> type
mapping to the ctf_funchash in the ctf_dict.
ctf_add_objt_sym: Add a symbol with a given name and type. The type
kind can be anything, including function pointers.
This adds to ctf_objthash.
These both treat symbols as name -> type mappings: the linker associates
symbol names with symbol indexes via the ctf_link_shuffle_syms callback,
which sets up the ctf_dynsyms/ctf_dynsymidx/ctf_dynsymmax fields in the
ctf_dict. Repeated relinks can add more symbols.
Variables that are also exposed as symbols are removed from the variable
section at serialization time.
CTF symbol type sections which have enough pads, defined by
CTF_INDEX_PAD_THRESHOLD (whether because they are in dicts with symbols
where most types are unknown, or in archive where most types are defined
in some child or parent dict, not in this specific dict) are sorted by
name rather than symidx and accompanied by an index which associates
each symbol type entry with a name: the existing ctf_lookup_by_symbol
will map symbol indexes to symbol names and look the names up in the
index automatically. (This is currently ELF-symbol-table-dependent, but
there is almost nothing specific to ELF in here and we can add support
for other symbol table formats easily).
The compiler also uses index sections to communicate the contents of
object file symbol tables without relying on any specific ordering of
symbols: it doesn't need to sort them, and libctf will detect an
unsorted index section via the absence of the new CTF_F_IDXSORTED header
flag, and sort it if needed.
Iteration:
ctf_symbol_next: Iterator which returns the types and names of symbols
one by one, either for function or data symbols.
This does not require any sorting: the ctf_link machinery uses it to
pull in all the compiler-provided symbols cheaply, but it is not
restricted to that use.
(Compatible) changes in API:
ctf_lookup_by_symbol: can now be called for object and function
symbols: never returns ECTF_NOTDATA (which is
now not thrown by anything, but is kept for
compatibility and because it is a plausible
error that we might start throwing again at some
later date).
Internally we also have changes to the ctf-string functionality so that
"external" strings (those where we track a string -> offset mapping, but
only write out an offset) can be consulted via the usual means
(ctf_strptr) before the strtab is written out. This is important
because ctf_link_add_linker_symbol can now be handed symbols named via
strtab offsets, and ctf_link_shuffle_syms must figure out their actual
names by looking in the external symtab we have just been fed by the
ctf_link_add_strtab callback, long before that strtab is written out.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_symbol_next): New.
(ctf_add_objt_sym): Likewise.
(ctf_add_func_sym): Likewise.
* ctf.h: Document new function info section format.
(CTF_F_NEWFUNCINFO): New.
(CTF_F_IDXSORTED): New.
(CTF_F_MAX): Adjust accordingly.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (CTF_INDEX_PAD_THRESHOLD): New.
(_libctf_nonnull_): Likewise.
(ctf_in_flight_dynsym_t): New.
(ctf_dict_t) <ctf_funcidx_names>: Likewise.
<ctf_objtidx_names>: Likewise.
<ctf_nfuncidx>: Likewise.
<ctf_nobjtidx>: Likewise.
<ctf_funcidx_sxlate>: Likewise.
<ctf_objtidx_sxlate>: Likewise.
<ctf_objthash>: Likewise.
<ctf_funchash>: Likewise.
<ctf_dynsyms>: Likewise.
<ctf_dynsymidx>: Likewise.
<ctf_dynsymmax>: Likewise.
<ctf_in_flight_dynsym>: Likewise.
(struct ctf_next) <u.ctn_next>: Likewise.
(ctf_symtab_skippable): New prototype.
(ctf_add_funcobjt_sym): Likewise.
(ctf_dynhash_sort_by_name): Likewise.
(ctf_sym_to_elf64): Rename to...
(ctf_elf32_to_link_sym): ... this, and...
(ctf_elf64_to_link_sym): ... this.
* ctf-open.c (init_symtab): Check for lack of CTF_F_NEWFUNCINFO
flag, and presence of index sections. Refactor out
ctf_symtab_skippable and ctf_elf*_to_link_sym, and use them. Use
ctf_link_sym_t, not Elf64_Sym. Skip initializing objt or func
sxlate sections if corresponding index section is present. Adjust
for new func info section format.
(ctf_bufopen_internal): Add ctf_err_warn to corrupt-file error
handling. Report incorrect-length index sections. Always do an
init_symtab, even if there is no symtab section (there may be index
sections still).
(flip_objts): Adjust comment: func and objt sections are actually
identical in structure now, no need to caveat.
(ctf_dict_close): Free newly-added data structures.
* ctf-create.c (ctf_create): Initialize them.
(ctf_symtab_skippable): New, refactored out of
init_symtab, with st_nameidx_set check added.
(ctf_add_funcobjt_sym): New, add a function or object symbol to the
ctf_objthash or ctf_funchash, by name.
(ctf_add_objt_sym): Call it.
(ctf_add_func_sym): Likewise.
(symtypetab_delete_nonstatic_vars): New, delete vars also present as
data objects.
(CTF_SYMTYPETAB_EMIT_FUNCTION): New flag to symtypetab emitters:
this is a function emission, not a data object emission.
(CTF_SYMTYPETAB_EMIT_PAD): New flag to symtypetab emitters: emit
pads for symbols with no type (only set for unindexed sections).
(CTF_SYMTYPETAB_FORCE_INDEXED): New flag to symtypetab emitters:
always emit indexed.
(symtypetab_density): New, figure out section sizes.
(emit_symtypetab): New, emit a symtypetab.
(emit_symtypetab_index): New, emit a symtypetab index.
(ctf_serialize): Call them, emitting suitably sorted symtypetab
sections and indexes. Set suitable header flags. Copy over new
fields.
* ctf-hash.c (ctf_dynhash_sort_by_name): New, used to impose an
order on symtypetab index sections.
* ctf-link.c (ctf_add_type_mapping): Delete erroneous comment
relating to code that was never committed.
(ctf_link_one_variable): Improve variable name.
(check_sym): New, symtypetab analogue of check_variable.
(ctf_link_deduplicating_one_symtypetab): New.
(ctf_link_deduplicating_syms): Likewise.
(ctf_link_deduplicating): Call them.
(ctf_link_deduplicating_per_cu): Note that we don't call them in
this case (yet).
(ctf_link_add_strtab): Set the error on the fp correctly.
(ctf_link_add_linker_symbol): New (no longer a do-nothing stub), add
a linker symbol to the in-flight list.
(ctf_link_shuffle_syms): New (no longer a do-nothing stub), turn the
in-flight list into a mapping we can use, now its names are
resolvable in the external strtab.
* ctf-string.c (ctf_str_rollback_atom): Don't roll back atoms with
external strtab offsets.
(ctf_str_rollback): Adjust comment.
(ctf_str_write_strtab): Migrate ctf_syn_ext_strtab population from
writeout time...
(ctf_str_add_external): ... to string addition time.
* ctf-lookup.c (ctf_lookup_var_key_t): Rename to...
(ctf_lookup_idx_key_t): ... this, now we use it for syms too.
<clik_names>: New member, a name table.
(ctf_lookup_var): Adjust accordingly.
(ctf_lookup_variable): Likewise.
(ctf_lookup_by_id): Shuffle further up in the file.
(ctf_symidx_sort_arg_cb): New, callback for...
(sort_symidx_by_name): ... this new function to sort a symidx
found to be unsorted (likely originating from the compiler).
(ctf_symidx_sort): New, sort a symidx.
(ctf_lookup_symbol_name): Support dynamic symbols with indexes
provided by the linker. Use ctf_link_sym_t, not Elf64_Sym.
Check the parent if a child lookup fails.
(ctf_lookup_by_symbol): Likewise. Work for function symbols too.
(ctf_symbol_next): New, iterate over symbols with types (without
sorting).
(ctf_lookup_idx_name): New, bsearch for symbol names in indexes.
(ctf_try_lookup_indexed): New, attempt an indexed lookup.
(ctf_func_info): Reimplement in terms of ctf_lookup_by_symbol.
(ctf_func_args): Likewise.
(ctf_get_dict): Move...
* ctf-types.c (ctf_get_dict): ... here.
* ctf-util.c (ctf_sym_to_elf64): Re-express as...
(ctf_elf64_to_link_sym): ... this. Add new st_symidx field, and
st_nameidx_set (always 0, so st_nameidx can be ignored). Look in
the ELF strtab for names.
(ctf_elf32_to_link_sym): Likewise, for Elf32_Sym.
(ctf_next_destroy): Destroy ctf_next_t.u.ctn_next if need be.
* libctf.ver: Add ctf_symbol_next, ctf_add_objt_sym and
ctf_add_func_sym.
2020-11-20 21:34:04 +08:00
|
|
|
|
|
|
|
if ((new_sym = malloc (sizeof (ctf_link_sym_t))) == NULL)
|
|
|
|
goto local_oom;
|
|
|
|
|
|
|
|
memcpy (new_sym, &did->cid_sym, sizeof (ctf_link_sym_t));
|
|
|
|
if (ctf_dynhash_cinsert (fp->ctf_dynsyms, new_sym->st_name, new_sym) < 0)
|
|
|
|
goto local_oom;
|
|
|
|
|
|
|
|
if (fp->ctf_dynsymmax < new_sym->st_symidx)
|
|
|
|
fp->ctf_dynsymmax = new_sym->st_symidx;
|
|
|
|
}
|
|
|
|
|
|
|
|
free (did);
|
|
|
|
continue;
|
|
|
|
|
|
|
|
local_oom:
|
|
|
|
free (did);
|
|
|
|
free (new_sym);
|
|
|
|
goto err;
|
|
|
|
}
|
|
|
|
|
libctf, ld: fix symtypetab and var section population under ld -r
The variable section in a CTF dict is meant to contain the types of
variables that do not appear in the symbol table (mostly file-scope
static declarations). We implement this by having the compiler emit
all potential data symbols into both sections, then delete those
symbols from the variable section that correspond to data symbols the
linker has reported.
Unfortunately, the check for this in ctf_serialize is wrong: rather than
checking the set of linker-reported symbols, we check the set of names
in the data object symtypetab section: if the linker has reported no
symbols at all (usually if ld -r has been run, or if a non-linker
program that does not use symbol tables is calling ctf_link) this will
include every single symbol, emptying the variable section completely.
Worse, when ld -r is in use, we want to force writeout of every
symtypetab entry on the inputs, in an indexed section, whether or not
the linker has reported them, since this isn't a final link yet and the
symbol table is not finalized (and may grow more symbols than the linker
has yet reported). But the check for this is flawed too: we were
relying on ctf_link_shuffle_syms not having been called if no symbols
exist, but that function is *always* called by ld even when ld -r is in
use: ctf_link_add_linker_symbol is the one that's not called when there
are no symbols.
We clearly need to rethink this. Using the emptiness of the set of
reported symbols as a test for ld -r is just ugly: the linker already
knows if ld -r is underway and can just tell us. So add a new linker
flag CTF_LINK_NO_FILTER_REPORTED_SYMS that is set to stop the linker
filtering the symbols in the symtypetab sections using the set that the
linker has reported: use the presence or absence of this flag to
determine whether to emit unindexed symtabs: we only remove entries from
the variable section when filtering symbols, and we only remove them if
they are in the reported symbol set, fixing the case where no symbols
are reported by the linker at all.
(The negative sense of the new CTF_LINK flag is intentional: the common
case, both for ld and for simple tools that want to do a ctf_link with
no ELF symbol table in sight, is probably to filter out symbols that no
linker has reported: i.e., for the simple tools, all of them.)
There's another wrinkle, though. It is quite possible for a non-linker
to add symbols to a dict via ctf_add_*_sym and then write it out via the
ctf_write APIs: perhaps it's preparing a dict for a later linker
invocation. Right now this would not lead to anything terribly
meaningful happening: ctf_serialize just assumes it was called via
ctf_link if symbols are present. So add an (internal-to-libctf) flag
that indicates that a writeout is happening via ctf_link_write, and set
it there (propagating it to child dicts as needed). ctf_serialize can
then spot when it is not being called by a linker, and arrange to always
write out an indexed, sorted symtypetab for fastest possible future
symbol lookup by name in that case. (The writeouts done by ld -r are
unsorted, because the only thing likely to use those symtabs is the
linker, which doesn't benefit from symtypetab sorting.)
Tests added for all three linking cases (ld -r, ld -shared, ld), with a
bit of testsuite framework enhancement to stop it unconditionally
linking the CTF to be checked by the lookup program with -shared, so
tests can now examine CTF linked with -r or indeed with no flags at all,
though the output filename is still foo.so even in this case.
Another test added for the non-linker case that endeavours to determine
whether the symtypetab is sorted by examining the order of entries
returned from ctf_symbol_next: nobody outside libctf should rely on
this ordering, but this test is not outside libctf :)
include/ChangeLog
2021-01-26 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (CTF_LINK_NO_FILTER_REPORTED_SYMS): New.
ld/ChangeLog
2021-01-26 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (lang_merge_ctf): Set CTF_LINK_NO_FILTER_REPORTED_SYMS
when appropriate.
libctf/ChangeLog
2021-01-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.c (_libctf_nonnull_): Add parameters.
(LCTF_LINKING): New flag.
(ctf_dict_t) <ctf_link_flags>: Mention it.
* ctf-link.c (ctf_link): Keep LCTF_LINKING set across call.
(ctf_write): Likewise, including in child dictionaries.
(ctf_link_shuffle_syms): Make sure ctf_dynsyms is NULL if there
are no reported symbols.
* ctf-create.c (symtypetab_delete_nonstatic_vars): Make sure
the variable has been reported as a symbol by the linker.
(symtypetab_skippable): Mention relationship between SYMFP and the
flags.
(symtypetab_density): Adjust nonnullity. Exit early if no symbols
were reported and force-indexing is off (i.e., we are doing a
final link).
(ctf_serialize): Handle the !LCTF_LINKING case by writing out an
indexed, sorted symtypetab (and allow SYMFP to be NULL in this
case). Turn sorting off if this is a non-final link. Only delete
nonstatic vars if we are filtering symbols and the linker has
reported some.
* testsuite/libctf-regression/nonstatic-var-section-ld-r*:
New test of variable and symtypetab section population when
ld -r is used.
* testsuite/libctf-regression/nonstatic-var-section-ld-executable.lk:
Likewise, when ld of an executable is used.
* testsuite/libctf-regression/nonstatic-var-section-ld.lk:
Likewise, when ld -shared alone is used.
* testsuite/libctf-regression/nonstatic-var-section-ld*.c:
Lookup programs for the above.
* testsuite/libctf-writable/symtypetab-nonlinker-writeout.*: New
test, testing survival of symbols across ctf_write paths.
* testsuite/lib/ctf-lib.exp (run_lookup_test): New option,
nonshared, suppressing linking of the SOURCE with -shared.
2021-01-17 00:49:29 +08:00
|
|
|
/* If no symbols are reported, unwind what we have done and return. This
|
|
|
|
makes it a bit easier for the serializer to tell that no symbols have been
|
|
|
|
reported and that it should look elsewhere for reported symbols. */
|
|
|
|
if (!ctf_dynhash_elements (fp->ctf_dynsyms))
|
|
|
|
{
|
|
|
|
ctf_dprintf ("No symbols: not a final link.\n");
|
2021-03-02 23:10:05 +08:00
|
|
|
ctf_dynhash_destroy (fp->ctf_dynsyms);
|
libctf, ld: fix symtypetab and var section population under ld -r
The variable section in a CTF dict is meant to contain the types of
variables that do not appear in the symbol table (mostly file-scope
static declarations). We implement this by having the compiler emit
all potential data symbols into both sections, then delete those
symbols from the variable section that correspond to data symbols the
linker has reported.
Unfortunately, the check for this in ctf_serialize is wrong: rather than
checking the set of linker-reported symbols, we check the set of names
in the data object symtypetab section: if the linker has reported no
symbols at all (usually if ld -r has been run, or if a non-linker
program that does not use symbol tables is calling ctf_link) this will
include every single symbol, emptying the variable section completely.
Worse, when ld -r is in use, we want to force writeout of every
symtypetab entry on the inputs, in an indexed section, whether or not
the linker has reported them, since this isn't a final link yet and the
symbol table is not finalized (and may grow more symbols than the linker
has yet reported). But the check for this is flawed too: we were
relying on ctf_link_shuffle_syms not having been called if no symbols
exist, but that function is *always* called by ld even when ld -r is in
use: ctf_link_add_linker_symbol is the one that's not called when there
are no symbols.
We clearly need to rethink this. Using the emptiness of the set of
reported symbols as a test for ld -r is just ugly: the linker already
knows if ld -r is underway and can just tell us. So add a new linker
flag CTF_LINK_NO_FILTER_REPORTED_SYMS that is set to stop the linker
filtering the symbols in the symtypetab sections using the set that the
linker has reported: use the presence or absence of this flag to
determine whether to emit unindexed symtabs: we only remove entries from
the variable section when filtering symbols, and we only remove them if
they are in the reported symbol set, fixing the case where no symbols
are reported by the linker at all.
(The negative sense of the new CTF_LINK flag is intentional: the common
case, both for ld and for simple tools that want to do a ctf_link with
no ELF symbol table in sight, is probably to filter out symbols that no
linker has reported: i.e., for the simple tools, all of them.)
There's another wrinkle, though. It is quite possible for a non-linker
to add symbols to a dict via ctf_add_*_sym and then write it out via the
ctf_write APIs: perhaps it's preparing a dict for a later linker
invocation. Right now this would not lead to anything terribly
meaningful happening: ctf_serialize just assumes it was called via
ctf_link if symbols are present. So add an (internal-to-libctf) flag
that indicates that a writeout is happening via ctf_link_write, and set
it there (propagating it to child dicts as needed). ctf_serialize can
then spot when it is not being called by a linker, and arrange to always
write out an indexed, sorted symtypetab for fastest possible future
symbol lookup by name in that case. (The writeouts done by ld -r are
unsorted, because the only thing likely to use those symtabs is the
linker, which doesn't benefit from symtypetab sorting.)
Tests added for all three linking cases (ld -r, ld -shared, ld), with a
bit of testsuite framework enhancement to stop it unconditionally
linking the CTF to be checked by the lookup program with -shared, so
tests can now examine CTF linked with -r or indeed with no flags at all,
though the output filename is still foo.so even in this case.
Another test added for the non-linker case that endeavours to determine
whether the symtypetab is sorted by examining the order of entries
returned from ctf_symbol_next: nobody outside libctf should rely on
this ordering, but this test is not outside libctf :)
include/ChangeLog
2021-01-26 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (CTF_LINK_NO_FILTER_REPORTED_SYMS): New.
ld/ChangeLog
2021-01-26 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (lang_merge_ctf): Set CTF_LINK_NO_FILTER_REPORTED_SYMS
when appropriate.
libctf/ChangeLog
2021-01-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.c (_libctf_nonnull_): Add parameters.
(LCTF_LINKING): New flag.
(ctf_dict_t) <ctf_link_flags>: Mention it.
* ctf-link.c (ctf_link): Keep LCTF_LINKING set across call.
(ctf_write): Likewise, including in child dictionaries.
(ctf_link_shuffle_syms): Make sure ctf_dynsyms is NULL if there
are no reported symbols.
* ctf-create.c (symtypetab_delete_nonstatic_vars): Make sure
the variable has been reported as a symbol by the linker.
(symtypetab_skippable): Mention relationship between SYMFP and the
flags.
(symtypetab_density): Adjust nonnullity. Exit early if no symbols
were reported and force-indexing is off (i.e., we are doing a
final link).
(ctf_serialize): Handle the !LCTF_LINKING case by writing out an
indexed, sorted symtypetab (and allow SYMFP to be NULL in this
case). Turn sorting off if this is a non-final link. Only delete
nonstatic vars if we are filtering symbols and the linker has
reported some.
* testsuite/libctf-regression/nonstatic-var-section-ld-r*:
New test of variable and symtypetab section population when
ld -r is used.
* testsuite/libctf-regression/nonstatic-var-section-ld-executable.lk:
Likewise, when ld of an executable is used.
* testsuite/libctf-regression/nonstatic-var-section-ld.lk:
Likewise, when ld -shared alone is used.
* testsuite/libctf-regression/nonstatic-var-section-ld*.c:
Lookup programs for the above.
* testsuite/libctf-writable/symtypetab-nonlinker-writeout.*: New
test, testing survival of symbols across ctf_write paths.
* testsuite/lib/ctf-lib.exp (run_lookup_test): New option,
nonshared, suppressing linking of the SOURCE with -shared.
2021-01-17 00:49:29 +08:00
|
|
|
fp->ctf_dynsyms = NULL;
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
libctf: symbol type linking support
This adds facilities to write out the function info and data object
sections, which efficiently map from entries in the symbol table to
types. The write-side code is entirely new: the read-side code was
merely significantly changed and support for indexed tables added
(pointed to by the no-longer-unused cth_objtidxoff and cth_funcidxoff
header fields).
With this in place, you can use ctf_lookup_by_symbol to look up the
types of symbols of function and object type (and, as before, you can
use ctf_lookup_variable to look up types of file-scope variables not
present in the symbol table, as long as you know their name: but
variables that are also data objects are now found in the data object
section instead.)
(Compatible) file format change:
The CTF spec has always said that the function info section looks much
like the CTF_K_FUNCTIONs in the type section: an info word (including an
argument count) followed by a return type and N argument types. This
format is suboptimal: it means function symbols cannot be deduplicated
and it causes a lot of ugly code duplication in libctf. But
conveniently the compiler has never emitted this! Because it has always
emitted a rather different format that libctf has never accepted, we can
be sure that there are no instances of this function info section in the
wild, and can freely change its format without compatibility concerns or
a file format version bump. (And since it has never been emitted in any
code that generated any older file format version, either, we need keep
no code to read the format as specified at all!)
So the function info section is now specified as an array of uint32_t,
exactly like the object data section: each entry is a type ID in the
type section which must be of kind CTF_K_FUNCTION, the prototype of
this function.
This allows function types to be deduplicated and also correctly encodes
the fact that all functions declared in C really are types available to
the program: so they should be stored in the type section like all other
types. (In format v4, we will be able to represent the types of static
functions as well, but that really does require a file format change.)
We introduce a new header flag, CTF_F_NEWFUNCINFO, which is set if the
new function info format is in use. A sufficiently new compiler will
always set this flag. New libctf will always set this flag: old libctf
will refuse to open any CTF dicts that have this flag set. If the flag
is not set on a dict being read in, new libctf will disregard the
function info section. Format v4 will remove this flag (or, rather, the
flag has no meaning there and the bit position may be recycled for some
other purpose).
New API:
Symbol addition:
ctf_add_func_sym: Add a symbol with a given name and type. The
type must be of kind CTF_K_FUNCTION (a function
pointer). Internally this adds a name -> type
mapping to the ctf_funchash in the ctf_dict.
ctf_add_objt_sym: Add a symbol with a given name and type. The type
kind can be anything, including function pointers.
This adds to ctf_objthash.
These both treat symbols as name -> type mappings: the linker associates
symbol names with symbol indexes via the ctf_link_shuffle_syms callback,
which sets up the ctf_dynsyms/ctf_dynsymidx/ctf_dynsymmax fields in the
ctf_dict. Repeated relinks can add more symbols.
Variables that are also exposed as symbols are removed from the variable
section at serialization time.
CTF symbol type sections which have enough pads, defined by
CTF_INDEX_PAD_THRESHOLD (whether because they are in dicts with symbols
where most types are unknown, or in archive where most types are defined
in some child or parent dict, not in this specific dict) are sorted by
name rather than symidx and accompanied by an index which associates
each symbol type entry with a name: the existing ctf_lookup_by_symbol
will map symbol indexes to symbol names and look the names up in the
index automatically. (This is currently ELF-symbol-table-dependent, but
there is almost nothing specific to ELF in here and we can add support
for other symbol table formats easily).
The compiler also uses index sections to communicate the contents of
object file symbol tables without relying on any specific ordering of
symbols: it doesn't need to sort them, and libctf will detect an
unsorted index section via the absence of the new CTF_F_IDXSORTED header
flag, and sort it if needed.
Iteration:
ctf_symbol_next: Iterator which returns the types and names of symbols
one by one, either for function or data symbols.
This does not require any sorting: the ctf_link machinery uses it to
pull in all the compiler-provided symbols cheaply, but it is not
restricted to that use.
(Compatible) changes in API:
ctf_lookup_by_symbol: can now be called for object and function
symbols: never returns ECTF_NOTDATA (which is
now not thrown by anything, but is kept for
compatibility and because it is a plausible
error that we might start throwing again at some
later date).
Internally we also have changes to the ctf-string functionality so that
"external" strings (those where we track a string -> offset mapping, but
only write out an offset) can be consulted via the usual means
(ctf_strptr) before the strtab is written out. This is important
because ctf_link_add_linker_symbol can now be handed symbols named via
strtab offsets, and ctf_link_shuffle_syms must figure out their actual
names by looking in the external symtab we have just been fed by the
ctf_link_add_strtab callback, long before that strtab is written out.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_symbol_next): New.
(ctf_add_objt_sym): Likewise.
(ctf_add_func_sym): Likewise.
* ctf.h: Document new function info section format.
(CTF_F_NEWFUNCINFO): New.
(CTF_F_IDXSORTED): New.
(CTF_F_MAX): Adjust accordingly.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (CTF_INDEX_PAD_THRESHOLD): New.
(_libctf_nonnull_): Likewise.
(ctf_in_flight_dynsym_t): New.
(ctf_dict_t) <ctf_funcidx_names>: Likewise.
<ctf_objtidx_names>: Likewise.
<ctf_nfuncidx>: Likewise.
<ctf_nobjtidx>: Likewise.
<ctf_funcidx_sxlate>: Likewise.
<ctf_objtidx_sxlate>: Likewise.
<ctf_objthash>: Likewise.
<ctf_funchash>: Likewise.
<ctf_dynsyms>: Likewise.
<ctf_dynsymidx>: Likewise.
<ctf_dynsymmax>: Likewise.
<ctf_in_flight_dynsym>: Likewise.
(struct ctf_next) <u.ctn_next>: Likewise.
(ctf_symtab_skippable): New prototype.
(ctf_add_funcobjt_sym): Likewise.
(ctf_dynhash_sort_by_name): Likewise.
(ctf_sym_to_elf64): Rename to...
(ctf_elf32_to_link_sym): ... this, and...
(ctf_elf64_to_link_sym): ... this.
* ctf-open.c (init_symtab): Check for lack of CTF_F_NEWFUNCINFO
flag, and presence of index sections. Refactor out
ctf_symtab_skippable and ctf_elf*_to_link_sym, and use them. Use
ctf_link_sym_t, not Elf64_Sym. Skip initializing objt or func
sxlate sections if corresponding index section is present. Adjust
for new func info section format.
(ctf_bufopen_internal): Add ctf_err_warn to corrupt-file error
handling. Report incorrect-length index sections. Always do an
init_symtab, even if there is no symtab section (there may be index
sections still).
(flip_objts): Adjust comment: func and objt sections are actually
identical in structure now, no need to caveat.
(ctf_dict_close): Free newly-added data structures.
* ctf-create.c (ctf_create): Initialize them.
(ctf_symtab_skippable): New, refactored out of
init_symtab, with st_nameidx_set check added.
(ctf_add_funcobjt_sym): New, add a function or object symbol to the
ctf_objthash or ctf_funchash, by name.
(ctf_add_objt_sym): Call it.
(ctf_add_func_sym): Likewise.
(symtypetab_delete_nonstatic_vars): New, delete vars also present as
data objects.
(CTF_SYMTYPETAB_EMIT_FUNCTION): New flag to symtypetab emitters:
this is a function emission, not a data object emission.
(CTF_SYMTYPETAB_EMIT_PAD): New flag to symtypetab emitters: emit
pads for symbols with no type (only set for unindexed sections).
(CTF_SYMTYPETAB_FORCE_INDEXED): New flag to symtypetab emitters:
always emit indexed.
(symtypetab_density): New, figure out section sizes.
(emit_symtypetab): New, emit a symtypetab.
(emit_symtypetab_index): New, emit a symtypetab index.
(ctf_serialize): Call them, emitting suitably sorted symtypetab
sections and indexes. Set suitable header flags. Copy over new
fields.
* ctf-hash.c (ctf_dynhash_sort_by_name): New, used to impose an
order on symtypetab index sections.
* ctf-link.c (ctf_add_type_mapping): Delete erroneous comment
relating to code that was never committed.
(ctf_link_one_variable): Improve variable name.
(check_sym): New, symtypetab analogue of check_variable.
(ctf_link_deduplicating_one_symtypetab): New.
(ctf_link_deduplicating_syms): Likewise.
(ctf_link_deduplicating): Call them.
(ctf_link_deduplicating_per_cu): Note that we don't call them in
this case (yet).
(ctf_link_add_strtab): Set the error on the fp correctly.
(ctf_link_add_linker_symbol): New (no longer a do-nothing stub), add
a linker symbol to the in-flight list.
(ctf_link_shuffle_syms): New (no longer a do-nothing stub), turn the
in-flight list into a mapping we can use, now its names are
resolvable in the external strtab.
* ctf-string.c (ctf_str_rollback_atom): Don't roll back atoms with
external strtab offsets.
(ctf_str_rollback): Adjust comment.
(ctf_str_write_strtab): Migrate ctf_syn_ext_strtab population from
writeout time...
(ctf_str_add_external): ... to string addition time.
* ctf-lookup.c (ctf_lookup_var_key_t): Rename to...
(ctf_lookup_idx_key_t): ... this, now we use it for syms too.
<clik_names>: New member, a name table.
(ctf_lookup_var): Adjust accordingly.
(ctf_lookup_variable): Likewise.
(ctf_lookup_by_id): Shuffle further up in the file.
(ctf_symidx_sort_arg_cb): New, callback for...
(sort_symidx_by_name): ... this new function to sort a symidx
found to be unsorted (likely originating from the compiler).
(ctf_symidx_sort): New, sort a symidx.
(ctf_lookup_symbol_name): Support dynamic symbols with indexes
provided by the linker. Use ctf_link_sym_t, not Elf64_Sym.
Check the parent if a child lookup fails.
(ctf_lookup_by_symbol): Likewise. Work for function symbols too.
(ctf_symbol_next): New, iterate over symbols with types (without
sorting).
(ctf_lookup_idx_name): New, bsearch for symbol names in indexes.
(ctf_try_lookup_indexed): New, attempt an indexed lookup.
(ctf_func_info): Reimplement in terms of ctf_lookup_by_symbol.
(ctf_func_args): Likewise.
(ctf_get_dict): Move...
* ctf-types.c (ctf_get_dict): ... here.
* ctf-util.c (ctf_sym_to_elf64): Re-express as...
(ctf_elf64_to_link_sym): ... this. Add new st_symidx field, and
st_nameidx_set (always 0, so st_nameidx can be ignored). Look in
the ELF strtab for names.
(ctf_elf32_to_link_sym): Likewise, for Elf32_Sym.
(ctf_next_destroy): Destroy ctf_next_t.u.ctn_next if need be.
* libctf.ver: Add ctf_symbol_next, ctf_add_objt_sym and
ctf_add_func_sym.
2020-11-20 21:34:04 +08:00
|
|
|
/* Construct a mapping from shndx to the symbol info. */
|
|
|
|
free (fp->ctf_dynsymidx);
|
|
|
|
if ((fp->ctf_dynsymidx = calloc (fp->ctf_dynsymmax + 1,
|
|
|
|
sizeof (ctf_link_sym_t *))) == NULL)
|
|
|
|
goto err;
|
|
|
|
|
|
|
|
while ((err = ctf_dynhash_next (fp->ctf_dynsyms, &i, &name_, &sym_)) == 0)
|
|
|
|
{
|
|
|
|
const char *name = (const char *) name;
|
|
|
|
ctf_link_sym_t *symp = (ctf_link_sym_t *) sym_;
|
|
|
|
|
|
|
|
if (!ctf_assert (fp, symp->st_symidx <= fp->ctf_dynsymmax))
|
|
|
|
{
|
|
|
|
ctf_next_destroy (i);
|
|
|
|
err = ctf_errno (fp);
|
|
|
|
goto err;
|
|
|
|
}
|
|
|
|
fp->ctf_dynsymidx[symp->st_symidx] = symp;
|
|
|
|
}
|
|
|
|
if (err != ECTF_NEXT_END)
|
|
|
|
{
|
|
|
|
ctf_err_warn (fp, 0, err, _("error iterating over shuffled symbols"));
|
|
|
|
goto err;
|
|
|
|
}
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
return 0;
|
libctf: symbol type linking support
This adds facilities to write out the function info and data object
sections, which efficiently map from entries in the symbol table to
types. The write-side code is entirely new: the read-side code was
merely significantly changed and support for indexed tables added
(pointed to by the no-longer-unused cth_objtidxoff and cth_funcidxoff
header fields).
With this in place, you can use ctf_lookup_by_symbol to look up the
types of symbols of function and object type (and, as before, you can
use ctf_lookup_variable to look up types of file-scope variables not
present in the symbol table, as long as you know their name: but
variables that are also data objects are now found in the data object
section instead.)
(Compatible) file format change:
The CTF spec has always said that the function info section looks much
like the CTF_K_FUNCTIONs in the type section: an info word (including an
argument count) followed by a return type and N argument types. This
format is suboptimal: it means function symbols cannot be deduplicated
and it causes a lot of ugly code duplication in libctf. But
conveniently the compiler has never emitted this! Because it has always
emitted a rather different format that libctf has never accepted, we can
be sure that there are no instances of this function info section in the
wild, and can freely change its format without compatibility concerns or
a file format version bump. (And since it has never been emitted in any
code that generated any older file format version, either, we need keep
no code to read the format as specified at all!)
So the function info section is now specified as an array of uint32_t,
exactly like the object data section: each entry is a type ID in the
type section which must be of kind CTF_K_FUNCTION, the prototype of
this function.
This allows function types to be deduplicated and also correctly encodes
the fact that all functions declared in C really are types available to
the program: so they should be stored in the type section like all other
types. (In format v4, we will be able to represent the types of static
functions as well, but that really does require a file format change.)
We introduce a new header flag, CTF_F_NEWFUNCINFO, which is set if the
new function info format is in use. A sufficiently new compiler will
always set this flag. New libctf will always set this flag: old libctf
will refuse to open any CTF dicts that have this flag set. If the flag
is not set on a dict being read in, new libctf will disregard the
function info section. Format v4 will remove this flag (or, rather, the
flag has no meaning there and the bit position may be recycled for some
other purpose).
New API:
Symbol addition:
ctf_add_func_sym: Add a symbol with a given name and type. The
type must be of kind CTF_K_FUNCTION (a function
pointer). Internally this adds a name -> type
mapping to the ctf_funchash in the ctf_dict.
ctf_add_objt_sym: Add a symbol with a given name and type. The type
kind can be anything, including function pointers.
This adds to ctf_objthash.
These both treat symbols as name -> type mappings: the linker associates
symbol names with symbol indexes via the ctf_link_shuffle_syms callback,
which sets up the ctf_dynsyms/ctf_dynsymidx/ctf_dynsymmax fields in the
ctf_dict. Repeated relinks can add more symbols.
Variables that are also exposed as symbols are removed from the variable
section at serialization time.
CTF symbol type sections which have enough pads, defined by
CTF_INDEX_PAD_THRESHOLD (whether because they are in dicts with symbols
where most types are unknown, or in archive where most types are defined
in some child or parent dict, not in this specific dict) are sorted by
name rather than symidx and accompanied by an index which associates
each symbol type entry with a name: the existing ctf_lookup_by_symbol
will map symbol indexes to symbol names and look the names up in the
index automatically. (This is currently ELF-symbol-table-dependent, but
there is almost nothing specific to ELF in here and we can add support
for other symbol table formats easily).
The compiler also uses index sections to communicate the contents of
object file symbol tables without relying on any specific ordering of
symbols: it doesn't need to sort them, and libctf will detect an
unsorted index section via the absence of the new CTF_F_IDXSORTED header
flag, and sort it if needed.
Iteration:
ctf_symbol_next: Iterator which returns the types and names of symbols
one by one, either for function or data symbols.
This does not require any sorting: the ctf_link machinery uses it to
pull in all the compiler-provided symbols cheaply, but it is not
restricted to that use.
(Compatible) changes in API:
ctf_lookup_by_symbol: can now be called for object and function
symbols: never returns ECTF_NOTDATA (which is
now not thrown by anything, but is kept for
compatibility and because it is a plausible
error that we might start throwing again at some
later date).
Internally we also have changes to the ctf-string functionality so that
"external" strings (those where we track a string -> offset mapping, but
only write out an offset) can be consulted via the usual means
(ctf_strptr) before the strtab is written out. This is important
because ctf_link_add_linker_symbol can now be handed symbols named via
strtab offsets, and ctf_link_shuffle_syms must figure out their actual
names by looking in the external symtab we have just been fed by the
ctf_link_add_strtab callback, long before that strtab is written out.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_symbol_next): New.
(ctf_add_objt_sym): Likewise.
(ctf_add_func_sym): Likewise.
* ctf.h: Document new function info section format.
(CTF_F_NEWFUNCINFO): New.
(CTF_F_IDXSORTED): New.
(CTF_F_MAX): Adjust accordingly.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (CTF_INDEX_PAD_THRESHOLD): New.
(_libctf_nonnull_): Likewise.
(ctf_in_flight_dynsym_t): New.
(ctf_dict_t) <ctf_funcidx_names>: Likewise.
<ctf_objtidx_names>: Likewise.
<ctf_nfuncidx>: Likewise.
<ctf_nobjtidx>: Likewise.
<ctf_funcidx_sxlate>: Likewise.
<ctf_objtidx_sxlate>: Likewise.
<ctf_objthash>: Likewise.
<ctf_funchash>: Likewise.
<ctf_dynsyms>: Likewise.
<ctf_dynsymidx>: Likewise.
<ctf_dynsymmax>: Likewise.
<ctf_in_flight_dynsym>: Likewise.
(struct ctf_next) <u.ctn_next>: Likewise.
(ctf_symtab_skippable): New prototype.
(ctf_add_funcobjt_sym): Likewise.
(ctf_dynhash_sort_by_name): Likewise.
(ctf_sym_to_elf64): Rename to...
(ctf_elf32_to_link_sym): ... this, and...
(ctf_elf64_to_link_sym): ... this.
* ctf-open.c (init_symtab): Check for lack of CTF_F_NEWFUNCINFO
flag, and presence of index sections. Refactor out
ctf_symtab_skippable and ctf_elf*_to_link_sym, and use them. Use
ctf_link_sym_t, not Elf64_Sym. Skip initializing objt or func
sxlate sections if corresponding index section is present. Adjust
for new func info section format.
(ctf_bufopen_internal): Add ctf_err_warn to corrupt-file error
handling. Report incorrect-length index sections. Always do an
init_symtab, even if there is no symtab section (there may be index
sections still).
(flip_objts): Adjust comment: func and objt sections are actually
identical in structure now, no need to caveat.
(ctf_dict_close): Free newly-added data structures.
* ctf-create.c (ctf_create): Initialize them.
(ctf_symtab_skippable): New, refactored out of
init_symtab, with st_nameidx_set check added.
(ctf_add_funcobjt_sym): New, add a function or object symbol to the
ctf_objthash or ctf_funchash, by name.
(ctf_add_objt_sym): Call it.
(ctf_add_func_sym): Likewise.
(symtypetab_delete_nonstatic_vars): New, delete vars also present as
data objects.
(CTF_SYMTYPETAB_EMIT_FUNCTION): New flag to symtypetab emitters:
this is a function emission, not a data object emission.
(CTF_SYMTYPETAB_EMIT_PAD): New flag to symtypetab emitters: emit
pads for symbols with no type (only set for unindexed sections).
(CTF_SYMTYPETAB_FORCE_INDEXED): New flag to symtypetab emitters:
always emit indexed.
(symtypetab_density): New, figure out section sizes.
(emit_symtypetab): New, emit a symtypetab.
(emit_symtypetab_index): New, emit a symtypetab index.
(ctf_serialize): Call them, emitting suitably sorted symtypetab
sections and indexes. Set suitable header flags. Copy over new
fields.
* ctf-hash.c (ctf_dynhash_sort_by_name): New, used to impose an
order on symtypetab index sections.
* ctf-link.c (ctf_add_type_mapping): Delete erroneous comment
relating to code that was never committed.
(ctf_link_one_variable): Improve variable name.
(check_sym): New, symtypetab analogue of check_variable.
(ctf_link_deduplicating_one_symtypetab): New.
(ctf_link_deduplicating_syms): Likewise.
(ctf_link_deduplicating): Call them.
(ctf_link_deduplicating_per_cu): Note that we don't call them in
this case (yet).
(ctf_link_add_strtab): Set the error on the fp correctly.
(ctf_link_add_linker_symbol): New (no longer a do-nothing stub), add
a linker symbol to the in-flight list.
(ctf_link_shuffle_syms): New (no longer a do-nothing stub), turn the
in-flight list into a mapping we can use, now its names are
resolvable in the external strtab.
* ctf-string.c (ctf_str_rollback_atom): Don't roll back atoms with
external strtab offsets.
(ctf_str_rollback): Adjust comment.
(ctf_str_write_strtab): Migrate ctf_syn_ext_strtab population from
writeout time...
(ctf_str_add_external): ... to string addition time.
* ctf-lookup.c (ctf_lookup_var_key_t): Rename to...
(ctf_lookup_idx_key_t): ... this, now we use it for syms too.
<clik_names>: New member, a name table.
(ctf_lookup_var): Adjust accordingly.
(ctf_lookup_variable): Likewise.
(ctf_lookup_by_id): Shuffle further up in the file.
(ctf_symidx_sort_arg_cb): New, callback for...
(sort_symidx_by_name): ... this new function to sort a symidx
found to be unsorted (likely originating from the compiler).
(ctf_symidx_sort): New, sort a symidx.
(ctf_lookup_symbol_name): Support dynamic symbols with indexes
provided by the linker. Use ctf_link_sym_t, not Elf64_Sym.
Check the parent if a child lookup fails.
(ctf_lookup_by_symbol): Likewise. Work for function symbols too.
(ctf_symbol_next): New, iterate over symbols with types (without
sorting).
(ctf_lookup_idx_name): New, bsearch for symbol names in indexes.
(ctf_try_lookup_indexed): New, attempt an indexed lookup.
(ctf_func_info): Reimplement in terms of ctf_lookup_by_symbol.
(ctf_func_args): Likewise.
(ctf_get_dict): Move...
* ctf-types.c (ctf_get_dict): ... here.
* ctf-util.c (ctf_sym_to_elf64): Re-express as...
(ctf_elf64_to_link_sym): ... this. Add new st_symidx field, and
st_nameidx_set (always 0, so st_nameidx can be ignored). Look in
the ELF strtab for names.
(ctf_elf32_to_link_sym): Likewise, for Elf32_Sym.
(ctf_next_destroy): Destroy ctf_next_t.u.ctn_next if need be.
* libctf.ver: Add ctf_symbol_next, ctf_add_objt_sym and
ctf_add_func_sym.
2020-11-20 21:34:04 +08:00
|
|
|
|
|
|
|
err:
|
|
|
|
/* Leave the in-flight symbols around: they'll be freed at
|
|
|
|
dict close time regardless. */
|
|
|
|
ctf_dynhash_destroy (fp->ctf_dynsyms);
|
|
|
|
fp->ctf_dynsyms = NULL;
|
|
|
|
free (fp->ctf_dynsymidx);
|
|
|
|
fp->ctf_dynsymidx = NULL;
|
|
|
|
fp->ctf_dynsymmax = 0;
|
|
|
|
ctf_set_errno (fp, err);
|
|
|
|
return -err;
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
typedef struct ctf_name_list_accum_cb_arg
|
|
|
|
{
|
|
|
|
char **names;
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
ctf_dict_t *fp;
|
|
|
|
ctf_dict_t **files;
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
size_t i;
|
libctf: add CU-mapping machinery
Once the deduplicator is capable of actually detecting conflicting types
with the same name (i.e., not yet) we will place such conflicting types,
and types that depend on them, into CTF dictionaries that are the child
of the main dictionary we usually emit: currently, this will lead to the
.ctf section becoming a CTF archive rather than a single dictionary,
with the default-named archive member (_CTF_SECTION, or NULL) being the
main shared dictionary with most of the types in it.
By default, the sections are named after the compilation unit they come
from (complete path and all), with the cuname field in the CTF header
providing further evidence of the name without requiring the caller to
engage in tiresome parsing. But some callers may not wish the mapping
from input CU to output sub-dictionary to be purely CU-based.
The machinery here allows this to be freely changed, in two ways:
- callers can call ctf_link_add_cu_mapping to specify that a single
input compilation unit should have its types placed in some other CU
if they conflict: the CU will always be created, even if empty, so
the consuming program can depend on its existence. You can map
multiple input CUs to one output CU to force all their types to be
merged together: if some of *those* types conflict, the behaviour is
currently unspecified (the new deduplicator will specify it).
- callers can call ctf_link_set_memb_name_changer to provide a function
which is passed every CTF sub-dictionary name in turn (including
_CTF_SECTION) and can return a new name, or NULL if no change is
desired. The mapping from input to output names should not map two
input names to the same output name: if this happens, the two are not
merged but will result in an archive with two members with the same
name (technically valid, but it's hard to access the second
same-named member: you have to do an iteration over archive members).
This is used by the kernel's ctfarchive machinery (not yet upstream) to
encode CTF under member names like {module name}.ctf rather than
.ctf.CU, but it is anticipated that other large projects may wish to
have their own storage for CTF outside of .ctf sections and may wish to
have new naming schemes that suit their special-purpose consumers.
New in v3.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (ctf_link_add_cu_mapping): New.
(ctf_link_memb_name_changer_f): New.
(ctf_link_set_memb_name_changer): New.
libctf/
* ctf-impl.h (ctf_file_t) <ctf_link_cu_mappping>: New.
<ctf_link_memb_name_changer>: Likewise.
<ctf_link_memb_name_changer_arg>: Likewise.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-link.c (ctf_create_per_cu): Apply the cu mapping.
(ctf_link_add_cu_mapping): New.
(ctf_link_set_memb_name_changer): Likewise.
(ctf_change_parent_name): New.
(ctf_name_list_accum_cb_arg_t) <dynames>: New, storage for names
allocated by the caller's ctf_link_memb_name_changer.
<ndynames>: Likewise.
(ctf_accumulate_archive_names): Call the ctf_link_memb_name_changer.
(ctf_link_write): Likewise (for _CTF_SECTION only): also call
ctf_change_parent_name. Free any resulting names.
2019-07-20 21:44:44 +08:00
|
|
|
char **dynames;
|
|
|
|
size_t ndynames;
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
} ctf_name_list_accum_cb_arg_t;
|
|
|
|
|
libctf: avoid the need to ever use ctf_update
The method of operation of libctf when the dictionary is writable has
before now been that types that are added land in the dynamic type
section, which is a linked list and hash of IDs -> dynamic type
definitions (and, recently a hash of names): the DTDs are a bit of CTF
representing the ctf_type_t and ad hoc C structures representing the
vlen. Historically, libctf was unable to do anything with these types,
not even look them up by ID, let alone by name: if you wanted to do that
say if you were adding a type that depended on one you just added) you
called ctf_update, which serializes all the DTDs into a CTF file and
reopens it, copying its guts over the fp it's called with. The
ctf_updated types are then frozen in amber and unchangeable: all lookups
will return the types in the static portion in preference to the dynamic
portion, and we will refuse to re-add things that already exist in the
static portion (and, of late, in the dynamic portion too). The libctf
machinery remembers the boundary between static and dynamic types and
looks in the right portion for each type. Lots of things still don't
quite work with dynamic types (e.g. getting their size), but enough
works to do a bunch of additions and then a ctf_update, most of the
time.
Except it doesn't, because ctf_add_type finds it necessary to walk the
full dynamic type definition list looking for types with matching names,
so it gets slower and slower with every type you add: fixing this
requires calling ctf_update periodically for no other reason than to
avoid massively slowing things down.
This is all clunky and very slow but kind of works, until you consider
that it is in fact possible and indeed necessary to modify one sort of
type after it has been added: forwards. These are necessarily promoted
to structs, unions or enums, and when they do so *their type ID does not
change*. So all of a sudden we are changing types that already exist in
the static portion. ctf_update gets massively confused by this and
allocates space enough for the forward (with no members), but then emits
the new dynamic type (with all the members) into it. You get an
assertion failure after that, if you're lucky, or a coredump.
So this commit rejigs things a bit and arranges to exclusively use the
dynamic type definitions in writable dictionaries, and the static type
definitions in readable dictionaries: we don't at any time have a mixture
of static and dynamic types, and you don't need to call ctf_update to
make things "appear". The ctf_dtbyname hash I introduced a few months
ago, which maps things like "struct foo" to DTDs, is removed, replaced
instead by a change of type of the four dictionaries which track names.
Rather than just being (unresizable) ctf_hash_t's populated only at
ctf_bufopen time, they are now a ctf_names_t structure, which is a pair
of ctf_hash_t and ctf_dynhash_t, with the ctf_hash_t portion being used
in readonly dictionaries, and the ctf_dynhash_t being used in writable
ones. The decision as to which to use is centralized in the new
functions ctf_lookup_by_rawname (which takes a type kind) and
ctf_lookup_by_rawhash, which it calls (which takes a ctf_names_t *.)
This change lets us switch from using static to dynamic name hashes on
the fly across the entirety of libctf without complexifying anything: in
fact, because we now centralize the knowledge about how to map from type
kind to name hash, it actually simplifies things and lets us throw out
quite a lot of now-unnecessary complexity, from ctf_dtnyname (replaced
by the dynamic half of the name tables), through to ctf_dtnextid (now
that a dictionary's static portion is never referenced if the dictionary
is writable, we can just use ctf_typemax to indicate the maximum type:
dynamic or non-dynamic does not matter, and we no longer need to track
the boundary between the types). You can now ctf_rollback() as far as
you like, even past a ctf_update or for that matter a full writeout; all
the iteration functions work just as well on writable as on read-only
dictionaries; ctf_add_type no longer needs expensive duplicated code to
run over the dynamic types hunting for ones it might be interested in;
and the linker no longer needs a hack to call ctf_update so that calling
ctf_add_type is not impossibly expensive.
There is still a bit more complexity: some new code paths in ctf-types.c
need to know how to extract information from dynamic types. This
complexity will go away again in a few months when libctf acquires a
proper intermediate representation.
You can still call ctf_update if you like (it's public API, after all),
but its only effect now is to set the point to which ctf_discard rolls
back.
Obviously *something* still needs to serialize the CTF file before
writeout, and this job is done by ctf_serialize, which does everything
ctf_update used to except set the counter used by ctf_discard. It is
automatically called by the various functions that do CTF writeout:
nobody else ever needs to call it.
With this in place, forwards that are promoted to non-forwards no longer
crash the link, even if it happens tens of thousands of types later.
v5: fix tabdamage.
libctf/
* ctf-impl.h (ctf_names_t): New.
(ctf_lookup_t) <ctf_hash>: Now a ctf_names_t, not a ctf_hash_t.
(ctf_file_t) <ctf_structs>: Likewise.
<ctf_unions>: Likewise.
<ctf_enums>: Likewise.
<ctf_names>: Likewise.
<ctf_lookups>: Improve comment.
<ctf_ptrtab_len>: New.
<ctf_prov_strtab>: New.
<ctf_str_prov_offset>: New.
<ctf_dtbyname>: Remove, redundant to the names hashes.
<ctf_dtnextid>: Remove, redundant to ctf_typemax.
(ctf_dtdef_t) <dtd_name>: Remove.
<dtd_data>: Note that the ctt_name is now populated.
(ctf_str_atom_t) <csa_offset>: This is now the strtab
offset for internal strings too.
<csa_external_offset>: New, the external strtab offset.
(CTF_INDEX_TO_TYPEPTR): Handle the LCTF_RDWR case.
(ctf_name_table): New declaration.
(ctf_lookup_by_rawname): Likewise.
(ctf_lookup_by_rawhash): Likewise.
(ctf_set_ctl_hashes): Likewise.
(ctf_serialize): Likewise.
(ctf_dtd_insert): Adjust.
(ctf_simple_open_internal): Likewise.
(ctf_bufopen_internal): Likewise.
(ctf_list_empty_p): Likewise.
(ctf_str_remove_ref): Likewise.
(ctf_str_add): Returns uint32_t now.
(ctf_str_add_ref): Likewise.
(ctf_str_add_external): Now returns a boolean (int).
* ctf-string.c (ctf_strraw_explicit): Check the ctf_prov_strtab
for strings in the appropriate range.
(ctf_str_create_atoms): Create the ctf_prov_strtab. Detect OOM
when adding the null string to the new strtab.
(ctf_str_free_atoms): Destroy the ctf_prov_strtab.
(ctf_str_add_ref_internal): Add make_provisional argument. If
make_provisional, populate the offset and fill in the
ctf_prov_strtab accordingly.
(ctf_str_add): Return the offset, not the string.
(ctf_str_add_ref): Likewise.
(ctf_str_add_external): Return a success integer.
(ctf_str_remove_ref): New, remove a single ref.
(ctf_str_count_strtab): Do not count the initial null string's
length or the existence or length of any unreferenced internal
atoms.
(ctf_str_populate_sorttab): Skip atoms with no refs.
(ctf_str_write_strtab): Populate the nullstr earlier. Add one
to the cts_len for the null string, since it is no longer done
in ctf_str_count_strtab. Adjust for csa_external_offset rename.
Populate the csa_offset for both internal and external cases.
Flush the ctf_prov_strtab afterwards, and reset the
ctf_str_prov_offset.
* ctf-create.c (ctf_grow_ptrtab): New.
(ctf_create): Call it. Initialize new fields rather than old
ones. Tell ctf_bufopen_internal that this is a writable dictionary.
Set the ctl hashes and data model.
(ctf_update): Rename to...
(ctf_serialize): ... this. Leave a compatibility function behind.
Tell ctf_simple_open_internal that this is a writable dictionary.
Pass the new fields along from the old dictionary. Drop
ctf_dtnextid and ctf_dtbyname. Use ctf_strraw, not dtd_name.
Do not zero out the DTD's ctt_name.
(ctf_prefixed_name): Rename to...
(ctf_name_table): ... this. No longer return a prefixed name: return
the applicable name table instead.
(ctf_dtd_insert): Use it, and use the right name table. Pass in the
kind we're adding. Migrate away from dtd_name.
(ctf_dtd_delete): Adjust similarly. Remove the ref to the
deleted ctt_name.
(ctf_dtd_lookup_type_by_name): Remove.
(ctf_dynamic_type): Always return NULL on read-only dictionaries.
No longer check ctf_dtnextid: check ctf_typemax instead.
(ctf_snapshot): No longer use ctf_dtnextid: use ctf_typemax instead.
(ctf_rollback): Likewise. No longer fail with ECTF_OVERROLLBACK. Use
ctf_name_table and the right name table, and migrate away from
dtd_name as in ctf_dtd_delete.
(ctf_add_generic): Pass in the kind explicitly and pass it to
ctf_dtd_insert. Use ctf_typemax, not ctf_dtnextid. Migrate away
from dtd_name to using ctf_str_add_ref to populate the ctt_name.
Grow the ptrtab if needed.
(ctf_add_encoded): Pass in the kind.
(ctf_add_slice): Likewise.
(ctf_add_array): Likewise.
(ctf_add_function): Likewise.
(ctf_add_typedef): Likewise.
(ctf_add_reftype): Likewise. Initialize the ctf_ptrtab, checking
ctt_name rather than dtd_name.
(ctf_add_struct_sized): Pass in the kind. Use
ctf_lookup_by_rawname, not ctf_hash_lookup_type /
ctf_dtd_lookup_type_by_name.
(ctf_add_union_sized): Likewise.
(ctf_add_enum): Likewise.
(ctf_add_enum_encoded): Likewise.
(ctf_add_forward): Likewise.
(ctf_add_type): Likewise.
(ctf_compress_write): Call ctf_serialize: adjust for ctf_size not
being initialized until after the call.
(ctf_write_mem): Likewise.
(ctf_write): Likewise.
* ctf-archive.c (arc_write_one_ctf): Likewise.
* ctf-lookup.c (ctf_lookup_by_name): Use ctf_lookuup_by_rawhash, not
ctf_hash_lookup_type.
(ctf_lookup_by_id): No longer check the readonly types if the
dictionary is writable.
* ctf-open.c (init_types): Assert that this dictionary is not
writable. Adjust to use the new name hashes, ctf_name_table,
and ctf_ptrtab_len. GNU style fix for the final ptrtab scan.
(ctf_bufopen_internal): New 'writable' parameter. Flip on LCTF_RDWR
if set. Drop out early when dictionary is writable. Split the
ctf_lookups initialization into...
(ctf_set_cth_hashes): ... this new function.
(ctf_simple_open_internal): Adjust. New 'writable' parameter.
(ctf_simple_open): Adjust accordingly.
(ctf_bufopen): Likewise.
(ctf_file_close): Destroy the appropriate name hashes. No longer
destroy ctf_dtbyname, which is gone.
(ctf_getdatasect): Remove spurious "extern".
* ctf-types.c (ctf_lookup_by_rawname): New, look up types in the
specified name table, given a kind.
(ctf_lookup_by_rawhash): Likewise, given a ctf_names_t *.
(ctf_member_iter): Add support for iterating over the
dynamic type list.
(ctf_enum_iter): Likewise.
(ctf_variable_iter): Likewise.
(ctf_type_rvisit): Likewise.
(ctf_member_info): Add support for types in the dynamic type list.
(ctf_enum_name): Likewise.
(ctf_enum_value): Likewise.
(ctf_func_type_info): Likewise.
(ctf_func_type_args): Likewise.
* ctf-link.c (ctf_accumulate_archive_names): No longer call
ctf_update.
(ctf_link_write): Likewise.
(ctf_link_intern_extern_string): Adjust for new
ctf_str_add_external return value.
(ctf_link_add_strtab): Likewise.
* ctf-util.c (ctf_list_empty_p): New.
2019-08-08 00:55:09 +08:00
|
|
|
/* Accumulate the names and a count of the names in the link output hash. */
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
static void
|
|
|
|
ctf_accumulate_archive_names (void *key, void *value, void *arg_)
|
|
|
|
{
|
|
|
|
const char *name = (const char *) key;
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
ctf_dict_t *fp = (ctf_dict_t *) value;
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
char **names;
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
ctf_dict_t **files;
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
ctf_name_list_accum_cb_arg_t *arg = (ctf_name_list_accum_cb_arg_t *) arg_;
|
|
|
|
|
|
|
|
if ((names = realloc (arg->names, sizeof (char *) * ++(arg->i))) == NULL)
|
|
|
|
{
|
|
|
|
(arg->i)--;
|
|
|
|
ctf_set_errno (arg->fp, ENOMEM);
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
if ((files = realloc (arg->files, sizeof (ctf_dict_t *) * arg->i)) == NULL)
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
{
|
|
|
|
(arg->i)--;
|
|
|
|
ctf_set_errno (arg->fp, ENOMEM);
|
|
|
|
return;
|
|
|
|
}
|
libctf: add CU-mapping machinery
Once the deduplicator is capable of actually detecting conflicting types
with the same name (i.e., not yet) we will place such conflicting types,
and types that depend on them, into CTF dictionaries that are the child
of the main dictionary we usually emit: currently, this will lead to the
.ctf section becoming a CTF archive rather than a single dictionary,
with the default-named archive member (_CTF_SECTION, or NULL) being the
main shared dictionary with most of the types in it.
By default, the sections are named after the compilation unit they come
from (complete path and all), with the cuname field in the CTF header
providing further evidence of the name without requiring the caller to
engage in tiresome parsing. But some callers may not wish the mapping
from input CU to output sub-dictionary to be purely CU-based.
The machinery here allows this to be freely changed, in two ways:
- callers can call ctf_link_add_cu_mapping to specify that a single
input compilation unit should have its types placed in some other CU
if they conflict: the CU will always be created, even if empty, so
the consuming program can depend on its existence. You can map
multiple input CUs to one output CU to force all their types to be
merged together: if some of *those* types conflict, the behaviour is
currently unspecified (the new deduplicator will specify it).
- callers can call ctf_link_set_memb_name_changer to provide a function
which is passed every CTF sub-dictionary name in turn (including
_CTF_SECTION) and can return a new name, or NULL if no change is
desired. The mapping from input to output names should not map two
input names to the same output name: if this happens, the two are not
merged but will result in an archive with two members with the same
name (technically valid, but it's hard to access the second
same-named member: you have to do an iteration over archive members).
This is used by the kernel's ctfarchive machinery (not yet upstream) to
encode CTF under member names like {module name}.ctf rather than
.ctf.CU, but it is anticipated that other large projects may wish to
have their own storage for CTF outside of .ctf sections and may wish to
have new naming schemes that suit their special-purpose consumers.
New in v3.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (ctf_link_add_cu_mapping): New.
(ctf_link_memb_name_changer_f): New.
(ctf_link_set_memb_name_changer): New.
libctf/
* ctf-impl.h (ctf_file_t) <ctf_link_cu_mappping>: New.
<ctf_link_memb_name_changer>: Likewise.
<ctf_link_memb_name_changer_arg>: Likewise.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-link.c (ctf_create_per_cu): Apply the cu mapping.
(ctf_link_add_cu_mapping): New.
(ctf_link_set_memb_name_changer): Likewise.
(ctf_change_parent_name): New.
(ctf_name_list_accum_cb_arg_t) <dynames>: New, storage for names
allocated by the caller's ctf_link_memb_name_changer.
<ndynames>: Likewise.
(ctf_accumulate_archive_names): Call the ctf_link_memb_name_changer.
(ctf_link_write): Likewise (for _CTF_SECTION only): also call
ctf_change_parent_name. Free any resulting names.
2019-07-20 21:44:44 +08:00
|
|
|
|
|
|
|
/* Allow the caller to get in and modify the name at the last minute. If the
|
|
|
|
caller *does* modify the name, we have to stash away the new name the
|
|
|
|
caller returned so we can free it later on. (The original name is the key
|
|
|
|
of the ctf_link_outputs hash and is freed by the dynhash machinery.) */
|
|
|
|
|
|
|
|
if (fp->ctf_link_memb_name_changer)
|
|
|
|
{
|
|
|
|
char **dynames;
|
|
|
|
char *dyname;
|
|
|
|
void *nc_arg = fp->ctf_link_memb_name_changer_arg;
|
|
|
|
|
|
|
|
dyname = fp->ctf_link_memb_name_changer (fp, name, nc_arg);
|
|
|
|
|
|
|
|
if (dyname != NULL)
|
|
|
|
{
|
|
|
|
if ((dynames = realloc (arg->dynames,
|
|
|
|
sizeof (char *) * ++(arg->ndynames))) == NULL)
|
|
|
|
{
|
|
|
|
(arg->ndynames)--;
|
|
|
|
ctf_set_errno (arg->fp, ENOMEM);
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
arg->dynames = dynames;
|
|
|
|
name = (const char *) dyname;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
arg->names = names;
|
|
|
|
arg->names[(arg->i) - 1] = (char *) name;
|
|
|
|
arg->files = files;
|
|
|
|
arg->files[(arg->i) - 1] = fp;
|
|
|
|
}
|
|
|
|
|
libctf: add CU-mapping machinery
Once the deduplicator is capable of actually detecting conflicting types
with the same name (i.e., not yet) we will place such conflicting types,
and types that depend on them, into CTF dictionaries that are the child
of the main dictionary we usually emit: currently, this will lead to the
.ctf section becoming a CTF archive rather than a single dictionary,
with the default-named archive member (_CTF_SECTION, or NULL) being the
main shared dictionary with most of the types in it.
By default, the sections are named after the compilation unit they come
from (complete path and all), with the cuname field in the CTF header
providing further evidence of the name without requiring the caller to
engage in tiresome parsing. But some callers may not wish the mapping
from input CU to output sub-dictionary to be purely CU-based.
The machinery here allows this to be freely changed, in two ways:
- callers can call ctf_link_add_cu_mapping to specify that a single
input compilation unit should have its types placed in some other CU
if they conflict: the CU will always be created, even if empty, so
the consuming program can depend on its existence. You can map
multiple input CUs to one output CU to force all their types to be
merged together: if some of *those* types conflict, the behaviour is
currently unspecified (the new deduplicator will specify it).
- callers can call ctf_link_set_memb_name_changer to provide a function
which is passed every CTF sub-dictionary name in turn (including
_CTF_SECTION) and can return a new name, or NULL if no change is
desired. The mapping from input to output names should not map two
input names to the same output name: if this happens, the two are not
merged but will result in an archive with two members with the same
name (technically valid, but it's hard to access the second
same-named member: you have to do an iteration over archive members).
This is used by the kernel's ctfarchive machinery (not yet upstream) to
encode CTF under member names like {module name}.ctf rather than
.ctf.CU, but it is anticipated that other large projects may wish to
have their own storage for CTF outside of .ctf sections and may wish to
have new naming schemes that suit their special-purpose consumers.
New in v3.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (ctf_link_add_cu_mapping): New.
(ctf_link_memb_name_changer_f): New.
(ctf_link_set_memb_name_changer): New.
libctf/
* ctf-impl.h (ctf_file_t) <ctf_link_cu_mappping>: New.
<ctf_link_memb_name_changer>: Likewise.
<ctf_link_memb_name_changer_arg>: Likewise.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-link.c (ctf_create_per_cu): Apply the cu mapping.
(ctf_link_add_cu_mapping): New.
(ctf_link_set_memb_name_changer): Likewise.
(ctf_change_parent_name): New.
(ctf_name_list_accum_cb_arg_t) <dynames>: New, storage for names
allocated by the caller's ctf_link_memb_name_changer.
<ndynames>: Likewise.
(ctf_accumulate_archive_names): Call the ctf_link_memb_name_changer.
(ctf_link_write): Likewise (for _CTF_SECTION only): also call
ctf_change_parent_name. Free any resulting names.
2019-07-20 21:44:44 +08:00
|
|
|
/* Change the name of the parent CTF section, if the name transformer has got to
|
|
|
|
it. */
|
|
|
|
static void
|
|
|
|
ctf_change_parent_name (void *key _libctf_unused_, void *value, void *arg)
|
|
|
|
{
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
ctf_dict_t *fp = (ctf_dict_t *) value;
|
libctf: add CU-mapping machinery
Once the deduplicator is capable of actually detecting conflicting types
with the same name (i.e., not yet) we will place such conflicting types,
and types that depend on them, into CTF dictionaries that are the child
of the main dictionary we usually emit: currently, this will lead to the
.ctf section becoming a CTF archive rather than a single dictionary,
with the default-named archive member (_CTF_SECTION, or NULL) being the
main shared dictionary with most of the types in it.
By default, the sections are named after the compilation unit they come
from (complete path and all), with the cuname field in the CTF header
providing further evidence of the name without requiring the caller to
engage in tiresome parsing. But some callers may not wish the mapping
from input CU to output sub-dictionary to be purely CU-based.
The machinery here allows this to be freely changed, in two ways:
- callers can call ctf_link_add_cu_mapping to specify that a single
input compilation unit should have its types placed in some other CU
if they conflict: the CU will always be created, even if empty, so
the consuming program can depend on its existence. You can map
multiple input CUs to one output CU to force all their types to be
merged together: if some of *those* types conflict, the behaviour is
currently unspecified (the new deduplicator will specify it).
- callers can call ctf_link_set_memb_name_changer to provide a function
which is passed every CTF sub-dictionary name in turn (including
_CTF_SECTION) and can return a new name, or NULL if no change is
desired. The mapping from input to output names should not map two
input names to the same output name: if this happens, the two are not
merged but will result in an archive with two members with the same
name (technically valid, but it's hard to access the second
same-named member: you have to do an iteration over archive members).
This is used by the kernel's ctfarchive machinery (not yet upstream) to
encode CTF under member names like {module name}.ctf rather than
.ctf.CU, but it is anticipated that other large projects may wish to
have their own storage for CTF outside of .ctf sections and may wish to
have new naming schemes that suit their special-purpose consumers.
New in v3.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (ctf_link_add_cu_mapping): New.
(ctf_link_memb_name_changer_f): New.
(ctf_link_set_memb_name_changer): New.
libctf/
* ctf-impl.h (ctf_file_t) <ctf_link_cu_mappping>: New.
<ctf_link_memb_name_changer>: Likewise.
<ctf_link_memb_name_changer_arg>: Likewise.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-link.c (ctf_create_per_cu): Apply the cu mapping.
(ctf_link_add_cu_mapping): New.
(ctf_link_set_memb_name_changer): Likewise.
(ctf_change_parent_name): New.
(ctf_name_list_accum_cb_arg_t) <dynames>: New, storage for names
allocated by the caller's ctf_link_memb_name_changer.
<ndynames>: Likewise.
(ctf_accumulate_archive_names): Call the ctf_link_memb_name_changer.
(ctf_link_write): Likewise (for _CTF_SECTION only): also call
ctf_change_parent_name. Free any resulting names.
2019-07-20 21:44:44 +08:00
|
|
|
const char *name = (const char *) arg;
|
|
|
|
|
|
|
|
ctf_parent_name_set (fp, name);
|
|
|
|
}
|
|
|
|
|
2021-01-05 21:25:56 +08:00
|
|
|
/* Warn if we may suffer information loss because the CTF input files are too
|
|
|
|
old. Usually we provide complete backward compatibility, but compiler
|
|
|
|
changes etc which never hit a release may have a flag in the header that
|
|
|
|
simply prevents those changes from being used. */
|
|
|
|
static void
|
|
|
|
ctf_link_warn_outdated_inputs (ctf_dict_t *fp)
|
|
|
|
{
|
|
|
|
ctf_next_t *i = NULL;
|
|
|
|
void *name_;
|
libctf: ctf-link outdated input check faulty
This check has a pair of faults which, combined, can lead to memory
corruption. Firstly, it assumes that the values of the ctf_link_inputs
hash are ctf_dict_t's: they are not, they are ctf_link_input_t's, a much
shorter structure. So the flags check which is the core of this is
faulty (but happens, by chance, to give the right output on most
architectures, since usually we happen to get a 0 here, so the test that
checks this usually passes). Worse, the warning that is emitted when
the test fails is added to the wrong dict -- it's added to the input
dict, whose warning list is never consumed, rendering the whole check
useless. But the dict it adds to is still the wrong type, so we end up
overwriting something deep in memory (or, much more likely,
dereferencing a garbage pointer and crashing).
Fixing both reveals another problem: the link input is an *archive*
consisting of multiple members, so we have to consider whether to check
all of them for the outdated-func-info thing we are checking here.
However, no compiler exists that emits a mixture of members with this
flag on and members with it off, and the linker always reserializes (and
upgrades) such things when it sees them: so all members in a given
archive must have the same value of the flag, so we only need to check
one member per input archive.
libctf/
PR libctf/29983
* ctf-link.c (ctf_link_warn_outdated_inputs): Get the types of
members of ctf_link_inputs right, fixing a possible spurious
tesst failure / wild pointer deref / overwrite. Emit the
warning message into the right dict.
2023-01-09 21:43:09 +08:00
|
|
|
void *input_;
|
2021-01-05 21:25:56 +08:00
|
|
|
int err;
|
|
|
|
|
libctf: ctf-link outdated input check faulty
This check has a pair of faults which, combined, can lead to memory
corruption. Firstly, it assumes that the values of the ctf_link_inputs
hash are ctf_dict_t's: they are not, they are ctf_link_input_t's, a much
shorter structure. So the flags check which is the core of this is
faulty (but happens, by chance, to give the right output on most
architectures, since usually we happen to get a 0 here, so the test that
checks this usually passes). Worse, the warning that is emitted when
the test fails is added to the wrong dict -- it's added to the input
dict, whose warning list is never consumed, rendering the whole check
useless. But the dict it adds to is still the wrong type, so we end up
overwriting something deep in memory (or, much more likely,
dereferencing a garbage pointer and crashing).
Fixing both reveals another problem: the link input is an *archive*
consisting of multiple members, so we have to consider whether to check
all of them for the outdated-func-info thing we are checking here.
However, no compiler exists that emits a mixture of members with this
flag on and members with it off, and the linker always reserializes (and
upgrades) such things when it sees them: so all members in a given
archive must have the same value of the flag, so we only need to check
one member per input archive.
libctf/
PR libctf/29983
* ctf-link.c (ctf_link_warn_outdated_inputs): Get the types of
members of ctf_link_inputs right, fixing a possible spurious
tesst failure / wild pointer deref / overwrite. Emit the
warning message into the right dict.
2023-01-09 21:43:09 +08:00
|
|
|
while ((err = ctf_dynhash_next (fp->ctf_link_inputs, &i, &name_, &input_)) == 0)
|
2021-01-05 21:25:56 +08:00
|
|
|
{
|
|
|
|
const char *name = (const char *) name_;
|
libctf: ctf-link outdated input check faulty
This check has a pair of faults which, combined, can lead to memory
corruption. Firstly, it assumes that the values of the ctf_link_inputs
hash are ctf_dict_t's: they are not, they are ctf_link_input_t's, a much
shorter structure. So the flags check which is the core of this is
faulty (but happens, by chance, to give the right output on most
architectures, since usually we happen to get a 0 here, so the test that
checks this usually passes). Worse, the warning that is emitted when
the test fails is added to the wrong dict -- it's added to the input
dict, whose warning list is never consumed, rendering the whole check
useless. But the dict it adds to is still the wrong type, so we end up
overwriting something deep in memory (or, much more likely,
dereferencing a garbage pointer and crashing).
Fixing both reveals another problem: the link input is an *archive*
consisting of multiple members, so we have to consider whether to check
all of them for the outdated-func-info thing we are checking here.
However, no compiler exists that emits a mixture of members with this
flag on and members with it off, and the linker always reserializes (and
upgrades) such things when it sees them: so all members in a given
archive must have the same value of the flag, so we only need to check
one member per input archive.
libctf/
PR libctf/29983
* ctf-link.c (ctf_link_warn_outdated_inputs): Get the types of
members of ctf_link_inputs right, fixing a possible spurious
tesst failure / wild pointer deref / overwrite. Emit the
warning message into the right dict.
2023-01-09 21:43:09 +08:00
|
|
|
ctf_link_input_t *input = (ctf_link_input_t *) input_;
|
|
|
|
ctf_next_t *j = NULL;
|
|
|
|
ctf_dict_t *ifp;
|
|
|
|
int err;
|
|
|
|
|
|
|
|
/* We only care about CTF archives by this point: lazy-opened archives
|
|
|
|
have always been opened by this point, and short-circuited entries have
|
|
|
|
a matching corresponding archive member. Entries with NULL clin_arc can
|
|
|
|
exist, and constitute old entries renamed via a name changer: the
|
|
|
|
renamed entries exist elsewhere in the list, so we can just skip
|
|
|
|
those. */
|
|
|
|
|
|
|
|
if (!input->clin_arc)
|
|
|
|
continue;
|
|
|
|
|
|
|
|
/* All entries in the archive will necessarily contain the same
|
|
|
|
CTF_F_NEWFUNCINFO flag, so we only need to check the first. We don't
|
|
|
|
even need to do that if we can't open it for any reason at all: the
|
|
|
|
link will fail later on regardless, since an input can't be opened. */
|
|
|
|
|
|
|
|
ifp = ctf_archive_next (input->clin_arc, &j, NULL, 0, &err);
|
|
|
|
if (!ifp)
|
|
|
|
continue;
|
|
|
|
ctf_next_destroy (j);
|
2021-01-05 21:25:56 +08:00
|
|
|
|
|
|
|
if (!(ifp->ctf_header->cth_flags & CTF_F_NEWFUNCINFO)
|
|
|
|
&& (ifp->ctf_header->cth_varoff - ifp->ctf_header->cth_funcoff) > 0)
|
libctf: ctf-link outdated input check faulty
This check has a pair of faults which, combined, can lead to memory
corruption. Firstly, it assumes that the values of the ctf_link_inputs
hash are ctf_dict_t's: they are not, they are ctf_link_input_t's, a much
shorter structure. So the flags check which is the core of this is
faulty (but happens, by chance, to give the right output on most
architectures, since usually we happen to get a 0 here, so the test that
checks this usually passes). Worse, the warning that is emitted when
the test fails is added to the wrong dict -- it's added to the input
dict, whose warning list is never consumed, rendering the whole check
useless. But the dict it adds to is still the wrong type, so we end up
overwriting something deep in memory (or, much more likely,
dereferencing a garbage pointer and crashing).
Fixing both reveals another problem: the link input is an *archive*
consisting of multiple members, so we have to consider whether to check
all of them for the outdated-func-info thing we are checking here.
However, no compiler exists that emits a mixture of members with this
flag on and members with it off, and the linker always reserializes (and
upgrades) such things when it sees them: so all members in a given
archive must have the same value of the flag, so we only need to check
one member per input archive.
libctf/
PR libctf/29983
* ctf-link.c (ctf_link_warn_outdated_inputs): Get the types of
members of ctf_link_inputs right, fixing a possible spurious
tesst failure / wild pointer deref / overwrite. Emit the
warning message into the right dict.
2023-01-09 21:43:09 +08:00
|
|
|
ctf_err_warn (fp, 1, 0, _("linker input %s has CTF func info but uses "
|
|
|
|
"an old, unreleased func info format: "
|
|
|
|
"this func info section will be dropped."),
|
2021-01-05 21:25:56 +08:00
|
|
|
name);
|
|
|
|
}
|
|
|
|
if (err != ECTF_NEXT_END)
|
|
|
|
ctf_err_warn (fp, 0, err, _("error checking for outdated inputs"));
|
|
|
|
}
|
|
|
|
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
/* Write out a CTF archive (if there are per-CU CTF files) or a CTF file
|
|
|
|
(otherwise) into a new dynamically-allocated string, and return it.
|
|
|
|
Members with sizes above THRESHOLD are compressed. */
|
|
|
|
unsigned char *
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
ctf_link_write (ctf_dict_t *fp, size_t *size, size_t threshold)
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
{
|
|
|
|
ctf_name_list_accum_cb_arg_t arg;
|
|
|
|
char **names;
|
libctf: add CU-mapping machinery
Once the deduplicator is capable of actually detecting conflicting types
with the same name (i.e., not yet) we will place such conflicting types,
and types that depend on them, into CTF dictionaries that are the child
of the main dictionary we usually emit: currently, this will lead to the
.ctf section becoming a CTF archive rather than a single dictionary,
with the default-named archive member (_CTF_SECTION, or NULL) being the
main shared dictionary with most of the types in it.
By default, the sections are named after the compilation unit they come
from (complete path and all), with the cuname field in the CTF header
providing further evidence of the name without requiring the caller to
engage in tiresome parsing. But some callers may not wish the mapping
from input CU to output sub-dictionary to be purely CU-based.
The machinery here allows this to be freely changed, in two ways:
- callers can call ctf_link_add_cu_mapping to specify that a single
input compilation unit should have its types placed in some other CU
if they conflict: the CU will always be created, even if empty, so
the consuming program can depend on its existence. You can map
multiple input CUs to one output CU to force all their types to be
merged together: if some of *those* types conflict, the behaviour is
currently unspecified (the new deduplicator will specify it).
- callers can call ctf_link_set_memb_name_changer to provide a function
which is passed every CTF sub-dictionary name in turn (including
_CTF_SECTION) and can return a new name, or NULL if no change is
desired. The mapping from input to output names should not map two
input names to the same output name: if this happens, the two are not
merged but will result in an archive with two members with the same
name (technically valid, but it's hard to access the second
same-named member: you have to do an iteration over archive members).
This is used by the kernel's ctfarchive machinery (not yet upstream) to
encode CTF under member names like {module name}.ctf rather than
.ctf.CU, but it is anticipated that other large projects may wish to
have their own storage for CTF outside of .ctf sections and may wish to
have new naming schemes that suit their special-purpose consumers.
New in v3.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (ctf_link_add_cu_mapping): New.
(ctf_link_memb_name_changer_f): New.
(ctf_link_set_memb_name_changer): New.
libctf/
* ctf-impl.h (ctf_file_t) <ctf_link_cu_mappping>: New.
<ctf_link_memb_name_changer>: Likewise.
<ctf_link_memb_name_changer_arg>: Likewise.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-link.c (ctf_create_per_cu): Apply the cu mapping.
(ctf_link_add_cu_mapping): New.
(ctf_link_set_memb_name_changer): Likewise.
(ctf_change_parent_name): New.
(ctf_name_list_accum_cb_arg_t) <dynames>: New, storage for names
allocated by the caller's ctf_link_memb_name_changer.
<ndynames>: Likewise.
(ctf_accumulate_archive_names): Call the ctf_link_memb_name_changer.
(ctf_link_write): Likewise (for _CTF_SECTION only): also call
ctf_change_parent_name. Free any resulting names.
2019-07-20 21:44:44 +08:00
|
|
|
char *transformed_name = NULL;
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
ctf_dict_t **files;
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
FILE *f = NULL;
|
libctf, ld: fix symtypetab and var section population under ld -r
The variable section in a CTF dict is meant to contain the types of
variables that do not appear in the symbol table (mostly file-scope
static declarations). We implement this by having the compiler emit
all potential data symbols into both sections, then delete those
symbols from the variable section that correspond to data symbols the
linker has reported.
Unfortunately, the check for this in ctf_serialize is wrong: rather than
checking the set of linker-reported symbols, we check the set of names
in the data object symtypetab section: if the linker has reported no
symbols at all (usually if ld -r has been run, or if a non-linker
program that does not use symbol tables is calling ctf_link) this will
include every single symbol, emptying the variable section completely.
Worse, when ld -r is in use, we want to force writeout of every
symtypetab entry on the inputs, in an indexed section, whether or not
the linker has reported them, since this isn't a final link yet and the
symbol table is not finalized (and may grow more symbols than the linker
has yet reported). But the check for this is flawed too: we were
relying on ctf_link_shuffle_syms not having been called if no symbols
exist, but that function is *always* called by ld even when ld -r is in
use: ctf_link_add_linker_symbol is the one that's not called when there
are no symbols.
We clearly need to rethink this. Using the emptiness of the set of
reported symbols as a test for ld -r is just ugly: the linker already
knows if ld -r is underway and can just tell us. So add a new linker
flag CTF_LINK_NO_FILTER_REPORTED_SYMS that is set to stop the linker
filtering the symbols in the symtypetab sections using the set that the
linker has reported: use the presence or absence of this flag to
determine whether to emit unindexed symtabs: we only remove entries from
the variable section when filtering symbols, and we only remove them if
they are in the reported symbol set, fixing the case where no symbols
are reported by the linker at all.
(The negative sense of the new CTF_LINK flag is intentional: the common
case, both for ld and for simple tools that want to do a ctf_link with
no ELF symbol table in sight, is probably to filter out symbols that no
linker has reported: i.e., for the simple tools, all of them.)
There's another wrinkle, though. It is quite possible for a non-linker
to add symbols to a dict via ctf_add_*_sym and then write it out via the
ctf_write APIs: perhaps it's preparing a dict for a later linker
invocation. Right now this would not lead to anything terribly
meaningful happening: ctf_serialize just assumes it was called via
ctf_link if symbols are present. So add an (internal-to-libctf) flag
that indicates that a writeout is happening via ctf_link_write, and set
it there (propagating it to child dicts as needed). ctf_serialize can
then spot when it is not being called by a linker, and arrange to always
write out an indexed, sorted symtypetab for fastest possible future
symbol lookup by name in that case. (The writeouts done by ld -r are
unsorted, because the only thing likely to use those symtabs is the
linker, which doesn't benefit from symtypetab sorting.)
Tests added for all three linking cases (ld -r, ld -shared, ld), with a
bit of testsuite framework enhancement to stop it unconditionally
linking the CTF to be checked by the lookup program with -shared, so
tests can now examine CTF linked with -r or indeed with no flags at all,
though the output filename is still foo.so even in this case.
Another test added for the non-linker case that endeavours to determine
whether the symtypetab is sorted by examining the order of entries
returned from ctf_symbol_next: nobody outside libctf should rely on
this ordering, but this test is not outside libctf :)
include/ChangeLog
2021-01-26 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (CTF_LINK_NO_FILTER_REPORTED_SYMS): New.
ld/ChangeLog
2021-01-26 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (lang_merge_ctf): Set CTF_LINK_NO_FILTER_REPORTED_SYMS
when appropriate.
libctf/ChangeLog
2021-01-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.c (_libctf_nonnull_): Add parameters.
(LCTF_LINKING): New flag.
(ctf_dict_t) <ctf_link_flags>: Mention it.
* ctf-link.c (ctf_link): Keep LCTF_LINKING set across call.
(ctf_write): Likewise, including in child dictionaries.
(ctf_link_shuffle_syms): Make sure ctf_dynsyms is NULL if there
are no reported symbols.
* ctf-create.c (symtypetab_delete_nonstatic_vars): Make sure
the variable has been reported as a symbol by the linker.
(symtypetab_skippable): Mention relationship between SYMFP and the
flags.
(symtypetab_density): Adjust nonnullity. Exit early if no symbols
were reported and force-indexing is off (i.e., we are doing a
final link).
(ctf_serialize): Handle the !LCTF_LINKING case by writing out an
indexed, sorted symtypetab (and allow SYMFP to be NULL in this
case). Turn sorting off if this is a non-final link. Only delete
nonstatic vars if we are filtering symbols and the linker has
reported some.
* testsuite/libctf-regression/nonstatic-var-section-ld-r*:
New test of variable and symtypetab section population when
ld -r is used.
* testsuite/libctf-regression/nonstatic-var-section-ld-executable.lk:
Likewise, when ld of an executable is used.
* testsuite/libctf-regression/nonstatic-var-section-ld.lk:
Likewise, when ld -shared alone is used.
* testsuite/libctf-regression/nonstatic-var-section-ld*.c:
Lookup programs for the above.
* testsuite/libctf-writable/symtypetab-nonlinker-writeout.*: New
test, testing survival of symbols across ctf_write paths.
* testsuite/lib/ctf-lib.exp (run_lookup_test): New option,
nonshared, suppressing linking of the SOURCE with -shared.
2021-01-17 00:49:29 +08:00
|
|
|
size_t i;
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
int err;
|
|
|
|
long fsize;
|
|
|
|
const char *errloc;
|
|
|
|
unsigned char *buf = NULL;
|
|
|
|
|
|
|
|
memset (&arg, 0, sizeof (ctf_name_list_accum_cb_arg_t));
|
|
|
|
arg.fp = fp;
|
libctf, ld: fix symtypetab and var section population under ld -r
The variable section in a CTF dict is meant to contain the types of
variables that do not appear in the symbol table (mostly file-scope
static declarations). We implement this by having the compiler emit
all potential data symbols into both sections, then delete those
symbols from the variable section that correspond to data symbols the
linker has reported.
Unfortunately, the check for this in ctf_serialize is wrong: rather than
checking the set of linker-reported symbols, we check the set of names
in the data object symtypetab section: if the linker has reported no
symbols at all (usually if ld -r has been run, or if a non-linker
program that does not use symbol tables is calling ctf_link) this will
include every single symbol, emptying the variable section completely.
Worse, when ld -r is in use, we want to force writeout of every
symtypetab entry on the inputs, in an indexed section, whether or not
the linker has reported them, since this isn't a final link yet and the
symbol table is not finalized (and may grow more symbols than the linker
has yet reported). But the check for this is flawed too: we were
relying on ctf_link_shuffle_syms not having been called if no symbols
exist, but that function is *always* called by ld even when ld -r is in
use: ctf_link_add_linker_symbol is the one that's not called when there
are no symbols.
We clearly need to rethink this. Using the emptiness of the set of
reported symbols as a test for ld -r is just ugly: the linker already
knows if ld -r is underway and can just tell us. So add a new linker
flag CTF_LINK_NO_FILTER_REPORTED_SYMS that is set to stop the linker
filtering the symbols in the symtypetab sections using the set that the
linker has reported: use the presence or absence of this flag to
determine whether to emit unindexed symtabs: we only remove entries from
the variable section when filtering symbols, and we only remove them if
they are in the reported symbol set, fixing the case where no symbols
are reported by the linker at all.
(The negative sense of the new CTF_LINK flag is intentional: the common
case, both for ld and for simple tools that want to do a ctf_link with
no ELF symbol table in sight, is probably to filter out symbols that no
linker has reported: i.e., for the simple tools, all of them.)
There's another wrinkle, though. It is quite possible for a non-linker
to add symbols to a dict via ctf_add_*_sym and then write it out via the
ctf_write APIs: perhaps it's preparing a dict for a later linker
invocation. Right now this would not lead to anything terribly
meaningful happening: ctf_serialize just assumes it was called via
ctf_link if symbols are present. So add an (internal-to-libctf) flag
that indicates that a writeout is happening via ctf_link_write, and set
it there (propagating it to child dicts as needed). ctf_serialize can
then spot when it is not being called by a linker, and arrange to always
write out an indexed, sorted symtypetab for fastest possible future
symbol lookup by name in that case. (The writeouts done by ld -r are
unsorted, because the only thing likely to use those symtabs is the
linker, which doesn't benefit from symtypetab sorting.)
Tests added for all three linking cases (ld -r, ld -shared, ld), with a
bit of testsuite framework enhancement to stop it unconditionally
linking the CTF to be checked by the lookup program with -shared, so
tests can now examine CTF linked with -r or indeed with no flags at all,
though the output filename is still foo.so even in this case.
Another test added for the non-linker case that endeavours to determine
whether the symtypetab is sorted by examining the order of entries
returned from ctf_symbol_next: nobody outside libctf should rely on
this ordering, but this test is not outside libctf :)
include/ChangeLog
2021-01-26 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (CTF_LINK_NO_FILTER_REPORTED_SYMS): New.
ld/ChangeLog
2021-01-26 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (lang_merge_ctf): Set CTF_LINK_NO_FILTER_REPORTED_SYMS
when appropriate.
libctf/ChangeLog
2021-01-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.c (_libctf_nonnull_): Add parameters.
(LCTF_LINKING): New flag.
(ctf_dict_t) <ctf_link_flags>: Mention it.
* ctf-link.c (ctf_link): Keep LCTF_LINKING set across call.
(ctf_write): Likewise, including in child dictionaries.
(ctf_link_shuffle_syms): Make sure ctf_dynsyms is NULL if there
are no reported symbols.
* ctf-create.c (symtypetab_delete_nonstatic_vars): Make sure
the variable has been reported as a symbol by the linker.
(symtypetab_skippable): Mention relationship between SYMFP and the
flags.
(symtypetab_density): Adjust nonnullity. Exit early if no symbols
were reported and force-indexing is off (i.e., we are doing a
final link).
(ctf_serialize): Handle the !LCTF_LINKING case by writing out an
indexed, sorted symtypetab (and allow SYMFP to be NULL in this
case). Turn sorting off if this is a non-final link. Only delete
nonstatic vars if we are filtering symbols and the linker has
reported some.
* testsuite/libctf-regression/nonstatic-var-section-ld-r*:
New test of variable and symtypetab section population when
ld -r is used.
* testsuite/libctf-regression/nonstatic-var-section-ld-executable.lk:
Likewise, when ld of an executable is used.
* testsuite/libctf-regression/nonstatic-var-section-ld.lk:
Likewise, when ld -shared alone is used.
* testsuite/libctf-regression/nonstatic-var-section-ld*.c:
Lookup programs for the above.
* testsuite/libctf-writable/symtypetab-nonlinker-writeout.*: New
test, testing survival of symbols across ctf_write paths.
* testsuite/lib/ctf-lib.exp (run_lookup_test): New option,
nonshared, suppressing linking of the SOURCE with -shared.
2021-01-17 00:49:29 +08:00
|
|
|
fp->ctf_flags |= LCTF_LINKING;
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
|
2021-01-05 21:25:56 +08:00
|
|
|
ctf_link_warn_outdated_inputs (fp);
|
|
|
|
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
if (fp->ctf_link_outputs)
|
|
|
|
{
|
|
|
|
ctf_dynhash_iter (fp->ctf_link_outputs, ctf_accumulate_archive_names, &arg);
|
|
|
|
if (ctf_errno (fp) < 0)
|
|
|
|
{
|
|
|
|
errloc = "hash creation";
|
|
|
|
goto err;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
/* No extra outputs? Just write a simple ctf_dict_t. */
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
if (arg.i == 0)
|
libctf, ld: fix symtypetab and var section population under ld -r
The variable section in a CTF dict is meant to contain the types of
variables that do not appear in the symbol table (mostly file-scope
static declarations). We implement this by having the compiler emit
all potential data symbols into both sections, then delete those
symbols from the variable section that correspond to data symbols the
linker has reported.
Unfortunately, the check for this in ctf_serialize is wrong: rather than
checking the set of linker-reported symbols, we check the set of names
in the data object symtypetab section: if the linker has reported no
symbols at all (usually if ld -r has been run, or if a non-linker
program that does not use symbol tables is calling ctf_link) this will
include every single symbol, emptying the variable section completely.
Worse, when ld -r is in use, we want to force writeout of every
symtypetab entry on the inputs, in an indexed section, whether or not
the linker has reported them, since this isn't a final link yet and the
symbol table is not finalized (and may grow more symbols than the linker
has yet reported). But the check for this is flawed too: we were
relying on ctf_link_shuffle_syms not having been called if no symbols
exist, but that function is *always* called by ld even when ld -r is in
use: ctf_link_add_linker_symbol is the one that's not called when there
are no symbols.
We clearly need to rethink this. Using the emptiness of the set of
reported symbols as a test for ld -r is just ugly: the linker already
knows if ld -r is underway and can just tell us. So add a new linker
flag CTF_LINK_NO_FILTER_REPORTED_SYMS that is set to stop the linker
filtering the symbols in the symtypetab sections using the set that the
linker has reported: use the presence or absence of this flag to
determine whether to emit unindexed symtabs: we only remove entries from
the variable section when filtering symbols, and we only remove them if
they are in the reported symbol set, fixing the case where no symbols
are reported by the linker at all.
(The negative sense of the new CTF_LINK flag is intentional: the common
case, both for ld and for simple tools that want to do a ctf_link with
no ELF symbol table in sight, is probably to filter out symbols that no
linker has reported: i.e., for the simple tools, all of them.)
There's another wrinkle, though. It is quite possible for a non-linker
to add symbols to a dict via ctf_add_*_sym and then write it out via the
ctf_write APIs: perhaps it's preparing a dict for a later linker
invocation. Right now this would not lead to anything terribly
meaningful happening: ctf_serialize just assumes it was called via
ctf_link if symbols are present. So add an (internal-to-libctf) flag
that indicates that a writeout is happening via ctf_link_write, and set
it there (propagating it to child dicts as needed). ctf_serialize can
then spot when it is not being called by a linker, and arrange to always
write out an indexed, sorted symtypetab for fastest possible future
symbol lookup by name in that case. (The writeouts done by ld -r are
unsorted, because the only thing likely to use those symtabs is the
linker, which doesn't benefit from symtypetab sorting.)
Tests added for all three linking cases (ld -r, ld -shared, ld), with a
bit of testsuite framework enhancement to stop it unconditionally
linking the CTF to be checked by the lookup program with -shared, so
tests can now examine CTF linked with -r or indeed with no flags at all,
though the output filename is still foo.so even in this case.
Another test added for the non-linker case that endeavours to determine
whether the symtypetab is sorted by examining the order of entries
returned from ctf_symbol_next: nobody outside libctf should rely on
this ordering, but this test is not outside libctf :)
include/ChangeLog
2021-01-26 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (CTF_LINK_NO_FILTER_REPORTED_SYMS): New.
ld/ChangeLog
2021-01-26 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (lang_merge_ctf): Set CTF_LINK_NO_FILTER_REPORTED_SYMS
when appropriate.
libctf/ChangeLog
2021-01-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.c (_libctf_nonnull_): Add parameters.
(LCTF_LINKING): New flag.
(ctf_dict_t) <ctf_link_flags>: Mention it.
* ctf-link.c (ctf_link): Keep LCTF_LINKING set across call.
(ctf_write): Likewise, including in child dictionaries.
(ctf_link_shuffle_syms): Make sure ctf_dynsyms is NULL if there
are no reported symbols.
* ctf-create.c (symtypetab_delete_nonstatic_vars): Make sure
the variable has been reported as a symbol by the linker.
(symtypetab_skippable): Mention relationship between SYMFP and the
flags.
(symtypetab_density): Adjust nonnullity. Exit early if no symbols
were reported and force-indexing is off (i.e., we are doing a
final link).
(ctf_serialize): Handle the !LCTF_LINKING case by writing out an
indexed, sorted symtypetab (and allow SYMFP to be NULL in this
case). Turn sorting off if this is a non-final link. Only delete
nonstatic vars if we are filtering symbols and the linker has
reported some.
* testsuite/libctf-regression/nonstatic-var-section-ld-r*:
New test of variable and symtypetab section population when
ld -r is used.
* testsuite/libctf-regression/nonstatic-var-section-ld-executable.lk:
Likewise, when ld of an executable is used.
* testsuite/libctf-regression/nonstatic-var-section-ld.lk:
Likewise, when ld -shared alone is used.
* testsuite/libctf-regression/nonstatic-var-section-ld*.c:
Lookup programs for the above.
* testsuite/libctf-writable/symtypetab-nonlinker-writeout.*: New
test, testing survival of symbols across ctf_write paths.
* testsuite/lib/ctf-lib.exp (run_lookup_test): New option,
nonshared, suppressing linking of the SOURCE with -shared.
2021-01-17 00:49:29 +08:00
|
|
|
{
|
|
|
|
unsigned char *ret = ctf_write_mem (fp, size, threshold);
|
|
|
|
fp->ctf_flags &= ~LCTF_LINKING;
|
|
|
|
return ret;
|
|
|
|
}
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
|
|
|
|
/* Writing an archive. Stick ourselves (the shared repository, parent of all
|
|
|
|
other archives) on the front of it with the default name. */
|
|
|
|
if ((names = realloc (arg.names, sizeof (char *) * (arg.i + 1))) == NULL)
|
|
|
|
{
|
|
|
|
errloc = "name reallocation";
|
|
|
|
goto err_no;
|
|
|
|
}
|
|
|
|
arg.names = names;
|
|
|
|
memmove (&(arg.names[1]), arg.names, sizeof (char *) * (arg.i));
|
libctf: add CU-mapping machinery
Once the deduplicator is capable of actually detecting conflicting types
with the same name (i.e., not yet) we will place such conflicting types,
and types that depend on them, into CTF dictionaries that are the child
of the main dictionary we usually emit: currently, this will lead to the
.ctf section becoming a CTF archive rather than a single dictionary,
with the default-named archive member (_CTF_SECTION, or NULL) being the
main shared dictionary with most of the types in it.
By default, the sections are named after the compilation unit they come
from (complete path and all), with the cuname field in the CTF header
providing further evidence of the name without requiring the caller to
engage in tiresome parsing. But some callers may not wish the mapping
from input CU to output sub-dictionary to be purely CU-based.
The machinery here allows this to be freely changed, in two ways:
- callers can call ctf_link_add_cu_mapping to specify that a single
input compilation unit should have its types placed in some other CU
if they conflict: the CU will always be created, even if empty, so
the consuming program can depend on its existence. You can map
multiple input CUs to one output CU to force all their types to be
merged together: if some of *those* types conflict, the behaviour is
currently unspecified (the new deduplicator will specify it).
- callers can call ctf_link_set_memb_name_changer to provide a function
which is passed every CTF sub-dictionary name in turn (including
_CTF_SECTION) and can return a new name, or NULL if no change is
desired. The mapping from input to output names should not map two
input names to the same output name: if this happens, the two are not
merged but will result in an archive with two members with the same
name (technically valid, but it's hard to access the second
same-named member: you have to do an iteration over archive members).
This is used by the kernel's ctfarchive machinery (not yet upstream) to
encode CTF under member names like {module name}.ctf rather than
.ctf.CU, but it is anticipated that other large projects may wish to
have their own storage for CTF outside of .ctf sections and may wish to
have new naming schemes that suit their special-purpose consumers.
New in v3.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (ctf_link_add_cu_mapping): New.
(ctf_link_memb_name_changer_f): New.
(ctf_link_set_memb_name_changer): New.
libctf/
* ctf-impl.h (ctf_file_t) <ctf_link_cu_mappping>: New.
<ctf_link_memb_name_changer>: Likewise.
<ctf_link_memb_name_changer_arg>: Likewise.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-link.c (ctf_create_per_cu): Apply the cu mapping.
(ctf_link_add_cu_mapping): New.
(ctf_link_set_memb_name_changer): Likewise.
(ctf_change_parent_name): New.
(ctf_name_list_accum_cb_arg_t) <dynames>: New, storage for names
allocated by the caller's ctf_link_memb_name_changer.
<ndynames>: Likewise.
(ctf_accumulate_archive_names): Call the ctf_link_memb_name_changer.
(ctf_link_write): Likewise (for _CTF_SECTION only): also call
ctf_change_parent_name. Free any resulting names.
2019-07-20 21:44:44 +08:00
|
|
|
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
arg.names[0] = (char *) _CTF_SECTION;
|
libctf: add CU-mapping machinery
Once the deduplicator is capable of actually detecting conflicting types
with the same name (i.e., not yet) we will place such conflicting types,
and types that depend on them, into CTF dictionaries that are the child
of the main dictionary we usually emit: currently, this will lead to the
.ctf section becoming a CTF archive rather than a single dictionary,
with the default-named archive member (_CTF_SECTION, or NULL) being the
main shared dictionary with most of the types in it.
By default, the sections are named after the compilation unit they come
from (complete path and all), with the cuname field in the CTF header
providing further evidence of the name without requiring the caller to
engage in tiresome parsing. But some callers may not wish the mapping
from input CU to output sub-dictionary to be purely CU-based.
The machinery here allows this to be freely changed, in two ways:
- callers can call ctf_link_add_cu_mapping to specify that a single
input compilation unit should have its types placed in some other CU
if they conflict: the CU will always be created, even if empty, so
the consuming program can depend on its existence. You can map
multiple input CUs to one output CU to force all their types to be
merged together: if some of *those* types conflict, the behaviour is
currently unspecified (the new deduplicator will specify it).
- callers can call ctf_link_set_memb_name_changer to provide a function
which is passed every CTF sub-dictionary name in turn (including
_CTF_SECTION) and can return a new name, or NULL if no change is
desired. The mapping from input to output names should not map two
input names to the same output name: if this happens, the two are not
merged but will result in an archive with two members with the same
name (technically valid, but it's hard to access the second
same-named member: you have to do an iteration over archive members).
This is used by the kernel's ctfarchive machinery (not yet upstream) to
encode CTF under member names like {module name}.ctf rather than
.ctf.CU, but it is anticipated that other large projects may wish to
have their own storage for CTF outside of .ctf sections and may wish to
have new naming schemes that suit their special-purpose consumers.
New in v3.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (ctf_link_add_cu_mapping): New.
(ctf_link_memb_name_changer_f): New.
(ctf_link_set_memb_name_changer): New.
libctf/
* ctf-impl.h (ctf_file_t) <ctf_link_cu_mappping>: New.
<ctf_link_memb_name_changer>: Likewise.
<ctf_link_memb_name_changer_arg>: Likewise.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-link.c (ctf_create_per_cu): Apply the cu mapping.
(ctf_link_add_cu_mapping): New.
(ctf_link_set_memb_name_changer): Likewise.
(ctf_change_parent_name): New.
(ctf_name_list_accum_cb_arg_t) <dynames>: New, storage for names
allocated by the caller's ctf_link_memb_name_changer.
<ndynames>: Likewise.
(ctf_accumulate_archive_names): Call the ctf_link_memb_name_changer.
(ctf_link_write): Likewise (for _CTF_SECTION only): also call
ctf_change_parent_name. Free any resulting names.
2019-07-20 21:44:44 +08:00
|
|
|
if (fp->ctf_link_memb_name_changer)
|
|
|
|
{
|
|
|
|
void *nc_arg = fp->ctf_link_memb_name_changer_arg;
|
|
|
|
|
|
|
|
transformed_name = fp->ctf_link_memb_name_changer (fp, _CTF_SECTION,
|
|
|
|
nc_arg);
|
|
|
|
|
|
|
|
if (transformed_name != NULL)
|
|
|
|
{
|
|
|
|
arg.names[0] = transformed_name;
|
|
|
|
ctf_dynhash_iter (fp->ctf_link_outputs, ctf_change_parent_name,
|
|
|
|
transformed_name);
|
|
|
|
}
|
|
|
|
}
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
|
libctf, ld: fix symtypetab and var section population under ld -r
The variable section in a CTF dict is meant to contain the types of
variables that do not appear in the symbol table (mostly file-scope
static declarations). We implement this by having the compiler emit
all potential data symbols into both sections, then delete those
symbols from the variable section that correspond to data symbols the
linker has reported.
Unfortunately, the check for this in ctf_serialize is wrong: rather than
checking the set of linker-reported symbols, we check the set of names
in the data object symtypetab section: if the linker has reported no
symbols at all (usually if ld -r has been run, or if a non-linker
program that does not use symbol tables is calling ctf_link) this will
include every single symbol, emptying the variable section completely.
Worse, when ld -r is in use, we want to force writeout of every
symtypetab entry on the inputs, in an indexed section, whether or not
the linker has reported them, since this isn't a final link yet and the
symbol table is not finalized (and may grow more symbols than the linker
has yet reported). But the check for this is flawed too: we were
relying on ctf_link_shuffle_syms not having been called if no symbols
exist, but that function is *always* called by ld even when ld -r is in
use: ctf_link_add_linker_symbol is the one that's not called when there
are no symbols.
We clearly need to rethink this. Using the emptiness of the set of
reported symbols as a test for ld -r is just ugly: the linker already
knows if ld -r is underway and can just tell us. So add a new linker
flag CTF_LINK_NO_FILTER_REPORTED_SYMS that is set to stop the linker
filtering the symbols in the symtypetab sections using the set that the
linker has reported: use the presence or absence of this flag to
determine whether to emit unindexed symtabs: we only remove entries from
the variable section when filtering symbols, and we only remove them if
they are in the reported symbol set, fixing the case where no symbols
are reported by the linker at all.
(The negative sense of the new CTF_LINK flag is intentional: the common
case, both for ld and for simple tools that want to do a ctf_link with
no ELF symbol table in sight, is probably to filter out symbols that no
linker has reported: i.e., for the simple tools, all of them.)
There's another wrinkle, though. It is quite possible for a non-linker
to add symbols to a dict via ctf_add_*_sym and then write it out via the
ctf_write APIs: perhaps it's preparing a dict for a later linker
invocation. Right now this would not lead to anything terribly
meaningful happening: ctf_serialize just assumes it was called via
ctf_link if symbols are present. So add an (internal-to-libctf) flag
that indicates that a writeout is happening via ctf_link_write, and set
it there (propagating it to child dicts as needed). ctf_serialize can
then spot when it is not being called by a linker, and arrange to always
write out an indexed, sorted symtypetab for fastest possible future
symbol lookup by name in that case. (The writeouts done by ld -r are
unsorted, because the only thing likely to use those symtabs is the
linker, which doesn't benefit from symtypetab sorting.)
Tests added for all three linking cases (ld -r, ld -shared, ld), with a
bit of testsuite framework enhancement to stop it unconditionally
linking the CTF to be checked by the lookup program with -shared, so
tests can now examine CTF linked with -r or indeed with no flags at all,
though the output filename is still foo.so even in this case.
Another test added for the non-linker case that endeavours to determine
whether the symtypetab is sorted by examining the order of entries
returned from ctf_symbol_next: nobody outside libctf should rely on
this ordering, but this test is not outside libctf :)
include/ChangeLog
2021-01-26 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (CTF_LINK_NO_FILTER_REPORTED_SYMS): New.
ld/ChangeLog
2021-01-26 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (lang_merge_ctf): Set CTF_LINK_NO_FILTER_REPORTED_SYMS
when appropriate.
libctf/ChangeLog
2021-01-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.c (_libctf_nonnull_): Add parameters.
(LCTF_LINKING): New flag.
(ctf_dict_t) <ctf_link_flags>: Mention it.
* ctf-link.c (ctf_link): Keep LCTF_LINKING set across call.
(ctf_write): Likewise, including in child dictionaries.
(ctf_link_shuffle_syms): Make sure ctf_dynsyms is NULL if there
are no reported symbols.
* ctf-create.c (symtypetab_delete_nonstatic_vars): Make sure
the variable has been reported as a symbol by the linker.
(symtypetab_skippable): Mention relationship between SYMFP and the
flags.
(symtypetab_density): Adjust nonnullity. Exit early if no symbols
were reported and force-indexing is off (i.e., we are doing a
final link).
(ctf_serialize): Handle the !LCTF_LINKING case by writing out an
indexed, sorted symtypetab (and allow SYMFP to be NULL in this
case). Turn sorting off if this is a non-final link. Only delete
nonstatic vars if we are filtering symbols and the linker has
reported some.
* testsuite/libctf-regression/nonstatic-var-section-ld-r*:
New test of variable and symtypetab section population when
ld -r is used.
* testsuite/libctf-regression/nonstatic-var-section-ld-executable.lk:
Likewise, when ld of an executable is used.
* testsuite/libctf-regression/nonstatic-var-section-ld.lk:
Likewise, when ld -shared alone is used.
* testsuite/libctf-regression/nonstatic-var-section-ld*.c:
Lookup programs for the above.
* testsuite/libctf-writable/symtypetab-nonlinker-writeout.*: New
test, testing survival of symbols across ctf_write paths.
* testsuite/lib/ctf-lib.exp (run_lookup_test): New option,
nonshared, suppressing linking of the SOURCE with -shared.
2021-01-17 00:49:29 +08:00
|
|
|
/* Propagate the link flags to all the dicts in this link. */
|
|
|
|
for (i = 0; i < arg.i; i++)
|
|
|
|
{
|
|
|
|
arg.files[i]->ctf_link_flags = fp->ctf_link_flags;
|
|
|
|
arg.files[i]->ctf_flags |= LCTF_LINKING;
|
|
|
|
}
|
|
|
|
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
if ((files = realloc (arg.files,
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
sizeof (struct ctf_dict *) * (arg.i + 1))) == NULL)
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
{
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
errloc = "ctf_dict reallocation";
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
goto err_no;
|
|
|
|
}
|
|
|
|
arg.files = files;
|
libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file". Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago. So the term "CTF file"
refers to something that is never a file! This is at best confusing.
The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.
So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead. Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.
Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).
binutils/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
(dump_ctf_archive_member): Likewise.
(dump_section_as_ctf): Likewise. Use ctf_dict_close, not
ctf_file_close.
gdb/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.
include/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_file_t): Rename to...
(ctf_dict_t): ... this. Keep ctf_file_t around for compatibility.
(struct ctf_file): Likewise rename to...
(struct ctf_dict): ... this.
(ctf_file_close): Rename to...
(ctf_dict_close): ... this, keeping compatibility function.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this, keeping compatibility function.
All callers adjusted.
* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
(struct ctf_archive) <ctfa_nfiles>: Rename to...
<ctfa_ndicts>: ... this.
ld/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (ctf_output): This is a ctf_dict_t now.
(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
(ldlang_open_ctf): Adjust comment.
(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
ctf_dict_t. Change opaque declaration accordingly.
* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
* ldemul.h (examine_strtab_for_ctf): Likewise.
(ldemul_examine_strtab_for_ctf): Likewise.
* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.
libctf/ChangeLog
2020-11-20 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
adjusted.
(ctf_fileops): Rename to...
(ctf_dictops): ... this.
(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
<cd_id_to_dict_t>: ... this.
(ctf_file_t): Fix outdated comment.
<ctf_fileops>: Rename to...
<ctf_dictops>: ... this.
(struct ctf_archive_internal) <ctfi_file>: Rename to...
<ctfi_dict>: ... this.
* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
Rename ctf_file_close to ctf_dict_close. All users adjusted.
* ctf-create.c: Likewise. Refer to CTF dicts, not CTF containers.
(ctf_bundle_t) <ctb_file>: Rename to...
<ctb_dict): ... this.
* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
* ctf-dedup.c: Likewise. Rename ctf_file_close to
ctf_dict_close. Refer to CTF dicts, not CTF containers.
* ctf-dump.c: Likewise.
* ctf-error.c: Likewise.
* ctf-hash.c: Likewise.
* ctf-inlines.h: Likewise.
* ctf-labels.c: Likewise.
* ctf-link.c: Likewise.
* ctf-lookup.c: Likewise.
* ctf-open-bfd.c: Likewise.
* ctf-string.c: Likewise.
* ctf-subr.c: Likewise.
* ctf-types.c: Likewise.
* ctf-util.c: Likewise.
* ctf-open.c: Likewise.
(ctf_file_close): Rename to...
(ctf_dict_close): ...this.
(ctf_file_close): New trivial wrapper around ctf_dict_close, for
compatibility.
(ctf_parent_file): Rename to...
(ctf_parent_dict): ... this.
(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
compatibility.
* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 21:34:04 +08:00
|
|
|
memmove (&(arg.files[1]), arg.files, sizeof (ctf_dict_t *) * (arg.i));
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
arg.files[0] = fp;
|
|
|
|
|
|
|
|
if ((f = tmpfile ()) == NULL)
|
|
|
|
{
|
|
|
|
errloc = "tempfile creation";
|
|
|
|
goto err_no;
|
|
|
|
}
|
|
|
|
|
|
|
|
if ((err = ctf_arc_write_fd (fileno (f), arg.files, arg.i + 1,
|
|
|
|
(const char **) arg.names,
|
|
|
|
threshold)) < 0)
|
|
|
|
{
|
|
|
|
errloc = "archive writing";
|
|
|
|
ctf_set_errno (fp, err);
|
|
|
|
goto err;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (fseek (f, 0, SEEK_END) < 0)
|
|
|
|
{
|
|
|
|
errloc = "seeking to end";
|
|
|
|
goto err_no;
|
|
|
|
}
|
|
|
|
|
|
|
|
if ((fsize = ftell (f)) < 0)
|
|
|
|
{
|
|
|
|
errloc = "filesize determination";
|
|
|
|
goto err_no;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (fseek (f, 0, SEEK_SET) < 0)
|
|
|
|
{
|
|
|
|
errloc = "filepos resetting";
|
|
|
|
goto err_no;
|
|
|
|
}
|
|
|
|
|
|
|
|
if ((buf = malloc (fsize)) == NULL)
|
|
|
|
{
|
|
|
|
errloc = "CTF archive buffer allocation";
|
|
|
|
goto err_no;
|
|
|
|
}
|
|
|
|
|
|
|
|
while (!feof (f) && fread (buf, fsize, 1, f) == 0)
|
|
|
|
if (ferror (f))
|
|
|
|
{
|
|
|
|
errloc = "reading archive from temporary file";
|
|
|
|
goto err_no;
|
|
|
|
}
|
|
|
|
|
|
|
|
*size = fsize;
|
|
|
|
free (arg.names);
|
|
|
|
free (arg.files);
|
libctf: add CU-mapping machinery
Once the deduplicator is capable of actually detecting conflicting types
with the same name (i.e., not yet) we will place such conflicting types,
and types that depend on them, into CTF dictionaries that are the child
of the main dictionary we usually emit: currently, this will lead to the
.ctf section becoming a CTF archive rather than a single dictionary,
with the default-named archive member (_CTF_SECTION, or NULL) being the
main shared dictionary with most of the types in it.
By default, the sections are named after the compilation unit they come
from (complete path and all), with the cuname field in the CTF header
providing further evidence of the name without requiring the caller to
engage in tiresome parsing. But some callers may not wish the mapping
from input CU to output sub-dictionary to be purely CU-based.
The machinery here allows this to be freely changed, in two ways:
- callers can call ctf_link_add_cu_mapping to specify that a single
input compilation unit should have its types placed in some other CU
if they conflict: the CU will always be created, even if empty, so
the consuming program can depend on its existence. You can map
multiple input CUs to one output CU to force all their types to be
merged together: if some of *those* types conflict, the behaviour is
currently unspecified (the new deduplicator will specify it).
- callers can call ctf_link_set_memb_name_changer to provide a function
which is passed every CTF sub-dictionary name in turn (including
_CTF_SECTION) and can return a new name, or NULL if no change is
desired. The mapping from input to output names should not map two
input names to the same output name: if this happens, the two are not
merged but will result in an archive with two members with the same
name (technically valid, but it's hard to access the second
same-named member: you have to do an iteration over archive members).
This is used by the kernel's ctfarchive machinery (not yet upstream) to
encode CTF under member names like {module name}.ctf rather than
.ctf.CU, but it is anticipated that other large projects may wish to
have their own storage for CTF outside of .ctf sections and may wish to
have new naming schemes that suit their special-purpose consumers.
New in v3.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (ctf_link_add_cu_mapping): New.
(ctf_link_memb_name_changer_f): New.
(ctf_link_set_memb_name_changer): New.
libctf/
* ctf-impl.h (ctf_file_t) <ctf_link_cu_mappping>: New.
<ctf_link_memb_name_changer>: Likewise.
<ctf_link_memb_name_changer_arg>: Likewise.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-link.c (ctf_create_per_cu): Apply the cu mapping.
(ctf_link_add_cu_mapping): New.
(ctf_link_set_memb_name_changer): Likewise.
(ctf_change_parent_name): New.
(ctf_name_list_accum_cb_arg_t) <dynames>: New, storage for names
allocated by the caller's ctf_link_memb_name_changer.
<ndynames>: Likewise.
(ctf_accumulate_archive_names): Call the ctf_link_memb_name_changer.
(ctf_link_write): Likewise (for _CTF_SECTION only): also call
ctf_change_parent_name. Free any resulting names.
2019-07-20 21:44:44 +08:00
|
|
|
free (transformed_name);
|
|
|
|
if (arg.ndynames)
|
|
|
|
{
|
|
|
|
size_t i;
|
|
|
|
for (i = 0; i < arg.ndynames; i++)
|
|
|
|
free (arg.dynames[i]);
|
|
|
|
free (arg.dynames);
|
|
|
|
}
|
2020-06-05 02:49:36 +08:00
|
|
|
fclose (f);
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
return buf;
|
|
|
|
|
|
|
|
err_no:
|
|
|
|
ctf_set_errno (fp, errno);
|
libctf, ld: fix symtypetab and var section population under ld -r
The variable section in a CTF dict is meant to contain the types of
variables that do not appear in the symbol table (mostly file-scope
static declarations). We implement this by having the compiler emit
all potential data symbols into both sections, then delete those
symbols from the variable section that correspond to data symbols the
linker has reported.
Unfortunately, the check for this in ctf_serialize is wrong: rather than
checking the set of linker-reported symbols, we check the set of names
in the data object symtypetab section: if the linker has reported no
symbols at all (usually if ld -r has been run, or if a non-linker
program that does not use symbol tables is calling ctf_link) this will
include every single symbol, emptying the variable section completely.
Worse, when ld -r is in use, we want to force writeout of every
symtypetab entry on the inputs, in an indexed section, whether or not
the linker has reported them, since this isn't a final link yet and the
symbol table is not finalized (and may grow more symbols than the linker
has yet reported). But the check for this is flawed too: we were
relying on ctf_link_shuffle_syms not having been called if no symbols
exist, but that function is *always* called by ld even when ld -r is in
use: ctf_link_add_linker_symbol is the one that's not called when there
are no symbols.
We clearly need to rethink this. Using the emptiness of the set of
reported symbols as a test for ld -r is just ugly: the linker already
knows if ld -r is underway and can just tell us. So add a new linker
flag CTF_LINK_NO_FILTER_REPORTED_SYMS that is set to stop the linker
filtering the symbols in the symtypetab sections using the set that the
linker has reported: use the presence or absence of this flag to
determine whether to emit unindexed symtabs: we only remove entries from
the variable section when filtering symbols, and we only remove them if
they are in the reported symbol set, fixing the case where no symbols
are reported by the linker at all.
(The negative sense of the new CTF_LINK flag is intentional: the common
case, both for ld and for simple tools that want to do a ctf_link with
no ELF symbol table in sight, is probably to filter out symbols that no
linker has reported: i.e., for the simple tools, all of them.)
There's another wrinkle, though. It is quite possible for a non-linker
to add symbols to a dict via ctf_add_*_sym and then write it out via the
ctf_write APIs: perhaps it's preparing a dict for a later linker
invocation. Right now this would not lead to anything terribly
meaningful happening: ctf_serialize just assumes it was called via
ctf_link if symbols are present. So add an (internal-to-libctf) flag
that indicates that a writeout is happening via ctf_link_write, and set
it there (propagating it to child dicts as needed). ctf_serialize can
then spot when it is not being called by a linker, and arrange to always
write out an indexed, sorted symtypetab for fastest possible future
symbol lookup by name in that case. (The writeouts done by ld -r are
unsorted, because the only thing likely to use those symtabs is the
linker, which doesn't benefit from symtypetab sorting.)
Tests added for all three linking cases (ld -r, ld -shared, ld), with a
bit of testsuite framework enhancement to stop it unconditionally
linking the CTF to be checked by the lookup program with -shared, so
tests can now examine CTF linked with -r or indeed with no flags at all,
though the output filename is still foo.so even in this case.
Another test added for the non-linker case that endeavours to determine
whether the symtypetab is sorted by examining the order of entries
returned from ctf_symbol_next: nobody outside libctf should rely on
this ordering, but this test is not outside libctf :)
include/ChangeLog
2021-01-26 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (CTF_LINK_NO_FILTER_REPORTED_SYMS): New.
ld/ChangeLog
2021-01-26 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (lang_merge_ctf): Set CTF_LINK_NO_FILTER_REPORTED_SYMS
when appropriate.
libctf/ChangeLog
2021-01-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.c (_libctf_nonnull_): Add parameters.
(LCTF_LINKING): New flag.
(ctf_dict_t) <ctf_link_flags>: Mention it.
* ctf-link.c (ctf_link): Keep LCTF_LINKING set across call.
(ctf_write): Likewise, including in child dictionaries.
(ctf_link_shuffle_syms): Make sure ctf_dynsyms is NULL if there
are no reported symbols.
* ctf-create.c (symtypetab_delete_nonstatic_vars): Make sure
the variable has been reported as a symbol by the linker.
(symtypetab_skippable): Mention relationship between SYMFP and the
flags.
(symtypetab_density): Adjust nonnullity. Exit early if no symbols
were reported and force-indexing is off (i.e., we are doing a
final link).
(ctf_serialize): Handle the !LCTF_LINKING case by writing out an
indexed, sorted symtypetab (and allow SYMFP to be NULL in this
case). Turn sorting off if this is a non-final link. Only delete
nonstatic vars if we are filtering symbols and the linker has
reported some.
* testsuite/libctf-regression/nonstatic-var-section-ld-r*:
New test of variable and symtypetab section population when
ld -r is used.
* testsuite/libctf-regression/nonstatic-var-section-ld-executable.lk:
Likewise, when ld of an executable is used.
* testsuite/libctf-regression/nonstatic-var-section-ld.lk:
Likewise, when ld -shared alone is used.
* testsuite/libctf-regression/nonstatic-var-section-ld*.c:
Lookup programs for the above.
* testsuite/libctf-writable/symtypetab-nonlinker-writeout.*: New
test, testing survival of symbols across ctf_write paths.
* testsuite/lib/ctf-lib.exp (run_lookup_test): New option,
nonshared, suppressing linking of the SOURCE with -shared.
2021-01-17 00:49:29 +08:00
|
|
|
|
|
|
|
/* Turn off the is-linking flag on all the dicts in this link. */
|
|
|
|
for (i = 0; i < arg.i; i++)
|
|
|
|
arg.files[i]->ctf_flags &= ~LCTF_LINKING;
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
err:
|
|
|
|
free (buf);
|
|
|
|
if (f)
|
|
|
|
fclose (f);
|
|
|
|
free (arg.names);
|
|
|
|
free (arg.files);
|
libctf: add CU-mapping machinery
Once the deduplicator is capable of actually detecting conflicting types
with the same name (i.e., not yet) we will place such conflicting types,
and types that depend on them, into CTF dictionaries that are the child
of the main dictionary we usually emit: currently, this will lead to the
.ctf section becoming a CTF archive rather than a single dictionary,
with the default-named archive member (_CTF_SECTION, or NULL) being the
main shared dictionary with most of the types in it.
By default, the sections are named after the compilation unit they come
from (complete path and all), with the cuname field in the CTF header
providing further evidence of the name without requiring the caller to
engage in tiresome parsing. But some callers may not wish the mapping
from input CU to output sub-dictionary to be purely CU-based.
The machinery here allows this to be freely changed, in two ways:
- callers can call ctf_link_add_cu_mapping to specify that a single
input compilation unit should have its types placed in some other CU
if they conflict: the CU will always be created, even if empty, so
the consuming program can depend on its existence. You can map
multiple input CUs to one output CU to force all their types to be
merged together: if some of *those* types conflict, the behaviour is
currently unspecified (the new deduplicator will specify it).
- callers can call ctf_link_set_memb_name_changer to provide a function
which is passed every CTF sub-dictionary name in turn (including
_CTF_SECTION) and can return a new name, or NULL if no change is
desired. The mapping from input to output names should not map two
input names to the same output name: if this happens, the two are not
merged but will result in an archive with two members with the same
name (technically valid, but it's hard to access the second
same-named member: you have to do an iteration over archive members).
This is used by the kernel's ctfarchive machinery (not yet upstream) to
encode CTF under member names like {module name}.ctf rather than
.ctf.CU, but it is anticipated that other large projects may wish to
have their own storage for CTF outside of .ctf sections and may wish to
have new naming schemes that suit their special-purpose consumers.
New in v3.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (ctf_link_add_cu_mapping): New.
(ctf_link_memb_name_changer_f): New.
(ctf_link_set_memb_name_changer): New.
libctf/
* ctf-impl.h (ctf_file_t) <ctf_link_cu_mappping>: New.
<ctf_link_memb_name_changer>: Likewise.
<ctf_link_memb_name_changer_arg>: Likewise.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-link.c (ctf_create_per_cu): Apply the cu mapping.
(ctf_link_add_cu_mapping): New.
(ctf_link_set_memb_name_changer): Likewise.
(ctf_change_parent_name): New.
(ctf_name_list_accum_cb_arg_t) <dynames>: New, storage for names
allocated by the caller's ctf_link_memb_name_changer.
<ndynames>: Likewise.
(ctf_accumulate_archive_names): Call the ctf_link_memb_name_changer.
(ctf_link_write): Likewise (for _CTF_SECTION only): also call
ctf_change_parent_name. Free any resulting names.
2019-07-20 21:44:44 +08:00
|
|
|
free (transformed_name);
|
|
|
|
if (arg.ndynames)
|
|
|
|
{
|
|
|
|
size_t i;
|
|
|
|
for (i = 0; i < arg.ndynames; i++)
|
|
|
|
free (arg.dynames[i]);
|
|
|
|
free (arg.dynames);
|
|
|
|
}
|
libctf, binutils, include, ld: gettextize and improve error handling
This commit follows on from the earlier commit "libctf, ld, binutils:
add textual error/warning reporting for libctf" and converts every error
in libctf that was reported using ctf_dprintf to use ctf_err_warn
instead, gettextizing them in the process, using N_() where necessary to
avoid doing gettext calls unless an error message is actually generated,
and rephrasing some error messages for ease of translation.
This requires a slight change in the ctf_errwarning_next API: this API
is public but has not been in a release yet, so can still change freely.
The problem is that many errors are emitted at open time (whether
opening of a CTF dict, or opening of a CTF archive): the former of these
throws away its incompletely-initialized ctf_file_t rather than return
it, and the latter has no ctf_file_t at all. So errors and warnings
emitted at open time cannot be stored in the ctf_file_t, and have to go
elsewhere.
We put them in a static local in ctf-subr.c (which is not very
thread-safe: a later commit will improve things here): ctf_err_warn with
a NULL fp adds to this list, and the public interface
ctf_errwarning_next with a NULL fp retrieves from it.
We need a slight exception from the usual iterator rules in this case:
with a NULL fp, there is nowhere to store the ECTF_NEXT_END "error"
which signifies the end of iteration, so we add a new err parameter to
ctf_errwarning_next which is used to report such iteration-related
errors. (If an fp is provided -- i.e., if not reporting open errors --
this is optional, but even if it's optional it's still an API change.
This is actually useful from a usability POV as well, since
ctf_errwarning_next is usually called when there's been an error, so
overwriting the error code with ECTF_NEXT_END is not very helpful!
So, unusually, ctf_errwarning_next now uses the passed fp for its
error code *only* if no errp pointer is passed in, and leaves it
untouched otherwise.)
ld, objdump and readelf are adapted to call ctf_errwarning_next with a
NULL fp to report open errors where appropriate.
The ctf_err_warn API also has to change, gaining a new error-number
parameter which is used to add the error message corresponding to that
error number into the debug stream when LIBCTF_DEBUG is enabled:
changing this API is easy at this point since we are already touching
all existing calls to gettextize them. We need this because the debug
stream should contain the errno's message, but the error reported in the
error/warning stream should *not*, because the caller will probably
report it themselves at failure time regardless, and reporting it in
every error message that leads up to it leads to a ridiculous chattering
on failure, which is likely to end up as ridiculous chattering on stderr
(trimmed a bit):
CTF error: `ld/testsuite/ld-ctf/A.c (0): lookup failure for type 3: flags 1: The parent CTF dictionary is unavailable'
CTF error: `ld/testsuite/ld-ctf/A.c (0): struct/union member type hashing error during type hashing for type 80000001, kind 6: The parent CTF dictionary is unavailable'
CTF error: `deduplicating link variable emission failed for ld/testsuite/ld-ctf/A.c: The parent CTF dictionary is unavailable'
ld/.libs/lt-ld-new: warning: CTF linking failed; output will have no CTF section: `The parent CTF dictionary is unavailable'
We only need to be told that the parent CTF dictionary is unavailable
*once*, not over and over again!
errmsgs are still emitted on warning generation, because warnings do not
usually lead to a failure propagated up to the caller and reported
there.
Debug-stream messages are not translated. If translation is turned on,
there will be a mixture of English and translated messages in the debug
stream, but rather that than burden the translators with debug-only
output.
binutils/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* objdump.c (dump_ctf_archive_member): Move error-
reporting...
(dump_ctf_errs): ... into this separate function.
(dump_ctf): Call it on open errors.
* readelf.c (dump_ctf_archive_member): Move error-
reporting...
(dump_ctf_errs): ... into this separate function. Support
calls with NULL fp. Adjust for new err parameter to
ctf_errwarning_next.
(dump_section_as_ctf): Call it on open errors.
include/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-api.h (ctf_errwarning_next): New err parameter.
ld/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ldlang.c (lang_ctf_errs_warnings): Support calls with NULL fp.
Adjust for new err parameter to ctf_errwarning_next. Only
check for assertion failures when fp is non-NULL.
(ldlang_open_ctf): Call it on open errors.
* testsuite/ld-ctf/ctf.exp: Always use the C locale to avoid
breaking the diags tests.
libctf/ChangeLog
2020-08-27 Nick Alcock <nick.alcock@oracle.com>
* ctf-subr.c (open_errors): New list.
(ctf_err_warn): Calls with NULL fp append to open_errors. Add err
parameter, and use it to decorate the debug stream with errmsgs.
(ctf_err_warn_to_open): Splice errors from a CTF dict into the
open_errors.
(ctf_errwarning_next): Calls with NULL fp report from open_errors.
New err param to report iteration errors (including end-of-iteration)
when fp is NULL.
(ctf_assert_fail_internal): Adjust ctf_err_warn call for new err
parameter: gettextize.
* ctf-impl.h (ctfo_get_vbytes): Add ctf_file_t parameter.
(LCTF_VBYTES): Adjust.
(ctf_err_warn_to_open): New.
(ctf_err_warn): Adjust.
(ctf_bundle): Used in only one place: move...
* ctf-create.c: ... here.
(enumcmp): Use ctf_err_warn, not ctf_dprintf, passing the err number
down as needed. Don't emit the errmsg. Gettextize.
(membcmp): Likewise.
(ctf_add_type_internal): Likewise.
(ctf_write_mem): Likewise.
(ctf_compress_write): Likewise. Report errors writing the header or
body.
(ctf_write): Likewise.
* ctf-archive.c (ctf_arc_write_fd): Use ctf_err_warn, not
ctf_dprintf, and gettextize, as above.
(ctf_arc_write): Likewise.
(ctf_arc_bufopen): Likewise.
(ctf_arc_open_internal): Likewise.
* ctf-labels.c (ctf_label_iter): Likewise.
* ctf-open-bfd.c (ctf_bfdclose): Likewise.
(ctf_bfdopen): Likewise.
(ctf_bfdopen_ctfsect): Likewise.
(ctf_fdopen): Likewise.
* ctf-string.c (ctf_str_write_strtab): Likewise.
* ctf-types.c (ctf_type_resolve): Likewise.
* ctf-open.c (get_vbytes_common): Likewise. Pass down the ctf dict.
(get_vbytes_v1): Pass down the ctf dict.
(get_vbytes_v2): Likewise.
(flip_ctf): Likewise.
(flip_types): Likewise. Use ctf_err_warn, not ctf_dprintf, and
gettextize, as above.
(upgrade_types_v1): Adjust calls.
(init_types): Use ctf_err_warn, not ctf_dprintf, as above.
(ctf_bufopen_internal): Likewise. Adjust calls. Transplant errors
emitted into individual dicts into the open errors if this turns
out to be a failed open in the end.
* ctf-dump.c (ctf_dump_format_type): Adjust ctf_err_warn for new err
argument. Gettextize. Don't emit the errmsg.
(ctf_dump_funcs): Likewise. Collapse err label into its only case.
(ctf_dump_type): Likewise.
* ctf-link.c (ctf_create_per_cu): Adjust ctf_err_warn for new err
argument. Gettextize. Don't emit the errmsg.
(ctf_link_one_type): Likewise.
(ctf_link_lazy_open): Likewise.
(ctf_link_one_input_archive): Likewise.
(ctf_link_deduplicating_count_inputs): Likewise.
(ctf_link_deduplicating_open_inputs): Likewise.
(ctf_link_deduplicating_close_inputs): Likewise.
(ctf_link_deduplicating): Likewise.
(ctf_link): Likewise.
(ctf_link_deduplicating_per_cu): Likewise. Add some missed
ctf_set_errnos to obscure error cases.
* ctf-dedup.c (ctf_dedup_rhash_type): Adjust ctf_err_warn for new
err argument. Gettextize. Don't emit the errmsg.
(ctf_dedup_populate_mappings): Likewise.
(ctf_dedup_detect_name_ambiguity): Likewise.
(ctf_dedup_init): Likewise.
(ctf_dedup_multiple_input_dicts): Likewise.
(ctf_dedup_conflictify_unshared): Likewise.
(ctf_dedup): Likewise.
(ctf_dedup_rwalk_one_output_mapping): Likewise.
(ctf_dedup_id_to_target): Likewise.
(ctf_dedup_emit_type): Likewise.
(ctf_dedup_emit_struct_members): Likewise.
(ctf_dedup_populate_type_mapping): Likewise.
(ctf_dedup_populate_type_mappings): Likewise.
(ctf_dedup_emit): Likewise.
(ctf_dedup_hash_type): Likewise. Fix a bit of messed-up error
status setting.
(ctf_dedup_rwalk_one_output_mapping): Likewise. Don't hide
unknown-type-kind messages (which signify file corruption).
2020-07-27 23:45:15 +08:00
|
|
|
ctf_err_warn (fp, 0, 0, _("cannot write archive in link: %s failure"),
|
|
|
|
errloc);
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
return NULL;
|
|
|
|
}
|