Commit Graph

166 Commits

Author SHA1 Message Date
Nick Alcock
b09ad6eae9 libctf: do not print array declarators backwards
The CTF declarator stack code (used by ctf_type_aname() and thus
ultimately by ctf-dump.c and objdump --ctf etc) contains careful
code to prepend array declarators to the stack it's building up
on the grounds that array declarators are ordered inside out: only
they're not, they're ordered outside in.

This has led to our (non-upstreamed) compiler emitting array declarators
backwards for years, because it looks backwards in the dumper unless
it's actually emitted backwards into the CTF so the dumper can wrongly
reverse it again: but

  int[5][6]

should be an array of 6 int[5]s, not an array of 5 int[6]'s, so even if
the dumper gets it right, actual users calling ctf_array_info are going
to see a completely wrong type graph with the wrong bounds in it.

Fix trivial.

libctf/ChangeLog
2021-01-05  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-decl.c (ctf_decl_push): Don't print array decls backwards.
2021-01-05 14:53:39 +00:00
Nicolas Boulenguez
a7c23ac931 In libctf, make AC_CONFIG_MACRO_DIR consistent with ACLOCAL_AMFLAGS
PR 27117
	* configure.ac: Make AC_CONFIG_MACRO_DIR consistent with
	ACLOCAL_AMFLAGS -I dirs.
	* configure: Regenerate.
2021-01-04 11:08:05 +10:30
Alan Modra
250d07de5c Update year range in copyright notice of binutils files 2021-01-01 10:31:05 +10:30
Alan Modra
c2795844e6 ChangeLog rotation 2021-01-01 10:31:02 +10:30
H.J. Lu
e8cda20905 libctf: Pass format argument to asprintf
libctf/ChangeLog
2020-09-23  H.J. Lu  <hongjiu.lu@intel.com>

	PR libctf/26934
	* ctf-dump.c (ctf_dump_objts): Pass format argument to asprintf.
2020-11-25 19:11:36 +00:00
Nick Alcock
53651de80f libctf, include: support foreign-endianness symtabs with CTF
The CTF symbol lookup machinery added recently has one deficit: it
assumes the symtab is in the machine's native endianness.  This is
always true when the linker is writing out symtabs (because cross
linkers byteswap symbols only after libctf has been called on them), but
may be untrue in the cross case when the linker or another tool
(objdump, etc) is reading them.

Unfortunately the easy way to model this to the caller, as an endianness
field in the ctf_sect_t, is precluded because doing so would change the
size of the ctf_sect_t, which would be an ABI break.  So, instead, allow
the endianness of the symtab to be set after open time, by calling one
of the two new API functions ctf_symsect_endianness (for ctf_dict_t's)
or ctf_arc_symsect_endianness (for entire ctf_archive_t's).  libctf
calls these functions automatically for objects opened via any of the
BFD-aware mechanisms (ctf_bfdopen, ctf_bfdopen_ctfsect, ctf_fdopen,
ctf_open, or ctf_arc_open), but the various mechanisms that just take
raw ctf_sect_t's will assume the symtab is in native endianness and need
a later call to ctf_*symsect_endianness to adjust it if needed.  (This
call is basically free if the endianness is actually native: it only
costs anything if the symtab endianness was previously guessed wrong,
and there is a symtab, and we are using it directly rather than using
symtab indexing.)

Obviously, calling ctf_lookup_by_symbol or ctf_symbol_next before the
symtab endianness is correctly set will probably give wrong answers --
but you can set it at any time as long as it is before then.

include/ChangeLog
2020-11-23  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-api.h: Style nit: remove () on function names in comments.
	(ctf_sect_t): Mention endianness concerns.
	(ctf_symsect_endianness): New declaration.
	(ctf_arc_symsect_endianness): Likewise.

libctf/ChangeLog
2020-11-23  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-impl.h (ctf_dict_t) <ctf_symtab_little_endian>: New.
	(struct ctf_archive_internal) <ctfi_symsect_little_endian>: Likewise.
	* ctf-create.c (ctf_serialize): Adjust for new field.
	* ctf-open.c (init_symtab): Note the semantics of repeated calls.
	(ctf_symsect_endianness): New.
	(ctf_bufopen_internal): Set ctf_symtab_little_endian suitably for
	the native endianness.
	(_Static_assert): Moved...
	(swap_thing): ... with this...
	* swap.h: ... to here.
	* ctf-util.c (ctf_elf32_to_link_sym): Use it, byteswapping the
	Elf32_Sym if the ctf_symtab_little_endian demands it.
	(ctf_elf64_to_link_sym): Likewise swap the Elf64_Sym if needed.
	* ctf-archive.c (ctf_arc_symsect_endianness): New, set the
	endianness of the symtab used by the dicts in an archive.
	(ctf_archive_iter_internal): Initialize to unknown (assumed native,
	do not call ctf_symsect_endianness).
	(ctf_dict_open_by_offset): Call ctf_symsect_endianness if need be.
	(ctf_dict_open_internal): Propagate the endianness down.
	(ctf_dict_open_sections): Likewise.
	* ctf-open-bfd.c (ctf_bfdopen_ctfsect): Get the endianness from the
	struct bfd and pass it down to the archive.
	* libctf.ver: Add ctf_symsect_endianness and
	ctf_arc_symsect_endianness.
2020-11-25 19:11:35 +00:00
Nick Alcock
ef21dd3bcf libctf: do not crash when CTF symbol or variable linking fails
When linking fails, we delete all the generated outputs, but we fail to
remove them from the ctf_link_outputs hash we stuck them in before doing
symbol and variable section linking (which we had to do because that's
where ctf_create_per_cu, used by both, looks for them).  This leaves
stale pointers to freed memory behind, and crashes soon follow.

Fix obvious.

libctf/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-link.c (ctf_link_deduplicating): Clean up the ctf_link_outputs
	hash on error.
2020-11-20 13:34:13 +00:00
Nick Alcock
8f235c90a2 libctf: error-handling fixes
libctf/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-create.c (ctf_dtd_insert): Set ENOMEM on the dict if out of memory.
	(ctf_dvd_insert): Likewise.
	(ctf_add_function): Report ECTF_RDONLY if this dict is not writable.
	* ctf-subr.c (ctf_err_warn): Only debug-dump passed-in warnings if
	the passed-in error code is nonzero: the error on the dict for
	warnings may relate to a previous error.
2020-11-20 13:34:12 +00:00
Nick Alcock
97a2a623d0 libctf, include: add ctf_getsymsect and ctf_getstrsect
libctf has long provided ctf_getdatasect, which hands back a pointer to
the CTF section a (read-only) dict came from.  But it has no such
functions to return pointers to the ELF symbol table or string table
it's working from, which is unfortunate because several libctf functions
(ctf_open, ctf_fdopen, and ctf_bfdopen) figure out which string and
symbol table to use themselves, and don't tell the user what they
decided, so the caller can't agree on which symtab to use with libctf
even if it wanted to.

Add a pair of functions to return the symtab and strtab in use.  Like
ctf_getdatasect, these return ctf_sect_t structures by value, filled
with all-NULL/0 content if a symtab or strtab is not being used.

include/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-api.h (ctf_getsymsect): New.
	(ctf_getstrsect): Likewise.

libctf/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-open.c (ctf_getsymsect): New.
	(ctf_getstrsect): Likewise.
	* libctf.ver: Add them.
2020-11-20 13:34:12 +00:00
Nick Alcock
2c78e92523 libctf, include: CTF-archive-wide symbol lookup
CTF archives may contain multiple dicts, each of which contain many
types and possibly a bunch of symtypetab entries relating to those
types: each symtypetab entry is going to appear in exactly one dict,
with the corresponding entries in the other dicts empty (either pads, or
indexed symtypetabs that do not mention that symbol).  But users of
libctf usually want to get back the type associated with a symbol
without having to dig around to find out which dict that type might be
in.

This adds machinery to do that -- and since you probably want to do it
repeatedly, it adds internal caching to the ctf-archive machinery so
that iteration over archives via ctf_archive_next and repeated symbol
lookups do not have to repeatedly reopen the archive.  (Iteration using
ctf_archive_iter will gain caching soon.)

Two new API functions:

ctf_dict_t *
ctf_arc_lookup_symbol (ctf_archive_t *arc, unsigned long symidx,
		       ctf_id_t *typep, int *errp);

This looks up the symbol with index SYMIDX in the archive ARC, returning
the dictionary in which it resides and optionally the type index as
well.  Errors are returned in ERRP.  The dict should be
ctf_dict_close()d when done, but is also cached inside the ctf_archive
so that the open cost is only paid once.  The result of the symbol
lookup is also cached internally, so repeated lookups of the same symbol
are nearly free.

void ctf_arc_flush_caches (ctf_archive_t *arc);

Flush all the caches. Done at close time, but also available as an API
function if users want to do it by hand.

include/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-api.h (ctf_arc_lookup_symbol): New.
	(ctf_arc_flush_caches): Likewise.
	* ctf.h: Document new auto-ctf_import behaviour.

libctf/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-impl.h (struct ctf_archive_internal) <ctfi_dicts>: New, dicts
	the archive machinery has opened and cached.
	<ctfi_symdicts>: New, cache of dicts containing symbols looked up.
	<ctfi_syms>: New, cache of types of symbols looked up.
	* ctf-archive.c (ctf_arc_close): Free them on close.
	(enosym): New, flag entry for 'symbol not present'.
	(ctf_arc_import_parent): New, automatically import the parent from
	".ctf" if this is a child in an archive and ".ctf" is present.
	(ctf_dict_open_sections): Use it.
	(ctf_archive_iter_internal): Likewise.
	(ctf_cached_dict_close): New, thunk around ctf_dict_close.
	(ctf_dict_open_cached): New, open and cache a dict.
	(ctf_arc_flush_caches): New, flush the caches.
	(ctf_arc_lookup_symbol): New, look up a symbol in (all members of)
	an archive, and cache the lookup.
	(ctf_archive_iter): Note the new caching behaviour.
	(ctf_archive_next): Use ctf_dict_open_cached.
	* libctf.ver: Add ctf_arc_lookup_symbol and ctf_arc_flush_caches.
2020-11-20 13:34:11 +00:00
Nick Alcock
0e28ade476 libctf, ld: properly deduplicate function types
Some type kinds in CTF (functions, arrays, pointers, slices, and
cvr-quals) are intrinsically nameless: the ctt_name field in the CTF
is always zero, and the libctf API provides no way to set a name.
But the compiler can and does sometimes set names for some of these
kinds: in particular, the name it sets on CTF_K_FUNCTION types is the
means it uses to force the name of the function into the string table
so that it can point at it from the function info section.

So null out the name at hashing time so that the deduplicator can
correctly detect that e.g. function types identical but for name should
be considered truly identical, since they will not have a name when the
deduplicator re-emits them into the output.

ld/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* testsuite/ld-ctf/data-func-conflicted.d: Shrink the expected
	size of the type section now that function types are being
	deduplicated properly.

libctf/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-dedup.c (ctf_dedup_rhash_type): Null out the names of nameless
	type kinds, just in case the input has named them.
2020-11-20 13:34:10 +00:00
Nick Alcock
4665e895c3 libctf: adjust dumper for symtypetab changes
Now that we have a new format for the function info section, it's much
easier to dump it: we can use the same code we use for the object type
section, and that's got simpler too because we can use ctf_symbol_next.

Also dump the new stuff in the header: the new flags bits and the index
section lengths.

libctf/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-dump.c (ctf_dump_header): Dump the new flags bits and the index
	section lengths.
	(ctf_dump_objts): Report indexed sections.  Also dump functions.  Use
	ctf_symbol_next, not manual looping.
	(ctf_dump_funcs): Delete.
	(ctf_dump): Use ctf_dump_objts, not ctf_dump_funcs.
2020-11-20 13:34:09 +00:00
Nick Alcock
1136c37971 libctf: symbol type linking support
This adds facilities to write out the function info and data object
sections, which efficiently map from entries in the symbol table to
types.  The write-side code is entirely new: the read-side code was
merely significantly changed and support for indexed tables added
(pointed to by the no-longer-unused cth_objtidxoff and cth_funcidxoff
header fields).

With this in place, you can use ctf_lookup_by_symbol to look up the
types of symbols of function and object type (and, as before, you can
use ctf_lookup_variable to look up types of file-scope variables not
present in the symbol table, as long as you know their name: but
variables that are also data objects are now found in the data object
section instead.)

(Compatible) file format change:

The CTF spec has always said that the function info section looks much
like the CTF_K_FUNCTIONs in the type section: an info word (including an
argument count) followed by a return type and N argument types. This
format is suboptimal: it means function symbols cannot be deduplicated
and it causes a lot of ugly code duplication in libctf.  But
conveniently the compiler has never emitted this!  Because it has always
emitted a rather different format that libctf has never accepted, we can
be sure that there are no instances of this function info section in the
wild, and can freely change its format without compatibility concerns or
a file format version bump.  (And since it has never been emitted in any
code that generated any older file format version, either, we need keep
no code to read the format as specified at all!)

So the function info section is now specified as an array of uint32_t,
exactly like the object data section: each entry is a type ID in the
type section which must be of kind CTF_K_FUNCTION, the prototype of
this function.

This allows function types to be deduplicated and also correctly encodes
the fact that all functions declared in C really are types available to
the program: so they should be stored in the type section like all other
types.  (In format v4, we will be able to represent the types of static
functions as well, but that really does require a file format change.)

We introduce a new header flag, CTF_F_NEWFUNCINFO, which is set if the
new function info format is in use.  A sufficiently new compiler will
always set this flag.  New libctf will always set this flag: old libctf
will refuse to open any CTF dicts that have this flag set.  If the flag
is not set on a dict being read in, new libctf will disregard the
function info section.  Format v4 will remove this flag (or, rather, the
flag has no meaning there and the bit position may be recycled for some
other purpose).

New API:

Symbol addition:
  ctf_add_func_sym: Add a symbol with a given name and type.  The
                    type must be of kind CTF_K_FUNCTION (a function
                    pointer).  Internally this adds a name -> type
                    mapping to the ctf_funchash in the ctf_dict.
  ctf_add_objt_sym: Add a symbol with a given name and type.  The type
                    kind can be anything, including function pointers.
		    This adds to ctf_objthash.

These both treat symbols as name -> type mappings: the linker associates
symbol names with symbol indexes via the ctf_link_shuffle_syms callback,
which sets up the ctf_dynsyms/ctf_dynsymidx/ctf_dynsymmax fields in the
ctf_dict.  Repeated relinks can add more symbols.

Variables that are also exposed as symbols are removed from the variable
section at serialization time.

CTF symbol type sections which have enough pads, defined by
CTF_INDEX_PAD_THRESHOLD (whether because they are in dicts with symbols
where most types are unknown, or in archive where most types are defined
in some child or parent dict, not in this specific dict) are sorted by
name rather than symidx and accompanied by an index which associates
each symbol type entry with a name: the existing ctf_lookup_by_symbol
will map symbol indexes to symbol names and look the names up in the
index automatically.  (This is currently ELF-symbol-table-dependent, but
there is almost nothing specific to ELF in here and we can add support
for other symbol table formats easily).

The compiler also uses index sections to communicate the contents of
object file symbol tables without relying on any specific ordering of
symbols: it doesn't need to sort them, and libctf will detect an
unsorted index section via the absence of the new CTF_F_IDXSORTED header
flag, and sort it if needed.

Iteration:
  ctf_symbol_next: Iterator which returns the types and names of symbols
                   one by one, either for function or data symbols.

This does not require any sorting: the ctf_link machinery uses it to
pull in all the compiler-provided symbols cheaply, but it is not
restricted to that use.

(Compatible) changes in API:
  ctf_lookup_by_symbol: can now be called for object and function
                        symbols: never returns ECTF_NOTDATA (which is
			now not thrown by anything, but is kept for
                        compatibility and because it is a plausible
                        error that we might start throwing again at some
                        later date).

Internally we also have changes to the ctf-string functionality so that
"external" strings (those where we track a string -> offset mapping, but
only write out an offset) can be consulted via the usual means
(ctf_strptr) before the strtab is written out.  This is important
because ctf_link_add_linker_symbol can now be handed symbols named via
strtab offsets, and ctf_link_shuffle_syms must figure out their actual
names by looking in the external symtab we have just been fed by the
ctf_link_add_strtab callback, long before that strtab is written out.

include/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-api.h (ctf_symbol_next): New.
	(ctf_add_objt_sym): Likewise.
	(ctf_add_func_sym): Likewise.
	* ctf.h: Document new function info section format.
	(CTF_F_NEWFUNCINFO): New.
	(CTF_F_IDXSORTED): New.
	(CTF_F_MAX): Adjust accordingly.

libctf/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-impl.h (CTF_INDEX_PAD_THRESHOLD): New.
	(_libctf_nonnull_): Likewise.
	(ctf_in_flight_dynsym_t): New.
	(ctf_dict_t) <ctf_funcidx_names>: Likewise.
	<ctf_objtidx_names>: Likewise.
	<ctf_nfuncidx>: Likewise.
	<ctf_nobjtidx>: Likewise.
	<ctf_funcidx_sxlate>: Likewise.
	<ctf_objtidx_sxlate>: Likewise.
	<ctf_objthash>: Likewise.
	<ctf_funchash>: Likewise.
	<ctf_dynsyms>: Likewise.
	<ctf_dynsymidx>: Likewise.
	<ctf_dynsymmax>: Likewise.
	<ctf_in_flight_dynsym>: Likewise.
	(struct ctf_next) <u.ctn_next>: Likewise.
	(ctf_symtab_skippable): New prototype.
	(ctf_add_funcobjt_sym): Likewise.
	(ctf_dynhash_sort_by_name): Likewise.
	(ctf_sym_to_elf64): Rename to...
	(ctf_elf32_to_link_sym): ... this, and...
	(ctf_elf64_to_link_sym): ... this.
	* ctf-open.c (init_symtab): Check for lack of CTF_F_NEWFUNCINFO
	flag, and presence of index sections.  Refactor out
	ctf_symtab_skippable and ctf_elf*_to_link_sym, and use them.  Use
	ctf_link_sym_t, not Elf64_Sym.  Skip initializing objt or func
	sxlate sections if corresponding index section is present.  Adjust
	for new func info section format.
	(ctf_bufopen_internal): Add ctf_err_warn to corrupt-file error
	handling.  Report incorrect-length index sections.  Always do an
	init_symtab, even if there is no symtab section (there may be index
	sections still).
	(flip_objts): Adjust comment: func and objt sections are actually
	identical in structure now, no need to caveat.
	(ctf_dict_close):  Free newly-added data structures.
	* ctf-create.c (ctf_create): Initialize them.
	(ctf_symtab_skippable): New, refactored out of
	init_symtab, with st_nameidx_set check added.
	(ctf_add_funcobjt_sym): New, add a function or object symbol to the
	ctf_objthash or ctf_funchash, by name.
	(ctf_add_objt_sym): Call it.
	(ctf_add_func_sym): Likewise.
	(symtypetab_delete_nonstatic_vars): New, delete vars also present as
	data objects.
	(CTF_SYMTYPETAB_EMIT_FUNCTION): New flag to symtypetab emitters:
	this is a function emission, not a data object emission.
	(CTF_SYMTYPETAB_EMIT_PAD): New flag to symtypetab emitters: emit
	pads for symbols with no type (only set for unindexed sections).
	(CTF_SYMTYPETAB_FORCE_INDEXED): New flag to symtypetab emitters:
	always emit indexed.
	(symtypetab_density): New, figure out section sizes.
	(emit_symtypetab): New, emit a symtypetab.
	(emit_symtypetab_index): New, emit a symtypetab index.
	(ctf_serialize): Call them, emitting suitably sorted symtypetab
	sections and indexes.  Set suitable header flags.  Copy over new
	fields.
	* ctf-hash.c (ctf_dynhash_sort_by_name): New, used to impose an
	order on symtypetab index sections.
	* ctf-link.c (ctf_add_type_mapping): Delete erroneous comment
	relating to code that was never committed.
	(ctf_link_one_variable): Improve variable name.
	(check_sym): New, symtypetab analogue of check_variable.
	(ctf_link_deduplicating_one_symtypetab): New.
	(ctf_link_deduplicating_syms): Likewise.
	(ctf_link_deduplicating): Call them.
	(ctf_link_deduplicating_per_cu): Note that we don't call them in
	this case (yet).
	(ctf_link_add_strtab): Set the error on the fp correctly.
	(ctf_link_add_linker_symbol): New (no longer a do-nothing stub), add
	a linker symbol to the in-flight list.
	(ctf_link_shuffle_syms): New (no longer a do-nothing stub), turn the
	in-flight list into a mapping we can use, now its names are
	resolvable in the external strtab.
	* ctf-string.c (ctf_str_rollback_atom): Don't roll back atoms with
	external strtab offsets.
	(ctf_str_rollback): Adjust comment.
	(ctf_str_write_strtab): Migrate ctf_syn_ext_strtab population from
	writeout time...
	(ctf_str_add_external): ... to string addition time.
	* ctf-lookup.c (ctf_lookup_var_key_t): Rename to...
	(ctf_lookup_idx_key_t): ... this, now we use it for syms too.
	<clik_names>: New member, a name table.
	(ctf_lookup_var): Adjust accordingly.
	(ctf_lookup_variable): Likewise.
	(ctf_lookup_by_id): Shuffle further up in the file.
	(ctf_symidx_sort_arg_cb): New, callback for...
	(sort_symidx_by_name): ... this new function to sort a symidx
	found to be unsorted (likely originating from the compiler).
	(ctf_symidx_sort): New, sort a symidx.
	(ctf_lookup_symbol_name): Support dynamic symbols with indexes
	provided by the linker.  Use ctf_link_sym_t, not Elf64_Sym.
	Check the parent if a child lookup fails.
	(ctf_lookup_by_symbol): Likewise.  Work for function symbols too.
	(ctf_symbol_next): New, iterate over symbols with types (without
	sorting).
	(ctf_lookup_idx_name): New, bsearch for symbol names in indexes.
	(ctf_try_lookup_indexed): New, attempt an indexed lookup.
	(ctf_func_info): Reimplement in terms of ctf_lookup_by_symbol.
	(ctf_func_args): Likewise.
	(ctf_get_dict): Move...
	* ctf-types.c (ctf_get_dict): ... here.
	* ctf-util.c (ctf_sym_to_elf64): Re-express as...
	(ctf_elf64_to_link_sym): ... this.  Add new st_symidx field, and
	st_nameidx_set (always 0, so st_nameidx can be ignored).  Look in
	the ELF strtab for names.
	(ctf_elf32_to_link_sym): Likewise, for Elf32_Sym.
	(ctf_next_destroy): Destroy ctf_next_t.u.ctn_next if need be.
	* libctf.ver: Add ctf_symbol_next, ctf_add_objt_sym and
	ctf_add_func_sym.
2020-11-20 13:34:08 +00:00
Nick Alcock
3d16b64e28 bfd, include, ld, binutils, libctf: CTF should use the dynstr/sym
This is embarrassing.

The whole point of CTF is that it remains intact even after a binary is
stripped, providing a compact mapping from symbols to types for
everything in the externally-visible interface of an ELF object: it has
connections to the symbol table for that purpose, and to the string
table to avoid duplicating symbol names.  So it's a shame that the hooks
I implemented last year served to hook it up to the .symtab and .strtab,
which obviously disappear on strip, leaving any accompanying the CTF
dict containing references to strings (and, soon, symbols) which don't
exist any more because their containing strtab has been vaporized.  The
original Solaris design used .dynsym and .dynstr (well, actually,
.ldynsym, which has more symbols) which do not disappear. So should we.

Thankfully the work we did before serves as guide rails, and adjusting
things to use the .dynstr and .dynsym was fast and easy.  The only
annoyance is that the dynsym is assembled inside elflink.c in a fairly
piecemeal fashion, so that the easiest way to get the symbols out was to
hook in before every call to swap_symbol_out (we also leave in a hook in
front of symbol additions to the .symtab because it seems plausible that
we might want to hook them in future too: for now that hook is unused).
We adjust things so that rather than being offered a whole hash table of
symbols at once, libctf is now given symbols one at a time, with st_name
indexes already resolved and pointing at their final .dynstr offsets:
it's now up to libctf to resolve these to names as needed using the
strtab info we pass it separately.

Some bits might be contentious.  The ctf_new_dynstr callback takes an
elf_internal_sym, and this remains an elf_internal_sym right down
through the generic emulation layers into ldelfgen.  This is no worse
than the elf_sym_strtab we used to pass down, but in the future when we
gain non-ELF CTF symtab support we might want to lower the
elf_internal_sym to some other representation (perhaps a
ctf_link_symbol) in bfd or in ldlang_ctf_new_dynsym.  We rename the
'apply_strsym' hooks to 'acquire_strings' instead, becuse they no longer
have anything to do with symbols.

There are some API changes to pieces of API which are technically public
but actually totally unused by anything and/or unused by anything but ld
so they can change freely: the ctf_link_symbol gains new fields to allow
symbol names to be given as strtab offsets as well as strings, and a
symidx so that the symbol index can be passed in.  ctf_link_shuffle_syms
loses its callback parameter: the idea now is that linkers call the new
ctf_link_add_linker_symbol for every symbol in .dynsym, feed in all the
strtab entries with ctf_link_add_strtab, and then a call to
ctf_link_shuffle_syms will apply both and arrange to use them to reorder
the CTF symtab at CTF serialization time (which is coming in the next
commit).

Inside libctf we have a new preamble flag CTF_F_DYNSTR which is always
set in v3-format CTF dicts from this commit forwards: CTF dicts without
this flag are associated with .strtab like they used to be, so that old
dicts' external strings don't turn to garbage when loaded by new libctf.
Dicts with this flag are associated with .dynstr and .dynsym instead.
(The flag is not the next in sequence because this commit was written
quite late: the missing flags will be filled in by the next commit.)

Tests forthcoming in a later commit in this series.

bfd/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* elflink.c (elf_finalize_dynstr): Call examine_strtab after
	dynstr finalization.
	(elf_link_swap_symbols_out): Don't call it here.  Call
	ctf_new_symbol before swap_symbol_out.
	(elf_link_output_extsym): Call ctf_new_dynsym before
	swap_symbol_out.
	(bfd_elf_final_link): Likewise.
	* elf.c (swap_out_syms): Pass in bfd_link_info.  Call
	ctf_new_symbol before swap_symbol_out.
	(_bfd_elf_compute_section_file_positions): Adjust.

binutils/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* readelf.c (dump_section_as_ctf): Use .dynsym and .dynstr, not
	.symtab and .strtab.

include/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* bfdlink.h (struct elf_sym_strtab): Replace with...
	(struct elf_internal_sym): ... this.
	(struct bfd_link_callbacks) <examine_strtab>: Take only a
	symstrtab argument.
	<ctf_new_symbol>: New.
	<ctf_new_dynsym>: Likewise.
	* ctf-api.h (struct ctf_link_sym) <st_symidx>: New.
	<st_nameidx>: Likewise.
	<st_nameidx_set>: Likewise.
	(ctf_link_iter_symbol_f): Removed.
	(ctf_link_shuffle_syms): Remove most parameters, just takes a
	ctf_dict_t now.
	(ctf_link_add_linker_symbol): New, split from
	ctf_link_shuffle_syms.
	* ctf.h (CTF_F_DYNSTR): New.
	(CTF_F_MAX): Adjust.

ld/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* ldelfgen.c (struct ctf_strsym_iter_cb_arg): Rename to...
	(struct ctf_strtab_iter_cb_arg): ... this, changing fields:
	<syms>: Remove.
	<symcount>: Remove.
	<symstrtab>: Rename to...
	<strtab>: ... this.
	(ldelf_ctf_strtab_iter_cb): Adjust.
	(ldelf_ctf_symbols_iter_cb): Remove.
	(ldelf_new_dynsym_for_ctf): New, tell libctf about a single
	symbol.
	(ldelf_examine_strtab_for_ctf): Rename to...
	(ldelf_acquire_strings_for_ctf): ... this, only doing the strtab
	portion and not symbols.
	* ldelfgen.h: Adjust declarations accordingly.
	* ldemul.c (ldemul_examine_strtab_for_ctf): Rename to...
	(ldemul_acquire_strings_for_ctf): ... this.
	(ldemul_new_dynsym_for_ctf): New.
	* ldemul.h: Adjust declarations accordingly.
	* ldlang.c (ldlang_ctf_apply_strsym): Rename to...
	(ldlang_ctf_acquire_strings): ... this.
	(ldlang_ctf_new_dynsym): New.
	(lang_write_ctf): Call ldemul_new_dynsym_for_ctf with NULL to do
	the actual symbol shuffle.
	* ldlang.h (struct elf_strtab_hash): Adjust accordingly.
	* ldmain.c (bfd_link_callbacks): Wire up new/renamed callbacks.

libctf/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-link.c (ctf_link_shuffle_syms): Adjust.
	(ctf_link_add_linker_symbol): New, unimplemented stub.
	* libctf.ver: Add it.
	* ctf-create.c (ctf_serialize): Set CTF_F_DYNSTR on newly-serialized
	dicts.
	* ctf-open-bfd.c (ctf_bfdopen_ctfsect): Check for the flag: open the
	symtab/strtab if not present, dynsym/dynstr otherwise.
	* ctf-archive.c (ctf_arc_bufpreamble): New, get the preamble from
	some arbitrary member of a CTF archive.
	* ctf-impl.h (ctf_arc_bufpreamble): Declare it.
2020-11-20 13:34:07 +00:00
Nick Alcock
ae41200ba8 libctf, include, binutils, gdb: rename CTF-opening functions
The functions that return ctf_dict_t's given a ctf_archive_t and a name
are very clumsily named.  It sounds like they return *archives*, not
dictionaries, and the names are very long and clunky.  Why do we
have a ctf_arc_open_by_name when it opens a dictionary, not an archive,
and when there is no way to open a dictionary in any other way?  The
answer is purely internal: the function is located in ctf-archive.c,
and everything in there was called ctf_arc_*, and there is another
way to open a dict (by offset in the archive), that is internal to
ctf-archive.c and that nothing else can call.

This is clearly bad naming. The internal organization of the source tree
should not dictate public API names!

So rename things (keeping the old, bad names for compatibility), and
adjust all users.  You now open a dict using ctf_dict_open, and
open it giving ELF sections via ctf_dict_open_sections.

binutils/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* objdump.c (dump_ctf): Use ctf_dict_open, not
	ctf_arc_open_by_name.
	* readelf.c (dump_section_as_ctf): Likewise.

gdb/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* ctfread.c (elfctf_build_psymtabs): Use ctf_dict_open, not
	ctf_arc_open_by_name.

include/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-api.h (ctf_arc_open_by_name): Rename to...
	(ctf_dict_open): ... this, keeping compatibility function.
	(ctf_arc_open_by_name_sections): Rename to...
	(ctf_dict_open_sections): ... this, keeping compatibility function.

libctf/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-archive.c (ctf_arc_open_by_offset): Rename to...
	(ctf_dict_open_by_offset): ... this.  Adjust callers.
	(ctf_arc_open_by_name_internal): Rename to...
	(ctf_dict_open_internal): ... this.  Adjust callers.
	(ctf_arc_open_by_name_sections): Rename to...
	(ctf_dict_open_sections): ... this, keeping compatibility function.
	(ctf_arc_open_by_name): Rename to...
	(ctf_dict_open): ... this, keeping compatibility function.
	* libctf.ver: New functions added.
	* ctf-link.c (ctf_link_one_input_archive): Adjusted accordingly.
	(ctf_link_deduplicating_open_inputs): Likewise.
2020-11-20 13:34:05 +00:00
Nick Alcock
139633c307 libctf, include, binutils, gdb, ld: rename ctf_file_t to ctf_dict_t
The naming of the ctf_file_t type in libctf is a historical curiosity.
Back in the Solaris days, CTF dictionaries were originally generated as
a separate file and then (sometimes) merged into objects: hence the
datatype was named ctf_file_t, and known as a "CTF file".  Nowadays, raw
CTF is essentially never written to a file on its own, and the datatype
changed name to a "CTF dictionary" years ago.  So the term "CTF file"
refers to something that is never a file!  This is at best confusing.

The type has also historically been known as a 'CTF container", which is
even more confusing now that we have CTF archives which are *also* a
sort of container (they contain CTF dictionaries), but which are never
referred to as containers in the source code.

So fix this by completing the renaming, renaming ctf_file_t to
ctf_dict_t throughout, and renaming those few functions that refer to
CTF files by name (keeping compatibility aliases) to refer to dicts
instead.  Old users who still refer to ctf_file_t will see (harmless)
pointer-compatibility warnings at compile time, but the ABI is unchanged
(since C doesn't mangle names, and ctf_file_t was always an opaque type)
and things will still compile fine as long as -Werror is not specified.
All references to CTF containers and CTF files in the source code are
fixed to refer to CTF dicts instead.

Further (smaller) renamings of annoyingly-named functions to come, as
part of the process of souping up queries across whole archives at once
(needed for the function info and data object sections).

binutils/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* objdump.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
	(dump_ctf_archive_member): Likewise.
	(dump_ctf): Likewise. Use ctf_dict_close, not ctf_file_close.
	* readelf.c (dump_ctf_errs): Rename ctf_file_t to ctf_dict_t.
	(dump_ctf_archive_member): Likewise.
	(dump_section_as_ctf): Likewise.  Use ctf_dict_close, not
	ctf_file_close.

gdb/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* ctfread.c: Change uses of ctf_file_t to ctf_dict_t.
	(ctf_fp_info::~ctf_fp_info): Call ctf_dict_close, not ctf_file_close.

include/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-api.h (ctf_file_t): Rename to...
	(ctf_dict_t): ... this.  Keep ctf_file_t around for compatibility.
	(struct ctf_file): Likewise rename to...
	(struct ctf_dict): ... this.
	(ctf_file_close): Rename to...
	(ctf_dict_close): ... this, keeping compatibility function.
	(ctf_parent_file): Rename to...
	(ctf_parent_dict): ... this, keeping compatibility function.
	All callers adjusted.
	* ctf.h: Rename references to ctf_file_t to ctf_dict_t.
	(struct ctf_archive) <ctfa_nfiles>: Rename to...
	<ctfa_ndicts>: ... this.

ld/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* ldlang.c (ctf_output): This is a ctf_dict_t now.
	(lang_ctf_errs_warnings): Rename ctf_file_t to ctf_dict_t.
	(ldlang_open_ctf): Adjust comment.
	(lang_merge_ctf): Use ctf_dict_close, not ctf_file_close.
	* ldelfgen.h (ldelf_examine_strtab_for_ctf): Rename ctf_file_t to
	ctf_dict_t.  Change opaque declaration accordingly.
	* ldelfgen.c (ldelf_examine_strtab_for_ctf): Adjust.
	* ldemul.h (examine_strtab_for_ctf): Likewise.
	(ldemul_examine_strtab_for_ctf): Likewise.
	* ldeuml.c (ldemul_examine_strtab_for_ctf): Likewise.

libctf/ChangeLog
2020-11-20  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-impl.h: Rename ctf_file_t to ctf_dict_t: all declarations
	adjusted.
	(ctf_fileops): Rename to...
	(ctf_dictops): ... this.
	(ctf_dedup_t) <cd_id_to_file_t>: Rename to...
	<cd_id_to_dict_t>: ... this.
	(ctf_file_t): Fix outdated comment.
	<ctf_fileops>: Rename to...
	<ctf_dictops>: ... this.
	(struct ctf_archive_internal) <ctfi_file>: Rename to...
	<ctfi_dict>: ... this.
	* ctf-archive.c: Rename ctf_file_t to ctf_dict_t.
	Rename ctf_archive.ctfa_nfiles to ctfa_ndicts.
	Rename ctf_file_close to ctf_dict_close.  All users adjusted.
	* ctf-create.c: Likewise.  Refer to CTF dicts, not CTF containers.
	(ctf_bundle_t) <ctb_file>: Rename to...
	<ctb_dict): ... this.
	* ctf-decl.c: Rename ctf_file_t to ctf_dict_t.
	* ctf-dedup.c: Likewise.  Rename ctf_file_close to
	ctf_dict_close. Refer to CTF dicts, not CTF containers.
	* ctf-dump.c: Likewise.
	* ctf-error.c: Likewise.
	* ctf-hash.c: Likewise.
	* ctf-inlines.h: Likewise.
	* ctf-labels.c: Likewise.
	* ctf-link.c: Likewise.
	* ctf-lookup.c: Likewise.
	* ctf-open-bfd.c: Likewise.
	* ctf-string.c: Likewise.
	* ctf-subr.c: Likewise.
	* ctf-types.c: Likewise.
	* ctf-util.c: Likewise.
	* ctf-open.c: Likewise.
	(ctf_file_close): Rename to...
	(ctf_dict_close): ...this.
	(ctf_file_close): New trivial wrapper around ctf_dict_close, for
	compatibility.
	(ctf_parent_file): Rename to...
	(ctf_parent_dict): ... this.
	(ctf_parent_file): New trivial wrapper around ctf_parent_dict, for
	compatibility.
	* libctf.ver: Add ctf_dict_close and ctf_parent_dict.
2020-11-20 13:34:04 +00:00
Tom Tromey
0d01fbe64f Remove libctf/mkerrors.sed
This patch removes libctf/mkerrors.sed, replacing it with a macro in
ctf-api.h.  This simplifies the build and avoids possible unportable
code in the sed script.

2020-10-21  Tom Tromey  <tromey@adacore.com>

	* ctf-api.h (_CTF_ERRORS): New macro.

libctf/ChangeLog
2020-10-21  Tom Tromey  <tromey@adacore.com>

	* mkerrors.sed: Remove.
	* ctf-error.c (_CTF_FIRST): New define.
	(_CTF_ITEM): Define this, not _CTF_STR.
	(_ctf_errlist, _ctf_erridx): Use _CTF_ERRORS.
	(ERRSTRFIELD): Rewrite.
	(ERRSTRFIELD1): Remove.
	* Makefile.in: Rebuild.
	* Makefile.am (BUILT_SOURCES): Remove.
	(ctf-error.h): Remove.
2020-10-21 11:52:17 -06:00
Nick Alcock
926c9e7665 libctf, binutils, include, ld: gettextize and improve error handling
This commit follows on from the earlier commit "libctf, ld, binutils:
add textual error/warning reporting for libctf" and converts every error
in libctf that was reported using ctf_dprintf to use ctf_err_warn
instead, gettextizing them in the process, using N_() where necessary to
avoid doing gettext calls unless an error message is actually generated,
and rephrasing some error messages for ease of translation.

This requires a slight change in the ctf_errwarning_next API: this API
is public but has not been in a release yet, so can still change freely.
The problem is that many errors are emitted at open time (whether
opening of a CTF dict, or opening of a CTF archive): the former of these
throws away its incompletely-initialized ctf_file_t rather than return
it, and the latter has no ctf_file_t at all. So errors and warnings
emitted at open time cannot be stored in the ctf_file_t, and have to go
elsewhere.

We put them in a static local in ctf-subr.c (which is not very
thread-safe: a later commit will improve things here): ctf_err_warn with
a NULL fp adds to this list, and the public interface
ctf_errwarning_next with a NULL fp retrieves from it.

We need a slight exception from the usual iterator rules in this case:
with a NULL fp, there is nowhere to store the ECTF_NEXT_END "error"
which signifies the end of iteration, so we add a new err parameter to
ctf_errwarning_next which is used to report such iteration-related
errors.  (If an fp is provided -- i.e., if not reporting open errors --
this is optional, but even if it's optional it's still an API change.
This is actually useful from a usability POV as well, since
ctf_errwarning_next is usually called when there's been an error, so
overwriting the error code with ECTF_NEXT_END is not very helpful!
So, unusually, ctf_errwarning_next now uses the passed fp for its
error code *only* if no errp pointer is passed in, and leaves it
untouched otherwise.)

ld, objdump and readelf are adapted to call ctf_errwarning_next with a
NULL fp to report open errors where appropriate.

The ctf_err_warn API also has to change, gaining a new error-number
parameter which is used to add the error message corresponding to that
error number into the debug stream when LIBCTF_DEBUG is enabled:
changing this API is easy at this point since we are already touching
all existing calls to gettextize them.  We need this because the debug
stream should contain the errno's message, but the error reported in the
error/warning stream should *not*, because the caller will probably
report it themselves at failure time regardless, and reporting it in
every error message that leads up to it leads to a ridiculous chattering
on failure, which is likely to end up as ridiculous chattering on stderr
(trimmed a bit):

CTF error: `ld/testsuite/ld-ctf/A.c (0): lookup failure for type 3: flags 1: The parent CTF dictionary is unavailable'
CTF error: `ld/testsuite/ld-ctf/A.c (0): struct/union member type hashing error during type hashing for type 80000001, kind 6: The parent CTF dictionary is unavailable'
CTF error: `deduplicating link variable emission failed for ld/testsuite/ld-ctf/A.c: The parent CTF dictionary is unavailable'
ld/.libs/lt-ld-new: warning: CTF linking failed; output will have no CTF section: `The parent CTF dictionary is unavailable'

We only need to be told that the parent CTF dictionary is unavailable
*once*, not over and over again!

errmsgs are still emitted on warning generation, because warnings do not
usually lead to a failure propagated up to the caller and reported
there.

Debug-stream messages are not translated.  If translation is turned on,
there will be a mixture of English and translated messages in the debug
stream, but rather that than burden the translators with debug-only
output.

binutils/ChangeLog
2020-08-27  Nick Alcock  <nick.alcock@oracle.com>

	* objdump.c (dump_ctf_archive_member): Move error-
	reporting...
	(dump_ctf_errs): ... into this separate function.
	(dump_ctf): Call it on open errors.
	* readelf.c (dump_ctf_archive_member): Move error-
	reporting...
	(dump_ctf_errs): ... into this separate function.  Support
	calls with NULL fp. Adjust for new err parameter to
	ctf_errwarning_next.
	(dump_section_as_ctf): Call it on open errors.

include/ChangeLog
2020-08-27  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-api.h (ctf_errwarning_next): New err parameter.

ld/ChangeLog
2020-08-27  Nick Alcock  <nick.alcock@oracle.com>

	* ldlang.c (lang_ctf_errs_warnings): Support calls with NULL fp.
	Adjust for new err parameter to ctf_errwarning_next.  Only
	check for assertion failures when fp is non-NULL.
	(ldlang_open_ctf): Call it on open errors.
	* testsuite/ld-ctf/ctf.exp: Always use the C locale to avoid
	breaking the diags tests.

libctf/ChangeLog
2020-08-27  Nick Alcock  <nick.alcock@oracle.com>

	* ctf-subr.c (open_errors): New list.
	(ctf_err_warn): Calls with NULL fp append to open_errors.  Add err
	parameter, and use it to decorate the debug stream with errmsgs.
	(ctf_err_warn_to_open): Splice errors from a CTF dict into the
	open_errors.
	(ctf_errwarning_next): Calls with NULL fp report from open_errors.
	New err param to report iteration errors (including end-of-iteration)
	when fp is NULL.
	(ctf_assert_fail_internal): Adjust ctf_err_warn call for new err
	parameter: gettextize.
	* ctf-impl.h (ctfo_get_vbytes): Add ctf_file_t parameter.
	(LCTF_VBYTES): Adjust.
	(ctf_err_warn_to_open): New.
	(ctf_err_warn): Adjust.
	(ctf_bundle): Used in only one place: move...
	* ctf-create.c: ... here.
	(enumcmp): Use ctf_err_warn, not ctf_dprintf, passing the err number
	down as needed.  Don't emit the errmsg.  Gettextize.
	(membcmp): Likewise.
	(ctf_add_type_internal): Likewise.
	(ctf_write_mem): Likewise.
	(ctf_compress_write): Likewise.  Report errors writing the header or
	body.
	(ctf_write): Likewise.
	* ctf-archive.c (ctf_arc_write_fd): Use ctf_err_warn, not
	ctf_dprintf, and gettextize, as above.
	(ctf_arc_write): Likewise.
	(ctf_arc_bufopen): Likewise.
	(ctf_arc_open_internal): Likewise.
	* ctf-labels.c (ctf_label_iter): Likewise.
	* ctf-open-bfd.c (ctf_bfdclose): Likewise.
	(ctf_bfdopen): Likewise.
	(ctf_bfdopen_ctfsect): Likewise.
	(ctf_fdopen): Likewise.
	* ctf-string.c (ctf_str_write_strtab): Likewise.
	* ctf-types.c (ctf_type_resolve): Likewise.
	* ctf-open.c (get_vbytes_common): Likewise. Pass down the ctf dict.
	(get_vbytes_v1): Pass down the ctf dict.
	(get_vbytes_v2): Likewise.
	(flip_ctf): Likewise.
	(flip_types): Likewise. Use ctf_err_warn, not ctf_dprintf, and
	gettextize, as above.
	(upgrade_types_v1): Adjust calls.
	(init_types): Use ctf_err_warn, not ctf_dprintf, as above.
	(ctf_bufopen_internal): Likewise. Adjust calls. Transplant errors
	emitted into individual dicts into the open errors if this turns
	out to be a failed open in the end.
	* ctf-dump.c (ctf_dump_format_type): Adjust ctf_err_warn for new err
	argument.  Gettextize.  Don't emit the errmsg.
	(ctf_dump_funcs): Likewise.  Collapse err label into its only case.
	(ctf_dump_type): Likewise.
	* ctf-link.c (ctf_create_per_cu): Adjust ctf_err_warn for new err
	argument.  Gettextize.  Don't emit the errmsg.
	(ctf_link_one_type): Likewise.
	(ctf_link_lazy_open): Likewise.
	(ctf_link_one_input_archive): Likewise.
	(ctf_link_deduplicating_count_inputs): Likewise.
	(ctf_link_deduplicating_open_inputs): Likewise.
	(ctf_link_deduplicating_close_inputs): Likewise.
	(ctf_link_deduplicating): Likewise.
	(ctf_link): Likewise.
	(ctf_link_deduplicating_per_cu): Likewise. Add some missed
	ctf_set_errnos to obscure error cases.
	* ctf-dedup.c (ctf_dedup_rhash_type): Adjust ctf_err_warn for new
	err argument.  Gettextize.  Don't emit the errmsg.
	(ctf_dedup_populate_mappings): Likewise.
	(ctf_dedup_detect_name_ambiguity): Likewise.
	(ctf_dedup_init): Likewise.
	(ctf_dedup_multiple_input_dicts): Likewise.
	(ctf_dedup_conflictify_unshared): Likewise.
	(ctf_dedup): Likewise.
	(ctf_dedup_rwalk_one_output_mapping): Likewise.
	(ctf_dedup_id_to_target): Likewise.
	(ctf_dedup_emit_type): Likewise.
	(ctf_dedup_emit_struct_members): Likewise.
	(ctf_dedup_populate_type_mapping): Likewise.
	(ctf_dedup_populate_type_mappings): Likewise.
	(ctf_dedup_emit): Likewise.
	(ctf_dedup_hash_type): Likewise. Fix a bit of messed-up error
	status setting.
	(ctf_dedup_rwalk_one_output_mapping): Likewise. Don't hide
	unknown-type-kind messages (which signify file corruption).
2020-08-27 13:15:43 +01:00
Nick Alcock
987cf30ad8 libctf, binutils: initial work towards libctf gettextization
We gettextize under our package name, which we change to a more
reasonable 'libctf'.  Our internationalization support is mostly
provided by ctf-intl.h, which is a copy of opcodes/opintl.h with
the non-gettext_noop N_() expansion debracketed to avoid pedantic
compiler warnings.

The libctf error strings returned by ctf_errmsg are marked up for
internationalization.

(We also adjust binutils's Makefile a tiny bit to allow for the
fact that libctf now uses functions from libintl.)

binutils/ChangeLog
2020-08-27  Nick Alcock  <nick.alcock@oracle.com>

	* Makefile.am (readelf_LDADD): Move $(LIBINTL) after $(LIBCTF_NOBFD).
	* Makefile.in: Regenerated.

libctf/ChangeLog
2020-08-27  Nick Alcock  <nick.alcock@oracle.com>

	* configure.ac: Adjust package name to simply 'libctf': arbitrarily
	declare this to be version 1.2.0.
	* Makefile.am (AM_CPPFLAGS): Add @INCINTL@.
	* Makefile.in: Regenerated.
	* configure: Regenerated.
	* ctf-intl.h: New file, lightly modified from opcodes/opintl.h.
	* ctf-impl.h: Include it.
	* ctf-error.r (_ctf_errlist_t): Mark strings as noop-translatable.
	(ctf_errmsg): Actually translate them.
2020-08-27 13:14:10 +01:00
Eli Zaretskii
555adca2e3 libctf: compilation failure on MinGW due to missing errno values
This commit fixes a compilation failure in a couple of libctf files
due to the use of EOVERFLOW and ENOTSUP, which are not defined
when compiling on MinGW.

libctf/ChangeLog:

	PR binutils/25155:
	* ctf-create.c (EOVERFLOW): If not defined by system header,
	redirect to ERANGE as a poor man's substitute.
	* ctf-subr.c (ENOTSUP): If not defined, use ENOSYS instead.

(cherry picked from commit 50500ecfef)
2020-07-26 16:11:36 -07:00
Nick Alcock
8c419a91d7 libctf: fixes for systems on which sizeof (void *) > sizeof (long)
Systems like mingw64 have pointers that can only be represented by 'long
long'.  Consistently cast integers stored in pointers through uintptr_t
to cater for this.

libctf/
	* ctf-create.c (ctf_dtd_insert): Add uintptr_t casts.
	(ctf_dtd_delete): Likewise.
	(ctf_dtd_lookup): Likewise.
	(ctf_rollback): Likewise.
	* ctf-hash.c (ctf_hash_lookup_type): Likewise.
	* ctf-types.c (ctf_lookup_by_rawhash): Likewise.
2020-07-22 18:05:32 +01:00
Nick Alcock
734c894234 libctf: fix isspace casts
isspace() notoriously takes an int, not a char.  Cast uses
appropriately.

libctf/
	* ctf-lookup.c (ctf_lookup_by_name): Adjust.
2020-07-22 18:05:32 +01:00
Nick Alcock
4533ed564d libctf, binutils: fix big-endian libctf archive opening
The recent commit "libctf, binutils: support CTF archives like objdump"
broke opening of CTF archives on big-endian platforms.

This didn't affect anyone much before now because the linker never
emitted CTF archives because it wasn't detecting ambiguous types
properly: now it does, and this bug becomes obvious.

Fix trivial.

libctf/
	* ctf-archive.c (ctf_arc_bufopen): Endian-swap the archive magic
	number if needed.
2020-07-22 18:05:32 +01:00
Nick Alcock
662df3c3f1 libctf, link: tie in the deduplicating linker
This fairly intricate commit connects up the CTF linker machinery (which
operates in terms of ctf_archive_t's on ctf_link_inputs ->
ctf_link_outputs) to the deduplicator (which operates in terms of arrays
of ctf_file_t's, all the archives exploded).

The nondeduplicating linker is retained, but is not called unless the
CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the
environment variable LD_NO_CTF_DEDUP is set.  Eventually, once we have
confidence in the much-more-complex deduplicating linker, I hope the
nondeduplicating linker can be removed.

In brief, what this does is traverses each input archive in
ctf_link_inputs, opening every member (if not already open) and tying
child dicts to their parents, shoving them into an array and
constructing a corresponding parents array that tells the deduplicator
which dict is the parent of which child.  We then call ctf_dedup and
ctf_dedup_emit with that array of inputs, taking the outputs that result
and putting them into ctf_link_outputs where the rest of the CTF linker
expects to find them, then linking in the variables just as is done by
the nondeduplicating linker.

It also implements much of the CU-mapping side of things.  The problem
CU-mapping introduces is that if you map many input CUs into one output,
this is saying that you want many translation units to produce at most
one child dict if conflicting types are found in any of them.  This
means you can suddenly have multiple distinct types with the same name
in the same dict, which libctf cannot really represent because it's not
something you can do with C translation units.

The deduplicator machinery already committed does as best it can with
these, hiding types with conflicting names rather than making child
dicts out of them: but we still need to call it.  This is done similarly
to the main link, taking the inputs (one CU output at a time),
deduplicating them, taking the output and making it an input to the
final link.  Two (significant) optimizations are done: we share atoms
tables between all these links and the final link (so e.g. all type hash
values are shared, all decorated type names, etc); and any CU-mapped
links with only one input (and no child dicts) doesn't need to do
anything other than renaming the CU: the CU-mapped link phase can be
skipped for it.  Put together, large CU-mapped links can save 50% of
their memory usage and about as much time (and the memory usage for
CU-mapped links is significant, because all those output CUs have to
have all their types stored in memory all at once).

include/
	* ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the
	deduplicator.
libctf/
	* ctf-impl.h (ctf_list_splice): New.
	* ctf-util.h (ctf_list_splice): Likewise.
	* ctf-link.c (link_sort_inputs_cb_arg_t): Likewise.
	(ctf_link_sort_inputs): Likewise.
	(ctf_link_deduplicating_count_inputs): Likewise.
	(ctf_link_deduplicating_open_inputs): Likewise.
	(ctf_link_deduplicating_close_inputs): Likewise.
	(ctf_link_deduplicating_variables): Likewise.
	(ctf_link_deduplicating_per_cu): Likewise.
	(ctf_link_deduplicating): Likewise.
	(ctf_link): Call it.
2020-07-22 18:02:19 +01:00
Nick Alcock
e3e8411bec libctf, link: add CTF_LINK_OMIT_VARIABLES_SECTION
This flag (not used anywhere yet) causes the variables section to be
omitted from the output CTF dict.

include/
	* ctf-api.h (CTF_LINK_OMIT_VARIABLES_SECTION): New.
libctf/
	* ctf-link.c (ctf_link_one_input_archive_member): Check
	CTF_LINK_OMIT_VARIABLES_SECTION.
2020-07-22 18:02:19 +01:00
Nick Alcock
0f0c11f7fc libctf, dedup: add deduplicator
This adds the core deduplicator that the ctf_link machinery calls
(possibly repeatedly) to link the CTF sections: it takes an array
of input ctf_file_t's and another array that indicates which entries in
the input array are parents of which other entries, and returns an array
of outputs.  The first output is always the ctf_file_t on which
ctf_link/ctf_dedup/etc was called: the other outputs are child dicts
that have the first output as their parent.

include/
	* ctf-api.h (CTF_LINK_SHARE_DUPLICATED): No longer unimplemented.
libctf/
	* ctf-impl.h (ctf_type_id_key): New, the key in the
	cd_id_to_file_t.
	(ctf_dedup): New, core deduplicator state.
	(ctf_file_t) <ctf_dedup>: New.
	<ctf_dedup_atoms>: New.
	<ctf_dedup_atoms_alloc>: New.
	(ctf_hash_type_id_key): New prototype.
	(ctf_hash_eq_type_id_key): Likewise.
	(ctf_dedup_atoms_init): Likewise.
	* ctf-hash.c (ctf_hash_eq_type_id_key): New.
	(ctf_dedup_atoms_init): Likewise.
	* ctf-create.c (ctf_serialize): Adjusted.
	(ctf_add_encoded): No longer static.
	(ctf_add_reftype): Likewise.
	* ctf-open.c (ctf_file_close): Destroy the
	ctf_dedup_atoms_alloc.
	* ctf-dedup.c: New file.
        * ctf-decls.h [!HAVE_DECL_STPCPY]: Add prototype.
	* configure.ac: Check for stpcpy.
	* Makefile.am: Add it.
	* Makefile.in: Regenerate.
        * config.h.in: Regenerate.
        * configure: Regenerate.
2020-07-22 18:02:19 +01:00
Nick Alcock
a9b9870206 libctf, dedup: add new configure option --enable-libctf-hash-debugging
Add a new debugging configure option, --enable-libctf-hash-debugging,
off by default, which lets you configure in expensive internal
consistency checks and enable the printing of debugging output when
LIBCTF_DEBUG=t before type deduplication has happened.

In this commit we just add the option and cause it to turn ctf_assert
into a real, hard assert for easier debugging.

libctf/
	* configure.ac: Add --enable-libctf-hash-debugging.
	* aclocal.m4: Pull in enable.m4, for GCC_ENABLE.
	* Makefile.in: Regenerated.
	* configure: Likewise.
	* config.h.in: Likewise.
	* ctf-impl.h [ENABLE_LIBCTF_HASH_DEBUGGING]
	(ctf_assert): Define to assert.
2020-07-22 18:02:19 +01:00
Nick Alcock
1f2e8b5b87 libctf: add SHA-1 support for libctf
This very thin abstraction layer provides SHA-1ing facilities to all of
libctf, almost all inlined wrappers around the libiberty functionality
other than ctf_sha1_fini.

The deduplicator will use this to recursively hash types to prove their
identity.

libctf/
	* ctf-sha1.h: New, inline wrappers around sha1_init_ctx and
	sha1_process_bytes.
	* ctf-impl.h: Include it.
	(ctf_sha1_init): New.
	(ctf_sha1_add): Likewise.
	(ctf_sha1_fini): Likewise.
	* ctf-sha1.c: New, non-inline wrapper around sha1_finish_ctx
	producing strings.
	* Makefile.am: Add file.
	* Makefile.in: Regenerate.
2020-07-22 18:02:18 +01:00
Nick Alcock
6dd2819ffc libctf, link: add the ability to filter out variables from the link
The CTF variables section (containing variables that have no
corresponding symtab entries) can cause the string table to get very
voluminous if the names of variables are long.  Some callers want to
filter out particular variables they know they won't need.

So add a "variable filter" callback that does that: it's passed the name
of the variable and a corresponding ctf_file_t / ctf_id_t pair, and
should return 1 to filter it out.

ld doesn't use this machinery yet, but we could easily add it later if
desired.  (But see later for a commit that turns off CTF variable-
section linking in ld entirely by default.)

include/
	* ctf-api.h (ctf_link_variable_filter_t): New.
	(ctf_link_set_variable_filter): Likewise.

libctf/
	* libctf.ver (ctf_link_set_variable_filter): Add.
	* ctf-impl.h (ctf_file_t) <ctf_link_variable_filter>: New.
	<ctf_link_variable_filter_arg>: Likewise.
	* ctf-create.c (ctf_serialize): Adjust.
	* ctf-link.c (ctf_link_set_variable_filter): New, set it.
	(ctf_link_one_variable): Call it if set.
2020-07-22 18:02:18 +01:00
Nick Alcock
19d4b1addc libctf, link: fix spurious conflicts of variables in the variable section
When we link a CTF variable, we check to see if it already exists in the
parent dict first: if it does, and it has a type the same as the type we
would populate it with, we assume we don't need to do anything:
otherwise, we populate it in a per-CU child.

Or that's what we should be doing.  Instead, we check if the type is the
same as the type in *source dict*, which is going to be a completely
different value!  So we end up concluding all variables are conflicting,
bloating up output possibly quite a lot (variables aren't big in and of
themselves, but each drags around a strtab entry, and CTF dicts in a CTF
archive do not share their strtabs -- one of many problems with CTF
archives as presently constituted.)

Fix trivial: check the right type.

libctf/
	* ctf-link.c (ctf_link_one_variable): Check the dst_type for
	conflicts, not the source type.
2020-07-22 18:02:18 +01:00
Nick Alcock
5f54462c6a libctf, link: redo cu-mapping handling
Now a bunch of stuff that doesn't apply to ld or any normal use of
libctf, piled into one commit so that it's easier to ignore.

The cu-mapping machinery associates incoming compilation unit names with
outgoing names of CTF dictionaries that should correspond to them, for
non-gdb CTF consumers that would like to group multiple TUs into a
single child dict if conflicting types are found in it (the existing use
case is one kernel module, one child CTF dict, even if the kernel module
is composed of multiple CUs).

The upcoming deduplicator needs to track not only the mapping from
incoming CU name to outgoing dict name, but the inverse mapping from
outgoing dict name to incoming CU name, so it can work over every CTF
dict we might see in the output and link into it.

So rejig the ctf-link machinery to do that.  Simultaneously (because
they are closely associated and were written at the same time), we add a
new CTF_LINK_EMPTY_CU_MAPPINGS flag to ctf_link, which tells the
ctf_link machinery to create empty child dicts for each outgoing CU
mapping even if no CUs that correspond to it exist in the link.  This is
a bit (OK, quite a lot) of a waste of space, but some existing consumers
require it.  (Nobody else should use it.)

Its value is not consecutive with existing CTF_LINK flag values because
we're about to add more flags that are conceptually closer to the
existing ones than this one is.

include/
	* ctf-api.h (CTF_LINK_EMPTY_CU_MAPPINGS): New.

libctf/
	* ctf-impl.h (ctf_file_t): Improve comments.
	<ctf_link_cu_mapping>: Split into...
	<ctf_link_in_cu_mapping>: ... this...
	<ctf_link_out_cu_mapping>: ... and this.
	* ctf-create.c (ctf_serialize): Adjust.
	* ctf-open.c (ctf_file_close): Likewise.
	* ctf-link.c (ctf_create_per_cu): Look things up in the
	in_cu_mapping instead of the cu_mapping.
	(ctf_link_add_cu_mapping): The deduplicating link will define
	what happens if many FROMs share a TO.
	(ctf_link_add_cu_mapping): Create in_cu_mapping and
	out_cu_mapping. Do not create ctf_link_outputs here any more, or
	create per-CU dicts here: they are already created when needed.
	(ctf_link_one_variable): Log a debug message if we skip a
	variable due to its type being concealed in a CU-mapped link.
	(This is probably too common a case to make into a warning.)
	(ctf_link): Create empty per-CU dicts if requested.
2020-07-22 18:02:18 +01:00
Nick Alcock
e3f17159e2 libctf, link: fix ctf_link_write fd leak
We were leaking the fd on every invocation.

libctf/
	* ctf-link.c (ctf_link_write): Close the fd.
2020-07-22 18:02:18 +01:00
Nick Alcock
8d2229ad1e libctf, link: add lazy linking: clean up input members: err/warn cleanup
This rather large and intertwined pile of changes does three things:

First, it transitions from dprintf to ctf_err_warn for things the user might
care about: this one file is the major impetus for the ctf_err_warn
infrastructure, because things like file names are crucial in linker
error messages, and errno values are utterly incapable of
communicating them

Second, it stabilizes the ctf_link APIs: you can now call
ctf_link_add_ctf without a CTF argument (only a NAME), to lazily
ctf_open the file with the given NAME when needed, and close it as soon
as possible, to save memory.  This is not an API change because a null
CTF argument was prohibited before now.

Since getting CTF directly from files uses ctf_open, passing in only a
NAME requires use of libctf, not libctf-nobfd.  The linker's behaviour
is unchanged, as it still passes in a ctf_archive_t as before.

This also let us fix a leak: we were opening ctf_archives and their
containing ctf_files, then only closing the files and leaving the
archives open.

Third, this commit restructures the ctf_link_in_member argument used by
the CTF linking machinery and adjusts its users accordingly.

We drop two members:

- arcname, which is difficult to construct and then only used in error
  messages (that were only dprintf()ed, so never seen!)
- share_mode, since we store the flags passed to ctf_link (including the
  share mode) in a new ctf_file_t.ctf_link_flags to help dedup get hold
  of it

We rename others whose existing names were fairly dreadful:

- done_main_member -> done_parent, using consistent terminology for .ctf
  as the parent of all archive members
- main_input_fp -> in_fp_parent, likewise
- file_name -> in_file_name, likewise

We add one new member, cu_mapped.

Finally, we move the various frees of things like mapping table data to
the top-level ctf_link, since deduplicating links will want to do that
too.

include/
	* ctf-api.h (ECTF_NEEDSBFD): New.
	(ECTF_NERR): Adjust.
	(ctf_link): Rename share_mode arg to flags.
libctf/
	* Makefile.am: Set -DNOBFD=1 in libctf-nobfd, and =0 elsewhere.
	* Makefile.in: Regenerated.
	* ctf-impl.h (ctf_link_input_name): New.
	(ctf_file_t) <ctf_link_flags>: New.
	* ctf-create.c (ctf_serialize): Adjust accordingly.
	* ctf-link.c: Define ctf_open as weak when PIC.
	(ctf_arc_close_thunk): Remove unnecessary thunk.
	(ctf_file_close_thunk): Likewise.
	(ctf_link_input_name): New.
	(ctf_link_input_t): New value of the ctf_file_t.ctf_link_input.
	(ctf_link_input_close): Adjust accordingly.
	(ctf_link_add_ctf_internal): New, split from...
	(ctf_link_add_ctf): ... here.  Return error if lazy loading of
	CTF is not possible.  Change to just call...
	(ctf_link_add): ... this new function.
	(ctf_link_add_cu_mapping): Transition to ctf_err_warn.  Drop the
	ctf_file_close_thunk.
	(ctf_link_in_member_cb_arg_t) <file_name> Rename to...
	<in_file_name>: ... this.
	<arcname>: Drop.
	<share_mode>: Likewise (migrated to ctf_link_flags).
	<done_main_member>: Rename to...
	<done_parent>: ... this.
	<main_input_fp>: Rename to...
	<in_fp_parent>: ... this.
	<cu_mapped>: New.
	(ctf_link_one_type): Adjuwt accordingly.  Transition to
	ctf_err_warn, removing a TODO.
	(ctf_link_one_variable): Note a case too common to warn about.
	Report in the debug stream if a cu-mapped link prevents addition
	of a conflicting variable.
	(ctf_link_one_input_archive_member): Adjust.
	(ctf_link_lazy_open): New, open a CTF archive for linking when
	needed.
	(ctf_link_close_one_input_archive): New, close it again.
	(ctf_link_one_input_archive): Adjust for lazy opening, member
	renames, and ctf_err_warn transition.  Move the
	empty_link_type_mapping call to...
	(ctf_link): ... here.  Adjut for renamings and thunk removal.
	Don't spuriously fail if some input contains no CTF data.
	(ctf_link_write): ctf_err_warn transition.
	* libctf.ver: Remove not-yet-stable comment.
2020-07-22 18:02:18 +01:00
Nick Alcock
e148b73013 libctf: drop error-prone ctf_strerror
This utility function is almost useless (all it does is casts the result
of a strerror) but has a seriously confusing name.  Over and over again
I have accidentally called it instead of ctf_errmsg, and hidden a
time-bomb for myself in a hard-to-test error-handling path: since
ctf_strerror is just a strerror wrapper, it cannot handle CTF errnos,
unlike ctf_errmsg.  It's astonishingly lucky that none of these errors
have crept into any commits to date.

Fuse it into ctf_errmsg and drop it.

libctf/
	* ctf-impl.h (ctf_strerror): Delete.
	* ctf-subr.c (ctf_strerror): Likewise.
	* ctf-error.c (ctf_errmsg): Stop using ctf_strerror: just use
	strerror directly.
2020-07-22 18:02:18 +01:00
Nick Alcock
1fa7a0c24e libctf: sort out potential refcount loops
When you link TUs that contain conflicting types together, the resulting
CTF section is an archive containing many CTF dicts.  These dicts appear
in ctf_link_outputs of the shared dict, with each ctf_import'ing that
shared dict.  ctf_importing a dict bumps its refcount to stop it going
away while it's in use -- but if the shared dict (whose refcount is
bumped) has the child dict (doing the bumping) in its ctf_link_outputs,
we have a refcount loop, since the child dict only un-ctf_imports and
drops the parent's refcount when it is freed, but the child is only
freed when the parent's refcount falls to zero.

(In the future, this will be able to go wrong on the inputs too, when an
ld -r'ed deduplicated output with conflicts is relinked.  Right now this
cannot happen because we don't ctf_import such dicts at all.  This will
be fixed in a later commit in this series.)

Fix this by introducing an internal-use-only ctf_import_unref function
that imports a parent dict *witthout* bumping the parent's refcount, and
using it when we create per-CU outputs.  This function is only safe to
use if you know the parent cannot go away while the child exists: but if
the parent *owns* the child, as here, this is necessarily true.

Record in the ctf_file_t whether a parent was imported via ctf_import or
ctf_import_unref, so that if you do another ctf_import later on (or a
ctf_import_unref) it can decide whether to drop the refcount of the
existing parent being replaced depending on which function you used to
import that one.  Adjust ctf_serialize so that rather than doing a
ctf_import (which is wrong if the original import was
ctf_import_unref'fed), we just copy the parent field and refcount over
and forcibly flip the unref flag on on the old copy we are going to
discard.

ctf_file_close also needs a bit of tweaking to only close the parent if
it was not imported with ctf_import_unref: while we're at it, guard
against repeated closes with a refcount of zero and stop them causing
double-frees, even if destruction of things freed *inside*
ctf_file_close cause such recursion.

Verified no leaks or accesses to freed memory after all of this with
valgrind.  (It was leak-happy before.)

libctf/
	* ctf-impl.c (ctf_file_t) <ctf_parent_unreffed>: New.
	(ctf_import_unref): New.
	* ctf-open.c (ctf_file_close) Drop the refcount all the way to
	zero.  Don't recurse back in if the refcount is already zero.
	(ctf_import): Check ctf_parent_unreffed before deciding whether
	to close a pre-existing parent.  Set it to zero.
	(ctf_import_unreffed): New, as above, setting
	ctf_parent_unreffed to 1.
	* ctf-create.c (ctf_serialize): Do not ctf_import into the new
	child: use direct assignment, and set unreffed on the new and
	old children.
	* ctf-link.c (ctf_create_per_cu): Import the parent using
	ctf_import_unreffed.
2020-07-22 18:02:18 +01:00
Nick Alcock
3166467b00 libctf: rename the type_mapping_key to type_key
The name was just annoyingly long and I kept misspelling it.
It's also a bad name: it's not a mapping the type might be *used* in a
type mapping, but it is itself a representation of a type (a ctf_file_t
/ ctf_id_t pair), not of a mapping at all.

libctf/
	* ctf-impl.h (ctf_link_type_mapping_key): Rename to...
	(ctf_link_type_key): ... this, adjusting member prefixes to
	match.
	(ctf_hash_type_mapping_key): Rename to...
	(ctf_hash_type_key): ... this.
	(ctf_hash_eq_type_mapping_key): Rename to...
	(ctf_hash_eq_type_key): ... this.
	* ctf-hash.c (ctf_hash_type_mapping_key): Rename to...
	(ctf_hash_type_key): ... this, and adjust for member name
	changes.
	(ctf_hash_eq_type_mapping_key): Rename to...
	(ctf_hash_eq_type_key): ... this, and adjust for member name
	changes.
	* ctf-link.c (ctf_add_type_mapping): Adjust.  Note the lack of
	need for out-of-memory checking in this code.
	(ctf_type_mapping): Adjust.
2020-07-22 18:02:18 +01:00
Nick Alcock
43a61d7d3e libctf: check for vasprintf
We've been using this for all of libctf's history in binutils: we should
check for it in configure.

libctf/
	configure.ac: Check for vasprintf.
	configure: Regenerated.
	config.h.in: Likewise.
2020-07-22 18:02:18 +01:00
Nick Alcock
ac2ff76030 libctf, archive: fix bad error message
Get the function name right.

libctf/
	* ctf-archive.c (ctf_arc_bufopen): Fix message.
2020-07-22 18:02:18 +01:00
Nick Alcock
d50c08025d libctf, open: fix opening CTF in binaries with no symtab
This is a perfectly possible case, and half of ctf_bfdopen_ctfsect
handled it fine.  The other half hit a divide by zero or two before we
got that far, and had no code path to load the strtab from anywhere
in the absence of a symtab to point at it in any case.

So, as a fallback, if there is no symtab, try loading ".strtab"
explicitly by name, like we used to before we started looking for the
strtab the symtab used.

Of course, such a strtab is not kept hold of by BFD, so this means we
have to bring back the code to possibly explicitly free the strtab that
we read in.

libctf/
	* ctf-impl.h (struct ctf_archive_internal) <ctfi_free_strsect>
	New.
	* ctf-open-bfd.c (ctf_bfdopen_ctfsect): Explicitly open a strtab
	if the input has no symtab, rather than dividing by
	zero. Arrange to free it later via ctfi_free_ctfsect.
	* ctf-archive.c (ctf_new_archive_internal): Do not
	ctfi_free_strsect by default.
	(ctf_arc_close): Possibly free it here.
2020-07-22 18:02:18 +01:00
Nick Alcock
7044740174 libctf, dump: fix slice dumping
Now that we can have slices of anything terminating in an int, we must
dump things accordingly, or slices of typedefs appear as

  c5b: __u8 -> 16c: __u8 -> 78: short unsigned int (size 0x2)

which is unhelpful.  If things *are* printed as slices, the name is
missing:

  a15: [slice 0x8:0x4]-> 16c: __u8 -> 78: short unsigned int (size 0x2)

And struct members give no clue they're a slice at all, which is a shame
since bitfields are the major use of this type kind:

       [0x8] (ID 0xa15) (kind 10) __u8  dst_reg

Fix things so that everything slicelike or integral gets its encoding
printed, and everything with a name gets the name printed:

  a15: __u8  [slice 0x8:0x4] (size 0x1) -> 1ff: __u8 (size 0x1) -> 37: unsigned char [0x0:0x8] (size 0x1)
     [0x0] (ID 0xa15) (kind 10) __u8:4 (aligned at 0x1, format 0x2, offset:bits 0x8:0x4)

Bitfield struct members get a technically redundant but much
easier-to-understand dumping now:

    [0x0] (ID 0x80000005) (kind 6) struct bpf_insn (aligned at 0x1)
        [0x0] (ID 0x222) (kind 10) __u8 code (aligned at 0x1)
        [0x8] (ID 0x1e9e) (kind 10) __u8  dst_reg:4 (aligned at 0x1, format 0x2, offset:bits 0x8:0x4)
        [0xc] (ID 0x1e46) (kind 10) __u8  src_reg:4 (aligned at 0x1, format 0x2, offset:bits 0xc:0x4)
        [0x10] (ID 0xf35) (kind 10) __s16 off (aligned at 0x2)
        [0x20] (ID 0x1718) (kind 10) __s32 imm (aligned at 0x4)

This also fixes one place where a failure to format a type would be
erroneously considered an out-of-memory condition.

libctf/
	* ctf-dump.c (ctf_is_slice): Delete, unnecessary.
	(ctf_dump_format_type): improve slice formatting.  Always print
	the type size, even of slices.
	(ctf_dump_member): Print slices (-> bitfields) differently from
	non-slices.  Failure to format a type is not an OOM.
2020-07-22 18:02:18 +01:00
Nick Alcock
8e795b46f5 libctf, dump: migrate towards dumping errors rather than truncation
If we get an error emitting a single type, variable, or label, right now
we emit the error into the ctf_dprintf stream and propagate the error
all the way up the stack, causing the entire output to be silently
truncated (unless libctf debugging is on).

Instead, emit an error and keep going.  (This makes sense for this use
case: if you're dumping types and a type is corrupted, you want to
know!)

Not all instances of this are fixed in this commit, only ones associated
with type formatting: more fixes will come.

libctf/
	* ctf-dump.c (ctf_dump_format_type): Emit a warning.
	(ctf_dump_label): Swallow errors from ctf_dump_format_type.
	(ctf_dump_objts): Likewise.
	(ctf_dump_var): Likewise.
	(ctf_dump_type): Do not emit a duplicate message.  Move to
	ctf_err_warning, and swallow all errors.
2020-07-22 18:02:17 +01:00
Nick Alcock
b255b35feb libctf, decl: avoid leaks of the formatted string on error
ctf_decl_sprintf builds up a formatted string in the ctf_decl_t's
cd_buf, but then on error this is hardly ever freed: we assume that
ctf_decl_fini frees it, but it leaks it instead.

Make it free it like any decent ADT should.

libctf/
	* ctf-decl.c (ctf_decl_fini): Free the cd_buf.
	(ctf_decl_buf): Once it escapes, don't try to free it later.
2020-07-22 18:02:17 +01:00
Nick Alcock
c6e9a1e576 libctf, types: enhance ctf_type_aname to print function arg types
Somehow this never got implemented, which makes debugging any kind of
bug that has to do with argument types fantastically confusing, because
it *looks* like the func type takes no arguments though in fact it does.

This also lets us simplify the dumper slightly (and introduces our first
uses of ctf_assert and ctf_err_warn: there will be many more).

ctf_type_aname dumps function types without including the function
pointer name itself: ctf_dump search-and-replaces it in.  This seems to
give the nicest-looking results for existing users of both, even if it
is a bit fiddly.

libctf/
	* ctf-types.c (ctf_type_aname): Print arg types here...
	* ctf-dump.c (ctf_dump_funcs): ... not here: but do substitute
	in the type name here.
2020-07-22 18:02:17 +01:00
Nick Alcock
8b37e7b63e libctf, ld, binutils: add textual error/warning reporting for libctf
This commit adds a long-missing piece of infrastructure to libctf: the
ability to report errors and warnings using all the power of printf,
rather than being restricted to one errno value.  Internally, libctf
calls ctf_err_warn() to add errors and warnings to a list: a new
iterator ctf_errwarning_next() then consumes this list one by one and
hands it to the caller, which can free it.  New errors and warnings are
added until the list is consumed by the caller or the ctf_file_t is
closed, so you can dump them at intervals.  The caller can of course
choose to print only those warnings it wants.  (I am not sure whether we
want objdump, readelf or ld to print warnings or not: right now I'm
printing them, but maybe we only want to print errors?  This entirely
depends on whether warnings are voluminous things describing e.g. the
inability to emit single types because of name clashes or something.
There are no users of this infrastructure yet, so it's hard to say.)

There is no internationalization here yet, but this at least adds a
place where internationalization can be added, to one of
ctf_errwarning_next or ctf_err_warn.

We also provide a new ctf_assert() function which uses this
infrastructure to provide non-fatal assertion failures while emitting an
assert-like string to the caller: to save space and avoid needlessly
duplicating unchanging strings, the assertion test is inlined but the
print-things-out failure case is not.  All assertions in libctf will be
converted to use this machinery in future commits and propagate
assertion-failure errors up, so that the linker in particular cannot be
killed by libctf assertion failures when it could perfectly well just
print warnings and drop the CTF section.

include/
	* ctf-api.h (ECTF_INTERNAL): Adjust error text.
	(ctf_errwarning_next): New.
libctf/
	* ctf-impl.h (ctf_assert): New.
	(ctf_err_warning_t): Likewise.
	(ctf_file_t) <ctf_errs_warnings>: Likewise.
	(ctf_err_warn): New prototype.
	(ctf_assert_fail_internal): Likewise.
	* ctf-inlines.h (ctf_assert_internal): Likewise.
	* ctf-open.c (ctf_file_close): Free ctf_errs_warnings.
	* ctf-create.c (ctf_serialize): Copy it on serialization.
	* ctf-subr.c (ctf_err_warn): New, add an error/warning.
	(ctf_errwarning_next): New iterator, free and pass back
	errors/warnings in succession.
	* libctf.ver (ctf_errwarning_next): Add.
ld/
	* ldlang.c (lang_ctf_errs_warnings): New, print CTF errors
	and warnings.  Assert when libctf asserts.
	(lang_merge_ctf): Call it.
	(land_write_ctf): Likewise.
binutils/
	* objdump.c (ctf_archive_member): Print CTF errors and warnings.
	* readelf.c (dump_ctf_archive_member): Likewise.
2020-07-22 18:02:17 +01:00
Egeyar Bagcioglu
b7190c821e libctf, types: ensure the emission of ECTF_NOPARENT
ctf_variable_iter was returning a (positive!) error code rather than
setting the error in the passed-in ctf_file_t.

Reviewed-by: Nick Alcock <nick.alcock@oracle.com>

libctf/
	* ctf-types.c (ctf_variable_iter): Fix error return.
2020-07-22 18:01:51 +01:00
Nick Alcock
ec388c16cd libctf: error out on corrupt CTF with invalid header flags
If corrupt CTF with invalid header flags is passed in, return the new
error ECTF_FLAGS.

include/
	* ctf-api.h (ECTF_FLAGS): New.
	(ECTF_NERR): Adjust.
	* ctf.h (CTF_F_MAX): New.
libctf/
	* ctf-open.c (ctf_bufopen_internal): Diagnose invalid flags.
2020-07-22 17:57:54 +01:00
Nick Alcock
67d4cc671b libctf: pass the thunk down properly when wrapping qsort_r
When wrapping qsort_r on a system like FreeBSD on which the compar
argument comes first, we wrap the passed arg in a thunk so we can pass
down both the caller-supplied comparator function and its argument.  We
should pass the *argument* down to the comparator, not the thunk, which
is basically random nonsense on the stack from the point of view of the
caller of qsort_r.

libctf/
	ctf-decls.h (ctf_qsort_compar_thunk): Fix arg passing.
2020-07-22 17:57:52 +01:00
Nick Alcock
e28591b3df libctf, next, hash: add dynhash and dynset _next iteration
This lets you iterate over dynhashes and dynsets using the _next API.
dynhashes can be iterated over in sorted order, which works by
populating an array of key/value pairs using ctf_dynhash_next itself,
then sorting it with qsort.

Convenience inline functions named ctf_dyn{hash,set}_cnext are also
provided that take (-> return) const keys and values.

libctf/
	* ctf-impl.h (ctf_next_hkv_t): New, kv-pairs passed to
	sorting functions.
	(ctf_next_t) <u.ctn_sorted_hkv>: New, sorted kv-pairs for
	ctf_dynhash_next_sorted.
	<cu.ctn_h>: New, pointer to the dynhash under iteration.
	<cu.ctn_s>: New, pointer to the dynset under iteration.
	(ctf_hash_sort_f): Sorting function passed to...
	(ctf_dynhash_next_sorted): ... this new function.
	(ctf_dynhash_next): New.
	(ctf_dynset_next): New.
	* ctf-inlines.h (ctf_dynhash_cnext_sorted): New.
	(ctf_dynhash_cnext): New.
	(ctf_dynset_cnext): New.
	* ctf-hash.c (ctf_dynhash_next_sorted): New.
	(ctf_dynhash_next): New.
	(ctf_dynset_next): New.
	* ctf-util.c (ctf_next_destroy): Free the u.ctn_sorted_hkv if
	needed.
	(ctf_next_copy): Alloc-and-copy the u.ctn_sorted_hkv if needed.
2020-07-22 17:57:51 +01:00
Nick Alcock
688d28f621 libctf, next: introduce new class of easier-to-use iterators
The libctf machinery currently only provides one way to iterate over its
data structures: ctf_*_iter functions that take a callback and an arg
and repeatedly call it.

This *works*, but if you are doing a lot of iteration it is really quite
inconvenient: you have to package up your local variables into
structures over and over again and spawn lots of little functions even
if it would be clearer in a single run of code.  Look at ctf-string.c
for an extreme example of how unreadable this can get, with
three-line-long functions proliferating wildly.

The deduplicator takes this to the Nth level. It iterates over a whole
bunch of things: if we'd had to use _iter-class iterators for all of
them there would be twenty additional functions in the deduplicator
alone, for no other reason than that the iterator API requires it.

Let's do something better. strtok_r gives us half the design: generators
in a number of other languages give us the other half.

The *_next API allows you to iterate over CTF-like entities in a single
function using a normal while loop. e.g. here we are iterating over all
the types in a dict:

ctf_next_t *i = NULL;
int *hidden;
ctf_id_t id;

while ((id = ctf_type_next (fp, &i, &hidden, 1)) != CTF_ERR)
  {
    /* do something with 'hidden' and 'id' */
  }
if (ctf_errno (fp) != ECTF_NEXT_END)
    /* iteration error */

Here we are walking through the members of a struct with CTF ID
'struct_type':

ctf_next_t *i = NULL;
ssize_t offset;
const char *name;
ctf_id_t membtype;

while ((offset = ctf_member_next (fp, struct_type, &i, &name,
                                  &membtype)) >= 0
  {
    /* do something with offset, name, and membtype */
  }
if (ctf_errno (fp) != ECTF_NEXT_END)
    /* iteration error */

Like every other while loop, this means you have access to all the local
variables outside the loop while inside it, with no need to tiresomely
package things up in structures, move the body of the loop into a
separate function, etc, as you would with an iterator taking a callback.

ctf_*_next allocates 'i' for you on first entry (when it must be NULL),
and frees and NULLs it and returns a _next-dependent flag value when the
iteration is over: the fp errno is set to ECTF_NEXT_END when the
iteartion ends normally.  If you want to exit early, call
ctf_next_destroy on the iterator.  You can copy iterators using
ctf_next_copy, which copies their current iteration position so you can
remember loop positions and go back to them later (or ctf_next_destroy
them if you don't need them after all).

Each _next function returns an always-likely-to-be-useful property of
the thing being iterated over, and takes pointers to parameters for the
others: with very few exceptions all those parameters can be NULLs if
you're not interested in them, so e.g. you can iterate over only the
offsets of members of a structure this way:

while ((offset = ctf_member_next (fp, struct_id, &i, NULL, NULL)) >= 0)

If you pass an iterator in use by one iteration function to another one,
you get the new error ECTF_NEXT_WRONGFUN back; if you try to change
ctf_file_t in mid-iteration, you get ECTF_NEXT_WRONGFP back.

Internally the ctf_next_t remembers the iteration function in use,
various sizes and increments useful for almost all iterations, then
uses unions to overlap the actual entities being iterated over to keep
ctf_next_t size down.

Iterators available in the public API so far (all tested in actual use
in the deduplicator):

/* Iterate over the members of a STRUCT or UNION, returning each member's
   offset and optionally name and member type in turn.  On end-of-iteration,
   returns -1.  */
ssize_t
ctf_member_next (ctf_file_t *fp, ctf_id_t type, ctf_next_t **it,
                 const char **name, ctf_id_t *membtype);

/* Iterate over the members of an enum TYPE, returning each enumerand's
   NAME or NULL at end of iteration or error, and optionally passing
   back the enumerand's integer VALue.  */
const char *
ctf_enum_next (ctf_file_t *fp, ctf_id_t type, ctf_next_t **it,
              int *val);

/* Iterate over every type in the given CTF container (not including
   parents), optionally including non-user-visible types, returning
   each type ID and optionally the hidden flag in turn. Returns CTF_ERR
   on end of iteration or error.  */
ctf_id_t
ctf_type_next (ctf_file_t *fp, ctf_next_t **it, int *flag,
               int want_hidden);

/* Iterate over every variable in the given CTF container, in arbitrary
   order, returning the name and type of each variable in turn.  The
   NAME argument is not optional.  Returns CTF_ERR on end of iteration
   or error.  */
ctf_id_t
ctf_variable_next (ctf_file_t *fp, ctf_next_t **it, const char **name);

/* Iterate over all CTF files in an archive, returning each dict in turn as a
   ctf_file_t, and NULL on error or end of iteration.  It is the caller's
   responsibility to close it.  Parent dicts may be skipped.  Regardless of
   whether they are skipped or not, the caller must ctf_import the parent if
   need be.  */
ctf_file_t *
ctf_archive_next (const ctf_archive_t *wrapper, ctf_next_t **it,
                  const char **name, int skip_parent, int *errp);

ctf_label_next is prototyped but not implemented yet.

include/
	* ctf-api.h (ECTF_NEXT_END): New error.
	(ECTF_NEXT_WRONGFUN): Likewise.
	(ECTF_NEXT_WRONGFP): Likewise.
	(ECTF_NERR): Adjust.
	(ctf_next_t): New.
	(ctf_next_create): New prototype.
	(ctf_next_destroy): Likewise.
	(ctf_next_copy): Likewise.
	(ctf_member_next): Likewise.
	(ctf_enum_next): Likewise.
	(ctf_type_next): Likewise.
	(ctf_label_next): Likewise.
	(ctf_variable_next): Likewise.

libctf/
	* ctf-impl.h (ctf_next): New.
	(ctf_get_dict): New prototype.
	* ctf-lookup.c (ctf_get_dict): New, split out of...
	(ctf_lookup_by_id): ... here.
	* ctf-util.c (ctf_next_create): New.
	(ctf_next_destroy): New.
	(ctf_next_copy): New.
	* ctf-types.c (includes): Add <assert.h>.
	(ctf_member_next): New.
	(ctf_enum_next): New.
	(ctf_type_iter): Document the lack of iteration over parent
	types.
	(ctf_type_next): New.
	(ctf_variable_next): New.
	* ctf-archive.c (ctf_archive_next): New.
	* libctf.ver: Add new public functions.
2020-07-22 17:57:50 +01:00
Nick Alcock
2399827bfa libctf: add ctf_ref
This allows you to bump the refcount on a ctf_file_t, so that you can
smuggle it out of iterators which open and close the ctf_file_t for you
around the loop body (like ctf_archive_iter).

You still can't use this to preserve a ctf_file_t for longer than the
lifetime of its containing entity (e.g. ctf_archive).

include/
	* ctf-api.h (ctf_ref): New.
libctf/
	* libctf.ver (ctf_ref): New.
	* ctf-open.c (ctf_ref): Implement it.
2020-07-22 17:57:49 +01:00