This flag was meant as an optimization to avoid reserializing dicts
unnecessarily. It was critically necessary back when serialization was
done by ctf_update() and you had to call that every time you wanted any
new modifications to the type table to be usable by other types, but
that has been unnecessary for years now, and serialization is only done
once when writing out, which one would naturally assume would always
serialize the dict. Worse, it never really worked: it only tracked
newly-added types, not things like added symbols which might equally
well require reserialization, and it gets in the way of an upcoming
change. Delete entirely.
libctf/
* ctf-create.c (ctf_create): Drop LCTF_DIRTY.
(ctf_discard): Likewise.
(ctf_rollback): Likewise.
(ctf_add_generic): Likewise.
(ctf_set_array): Likewise.
(ctf_add_enumerator): Likewise.
(ctf_add_member_offset): Likewise.
(ctf_add_variable_forced): Likewise.
* ctf-link.c (ctf_link_intern_extern_string): Likewise.
(ctf_link_add_strtab): Likewise.
* ctf-serialize.c (ctf_serialize): Likewise.
* ctf-impl.h (LCTF_DIRTY): Likewise.
(LCTF_LINKING): Renumber.
A mistaken "not" in ctf_err_warn made it seem like we only extracted
error messages if this was not an error.
libctf/
* ctf-subr.c (ctf_err_warn): Fix comment.
libctf has long declared deserialized dictionaries (out of files or ELF
sections or memory buffers or whatever) to be read-only: back in the
furthest prehistory this was not the case, in that you could add a
few sorts of type to such dicts, but attempting to do so often caused
horrible memory corruption, so I banned the lot.
But it turns out real consumers want it (notably DTrace, which
synthesises pointers to types that don't have them and adds them to the
ctf_open()ed dicts if it needs them). Let's bring it back again, but
without the memory corruption and without the massive code duplication
required in days of yore to distinguish between static and dynamic
types: the representation of both types has been identical for a few
years, with the only difference being that types as a whole are stored in
a big buffer for types read in via ctf_open and per-type hashtables for
newly-added types.
So we discard the internally-visible concept of "readonly dictionaries"
in favour of declaring the *range of types* that were already present
when the dict was read in to be read-only: you can't modify them (say,
by adding members to them if they're structs, or calling ctf_set_array
on them), but you can add more types and point to them. (The API
remains the same, with calls sometimes returning ECTF_RDONLY, but now
they do so less often.)
This is a fairly invasive change, mostly because code written since the
ban was introduced didn't take the possibility of a static/dynamic split
into account. Some of these irregularities were hard to define as
anything but bugs.
Notably:
- The symbol handling was assuming that symbols only needed to be
looked for in dynamic hashtabs or static linker-laid-out indexed/
nonindexed layouts, but now we want to check both in case people
added more symbols to a dict they opened.
- The code that handles type additions wasn't checking to see if types
with the same name existed *at all* (so you could do
ctf_add_typedef (fp, "foo", bar) repeatedly without error). This
seems reasonable for types you just added, but we probably *do* want
to ban addition of types with names that override names we already
used in the ctf_open()ed portion, since that would probably corrupt
existing type relationships. (Doing things this way also avoids
causing new errors for any existing code that was doing this sort of
thing.)
- ctf_lookup_variable entirely failed to work for variables just added
by ctf_add_variable: you had to write the dict out and read it back
in again before they appeared.
- The symbol handling remembered what symbols you looked up but didn't
remember their types, so you could look up an object symbol and then
find it popping up when you asked for function symbols, which seems
less than ideal. Since we had to rejig things enough to be able to
distinguish function and object symbols internally anyway (in order
to give suitable errors if you try to add a symbol with a name that
already existed in the ctf_open()ed dict), this bug suddenly became
more visible and was easily fixed.
We do not (yet) support writing out dicts that have been previously read
in via ctf_open() or other deserializer (you can look things up in them,
but not write them out a second time). This never worked, so there is
no incompatibility; if it is needed at a later date, the serializer is a
little bit closer to having it work now (the only table we don't deal
with is the types table, and that's because the upcoming CTFv4 changes
are likely to make major changes to the way that table is represented
internally, so adding more code that depends on its current form seems
like a bad idea).
There is a new testcase that tests much of this, in particular that
modification of existing types is still banned and that you can add new
ones and chase them without error.
libctf/
* ctf-impl.h (struct ctf_dict.ctf_symhash): Split into...
(ctf_dict.ctf_symhash_func): ... this and...
(ctf_dict.ctf_symhash_objt): ... this.
(ctf_dict.ctf_stypes): New, counts static types.
(LCTF_INDEX_TO_TYPEPTR): Use it instead of CTF_RDWR.
(LCTF_RDWR): Deleted.
(LCTF_DIRTY): Renumbered.
(LCTF_LINKING): Likewise.
(ctf_lookup_variable_here): New.
(ctf_lookup_by_sym_or_name): Likewise.
(ctf_symbol_next_static): Likewise.
(ctf_add_variable_forced): Likewise.
(ctf_add_funcobjt_sym_forced): Likewise.
(ctf_simple_open_internal): Adjust.
(ctf_bufopen_internal): Likewise.
* ctf-create.c (ctf_grow_ptrtab): Adjust a lot to start with.
(ctf_create): Migrate a bunch of initializations into bufopen.
Force recreation of name tables. Do not forcibly override the
model, let ctf_bufopen do it.
(ctf_static_type): New.
(ctf_update): Drop LCTF_RDWR check.
(ctf_dynamic_type): Likewise.
(ctf_add_function): Likewise.
(ctf_add_type_internal): Likewise.
(ctf_rollback): Check ctf_stypes, not LCTF_RDWR.
(ctf_set_array): Likewise.
(ctf_add_struct_sized): Likewise.
(ctf_add_union_sized): Likewise.
(ctf_add_enum): Likewise.
(ctf_add_enumerator): Likewise (only on the target dict).
(ctf_add_member_offset): Likewise.
(ctf_add_generic): Drop LCTF_RDWR check. Ban addition of types
with colliding names.
(ctf_add_forward): Note safety under the new rules.
(ctf_add_variable): Split all but the existence check into...
(ctf_add_variable_forced): ... this new function.
(ctf_add_funcobjt_sym): Likewise...
(ctf_add_funcobjt_sym_forced): ... for this new function.
* ctf-link.c (ctf_link_add_linker_symbol): Ban calling on dicts
with any stypes.
(ctf_link_add_strtab): Likewise.
(ctf_link_shuffle_syms): Likewise.
(ctf_link_intern_extern_string): Note pre-existing prohibition.
* ctf-lookup.c (ctf_lookup_by_id): Drop LCTF_RDWR check.
(ctf_lookup_variable): Split out looking in a dict but not
its parent into...
(ctf_lookup_variable_here): ... this new function.
(ctf_lookup_symbol_idx): Track whether looking up a function or
object: cache them separately.
(ctf_symbol_next): Split out looking in non-dynamic symtypetab
entries to...
(ctf_symbol_next_static): ... this new function. Don't get confused
by the simultaneous presence of static and dynamic symtypetab entries.
(ctf_try_lookup_indexed): Don't waste time looking up symbols by
index before there can be any idea how symbols are numbered.
(ctf_lookup_by_sym_or_name): Distinguish between function and
data object lookups. Drop LCTF_RDWR.
(ctf_lookup_by_symbol): Adjust.
(ctf_lookup_by_symbol_name): Likewise.
* ctf-open.c (init_types): Rename to...
(init_static_types): ... this. Drop LCTF_RDWR. Populate ctf_stypes.
(ctf_simple_open): Drop writable arg.
(ctf_simple_open_internal): Likewise.
(ctf_bufopen): Likewise.
(ctf_bufopen_internal): Populate fields only used for writable dicts.
Drop LCTF_RDWR.
(ctf_dict_close): Cater for symhash cache split.
* ctf-serialize.c (ctf_serialize): Use ctf_stypes, not LCTF_RDWR.
* ctf-types.c (ctf_variable_next): Drop LCTF_RDWR.
* testsuite/libctf-lookup/add-to-opened*: New test.
The intent of the name lookup code was for lookups to yield non-bitfield
basic types except if none existed with a given name, and only then
return bitfield types with that name. Unfortunately, the code as
written only does this if the base type has a type ID higher than all
bitfield types, which is most unlikely (the opposite is almost always
the case).
Adjust it so that what ends up in the name table is the highest-width
zero-offset type with a given name, if any such exist, and failing that
the first type with that name we see, no matter its offset. (We don't
define *which* bitfield type you get, after all, so we might as well
just stuff in the first we find.)
Reported by Stephen Brennan <stephen.brennan@oracle.com>.
libctf/
* ctf-open.c (init_types): Modify to allow some lookups during open;
detect bitfield name reuse and prefer less bitfieldy types.
* testsuite/libctf-writable/libctf-bitfield-name-lookup.*: New test.
libctf internally maintains a set of hash tables for type name lookups,
one for each valid C type namespace (struct, union, enum, and everything
else).
Or, rather, it maintains *two* sets of hash tables: one, a ctf_hash *,
is meant for lookups in ctf_(buf)open()ed dicts with fixed content; the
other, a ctf_dynhash *, is meant for lookups in ctf_create()d dicts.
This distinction was somewhat valuable in the far pre-binutils past when
two different hashtable implementations were used (one expanding, the
other fixed-size), but those days are long gone: the hash table
implementations are almost identical, both wrappers around the libiberty
hashtab. The ctf_dynhash has many more capabilities than the ctf_hash
(iteration, deletion, etc etc) and has no downsides other than starting
at a fixed, arbitrary small size.
That limitation is easy to lift (via a new ctf_dynhash_create_sized()),
following which we can throw away nearly all the ctf_hash
implementation, and all the code to choose between readable and writable
hashtabs; the few convenience functions that are still useful (for
insertion of name -> type mappings) can also be generalized a bit so
that the extra string verification they do is potentially available to
other string lookups as well.
(libctf still has two hashtable implementations, ctf_dynhash, above,
and ctf_dynset, which is a key-only hashtab that can avoid a great many
malloc()s, used for high-volume applications in the deduplicator.)
libctf/
* ctf-create.c (ctf_create): Eliminate ctn_writable.
(ctf_dtd_insert): Likewise.
(ctf_dtd_delete): Likewise.
(ctf_rollback): Likewise.
(ctf_name_table): Eliminate ctf_names_t.
* ctf-hash.c (ctf_dynhash_create): Comment update.
Reimplement in terms of...
(ctf_dynhash_create_sized): ... this new function.
(ctf_hash_create): Remove.
(ctf_hash_size): Remove.
(ctf_hash_define_type): Remove.
(ctf_hash_destroy): Remove.
(ctf_hash_lookup_type): Rename to...
(ctf_dynhash_lookup_type): ... this.
(ctf_hash_insert_type): Rename to...
(ctf_dynhash_insert_type): ... this, moving validation to...
* ctf-string.c (ctf_strptr_validate): ... this new function.
* ctf-impl.h (struct ctf_names): Extirpate.
(struct ctf_lookup.ctl_hash): Now a ctf_dynhash_t.
(struct ctf_dict): All ctf_names_t fields are now ctf_dynhash_t.
(ctf_name_table): Now returns a ctf_dynhash_t.
(ctf_lookup_by_rawhash): Remove.
(ctf_hash_create): Likewise.
(ctf_hash_insert_type): Likewise.
(ctf_hash_define_type): Likewise.
(ctf_hash_lookup_type): Likewise.
(ctf_hash_size): Likewise.
(ctf_hash_destroy): Likewise.
(ctf_dynhash_create_sized): New.
(ctf_dynhash_insert_type): New.
(ctf_dynhash_lookup_type): New.
(ctf_strptr_validate): New.
* ctf-lookup.c (ctf_lookup_by_name_internal): Adapt.
* ctf-open.c (init_types): Adapt.
(ctf_set_ctl_hashes): Adapt.
(ctf_dict_close): Adapt.
* ctf-serialize.c (ctf_serialize): Adapt.
* ctf-types.c (ctf_lookup_by_rawhash): Remove.
This cache replaced a cache of symbol index->ctf_id_t. That cache was
just an array, so it could get away with just being free()d, but the
ctfi_symnamedicts cache that replaced it is a full dynhash with a
dynamically-allocated string as the key. As such, it needs freeing with
ctf_dynhash_destroy(), not just free(), or we leak parts of the
underlying hashtab, and all the keys.
libctf/ChangeLog:
* ctf-archive.c (ctf_arc_flush_caches): Fix leak.
Seen with every compiler I have if using -fno-inline:
home/alan/src/binutils-gdb/libctf/ctf-create.c: In function ‘ctf_add_encoded’:
/home/alan/src/binutils-gdb/libctf/ctf-create.c:555:3: warning: ‘encoding’ may be used uninitialized [-Wmaybe-uninitialized]
555 | memcpy (dtd->dtd_vlen, &encoding, sizeof (encoding));
Seen with gcc-4.9 and probably others at lower optimisation levels:
home/alan/src/binutils-gdb/libctf/ctf-serialize.c: In function 'symtypetab_density':
/home/alan/src/binutils-gdb/libctf/ctf-serialize.c:211:18: warning: 'sym' may be used uninitialized in this function [-Wmaybe-uninitialized]
if (*max < sym->st_symidx)
Seen with gcc-4.5 and probably others at lower optimisation levels:
/home/alan/src/binutils-gdb/libctf/ctf-types.c:1649:21: warning: 'tp' may be used uninitialized in this function
/home/alan/src/binutils-gdb/libctf/ctf-link.c:765:16: warning: 'parent_i' may be used uninitialized in this function
Also with gcc-4.5:
In file included from /home/alan/src/binutils-gdb/libctf/ctf-endian.h:25:0,
from /home/alan/src/binutils-gdb/libctf/ctf-archive.c:24:
/home/alan/src/binutils-gdb/libctf/swap.h:70:0: warning: "_Static_assert" redefined
/usr/include/sys/cdefs.h:568:0: note: this is the location of the previous definition
* swap.h (_Static_assert): Don't define if already defined.
* ctf-serialize.c (symtypetab_density): Merge two
CTF_SYMTYPETAB_FORCE_INDEXED blocks.
* ctf-create.c (ctf_add_encoded): Avoid "encoding" may be used
uninitialized warning.
* ctf-link.c (ctf_link_deduplicating_open_inputs): Avoid
"parent_i" may be used uninitialized warning.
* ctf-types.c (ctf_type_rvisit): Avoid "tp" may be used
uninitialized warning.
Just because a path is an error path doesn't mean the program terminates
there if you don't ask it to. And we don't want to -- but that means
we need to initialize the variables that are missed if an error happens to
*something*. Type ID 0 (unimplemented) will do: it'll induce further
ECTF_BADID errors, but that's no bad thing.
libctf/ChangeLog:
* testsuite/libctf-writable/libctf-errors.c: Initialize variables.
Adds two new external authors to etc/update-copyright.py to cover
bfd/ax_tls.m4, and adds gprofng to dirs handled automatically, then
updates copyright messages as follows:
1) Update cgen/utils.scm emitted copyrights.
2) Run "etc/update-copyright.py --this-year" with an extra external
author I haven't committed, 'Kalray SA.', to cover gas testsuite
files (which should have their copyright message removed).
3) Build with --enable-maintainer-mode --enable-cgen-maint=yes.
4) Check out */po/*.pot which we don't update frequently.
When CTF finds conflicting types, it usually shoves each definition
into a CTF dictionary named after the compilation unit.
The intent of the obscure "cu-mapped link" feature is to allow you to
implement custom linkers that shove the definitions into other, more
coarse-grained units (say, one per kernel module, even if a module consists
of more than one compilation unit): conflicting types within one of these
larger components are hidden from name lookup so you can only look up (an
arbitrary one of) them by name, but can still be found by chasing type graph
links and are still fully deduplicated.
You do this by calling
ctf_link_add_cu_mapping (fp, "CU name", "bigger lump name"), repeatedly,
with different "CU name"s: the ctf_link() following that will put all
conflicting types found in "CU name"s sharing a "bigger lump name" into a
child dict in an archive member named "bigger lump name".
So it's clear enough what happens if you call it repeatedly with the same
"bigger lump name" more than once, because that's the whole point of it: but
what if you call it with the same "CU name" repeatedly?
ctf_link_add_cu_mapping (fp, "CU name", "bigger lump name");
ctf_link_add_cu_mapping (fp, "CU name", "other name");
This is meant to be the same as just doing the second of these, as if the
first was never called. Alas, this isn't what happens, and what you get is
instead a bit of an inconsistent mess: more or less, the first takes
precedence, which is the exact opposite of what we wanted.
Fix this to work the right way round.
(I plan to add support for CU-mapped links to GNU ld, mainly so that we can
properly *test* this machinery.)
libctf/ChangeLog:
* ctf-link.c (ctf_create_per_cu): Note the behaviour of
repeatedly adding FROMs.
(ctf_link_add_cu_mapping): Implement that behavour.
The fixes applied a few years ago to resolve confusions between parent and
child dicts at lookup time also apply in various forms to creation. In
general, if you have a type in a parent dict ctf_imported into a child and
you do something to it, and the parent dict is writable (created via
ctf_create, not opened via ctf_open*) it should work just the same to make
changes to that type via a child dict as it does to make the change
to the parent dict directly -- and nothing you're prohibited from doing
to the parent dict when done directly should be allowed just because
you're doing it via a child.
Specifically, the following don't work when doing things from the child, but
should:
- adding a member of a type in the parent to a struct or union in the
parent via ctf_add_member or ctf_add_member_offset: this yields
ECTF_BADID
- adding a member of a type in the parent to a struct or union in the
parent via ctf_add_member_encoded: this dumps core (!).
- adding an enumerand to an enumerator in the parent: this yields
ECTF_BADID
- setting the properties of an array in the parent via ctf_set_array;
this yields ECTF_BADID
Relatedly, some things work when doing things via a child that should fail,
yielding a CTF dictionary with invalid content (readable, but meaningless):
in particular, you can add a child type to a struct in the parent via
any of the ctf_add_member* family and nothing complains at all, even though
you should never be able to add references to children to parents (since any
given parent can be associated with many different children).
A family of tests is added to check each of these cases independently, since
some can result in coredumps and it would be nice to test the other cases
even if some dump core. They use a common library to do all the actual
work. The set of affected API calls was determined by code inspection
(auditing all calls to ctf_dtd_lookup): it's possible that I missed a few,
but I doubt it, since other cases use ctf_lookup* functions, which already
climb to the parent where appropriate.
libctf/ChangeLog:
PR libctf/30985
* ctf-create.c (ctf_dtd_lookup): Traverse to parents if necessary.
(ctf_set_array): Likewise. Report errors on the child; require
both parent and child to be writable.
(ctf_add_enumerator): Likewise.
(ctf_add_member_offset): Likewise. Prohibit addition of child types
to structs in the parent.
(ctf_add_member_encoded): Do not dereference a NULL dtd: report
ECTF_BADID instead.
* ctf-string.c (ctf_str_add_ref_internal): Report ENOMEM on the
dict if addition of a string ref fails.
* testsuite/libctf-writable/parent-child-dtd-crash-lib.c: New library.
* testsuite/libctf-writable/parent-child-dtd-enum.*: New test.
* testsuite/libctf-writable/parent-child-dtd-enumerator.*: New test.
* testsuite/libctf-writable/parent-child-dtd-member-encoded.*: New test.
* testsuite/libctf-writable/parent-child-dtd-member-offset.*: New test.
* testsuite/libctf-writable/parent-child-dtd-set-array.*: New test.
* testsuite/libctf-writable/parent-child-dtd-struct.*: New test.
* testsuite/libctf-writable/parent-child-dtd-union.*: New test.
We do this as a writable test because the only known-affected platforms
(with ssize_t longer than unsigned long) use PE, and we do not have support
for CTF linkage in the PE linker yet.
PR libctf/30836
* libctf/testsuite/libctf-writable/libctf-errors.*: New test.
In commit 998a4f589d, all but one return
statement was updated to return the error proper value. This commit
rectifies that missed return statement.
libctf/
ctf-types.c (ctf_type_resolve_unsliced): Return CTF_ERR on error.
Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
Made sure there is no implicit conversion between signed and unsigned
return value for functions setting the ctf_errno value.
An example of the problem is that in ctf_member_next, the "offset" value
is either 0L or (ctf_id_t)-1L, but it should have been 0L or -1L.
The issue was discovered while building a 64 bit ld binary to be
executed on the Windows platform.
Example object file that demonstrates the issue is attached in the PR.
libctf/
Affected functions adjusted.
Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
Co-Authored-By: Yvan ROUX <yvan.roux@foss.st.com>
This regenerates config files changed by the previous 44 commits.
Note that subject lines in these commits mostly match the gcc git
originating commit.
This is a bug in the intersection of two obscure options that cannot
even be invoked from ld with a feature added to stop ld of the
same input file repeatedly from crashing the linker.
The latter fix involved tracking input files (internally to libctf) not
just with their input CU name but with a version of their input CU name
that was augmented with a numeric prefix if their linker input file name
was changed, to prevent distinct CTF dicts with the same cuname from
overwriting each other. (We can't use just the linker input file name
because one linker input can contain many CU dicts, particularly under
ld -r). If these inputs then produced conflicting types, those types
were emitted into similarly-named output dicts, so we needed similar
machinery to detect clashing output dicts and add a numeric prefix to
them as well.
This works fine, except that if you used the cu-mapping feature to force
double-linking of CTF (so that your CTF can be grouped into output dicts
larger than a single translation unit) and then also used
CTF_LINK_EMPTY_CU_MAPPINGS to force every possible output dict in the
mapping to be created (even if empty), we did the creation of empty dicts
first, and then all the actual content got considered to be a clash. So
you ended up with a pile of useless empty dicts and then all the content
was in full dicts with the same names suffixed with a #0. This seems
likely to confuse consumers that use this facility.
Fixed by generating all the EMPTY_CU_MAPPINGS empty dicts after linking
is complete, not before it runs.
No impact on ld, which does not do cu-mapped links or pass
CTF_LINK_EMPTY_CU_MAPPINGS to ctf_link().
libctf/
* ctf-link.c (ctf_create_per_cu): Don't create new dicts iff one
already exists and we are making one for no input in particular.
(ctf_link): Emit empty CTF dicts corresponding to no input in
particular only after linkiing is complete.
CTF dicts have per-dict errno values: as with other errno values these
are set on error and left unchanged on success. This means that all
errors *must* set the CTF errno: if a call leaves it unchanged, the
caller is apt to find a previous, lingering error and misinterpret
it as the real error.
There are many places in libctf where we carry out operations on parent
dicts as a result of carrying out other user-requested operations on
child dicts (e.g. looking up information on a pointer to a type will
look up the type as well: the pointer might well be in a child and the
type it's a pointer to in the parent). Those operations on the parent
might fail; if they do, the error must be correctly reflected on the
child that the user-visible operation was carried out on. In many
places this was not happening.
So, audit and fix all those places. Add tests for as many of those
cases as possible so they don't regress.
libctf/
* ctf-create.c (ctf_add_slice): Use the original dict.
* ctf-lookup.c (ctf_lookup_variable): Propagate errors.
(ctf_lookup_symbol_idx): Likewise.
* ctf-types.c (ctf_member_next): Likewise.
(ctf_type_resolve_unsliced): Likewise.
(ctf_type_aname): Likewise.
(ctf_member_info): Likewise.
(ctf_type_rvisit): Likewise.
(ctf_func_type_info): Set the error on the right dict.
(ctf_type_encoding): Use the original dict.
* testsuite/libctf-writable/error-propagation.*: New test.
The newly-introduced libctf-lookup unnamed-field-info test checks
C compiler-observed field offsets against libctf-computed ones
by #including the testcase in the lookup runner as well as
generating CTF for it. This only works if the host, on which
the lookup runner is compiled and executed, is the same architecture as
the target, for which the CTF is generated: when crossing, the trick
may fail.
So pass down an indication of whether this is a cross into the
testsuite, and add a new no_cross flag to .lk files that is used to
suppress test execution when a cross-compiler is being tested.
libctf/
* Makefile.am (check_DEJAGNU): Pass down TEST_CROSS.
* Makefile.in: Regenerated.
* testsuite/lib/ctf-lib.exp (run_lookup_test): Use it to
implement the new no_cross option.
* testsuite/libctf-lookup/unnamed-field-info.lk: Mark as
no_cross.
We were failing to add the offsets of the containing struct/union
in this case, leading to all offsets being relative to the unnamed
struct/union itself.
libctf/
PR libctf/30264
* ctf-types.c (ctf_member_info): Add the offset of the unnamed
member of the current struct as necessary.
* testsuite/libctf-lookup/unnamed-field-info*: New test.
ctf_dedup's intern() function does not return a dynamically allocated
string, so I just spent ten minutes auditing for obvious memory leaks
that couldn't actually happen. Update the comment to note what it
actually returns (a pointer into an atoms table: i.e. possibly not
a new string, and not so easily leakable).
libctf/
* ctf-dedup.c (intern): Update comment.
GCC 11+ complains that sym is uninitialized in ctf_symbol_next. It
isn't, but it's not quite smart enough to figure that out (it requires
domain-specific knowledge of the state of the ctf_next_t iterator
over multiple calls).
libctf/
* ctf-lookup.c (ctf_symbol_next): Initialize sym to a suitable
value for returning if never reset during the function.
If no suitable qsort_r is found in libc, we fall back to an
implementation in ctf-qsort.c. But this implementation routinely calls
the comparison function with two identical arguments. The comparison
function that ensures that the order of output types is stable is not
ready for this, misinterprets it as a type appearing more that once (a
can-never-happen condition) and fails with an assertion failure.
Fixed, audited for further instances of the same failure (none found)
and added a no-qsort test to my regular testsuite run.
libctf/:
PR libctf/30013
* ctf-dedup.c (sort_output_mapping): Inputs are always equal to
themselves.
While trying to build gdb on latest openSUSE Tumbleweed, I noticed the
following warning,
checking for makeinfo... makeinfo --split-size=5000000
configure: WARNING:
*** Makeinfo is too old. Info documentation will not be built.
then I checked the version of makeinfo, it said,
======
$ makeinfo --version
texi2any (GNU texinfo) 7.0.1
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
======
After digging a little bit, it became quite obvious that a dot is
missing in regexp that makes it impossible to match versions higher than
7.0, and here's the solution:
- | egrep 'texinfo[^0-9]*(6\.[3-9]|[7-9][0-9])' >/dev/null 2>&1; then
+ | egrep 'texinfo[^0-9]*(6\.[3-9]|[7-9]\.[0-9])' >/dev/null 2>&1; then
However, Eli pointed out that the solution above has another problem: it
will stop working when Texinfo 10.1 will be released. Meanwhile, he
suggested to solve this problem permanently. That is, we don't care
about the minor version for Texinfo > 6.9, we only care about the major
version.
In this way, the problem will be resolved permanently, thanks to Eli.
libctf/ChangeLog:
* configure: Regenerated.
* configure.ac: Update regexp to match versions higher than 7.0.
This check has a pair of faults which, combined, can lead to memory
corruption. Firstly, it assumes that the values of the ctf_link_inputs
hash are ctf_dict_t's: they are not, they are ctf_link_input_t's, a much
shorter structure. So the flags check which is the core of this is
faulty (but happens, by chance, to give the right output on most
architectures, since usually we happen to get a 0 here, so the test that
checks this usually passes). Worse, the warning that is emitted when
the test fails is added to the wrong dict -- it's added to the input
dict, whose warning list is never consumed, rendering the whole check
useless. But the dict it adds to is still the wrong type, so we end up
overwriting something deep in memory (or, much more likely,
dereferencing a garbage pointer and crashing).
Fixing both reveals another problem: the link input is an *archive*
consisting of multiple members, so we have to consider whether to check
all of them for the outdated-func-info thing we are checking here.
However, no compiler exists that emits a mixture of members with this
flag on and members with it off, and the linker always reserializes (and
upgrades) such things when it sees them: so all members in a given
archive must have the same value of the flag, so we only need to check
one member per input archive.
libctf/
PR libctf/29983
* ctf-link.c (ctf_link_warn_outdated_inputs): Get the types of
members of ctf_link_inputs right, fixing a possible spurious
tesst failure / wild pointer deref / overwrite. Emit the
warning message into the right dict.
The libctf testsuite uses Tcl try/catch to trap run_output errors. This
is only supported in reasonably recent Tcls, so we detect the lack of
try/catch and suppress the testsuite via an Automake conditional in its
absence.
But this turns out not to work: Automake produces a check-DEJAGNU target
regardless of the value of this conditional and sticks it in an
unconditionally-executed part of the makefile, so the testsuite gets
executed anyway, and fails with a nasty-looking syntax error. We can't
disable it by taking "dejagnu" out of AUTOMAKE_OPTIONS, because if you
do that Automake stops you using RUNTEST, RUNTESTFLAGS and other
variables users would expect to work.
So move to disabling the testsuite from inside the testsuite itself,
importing the value of the former Automake conditional as a Tcl variable
and exiting very early in default.exp if it's false.
* configure.ac (TCL_TRY): No longer an Automake conditional.
Rename to...
(HAVE_TCL_TRY): ... this.
* Makefile.am: Drop TCL_TRY.
(development.exp): Set have_tcl_try.
* testsuite/config/default.exp: Exit if have_tcl_try is false.
* configure: Regenerated.
* Makefile.in: Likewise.
The newer update-copyright.py fixes file encoding too, removing cr/lf
on binutils/bfdtest2.c and ld/testsuite/ld-cygwin/exe-export.exp, and
embedded cr in binutils/testsuite/binutils-all/ar.exp string match.
This patch is essentially a revert of
commit-id: 8818c80cbd
(libctf: Add ZSTD_LIBS to LIBS so that ac_cv_libctf_bfd_elf can be true)
As the specific configure check now uses libtool, this explicit mention of the
dependency $ZSTD_LIBS is not needed anymore.
ChangeLog:
* libctf/Makefile.in: Regenerated.
* libctf/aclocal.m4: Likewise.
* libctf/config.h.in: Likewise.
* libctf/configure: Likewise.
* libctf/configure.ac: Remove ZSTD_LIBS from LIBS. Cleanup
unused AC_ZSTD.
ACLOCAL_AMFLAGS is being set already. So using AC_CONFIG_MACRO_DIR is
unnecessary.
ChangeLog:
* libctf/configure: Regenerated.
* libctf/configure.ac: remove AC_CONFIG_MACRO_DIR usage.
This dependency is managed via libtool. So explicit addition to LDFLAGS
and LIBS is not necessary anymore.
ChangeLog:
* libctf/configure: Regenerated.
* libctf/configure.ac: remove zlib from LDFLAGS and LIBS.
The configure check for ELF support in BFD uses the AC_TRY_LINK. If
libbfd's dependencies change, this macro will need to be updated
manually with explicit additions to LDFLAGS and LIBS.
This patch updates the check to use libtool instead.
ChangeLog:
* libctf/configure.ac: Use libtool instead.
* libctf/configure: Regenerated.
gas uses ZSTD_compressStream2 which is only available with libzstd >=
1.4.0, leading to build errors when an older version is installed.
This patch updates the check libzstd presence to check its version is
>= 1.4.0. However, since gas seems to be the only component requiring
such a recent version this may imply that we disable ZSTD support for
all components although some would still benefit from an older
version.
I ran 'autoreconf -f' in all directories containing a configure.ac
file, using vanilla autoconf-2.69 and automake-1.15.1. I noticed
several errors from autoheader in readline, as well as warnings in
intl, but they are unrelated to this patch.
This should fix some of the buildbots.
OK for trunk?
Thanks,
Christophe
We were failing to call prune_warnings appropriately, leading to
false-positive test failures on some platforms (observed on
sparclinux).
libctf/ChangeLog:
* testsuite/lib/ctf-lib.exp: Prune warnings from compiler and
linker output.
* testsuite/libctf-regression/libctf-repeat-cu.exp: Likewise,
and ar output too.
A missing paren led to an intended cast to avoid dependence on the size
of size_t in one argument of ctf_err_warn applying to the wrong type by
mistake.
libctf/ChangeLog:
* ctf-serialize.c (ctf_write_mem): Fix cast.
Right now, if you compile the same .c input repeatedly with CTF enabled
and different compilation flags, then arrange to link all of these
together, then things misbehave in various ways. libctf may conflate
either inputs (if the .o files have the same name, say if they are
stored in different .a archives), or per-CU outputs when conflicting
types are found: the latter can lead to entirely spurious errors when
it tries to produce multiple per-CU outputs with the same name
(discarding all but the last, but then looking for types in the earlier
ones which have just been thrown away).
Fixing this is multi-pronged. Both inputs and outputs need to be
differentiated in the hashtables libctf keeps them in: inputs with the
same cuname and filename need to be considered distinct as long as they
have different associated CTF dicts, and per-CU outputs need to be
considered distinct as long as they have different associated input
dicts. Right now there is nothing tying the two together other than the
CU name: fix this by introducing a new field in the ctf_dict_t named
ctf_link_in_out, which (for input dicts) points to the associated per-CU
output dict (if any), and for output dicts points to the associated
input dict. At creation time the name used is completely arbitrary:
it's only important that it be distinct if CTF dicts are distinct. So,
when a clash is found, adjust the CU name by sticking the number of
elements in the input on the end. At output time, the CU name will
appear in the linked object, so it matters a little more that it look
slightly less ugly: in conflicting cases, append an incrementing
integer, starting at 0.
This naming scheme is not very helpful, but it's hard to see what else
we can do. The input .o name may be the same. The input .a name is not
even visible to ctf_link, and even *that* might be the same, because
.a's can contain many members with the same name, all of which
participate in the link. All we really know is that the two have
distinct dictionaries with distinct types in them, and at least this way
they are all represented, any any symbols, variables etc referring to
those types are accurately stored.
(As a side-effect this also fixes a use-after-free and double-free when
errors are found during variable or symbol emission.)
Use the opportunity to prevent a couple of sources of problems, to wit
changing the active CU mappings when a link has already been done
(no effect on ld, which doesn't use CU mappings at all), and causing
multiple consecutive ctf_link's to have the same net effect as just
doing the last one (no effect on ld, which only ever does one
ctf_link) rather than having the links be a sort of half-incremental
not-really-intended mess.
libctf/ChangeLog:
PR libctf/29242
* ctf-impl.h (struct ctf_dict) [ctf_link_in_out]: New.
* ctf-dedup.c (ctf_dedup_emit_type): Set it.
* ctf-link.c (ctf_link_add_ctf_internal): Set the input
CU name uniquely when clashes are found.
(ctf_link_add): Document what repeated additions do.
(ctf_new_per_cu_name): New, come up with a consistent
name for a new per-CU dict.
(ctf_link_deduplicating): Use it.
(ctf_create_per_cu): Use it, and ctf_link_in_out, and set
ctf_link_in_out properly. Don't overwrite per-CU dicts with
per-CU dicts relating to different inputs.
(ctf_link_add_cu_mapping): Prevent per-CU mappings being set up
if we already have per-CU outputs.
(ctf_link_one_variable): Adjust ctf_link_per_cu call.
(ctf_link_deduplicating_one_symtypetab): Likewise.
(ctf_link_empty_outputs): New, delete all the ctf_link_outputs
and blank out ctf_link_in_out on the corresponding inputs.
(ctf_link): Clarify the effect of multiple ctf_link calls.
Empty ctf_link_outputs if it already exists rather than
having the old output leak into the new link. Fix a variable
name.
* testsuite/config/default.exp (AR): Add.
(OBJDUMP): Likewise.
* testsuite/libctf-regression/libctf-repeat-cu.exp: New test.
* testsuite/libctf-regression/libctf-repeat-cu*: Main program,
library, and expected results for the test.
When two types conflict and they are not types which can have forwards
(say, two arrays of different sizes with the same name in two different
TUs) the CTF deduplicator uses a popularity contest to decide what to
do: the type cited by the most other types ends up put into the shared
dict, while the others are relegated to per-CU child dicts.
This works well as long as one type *is* most popular -- but what if
there is a tie? If several types have the same popularity count,
we end up picking the first we run across and promoting it, and
unfortunately since we are working over a dynhash in essentially
arbitrary order, this means we promote a random one. So multiple
runs of ld with the same inputs can produce different outputs!
All the outputs are valid, but this is still undesirable.
Adjust things to use the same strategy used to sort types on the output:
when there is a tie, always put the type that appears in a CU that
appeared earlier on the link line (and if there is somehow still a tie,
which should be impossible, pick the type with the lowest type ID).
Add a testcase -- and since this emerged when trying out extern arrays,
check that those work as well (this requires a newer GCC, but since all
GCCs that can emit CTF at all are unreleased this is probably OK as
well).
Fix up one testcase that has slight type ordering changes as a result
of this change.
libctf/ChangeLog:
* ctf-dedup.c (ctf_dedup_detect_name_ambiguity): Use
cd_output_first_gid to break ties.
ld/ChangeLog:
* testsuite/ld-ctf/array-conflicted-ordering.d: New test, using...
* testsuite/ld-ctf/array-char-conflicting-1.c: ... this...
* testsuite/ld-ctf/array-char-conflicting-2.c: ... and this.
* testsuite/ld-ctf/array-extern.d: New test, using...
* testsuite/ld-ctf/array-extern.c: ... this.
* testsuite/ld-ctf/conflicting-typedefs.d: Adjust for ordering
changes.
My previous nm patch handled all cases but one -- if the user set NM in
the environment to a path which contained an option, libtool's nm
detection tries to run nm against a copy of nm with the options in it:
e.g. if NM was set to "nm --blargle", and nm was found in /usr/bin, the
test would try to run "/usr/bin/nm --blargle /usr/bin/nm --blargle".
This is unlikely to be desirable: in this case we should run
"/usr/bin/nm --blargle /usr/bin/nm".
Furthermore, as part of this nm has to detect when the passed-in $NM
contains a path, and in that case avoid doing a path search itself.
This too was thrown off if an option contained something that looked
like a path, e.g. NM="nm -B../prev-gcc"; libtool then tries to run
"nm -B../prev-gcc nm" which rarely works well (and indeed it looks
to see whether that nm exists, finds it doesn't, and wrongly concludes
that nm -p or whatever does not work).
Fix all of these by clipping all options (defined as everything
including and after the first " -") before deciding whether nm
contains a path (but not using the clipped value for anything else),
and then removing all options from the path-modified nm before
looking to see whether that nm existed.
NM=my-nm now does a path search and runs e.g.
/usr/bin/my-nm -B /usr/bin/my-nm
NM=/usr/bin/my-nm now avoids a path search and runs e.g.
/usr/bin/my-nm -B /usr/bin/my-nm
NM="my-nm -p../wombat" now does a path search and runs e.g.
/usr/bin/my-nm -p../wombat -B /usr/bin/my-nm
NM="../prev-binutils/new-nm -B../prev-gcc" now avoids a path search:
../prev-binutils/my-nm -B../prev-gcc -B ../prev-binutils/my-nm
This seems to be all combinations, including those used by GCC bootstrap
(which, before this commit, fails to bootstrap when configured
--with-build-config=bootstrap-lto, because the lto plugin is now using
--export-symbols-regex, which requires libtool to find a working nm,
while also using -B../prev-gcc to point at the lto plugin associated
with the GCC just built.)
Regenerate all affected configure scripts.
* libtool.m4 (LT_PATH_NM): Handle user-specified NM with
options, including options containing paths.
libctf has always handled endianness differences by detecting
foreign-endian CTF dicts on the input and endian-flipping them: dicts
are always written in native endianness. This makes endian-awareness
very low overhead, but it means that the foreign-endian code paths
almost never get routinely tested, since "make check" usually reads in
dicts ld has just written out: only a few corrupted-CTF tests are
actually in fixed endianness, and even they only test the foreign-
endian code paths when you run make check on a big-endian machine.
(And the fix is surely not to add more .s-based tests like that, because
they are a nightmare to maintain compared to the C-code-based ones.)
To improve on this, add a new environment variable,
LIBCTF_WRITE_FOREIGN_ENDIAN, which causes libctf to unconditionally
endian-flip at ctf_write time, so the output is always in the wrong
endianness. This then tests the foreign-endian read paths properly
at open time.
Make this easier by restructuring the writeout code in ctf-serialize.c,
which duplicates the maybe-gzip-and-write-out code three times (once
for ctf_write_mem, with thresholding, and once each for
ctf_compress_write and ctf_write just so those can avoid thresholding
and/or compression). Instead, have the latter two call the former
with thresholds of 0 or (size_t) -1, respectively.
The endian-flipping code itself gains a bit of complexity, because
one single endian-flipper (flip_types) was assuming the input to be
in foreign-endian form and assuming it could pull things out of the
input once they had been flipped and make sense of them. At the
cost of a few lines of duplicated initializations, teach it to
read before flipping if we're flipping to foreign-endianness instead
of away from it.
libctf/
* ctf-impl.h (ctf_flip_header): No longer static.
(ctf_flip): Likewise.
* ctf-open.c (flip_header): Rename to...
(ctf_flip_header): ... this, now it is not private to one file.
(flip_ctf): Rename...
(ctf_flip): ... this too. Add FOREIGN_ENDIAN arg.
(flip_types): Likewise. Use it.
(ctf_bufopen_internal): Adjust calls.
* ctf-serialize.c (ctf_write_mem): Add flip_endian path via
a newly-allocated bounce buffer.
(ctf_compress_write): Move below ctf_write_mem and reimplement
in terms of it.
(ctf_write): Likewise.
(ctf_gzwrite): Note that this obscure writeout function does not
support endian-flipping.
The last section in a CTF dict is the string table, at an offset
represented by the cth_stroff header field. Its length is recorded in
the next field, cth_strlen, and the two added together are taken as the
size of the CTF dict. Upon opening a dict, we check that none of the
header offsets exceed this size, and we check when uncompressing a
compressed dict that the result of the uncompression is the same length:
but CTF dicts need not be compressed, and short ones are not.
Uncompressed dicts just use the ctf_size without checking it. This
field is thankfully almost unused: it is mostly used when reserializing
a dict, which can't be done to dicts read off disk since they're
read-only.
However, when opening an uncompressed foreign-endian dict we have to
copy it out of the mmaped region it is stored in so we can endian-
swap it, and we use ctf_size when doing that. When the cth_strlen is
corrupt, this can overrun.
Fix this by checking the ctf_size in all uncompressed cases, just as we
already do in the compressed case. Add a new test.
This came to light because various corrupted-CTF raw-asm tests had an
incorrect cth_strlen: fix all of them so they produce the expected
error again.
libctf/
PR libctf/28933
* ctf-open.c (ctf_bufopen_internal): Always check uncompressed
CTF dict sizes against the section size in case the cth_strlen is
corrupt.
ld/
PR libctf/28933
* testsuite/ld-ctf/diag-strlen-invalid.*: New test,
derived from diag-cttname-invalid.s.
* testsuite/ld-ctf/diag-cttname-invalid.s: Fix incorrect cth_strlen.
* testsuite/ld-ctf/diag-cttname-null.s: Likewise.
* testsuite/ld-ctf/diag-cuname.s: Likewise.
* testsuite/ld-ctf/diag-parlabel.s: Likewise.
* testsuite/ld-ctf/diag-parname.s: Likewise.
The CTF variable section is an optional (usually-not-present) section in
the CTF dict which contains name -> type mappings corresponding to data
symbols that are present in the linker input but not in the output
symbol table: the idea is that programs that use their own symbol-
resolution mechanisms can use this section to look up the types of
symbols they have found using their own mechanism.
Because these removed symbols (mostly static variables, functions, etc)
all have names that are unlikely to appear in the ELF symtab and because
very few programs have their own symbol-resolution mechanisms, a special
linker flag (--ctf-variables) is needed to emit this section.
Historically, we emitted only removed data symbols into the variable
section. This seemed to make sense at the time, but in hindsight it
really doesn't: functions are symbols too, and a C program can look them
up just like any other type. So extend the variable section so that it
contains all static function symbols too (if it is emitted at all), with
types of kind CTF_K_FUNCTION.
This is a little fiddly. We relied on compiler assistance for data
symbols: the compiler simply emits all data symbols twice, once into the
symtypetab as an indexed symbol and once into the variable section.
Rather than wait for a suitably adjusted compiler that does the same for
function symbols, we can pluck unreported function symbols out of the
symtab and add them to the variable section ourselves. While we're at
it, we do the same with data symbols: this is redundant right now
because the compiler does it, but it costs very little time and lets the
compiler drop this kludge and save a little space in .o files.
include/
* ctf.h: Mention the new things we can see in the variable
section.
ld/
* testsuite/ld-ctf/data-func-conflicted-vars.d: New test.
libctf/
* ctf-link.c (ctf_link_deduplicating_variables): Duplicate
symbols into the variable section too.
* ctf-serialize.c (symtypetab_delete_nonstatic_vars): Rename
to...
(symtypetab_delete_nonstatics): ... this. Check the funchash
when pruning redundant variables.
(ctf_symtypetab_sect_sizes): Adjust accordingly.
* NEWS: Describe this change.