binutils-gdb

mirror of https://sourceware.org/git/binutils-gdb.git synced 2025-03-01 13:26:47 +08:00

Author	SHA1	Message	Date
Nick Alcock	483546ce4f	libctf: make ctf_serialize() actually serialize ctf_serialize() evolved from the old ctf_update(), which mutated the in-memory CTF dict to make all the dynamic in-memory types into static, unchanging written-to-the-dict types (by deserializing and reserializing it): back in the days when you could only do type lookups on static types, this meant you could see all the types you added recently, at the small, small cost of making it impossible to change those older types ever again and inducing an amortized O(n^2) cost if you actually wanted to add references to types you added at arbitrary times to later types. It also reset things so that ctf_discard() would throw away only types you added after the most recent ctf_update() call. Some time ago this was all changed so that you could look up dynamic types just as easily as static types: ctf_update() changed so that only its visible side-effect of affecting ctf_discard() remained: the old ctf_update() was renamed to ctf_serialize(), made internal to libctf, and called from the various functions that wrote files out. ... but it was still working by serializing and deserializing the entire dict, swapping out its guts with the newly-serialized copy in an invasive and horrible fashion that coupled ctf_serialize() to almost every field in the ctf_dict_t. This is totally useless, and fixing it is easy: just rip all that code out and have ctf_serialize return a serialized representation, and let everything use that directly. This simplifies most of its callers significantly. (It also points up another bug: ctf_gzwrite() failed to call ctf_serialize() at all, so it would only ever work for a dict you just ctf_write_mem()ed yourself, just for its invisible side-effect of serializing the dict!) This lets us simplify away a bunch of internal-only open-side functionality for overriding the syn_ext_strtab and some just-added functionality for forcing in an existing atoms table, without loss of functionality, and lets us lift the restriction on reserializing a dict that was ctf_open()ed rather than being ctf_create()d: it's now perfectly OK to open a dict, modify it (except for adding members to existing structs, unions, or enums, which fails with -ECTF_RDONLY), and write it out again, just as one would expect. libctf/ * ctf-serialize.c (ctf_symtypetab_sect_sizes): Fix typos. (ctf_type_sect_size): Add static type sizes too. (ctf_serialize): Return the new dict rather than updating the existing dict. No longer fail for dicts with static types; copy them onto the start of the new types table. (ctf_gzwrite): Actually serialize before gzwriting. (ctf_write_mem): Improve forced (test-mode) endian-flipping: flip dicts even if they are too small to be compressed. Improve confusing variable naming. * ctf-archive.c (arc_write_one_ctf): Don't bother to call ctf_serialize: both the functions we call do so. * ctf-string.c (ctf_str_create_atoms): Drop serializing case (atoms arg). * ctf-open.c (ctf_simple_open): Call ctf_bufopen directly. (ctf_simple_open_internal): Delete. (ctf_bufopen_internal): Delete/rename to ctf_bufopen: no longer bother with syn_ext_strtab or forced atoms table, serialization no longer needs them. * ctf-create.c (ctf_create): Call ctf_bufopen directly. * ctf-impl.h (ctf_str_create_atoms): Drop atoms arg. (ctf_simple_open_internal): Delete. (ctf_bufopen_internal): Likewise. (ctf_serialize): Adjust. * testsuite/libctf-lookup/add-to-opened.c: Adjust now that this is supposed to work.	2024-04-19 16:14:47 +01:00
Nick Alcock	cf9da3b0b6	libctf: rethink strtab writeout This commit finally adjusts strtab writeout so that repeated writeouts, or writeouts of a dict that was read in earlier, only sorts the portion of the strtab that was newly added. There are three intertwined changes here: - pull the contents of strtabs from newly ctf_bufopened dicts into the atoms table, so that future additions will reuse the existing offset etc rather than adding new identical strings - allow the internal ctf_bufopen done by serialization to contribute its existing atoms table, so that existing atoms can be used for the remainder of the open process (like name table construction): this atoms table currente gets thrown away in the mass reassignment done later in ctf_serialize in any case, but it needs to be there during the open. - rewrite ctf_str_write_strtab so that a) it uses iterators rather than ctf__iter, reducing pointless structures which serve no other purpose than to implement ordinary variable scope, but more clunkily, and b) retains the existing strtab on the front of the new one, with its sort retained, rather than resorting, so all existing already-written strtab offsets remain valid across the call. This latter change finally permits repeated serializations, and reserializations of ctf_open()ed dicts, to work, but for now we keep the code that prevents that because serialization is about to change again in a way that will make it more obvious that doing such things is safe, and we can take it out then. (There are also some smaller changes like moving the purge of the refs table into ctf_str_write_strtab(), since that's where the changes happen that invalidate it, rather than doing it in ctf_serialize(). We also prohibit something that has never worked, opening a dict and then reporting symbols to it via ctf_link_add_strtab() et al: you must do that to newly-created dicts which have had stuff ctf_link()ed into them. This is very unlikely ever to be a problem in practice: linkers just don't do that sort of thing.) libctf/ ctf-create.c (ctf_create): Add (temporary) atoms arg. * ctf-impl.h (struct ctf_dict.ctf_dynstrtab): New. (ctf_str_create_atoms): Adjust. (ctf_str_write_strtab): Likewise. (ctf_simple_open_internal): Likewise. * ctf-open.c (ctf_simple_open_internal): Add atoms arg. (ctf_bufopen): Likewise. (ctf_bufopen_internal): Initialize just enough of an atoms table: pre-init from the atoms arg if supplied. (ctf_simple_open): Adjust. * ctf-serialize.c (ctf_serialize): Constify the strtab. Move ref list purging into ctf_str_write_strtab. Initialize the new dict with the old dict's atoms table. Accept the new strtab from ctf_str_write_strtab. Adjust for addition of ctf_dynstrtab. * ctf-string.c (ctf_strraw_explicit): Improve comments. (ctf_str_create_atoms): Prepopulate from an existing atoms table, or alternatively pull in all strings from the strtab and turn them into atoms. (ctf_str_free_atoms): Free the dynstrtab and its strtab. (struct ctf_strtab_write_state): Remove. (ctf_str_count_strtab): Fold this... (ctf_str_populate_sorttab): ... and this... (ctf_str_write_strtab): ... into this. Prepend existing strings to the strtab rather than resorting them (and wrecking their offsets). Keep the dynstrtab updated. Update refs for all atoms with refs, whether or not they are strings newly added to the strtab.	2024-04-19 16:14:47 +01:00
Nick Alcock	149ce5c263	libctf: replace 'pending refs' abstraction A few years ago we introduced a 'pending refs' abstraction to fix one problem: serializing a dict, then changing it would tend to corrupt the dict because the strtab sort we do on strtab writeout (to improve compression efficiency) would modify the offset of any strings that sorted lexicographically earlier in the strtab: so we added a new restriction that all strings are added only at serialization time, and maintained a set of 'pending' refs that were added earlier, whose offsets we could update (like other refs) at writeout time. This was in hindsight seriously problematic for maintenance (because serialization has to traverse all strings in all datatypes in the entire dict), and has become impossible to sustain now that we can read in existing dicts, modify them, and reserialize them again. We really don't want to have to dig through the entire dict we jut read in just in order to dig out all its strtab offsets, then change it, just for the sake of a sort that adds a frankly trivial amount of compression efficiency. Sorting is still worthwhile -- but it sacrifices very little to only sort newly-added portions of the strtab, reusing older portions as necessary. As a first stage in this, discard the whole "pending refs" abstraction and replace it with "movable" refs, which are exactly like all other refs (addresses containing the strtab offset of some string, which are updated wiht the final strtab offset on serialization) except that we track them in a reverse dict so that we can move the refs around (which we do whenever we realloc() a buffer containing a bunch of structure members or something when we add members to the structure). libctf/ * ctf-create.c (ctf_add_enumerator): Call ctf_str_move_refs; add a movable ref. (ctf_add_member_offset): Likewise. * ctf-util.c (ctf_realloc): Delete. * ctf-serialize.c (ctf_serialize): No longer use it. Adjust to new fields. * ctf-string.c (ctf_str_purge_atom_refs): Purge movable refs. (ctf_str_free_atom): Free freeable atoms' strings. (ctf_str_create_atoms): Create the movable refs dynhash if needed. (ctf_str_free_atoms): Destroy it. (CTF_STR_MOVABLE): Switch (back) from ints to flags (see previous reversion). Add new flag. (aref_create): New, populate movable refs if need be. (ctf_str_add_ref_internal): Switch back to flags, update refs directly for nonprovisional strings (with already-known fixed offsets); create refs via aref_create. Allocate strings only if not within an mmapped strtab. (ctf_str_add_movable_ref): New. (ctf_str_add): Adjust to CTF_STR_* reintroduction. (ctf_str_add_external): LIkewise. (ctf_str_move_refs): New, move refs via ctf_str_movable_refs backpointer. (ctf_str_purge_refs): Drop ctf_str_num_refs. (ctf_str_update_refs): Fix indentation. * ctf-impl.h (struct ctf_str_atom_movable): New. (struct ctf_dict.ctf_str_num_refs): Drop. (struct ctf_dict.ctf_str_movable_refs): New. (ctf_str_add_movable_ref): Declare. (ctf_str_move_refs): Likewise. (ctf_realloc): Drop.	2024-04-19 16:14:46 +01:00
Nick Alcock	3301ddba1b	Revert "libctf: do not corrupt strings across ctf_serialize" This reverts commit `986e9e3aa0`. (We do not revert the testcase -- it remains valid -- but we are taking a different, less complex and more robust approach.) This also deletes the pending refs abstraction without (yet) replacing it, so some tests will fail for a commit or two.	2024-04-19 16:14:46 +01:00
Nick Alcock	4fa4e3d92a	libctf: delete LCTF_DIRTY This flag was meant as an optimization to avoid reserializing dicts unnecessarily. It was critically necessary back when serialization was done by ctf_update() and you had to call that every time you wanted any new modifications to the type table to be usable by other types, but that has been unnecessary for years now, and serialization is only done once when writing out, which one would naturally assume would always serialize the dict. Worse, it never really worked: it only tracked newly-added types, not things like added symbols which might equally well require reserialization, and it gets in the way of an upcoming change. Delete entirely. libctf/ * ctf-create.c (ctf_create): Drop LCTF_DIRTY. (ctf_discard): Likewise. (ctf_rollback): Likewise. (ctf_add_generic): Likewise. (ctf_set_array): Likewise. (ctf_add_enumerator): Likewise. (ctf_add_member_offset): Likewise. (ctf_add_variable_forced): Likewise. * ctf-link.c (ctf_link_intern_extern_string): Likewise. (ctf_link_add_strtab): Likewise. * ctf-serialize.c (ctf_serialize): Likewise. * ctf-impl.h (LCTF_DIRTY): Likewise. (LCTF_LINKING): Renumber.	2024-04-19 16:14:46 +01:00
Nick Alcock	8a60c93096	libctf: support addition of types to dicts read via ctf_open() libctf has long declared deserialized dictionaries (out of files or ELF sections or memory buffers or whatever) to be read-only: back in the furthest prehistory this was not the case, in that you could add a few sorts of type to such dicts, but attempting to do so often caused horrible memory corruption, so I banned the lot. But it turns out real consumers want it (notably DTrace, which synthesises pointers to types that don't have them and adds them to the ctf_open()ed dicts if it needs them). Let's bring it back again, but without the memory corruption and without the massive code duplication required in days of yore to distinguish between static and dynamic types: the representation of both types has been identical for a few years, with the only difference being that types as a whole are stored in a big buffer for types read in via ctf_open and per-type hashtables for newly-added types. So we discard the internally-visible concept of "readonly dictionaries" in favour of declaring the range of types that were already present when the dict was read in to be read-only: you can't modify them (say, by adding members to them if they're structs, or calling ctf_set_array on them), but you can add more types and point to them. (The API remains the same, with calls sometimes returning ECTF_RDONLY, but now they do so less often.) This is a fairly invasive change, mostly because code written since the ban was introduced didn't take the possibility of a static/dynamic split into account. Some of these irregularities were hard to define as anything but bugs. Notably: - The symbol handling was assuming that symbols only needed to be looked for in dynamic hashtabs or static linker-laid-out indexed/ nonindexed layouts, but now we want to check both in case people added more symbols to a dict they opened. - The code that handles type additions wasn't checking to see if types with the same name existed at all (so you could do ctf_add_typedef (fp, "foo", bar) repeatedly without error). This seems reasonable for types you just added, but we probably do want to ban addition of types with names that override names we already used in the ctf_open()ed portion, since that would probably corrupt existing type relationships. (Doing things this way also avoids causing new errors for any existing code that was doing this sort of thing.) - ctf_lookup_variable entirely failed to work for variables just added by ctf_add_variable: you had to write the dict out and read it back in again before they appeared. - The symbol handling remembered what symbols you looked up but didn't remember their types, so you could look up an object symbol and then find it popping up when you asked for function symbols, which seems less than ideal. Since we had to rejig things enough to be able to distinguish function and object symbols internally anyway (in order to give suitable errors if you try to add a symbol with a name that already existed in the ctf_open()ed dict), this bug suddenly became more visible and was easily fixed. We do not (yet) support writing out dicts that have been previously read in via ctf_open() or other deserializer (you can look things up in them, but not write them out a second time). This never worked, so there is no incompatibility; if it is needed at a later date, the serializer is a little bit closer to having it work now (the only table we don't deal with is the types table, and that's because the upcoming CTFv4 changes are likely to make major changes to the way that table is represented internally, so adding more code that depends on its current form seems like a bad idea). There is a new testcase that tests much of this, in particular that modification of existing types is still banned and that you can add new ones and chase them without error. libctf/ * ctf-impl.h (struct ctf_dict.ctf_symhash): Split into... (ctf_dict.ctf_symhash_func): ... this and... (ctf_dict.ctf_symhash_objt): ... this. (ctf_dict.ctf_stypes): New, counts static types. (LCTF_INDEX_TO_TYPEPTR): Use it instead of CTF_RDWR. (LCTF_RDWR): Deleted. (LCTF_DIRTY): Renumbered. (LCTF_LINKING): Likewise. (ctf_lookup_variable_here): New. (ctf_lookup_by_sym_or_name): Likewise. (ctf_symbol_next_static): Likewise. (ctf_add_variable_forced): Likewise. (ctf_add_funcobjt_sym_forced): Likewise. (ctf_simple_open_internal): Adjust. (ctf_bufopen_internal): Likewise. * ctf-create.c (ctf_grow_ptrtab): Adjust a lot to start with. (ctf_create): Migrate a bunch of initializations into bufopen. Force recreation of name tables. Do not forcibly override the model, let ctf_bufopen do it. (ctf_static_type): New. (ctf_update): Drop LCTF_RDWR check. (ctf_dynamic_type): Likewise. (ctf_add_function): Likewise. (ctf_add_type_internal): Likewise. (ctf_rollback): Check ctf_stypes, not LCTF_RDWR. (ctf_set_array): Likewise. (ctf_add_struct_sized): Likewise. (ctf_add_union_sized): Likewise. (ctf_add_enum): Likewise. (ctf_add_enumerator): Likewise (only on the target dict). (ctf_add_member_offset): Likewise. (ctf_add_generic): Drop LCTF_RDWR check. Ban addition of types with colliding names. (ctf_add_forward): Note safety under the new rules. (ctf_add_variable): Split all but the existence check into... (ctf_add_variable_forced): ... this new function. (ctf_add_funcobjt_sym): Likewise... (ctf_add_funcobjt_sym_forced): ... for this new function. * ctf-link.c (ctf_link_add_linker_symbol): Ban calling on dicts with any stypes. (ctf_link_add_strtab): Likewise. (ctf_link_shuffle_syms): Likewise. (ctf_link_intern_extern_string): Note pre-existing prohibition. * ctf-lookup.c (ctf_lookup_by_id): Drop LCTF_RDWR check. (ctf_lookup_variable): Split out looking in a dict but not its parent into... (ctf_lookup_variable_here): ... this new function. (ctf_lookup_symbol_idx): Track whether looking up a function or object: cache them separately. (ctf_symbol_next): Split out looking in non-dynamic symtypetab entries to... (ctf_symbol_next_static): ... this new function. Don't get confused by the simultaneous presence of static and dynamic symtypetab entries. (ctf_try_lookup_indexed): Don't waste time looking up symbols by index before there can be any idea how symbols are numbered. (ctf_lookup_by_sym_or_name): Distinguish between function and data object lookups. Drop LCTF_RDWR. (ctf_lookup_by_symbol): Adjust. (ctf_lookup_by_symbol_name): Likewise. * ctf-open.c (init_types): Rename to... (init_static_types): ... this. Drop LCTF_RDWR. Populate ctf_stypes. (ctf_simple_open): Drop writable arg. (ctf_simple_open_internal): Likewise. (ctf_bufopen): Likewise. (ctf_bufopen_internal): Populate fields only used for writable dicts. Drop LCTF_RDWR. (ctf_dict_close): Cater for symhash cache split. * ctf-serialize.c (ctf_serialize): Use ctf_stypes, not LCTF_RDWR. * ctf-types.c (ctf_variable_next): Drop LCTF_RDWR. * testsuite/libctf-lookup/add-to-opened*: New test.	2024-04-19 16:14:46 +01:00
Nick Alcock	54a0219150	libctf: remove static/dynamic name lookup distinction libctf internally maintains a set of hash tables for type name lookups, one for each valid C type namespace (struct, union, enum, and everything else). Or, rather, it maintains two sets of hash tables: one, a ctf_hash , is meant for lookups in ctf_(buf)open()ed dicts with fixed content; the other, a ctf_dynhash , is meant for lookups in ctf_create()d dicts. This distinction was somewhat valuable in the far pre-binutils past when two different hashtable implementations were used (one expanding, the other fixed-size), but those days are long gone: the hash table implementations are almost identical, both wrappers around the libiberty hashtab. The ctf_dynhash has many more capabilities than the ctf_hash (iteration, deletion, etc etc) and has no downsides other than starting at a fixed, arbitrary small size. That limitation is easy to lift (via a new ctf_dynhash_create_sized()), following which we can throw away nearly all the ctf_hash implementation, and all the code to choose between readable and writable hashtabs; the few convenience functions that are still useful (for insertion of name -> type mappings) can also be generalized a bit so that the extra string verification they do is potentially available to other string lookups as well. (libctf still has two hashtable implementations, ctf_dynhash, above, and ctf_dynset, which is a key-only hashtab that can avoid a great many malloc()s, used for high-volume applications in the deduplicator.) libctf/ * ctf-create.c (ctf_create): Eliminate ctn_writable. (ctf_dtd_insert): Likewise. (ctf_dtd_delete): Likewise. (ctf_rollback): Likewise. (ctf_name_table): Eliminate ctf_names_t. * ctf-hash.c (ctf_dynhash_create): Comment update. Reimplement in terms of... (ctf_dynhash_create_sized): ... this new function. (ctf_hash_create): Remove. (ctf_hash_size): Remove. (ctf_hash_define_type): Remove. (ctf_hash_destroy): Remove. (ctf_hash_lookup_type): Rename to... (ctf_dynhash_lookup_type): ... this. (ctf_hash_insert_type): Rename to... (ctf_dynhash_insert_type): ... this, moving validation to... * ctf-string.c (ctf_strptr_validate): ... this new function. * ctf-impl.h (struct ctf_names): Extirpate. (struct ctf_lookup.ctl_hash): Now a ctf_dynhash_t. (struct ctf_dict): All ctf_names_t fields are now ctf_dynhash_t. (ctf_name_table): Now returns a ctf_dynhash_t. (ctf_lookup_by_rawhash): Remove. (ctf_hash_create): Likewise. (ctf_hash_insert_type): Likewise. (ctf_hash_define_type): Likewise. (ctf_hash_lookup_type): Likewise. (ctf_hash_size): Likewise. (ctf_hash_destroy): Likewise. (ctf_dynhash_create_sized): New. (ctf_dynhash_insert_type): New. (ctf_dynhash_lookup_type): New. (ctf_strptr_validate): New. * ctf-lookup.c (ctf_lookup_by_name_internal): Adapt. * ctf-open.c (init_types): Adapt. (ctf_set_ctl_hashes): Adapt. (ctf_dict_close): Adapt. * ctf-serialize.c (ctf_serialize): Adapt. * ctf-types.c (ctf_lookup_by_rawhash): Remove.	2024-04-19 16:14:46 +01:00
Alan Modra	59497587af	libctf warnings Seen with every compiler I have if using -fno-inline: home/alan/src/binutils-gdb/libctf/ctf-create.c: In function ‘ctf_add_encoded’: /home/alan/src/binutils-gdb/libctf/ctf-create.c:555:3: warning: ‘encoding’ may be used uninitialized [-Wmaybe-uninitialized] 555 \| memcpy (dtd->dtd_vlen, &encoding, sizeof (encoding)); Seen with gcc-4.9 and probably others at lower optimisation levels: home/alan/src/binutils-gdb/libctf/ctf-serialize.c: In function 'symtypetab_density': /home/alan/src/binutils-gdb/libctf/ctf-serialize.c:211:18: warning: 'sym' may be used uninitialized in this function [-Wmaybe-uninitialized] if (max < sym->st_symidx) Seen with gcc-4.5 and probably others at lower optimisation levels: /home/alan/src/binutils-gdb/libctf/ctf-types.c:1649:21: warning: 'tp' may be used uninitialized in this function /home/alan/src/binutils-gdb/libctf/ctf-link.c:765:16: warning: 'parent_i' may be used uninitialized in this function Also with gcc-4.5: In file included from /home/alan/src/binutils-gdb/libctf/ctf-endian.h:25:0, from /home/alan/src/binutils-gdb/libctf/ctf-archive.c:24: /home/alan/src/binutils-gdb/libctf/swap.h:70:0: warning: "_Static_assert" redefined /usr/include/sys/cdefs.h:568:0: note: this is the location of the previous definition swap.h (_Static_assert): Don't define if already defined. * ctf-serialize.c (symtypetab_density): Merge two CTF_SYMTYPETAB_FORCE_INDEXED blocks. * ctf-create.c (ctf_add_encoded): Avoid "encoding" may be used uninitialized warning. * ctf-link.c (ctf_link_deduplicating_open_inputs): Avoid "parent_i" may be used uninitialized warning. * ctf-types.c (ctf_type_rvisit): Avoid "tp" may be used uninitialized warning.	2024-04-17 09:24:36 +09:30
Alan Modra	fd67aa1129	Update year range in copyright notice of binutils files Adds two new external authors to etc/update-copyright.py to cover bfd/ax_tls.m4, and adds gprofng to dirs handled automatically, then updates copyright messages as follows: 1) Update cgen/utils.scm emitted copyrights. 2) Run "etc/update-copyright.py --this-year" with an extra external author I haven't committed, 'Kalray SA.', to cover gas testsuite files (which should have their copyright message removed). 3) Build with --enable-maintainer-mode --enable-cgen-maint=yes. 4) Check out /po/.pot which we don't update frequently.	2024-01-04 22:58:12 +10:30
Alan Modra	d87bef3a7b	Update year range in copyright notice of binutils files The newer update-copyright.py fixes file encoding too, removing cr/lf on binutils/bfdtest2.c and ld/testsuite/ld-cygwin/exe-export.exp, and embedded cr in binutils/testsuite/binutils-all/ar.exp string match.	2023-01-01 21:50:11 +10:30
Nick Alcock	3ec2b3c058	libctf: avoid mingw warning A missing paren led to an intended cast to avoid dependence on the size of size_t in one argument of ctf_err_warn applying to the wrong type by mistake. libctf/ChangeLog: * ctf-serialize.c (ctf_write_mem): Fix cast.	2022-06-21 19:27:15 +01:00
Nick Alcock	faf5e6ace8	libctf: add LIBCTF_WRITE_FOREIGN_ENDIAN debugging option libctf has always handled endianness differences by detecting foreign-endian CTF dicts on the input and endian-flipping them: dicts are always written in native endianness. This makes endian-awareness very low overhead, but it means that the foreign-endian code paths almost never get routinely tested, since "make check" usually reads in dicts ld has just written out: only a few corrupted-CTF tests are actually in fixed endianness, and even they only test the foreign- endian code paths when you run make check on a big-endian machine. (And the fix is surely not to add more .s-based tests like that, because they are a nightmare to maintain compared to the C-code-based ones.) To improve on this, add a new environment variable, LIBCTF_WRITE_FOREIGN_ENDIAN, which causes libctf to unconditionally endian-flip at ctf_write time, so the output is always in the wrong endianness. This then tests the foreign-endian read paths properly at open time. Make this easier by restructuring the writeout code in ctf-serialize.c, which duplicates the maybe-gzip-and-write-out code three times (once for ctf_write_mem, with thresholding, and once each for ctf_compress_write and ctf_write just so those can avoid thresholding and/or compression). Instead, have the latter two call the former with thresholds of 0 or (size_t) -1, respectively. The endian-flipping code itself gains a bit of complexity, because one single endian-flipper (flip_types) was assuming the input to be in foreign-endian form and assuming it could pull things out of the input once they had been flipped and make sense of them. At the cost of a few lines of duplicated initializations, teach it to read before flipping if we're flipping to foreign-endianness instead of away from it. libctf/ * ctf-impl.h (ctf_flip_header): No longer static. (ctf_flip): Likewise. * ctf-open.c (flip_header): Rename to... (ctf_flip_header): ... this, now it is not private to one file. (flip_ctf): Rename... (ctf_flip): ... this too. Add FOREIGN_ENDIAN arg. (flip_types): Likewise. Use it. (ctf_bufopen_internal): Adjust calls. * ctf-serialize.c (ctf_write_mem): Add flip_endian path via a newly-allocated bounce buffer. (ctf_compress_write): Move below ctf_write_mem and reimplement in terms of it. (ctf_write): Likewise. (ctf_gzwrite): Note that this obscure writeout function does not support endian-flipping.	2022-03-23 13:48:32 +00:00
Nick Alcock	203bfa2f6b	include, libctf, ld: extend variable section to contain functions too The CTF variable section is an optional (usually-not-present) section in the CTF dict which contains name -> type mappings corresponding to data symbols that are present in the linker input but not in the output symbol table: the idea is that programs that use their own symbol- resolution mechanisms can use this section to look up the types of symbols they have found using their own mechanism. Because these removed symbols (mostly static variables, functions, etc) all have names that are unlikely to appear in the ELF symtab and because very few programs have their own symbol-resolution mechanisms, a special linker flag (--ctf-variables) is needed to emit this section. Historically, we emitted only removed data symbols into the variable section. This seemed to make sense at the time, but in hindsight it really doesn't: functions are symbols too, and a C program can look them up just like any other type. So extend the variable section so that it contains all static function symbols too (if it is emitted at all), with types of kind CTF_K_FUNCTION. This is a little fiddly. We relied on compiler assistance for data symbols: the compiler simply emits all data symbols twice, once into the symtypetab as an indexed symbol and once into the variable section. Rather than wait for a suitably adjusted compiler that does the same for function symbols, we can pluck unreported function symbols out of the symtab and add them to the variable section ourselves. While we're at it, we do the same with data symbols: this is redundant right now because the compiler does it, but it costs very little time and lets the compiler drop this kludge and save a little space in .o files. include/ * ctf.h: Mention the new things we can see in the variable section. ld/ * testsuite/ld-ctf/data-func-conflicted-vars.d: New test. libctf/ * ctf-link.c (ctf_link_deduplicating_variables): Duplicate symbols into the variable section too. * ctf-serialize.c (symtypetab_delete_nonstatic_vars): Rename to... (symtypetab_delete_nonstatics): ... this. Check the funchash when pruning redundant variables. (ctf_symtypetab_sect_sizes): Adjust accordingly. * NEWS: Describe this change.	2022-03-23 13:48:32 +00:00
Alan Modra	a2c5833233	Update year range in copyright notice of binutils files The result of running etc/update-copyright.py --this-year, fixing all the files whose mode is changed by the script, plus a build with --enable-maintainer-mode --enable-cgen-maint=yes, then checking out /po/.pot which we don't update frequently. The copy of cgen was with commit d1dd5fcc38ead reverted as that commit breaks building of bfp opcodes files.	2022-01-02 12:04:28 +10:30
Nick Alcock	86f64bf43f	libctf, serialize: functions with no args have a NULL dtd_vlen Every place that accesses a function's dtd_vlen accesses it only if the number of args is nonzero, except the serializer, which always tries to memcpy it. The number of bytes it memcpys in this case is zero, but it is still undefined behaviour to copy zero bytes from a null pointer. So check for this case explicitly. libctf/ChangeLog 2021-03-25 Nick Alcock <nick.alcock@oracle.com> PR libctf/27628 * ctf-serialize.c (ctf_emit_type_sect): Allow for a NULL vlen in CTF_K_FUNCTION types.	2021-03-25 16:32:48 +00:00
Nick Alcock	08c428aff4	libctf: eliminate dtd_u, part 5: structs / unions Eliminate the dynamic member storage for structs and unions as we have for other dynamic types. This is much like the previous enum elimination, except that structs and unions are the only types for which a full-sized ctf_type_t might be needed. Up to now, this decision has been made in the individual ctf_add_{struct,union}_sized functions and duplicated in ctf_add_member_offset. The vlen machinery lets us simplify this, always allocating a ctf_lmember_t and setting the dtd_data's ctt_size to CTF_LSIZE_SENT: we figure out whether this is really justified and (almost always) repack things down into a ctf_stype_t at ctf_serialize time. This allows us to eliminate the dynamic member paths from the iterators and query functions in ctf-types.c in favour of always using the large-structure vlen stuff for dynamic types (the diff is ugly but that's just because of the volume of reindentation this calls for). This also means the large-structure vlen stuff gets more heavily tested, which is nice because it was an almost totally unused code path before now (it only kicked in for structures of size >4GiB, and how often do you see those?) The only extra complexity here is ctf_add_type. Back in the days of the nondeduplicating linker this was called a ridiculous number of times for countless identical copies of structures: eschewing the repeated lookups of the dtd in ctf_add_member_offset and adding the members directly saved an amazing amount of time. Now the nondeduplicating linker is gone, this is extreme overoptimization: we can rip out the direct addition and use ctf_member_next and ctf_add_member_offset, just like ctf_dedup_emit does. We augment a ctf_add_type test to try adding a self-referential struct, the only thing the ctf_add_type part of this change really perturbs. This completes the elimination of dtd_u. libctf/ChangeLog 2021-03-18 Nick Alcock <nick.alcock@oracle.com> * ctf-impl.h (ctf_dtdef_t) <dtu_members>: Remove. <dtd_u>: Likewise. (ctf_dmdef_t): Remove. (struct ctf_next) <u.ctn_dmd>: Remove. * ctf-create.c (INITIAL_VLEN): New, more-or-less arbitrary initial vlen size. (ctf_add_enum): Use it. (ctf_dtd_delete): Do not free the (removed) dmd; remove string refs from the vlen on struct deletion. (ctf_add_struct_sized): Populate the vlen: do it by hand if promoting forwards. Always populate the full-size lsizehi/lsizelo members. (ctf_add_union_sized): Likewise. (ctf_add_member_offset): Set up the vlen rather than the dmd. Expand it as needed, repointing string refs via ctf_str_move_pending. Add the member names as pending strings. Always populate the full-size lsizehi/lsizelo members. (membadd): Remove, folding back into... (ctf_add_type_internal): ... here, adding via an ordinary ctf_add_struct_sized and _next iteration rather than doing everything by hand. * ctf-serialize.c (ctf_copy_smembers): Remove this... (ctf_copy_lmembers): ... and this... (ctf_emit_type_sect): ... folding into here. Figure out if a ctf_stype_t is needed here, not in ctf_add__sized. (ctf_type_sect_size): Figure out the ctf_stype_t stuff the same way here. ctf-types.c (ctf_member_next): Remove the dmd path and always use the vlen. Force large-structure usage for dynamic types. (ctf_type_align): Likewise. (ctf_member_info): Likewise. (ctf_type_rvisit): Likewise. * testsuite/libctf-regression/type-add-unnamed-struct-ctf.c: Add a self-referential type to this test. * testsuite/libctf-regression/type-add-unnamed-struct.c: Adjusted accordingly. * testsuite/libctf-regression/type-add-unnamed-struct.lk: Likewise.	2021-03-18 12:40:40 +00:00
Nick Alcock	77d724a7ec	libctf: eliminate dtd_u, part 4: enums This is the first tricky one, the first complex multi-entry vlen containing strings. To handle this in vlen form, we have to handle pending refs moving around on realloc. We grow vlen regions using a new ctf_grow_vlen function, and iterate through the existing enums every time a grow happens, telling the string machinery the distance between the old and new vlen region and letting it adjust the pending refs accordingly. (This avoids traversing all outstanding refs to find the refs that need adjusting, at the cost of having to traverse one enum: an obvious major performance win.) Addition of enums themselves (and also structs/unions later) is a bit trickier than earlier forms, because the type might be being promoted from a forward, and forwards have no vlen: so we have to spot that and create it if needed. Serialization of enums simplifies down to just telling the string machinery about the string refs; all the enum type-lookup code loses all its dynamic member lookup complexity entirely. A new test is added that iterates over (and gets values of) an enum with enough members to force a round of vlen growth. libctf/ChangeLog 2021-03-18 Nick Alcock <nick.alcock@oracle.com> * ctf-impl.h (ctf_dtdef_t) <dtd_vlen_alloc>: New. (ctf_str_move_pending): Declare. * ctf-string.c (ctf_str_add_ref_internal): Fix error return. (ctf_str_move_pending): New. * ctf-create.c (ctf_grow_vlen): New. (ctf_dtd_delete): Zero out the vlen_alloc after free. Free the vlen later: iterate over it and free enum name refs first. (ctf_add_generic): Populate dtd_vlen_alloc from vlen. (ctf_add_enum): populate the vlen; do it by hand if promoting forwards. (ctf_add_enumerator): Set up the vlen rather than the dmd. Expand it as needed, repointing string refs via ctf_str_move_pending. Add the enumerand names as pending strings. * ctf-serialize.c (ctf_copy_emembers): Remove. (ctf_emit_type_sect): Copy the vlen into place and ref the strings. * ctf-types.c (ctf_enum_next): The dynamic portion now uses the same code as the non-dynamic. (ctf_enum_name): Likewise. (ctf_enum_value): Likewise. * testsuite/libctf-lookup/enum-many-ctf.c: New test. * testsuite/libctf-lookup/enum-many.lk: New test.	2021-03-18 12:40:40 +00:00
Nick Alcock	986e9e3aa0	libctf: do not corrupt strings across ctf_serialize The preceding change revealed a new bug: the string table is sorted for better compression, so repeated serialization with type (or member) additions in the middle can move strings around. But every serialization flushes the set of refs (the memory locations that are automatically updated with a final string offset when the strtab is updated), so if we are not to have string offsets go stale, we must do all ref additions within the serialization code (which walks the complete set of types and symbols anyway). Unfortunately, we were adding one ref in another place: the type name in the dynamic type definitions, which has a ref added to it by ctf_add_generic. So adding a type, serializing (via, say, one of the ctf_write functions), adding another type with a name that sorts earlier, and serializing again will corrupt the name of the first type because it no longer had a ref pointing to its dtd entry's name when its string offset was shifted later in the strtab to mae way for the other type. To ensure that we don't miss strings, we also maintain a set of pending refs that will be added later (during serialization), and remove entries from that set when the ref is finally added. We always use ctf_str_add_pending outside ctf-serialize.c, ensure that ctf_serialize adds all strtab offsets as refs (even those in the dtds) on every serialization, and mandate that no refs are live on entry to ctf_serialize and that all pending refs are gone before strtab finalization. (Of necessity ctf_serialize has to traverse all strtab offsets in the dtds in order to serialize them, so adding them as refs at the same time is easy.) (Note that we still can't erase unused atoms when we roll back, though we can erase unused refs: members and enums are still not removed by rollbacks and might reference strings added after the snapshot.) libctf/ChangeLog 2021-03-18 Nick Alcock <nick.alcock@oracle.com> * ctf-hash.c (ctf_dynset_elements): New. * ctf-impl.h (ctf_dynset_elements): Declare it. (ctf_str_add_pending): Likewise. (ctf_dict_t) <ctf_str_pending_ref>: New, set of refs that must be added during serialization. * ctf-string.c (ctf_str_create_atoms): Initialize it. (CTF_STR_ADD_REF): New flag. (CTF_STR_MAKE_PROVISIONAL): Likewise. (CTF_STR_PENDING_REF): Likewise. (ctf_str_add_ref_internal): Take a flags word rather than int params. Populate, and clear out, ctf_str_pending_ref. (ctf_str_add): Adjust accordingly. (ctf_str_add_external): Likewise. (ctf_str_add_pending): New. (ctf_str_remove_ref): Also remove the potential ref if it is a pending ref. * ctf-serialize.c (ctf_serialize): Prohibit addition of strings with ctf_str_add_ref before serialization. Ensure that the ctf_str_pending_ref set is empty before strtab finalization. (ctf_emit_type_sect): Add a ref to the ctt_name. * ctf-create.c (ctf_add_generic): Add the ctt_name as a pending ref. * testsuite/libctf-writable/reserialize-strtab-corruption.*: New test.	2021-03-18 12:40:40 +00:00
Nick Alcock	2a05d50e90	libctf: don't lose track of all valid types upon serialization One pattern which is rarely done in libctf but which is meant to work is this: ctf_create(); ctf_add_(); // add stuff ctf_type_() // look stuff up ctf_write_(); ctf_add_(); // should still work ctf_type_() // so should this ctf_write_(); // and this i.e., writing out a dict should not break it and you should be able to do everything you could do with it before, including writing it out again. Unfortunately this has been broken for a while because the field which indicates the maximum valid type ID was not preserved across serialization: so type additions after serialization would overwrite types (obviously disastrous) and type lookups would just fail. Fix trivial. libctf/ChangeLog 2021-03-18 Nick Alcock <nick.alcock@oracle.com> * ctf-serialize.c (ctf_serialize): Preserve ctf_typemax across serialization.	2021-03-18 12:40:40 +00:00
Nick Alcock	81982d20fa	libctf: eliminate dtd_u, part 3: functions One more member vanishes from the dtd_u, leaving only the member for struct/union/enum members. There's not much to do here, since as of commit `afd78bd6f0` we use the same representation (type sizes, etc) in the dtu_argv as we will use in the final vlen, with one exception: the vlen has alignment padding, and the dtu_argv did not. Simplify things by adding suitable padding in both cases. libctf/ChangeLog 2021-03-18 Nick Alcock <nick.alcock@oracle.com> * ctf-impl.h (ctf_dtdef_t) <dtd_u.dtu_argv>: Remove. * ctf-create.c (ctf_dtd_delete): No longer free it. (ctf_add_function): Use the dtd_vlen, not dtu_argv. Properly align. * ctf-serialize.c (ctf_emit_type_sect): Just copy the dtd_vlen. * ctf-types.c (ctf_func_type_info): Just use the vlen. (ctf_func_type_args): Likewise.	2021-03-18 12:40:40 +00:00
Nick Alcock	534444b1ee	libctf: eliminate dtd_u, part 2: arrays This is even simpler than ints, floats and slices, with the only extra complication being the need to manually transfer the array parameter in the rarely-used function ctf_set_array. (Arrays are unique in libctf in that they can be modified post facto, not just created and appended to. I'm not sure why they got this exemption, but it's easy to maintain.) libctf/ChangeLog 2021-03-18 Nick Alcock <nick.alcock@oracle.com> * ctf-impl.h (ctf_dtdef_t) <dtd_u.dtu_arr>: Remove. * ctf-create.c (ctf_add_array): Use the dtd_vlen, not dtu_arr. (ctf_set_array): Likewise. * ctf-serialize.c (ctf_emit_type_sect): Just copy the dtd_vlen. * ctf-types.c (ctf_array_info): Just use the vlen.	2021-03-18 12:40:40 +00:00
Nick Alcock	7879dd88ef	libctf: eliminate dtd_u, part 1: int/float/slice This series eliminates a lot of special-case code to handle dynamic types (types added to writable dicts and not yet serialized). Historically, when such types have variable-length data in their final CTF representations, libctf has always worked by adding such types to a special union (ctf_dtdef_t.dtd_u) in the dynamic type definition structure, then picking the members out of this structure at serialization time and packing them into their final form. This has the advantage that the ctf_add_* code doesn't need to know anything about the final CTF representation, but the significant disadvantage that all code that looks up types in any way needs two code paths, one for dynamic types, one for all others. Historically libctf "handled" this by not supporting most type lookups on dynamic types at all until ctf_update was called to do a complete reserialization of the entire dict (it didn't emit an error, it just emitted wrong results). Since commit `676c3ecbad`, which eliminated ctf_update in favour of the internal-only ctf_serialize function, all the type-lookup paths grew an extra branch to handle dynamic types. We can eliminate this branch again by dropping the dtd_u stuff and simply writing out the vlen in (close to) its final form at ctf_add_* time: type lookup for types using this approach is then identical for types in writable dicts and types that are in read-only ones, and serialization is also simplified (we just need to write out the vlen we already created). The only complexity lies in type kinds for which multiple vlen representations are valid depending on properties of the type, e.g. structures. But we can start simple, adjusting ints, floats, and slices to work this way, and leaving everything else as is. libctf/ChangeLog 2021-03-18 Nick Alcock <nick.alcock@oracle.com> * ctf-impl.h (ctf_dtdef_t) <dtd_u.dtu_enc>: Remove. <dtd_u.dtu_slice>: Likewise. <dtd_vlen>: New. * ctf-create.c (ctf_add_generic): Perhaps allocate it. All callers adjusted. (ctf_dtd_delete): Free it. (ctf_add_slice): Use the dtd_vlen, not dtu_enc. (ctf_add_encoded): Likewise. Assert that this must be an int or float. * ctf-serialize.c (ctf_emit_type_sect): Just copy the dtd_vlen. * ctf-dedup.c (ctf_dedup_rhash_type): Use the dtd_vlen, not dtu_slice. * ctf-types.c (ctf_type_reference): Likewise. (ctf_type_encoding): Remove most dynamic-type-specific code: just get the vlen from the right place. Report failure to look up the underlying type's encoding.	2021-03-18 12:40:36 +00:00
Nick Alcock	b9a964318a	libctf: split up ctf_serialize ctf_serialize and its various pieces may be split out into a separate file now, but ctf_serialize is still far too long and disordered, mixing header initialization, sizing of multiple CTF sections, sorting and emission of multiple CTF sections, strtab construction and ctf_dict_t copying into a single ugly organically-grown mess. Fix the worst of this by migrating all section sizing and emission into separate functions, two per section (or class of section in the case of the symtypetabs). Only the variable section is now sized and emitted directly in ctf_serialize (because it only takes about three lines to do so). The section sizes themselves are still maintained by ctf_serialize so that it can work out the header offsets, but ctf_symtypetab_sect_sizes and ctf_emit_symtypetab_sects share a lot of extra state: migrate that into a shared structure, emit_symtypetab_state_t. (Test results unchanged.) libctf/ChangeLog 2021-03-18 Nick Alcock <nick.alcock@oracle.com> * ctf-serialize.c: General reshuffling, and... (emit_symtypetab_state_t): New, migrated from local variables in ctf_serialize. (ctf_serialize): Split out most section sizing and emission. (ctf_symtypetab_sect_sizes): New (split out). (ctf_emit_symtypetab_sects): Likewise. (ctf_type_sect_size): Likewise. (ctf_emit_type_sect): Likewise.	2021-03-18 12:37:55 +00:00
Nick Alcock	bf4c3185a5	libctf: split serialization and file writeout into its own file The code to serialize CTF dicts just gets bigger and bigger as the dictionary's complexity grows: adding symtypetabs almost doubled it on its own. It's long past time to split this out into its own source file, accompanied by the functions that do the actual writeout. This leaves ctf-create.c populated exclusively by functions related to actual writable dict creation (ctf_add_, ctf_create etc), and leaves both files a much more reasonable size. libctf/ChangeLog 2021-03-18 Nick Alcock <nick.alcock@oracle.com> ctf-create.c (symtypetab_delete_nonstatic_vars): Move into ctf-serialize.c. (ctf_symtab_skippable): Likewise. (CTF_SYMTYPETAB_EMIT_FUNCTION): Likewise. (CTF_SYMTYPETAB_EMIT_PAD): Likewise. (CTF_SYMTYPETAB_FORCE_INDEXED): Likewise. (symtypetab_density): Likewise. (emit_symtypetab): Likewise. (emit_symtypetab_index): Likewise. (ctf_copy_smembers): Likewise. (ctf_copy_lmembers): Likewise. (ctf_copy_emembers): Likewise. (ctf_sort_var): Likewise. (ctf_serialize): Likewise. (ctf_gzwrite): Likewise. (ctf_compress_write): Likewise. (ctf_write_mem): Likewise. (ctf_write): Likewise. * ctf-serialize.c: New file. * Makefile.am (libctf_nobfd_la_SOURCES): Add it. * Makefile.in: Regenerate.	2021-03-18 12:37:53 +00:00

24 Commits