2019-04-24 01:55:27 +08:00
|
|
|
/* Implementation header.
|
2020-01-01 15:57:01 +08:00
|
|
|
Copyright (C) 2019-2020 Free Software Foundation, Inc.
|
2019-04-24 01:55:27 +08:00
|
|
|
|
|
|
|
This file is part of libctf.
|
|
|
|
|
|
|
|
libctf is free software; you can redistribute it and/or modify it under
|
|
|
|
the terms of the GNU General Public License as published by the Free
|
|
|
|
Software Foundation; either version 3, or (at your option) any later
|
|
|
|
version.
|
|
|
|
|
|
|
|
This program is distributed in the hope that it will be useful, but
|
|
|
|
WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
|
|
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
|
|
|
|
See the GNU General Public License for more details.
|
|
|
|
|
|
|
|
You should have received a copy of the GNU General Public License
|
|
|
|
along with this program; see the file COPYING. If not see
|
|
|
|
<http://www.gnu.org/licenses/>. */
|
|
|
|
|
|
|
|
#ifndef _CTF_IMPL_H
|
|
|
|
#define _CTF_IMPL_H
|
|
|
|
|
|
|
|
#include "config.h"
|
libctf: fix a number of build problems found on Solaris and NetBSD
- Use of nonportable <endian.h>
- Use of qsort_r
- Use of zlib without appropriate magic to pull in the binutils zlib
- Use of off64_t without checking (fixed by dropping the unused fields
that need off64_t entirely)
- signedness problems due to long being too short a type on 32-bit
platforms: ctf_id_t is now 'unsigned long', and CTF_ERR must be
used only for functions that return ctf_id_t
- One lingering use of bzero() and of <sys/errno.h>
All fixed, using code from gnulib where possible.
Relatedly, set cts_size in a couple of places it was missed
(string table and symbol table loading upon ctf_bfdopen()).
binutils/
* objdump.c (make_ctfsect): Drop cts_type, cts_flags, and
cts_offset.
* readelf.c (shdr_to_ctf_sect): Likewise.
include/
* ctf-api.h (ctf_sect_t): Drop cts_type, cts_flags, and cts_offset.
(ctf_id_t): This is now an unsigned type.
(CTF_ERR): Cast it to ctf_id_t. Note that it should only be used
for ctf_id_t-returning functions.
libctf/
* Makefile.am (ZLIB): New.
(ZLIBINC): Likewise.
(AM_CFLAGS): Use them.
(libctf_a_LIBADD): New, for LIBOBJS.
* configure.ac: Check for zlib, endian.h, and qsort_r.
* ctf-endian.h: New, providing htole64 and le64toh.
* swap.h: Code style fixes.
(bswap_identity_64): New.
* qsort_r.c: New, from gnulib (with one added #include).
* ctf-decls.h: New, providing a conditional qsort_r declaration,
and unconditional definitions of MIN and MAX.
* ctf-impl.h: Use it. Do not use <sys/errno.h>.
(ctf_set_errno): Now returns unsigned long.
* ctf-util.c (ctf_set_errno): Adjust here too.
* ctf-archive.c: Use ctf-endian.h.
(ctf_arc_open_by_offset): Use memset, not bzero. Drop cts_type,
cts_flags and cts_offset.
(ctf_arc_write): Drop debugging dependent on the size of off_t.
* ctf-create.c: Provide a definition of roundup if not defined.
(ctf_create): Drop cts_type, cts_flags and cts_offset.
(ctf_add_reftype): Do not check if type IDs are below zero.
(ctf_add_slice): Likewise.
(ctf_add_typedef): Likewise.
(ctf_add_member_offset): Cast error-returning ssize_t's to size_t
when known error-free. Drop CTF_ERR usage for functions returning
int.
(ctf_add_member_encoded): Drop CTF_ERR usage for functions returning
int.
(ctf_add_variable): Likewise.
(enumcmp): Likewise.
(enumadd): Likewise.
(membcmp): Likewise.
(ctf_add_type): Likewise. Cast error-returning ssize_t's to size_t
when known error-free.
* ctf-dump.c (ctf_is_slice): Drop CTF_ERR usage for functions
returning int: use CTF_ERR for functions returning ctf_type_id.
(ctf_dump_label): Likewise.
(ctf_dump_objts): Likewise.
* ctf-labels.c (ctf_label_topmost): Likewise.
(ctf_label_iter): Likewise.
(ctf_label_info): Likewise.
* ctf-lookup.c (ctf_func_args): Likewise.
* ctf-open.c (upgrade_types): Cast to size_t where appropriate.
(ctf_bufopen): Likewise. Use zlib types as needed.
* ctf-types.c (ctf_member_iter): Drop CTF_ERR usage for functions
returning int.
(ctf_enum_iter): Likewise.
(ctf_type_size): Likewise.
(ctf_type_align): Likewise. Cast to size_t where appropriate.
(ctf_type_kind_unsliced): Likewise.
(ctf_type_kind): Likewise.
(ctf_type_encoding): Likewise.
(ctf_member_info): Likewise.
(ctf_array_info): Likewise.
(ctf_enum_value): Likewise.
(ctf_type_rvisit): Likewise.
* ctf-open-bfd.c (ctf_bfdopen): Drop cts_type, cts_flags and
cts_offset.
(ctf_simple_open): Likewise.
(ctf_bfdopen_ctfsect): Likewise. Set cts_size properly.
* Makefile.in: Regenerate.
* aclocal.m4: Likewise.
* config.h: Likewise.
* configure: Likewise.
2019-05-31 17:10:51 +08:00
|
|
|
#include <errno.h>
|
2019-07-14 04:45:55 +08:00
|
|
|
#include <sys/param.h>
|
libctf: fix a number of build problems found on Solaris and NetBSD
- Use of nonportable <endian.h>
- Use of qsort_r
- Use of zlib without appropriate magic to pull in the binutils zlib
- Use of off64_t without checking (fixed by dropping the unused fields
that need off64_t entirely)
- signedness problems due to long being too short a type on 32-bit
platforms: ctf_id_t is now 'unsigned long', and CTF_ERR must be
used only for functions that return ctf_id_t
- One lingering use of bzero() and of <sys/errno.h>
All fixed, using code from gnulib where possible.
Relatedly, set cts_size in a couple of places it was missed
(string table and symbol table loading upon ctf_bfdopen()).
binutils/
* objdump.c (make_ctfsect): Drop cts_type, cts_flags, and
cts_offset.
* readelf.c (shdr_to_ctf_sect): Likewise.
include/
* ctf-api.h (ctf_sect_t): Drop cts_type, cts_flags, and cts_offset.
(ctf_id_t): This is now an unsigned type.
(CTF_ERR): Cast it to ctf_id_t. Note that it should only be used
for ctf_id_t-returning functions.
libctf/
* Makefile.am (ZLIB): New.
(ZLIBINC): Likewise.
(AM_CFLAGS): Use them.
(libctf_a_LIBADD): New, for LIBOBJS.
* configure.ac: Check for zlib, endian.h, and qsort_r.
* ctf-endian.h: New, providing htole64 and le64toh.
* swap.h: Code style fixes.
(bswap_identity_64): New.
* qsort_r.c: New, from gnulib (with one added #include).
* ctf-decls.h: New, providing a conditional qsort_r declaration,
and unconditional definitions of MIN and MAX.
* ctf-impl.h: Use it. Do not use <sys/errno.h>.
(ctf_set_errno): Now returns unsigned long.
* ctf-util.c (ctf_set_errno): Adjust here too.
* ctf-archive.c: Use ctf-endian.h.
(ctf_arc_open_by_offset): Use memset, not bzero. Drop cts_type,
cts_flags and cts_offset.
(ctf_arc_write): Drop debugging dependent on the size of off_t.
* ctf-create.c: Provide a definition of roundup if not defined.
(ctf_create): Drop cts_type, cts_flags and cts_offset.
(ctf_add_reftype): Do not check if type IDs are below zero.
(ctf_add_slice): Likewise.
(ctf_add_typedef): Likewise.
(ctf_add_member_offset): Cast error-returning ssize_t's to size_t
when known error-free. Drop CTF_ERR usage for functions returning
int.
(ctf_add_member_encoded): Drop CTF_ERR usage for functions returning
int.
(ctf_add_variable): Likewise.
(enumcmp): Likewise.
(enumadd): Likewise.
(membcmp): Likewise.
(ctf_add_type): Likewise. Cast error-returning ssize_t's to size_t
when known error-free.
* ctf-dump.c (ctf_is_slice): Drop CTF_ERR usage for functions
returning int: use CTF_ERR for functions returning ctf_type_id.
(ctf_dump_label): Likewise.
(ctf_dump_objts): Likewise.
* ctf-labels.c (ctf_label_topmost): Likewise.
(ctf_label_iter): Likewise.
(ctf_label_info): Likewise.
* ctf-lookup.c (ctf_func_args): Likewise.
* ctf-open.c (upgrade_types): Cast to size_t where appropriate.
(ctf_bufopen): Likewise. Use zlib types as needed.
* ctf-types.c (ctf_member_iter): Drop CTF_ERR usage for functions
returning int.
(ctf_enum_iter): Likewise.
(ctf_type_size): Likewise.
(ctf_type_align): Likewise. Cast to size_t where appropriate.
(ctf_type_kind_unsliced): Likewise.
(ctf_type_kind): Likewise.
(ctf_type_encoding): Likewise.
(ctf_member_info): Likewise.
(ctf_array_info): Likewise.
(ctf_enum_value): Likewise.
(ctf_type_rvisit): Likewise.
* ctf-open-bfd.c (ctf_bfdopen): Drop cts_type, cts_flags and
cts_offset.
(ctf_simple_open): Likewise.
(ctf_bfdopen_ctfsect): Likewise. Set cts_size properly.
* Makefile.in: Regenerate.
* aclocal.m4: Likewise.
* config.h: Likewise.
* configure: Likewise.
2019-05-31 17:10:51 +08:00
|
|
|
#include "ctf-decls.h"
|
2019-04-24 01:55:27 +08:00
|
|
|
#include <ctf-api.h>
|
|
|
|
#include <sys/types.h>
|
2019-04-24 04:45:30 +08:00
|
|
|
#include <stdlib.h>
|
|
|
|
#include <stdarg.h>
|
|
|
|
#include <stdio.h>
|
|
|
|
#include <stdint.h>
|
|
|
|
#include <limits.h>
|
|
|
|
#include <ctype.h>
|
|
|
|
#include <elf.h>
|
libctf: ELF file opening via BFD
These functions let you open an ELF file with a customarily-named CTF
section in it, automatically opening the CTF file or archive and
associating the symbol and string tables in the ELF file with the CTF
container, so that you can look up the types of symbols in the ELF file
via ctf_lookup_by_symbol(), and so that strings can be shared between
the ELF file and CTF container, to save space.
It uses BFD machinery to do so. This has now been lightly tested and
seems to work. In particular, if you already have a bfd you can pass
it in to ctf_bfdopen(), and if you want a bfd made for you you can
call ctf_open() or ctf_fdopen(), optionally specifying a target (or
try once without a target and then again with one if you get
ECTF_BFD_AMBIGUOUS back).
We use a forward declaration for the struct bfd in ctf-api.h, so that
ctf-api.h users are not required to pull in <bfd.h>. (This is mostly
for the sake of readelf.)
libctf/
* ctf-open-bfd.c: New file.
* ctf-open.c (ctf_close): New.
* ctf-impl.h: Include bfd.h.
(ctf_file): New members ctf_data_mmapped, ctf_data_mmapped_len.
(ctf_archive_internal): New members ctfi_abfd, ctfi_data,
ctfi_bfd_close.
(ctf_bfdopen_ctfsect): New declaration.
(_CTF_SECTION): likewise.
include/
* ctf-api.h (struct bfd): New forward.
(ctf_fdopen): New.
(ctf_bfdopen): Likewise.
(ctf_open): Likewise.
(ctf_arc_open): Likewise.
2019-04-24 17:46:39 +08:00
|
|
|
#include <bfd.h>
|
2019-04-24 01:55:27 +08:00
|
|
|
|
|
|
|
#ifdef __cplusplus
|
|
|
|
extern "C"
|
|
|
|
{
|
|
|
|
#endif
|
|
|
|
|
|
|
|
/* Compiler attributes. */
|
|
|
|
|
|
|
|
#if defined (__GNUC__)
|
|
|
|
|
|
|
|
/* GCC. We assume that all compilers claiming to be GCC support sufficiently
|
|
|
|
many GCC attributes that the code below works. If some non-GCC compilers
|
|
|
|
masquerading as GCC in fact do not implement these attributes, version checks
|
|
|
|
may be required. */
|
|
|
|
|
|
|
|
/* We use the _libctf_*_ pattern to avoid clashes with any future attribute
|
|
|
|
macros glibc may introduce, which have names of the pattern
|
|
|
|
__attribute_blah__. */
|
|
|
|
|
|
|
|
#define _libctf_printflike_(string_index,first_to_check) \
|
|
|
|
__attribute__ ((__format__ (__printf__, (string_index), (first_to_check))))
|
|
|
|
#define _libctf_unlikely_(x) __builtin_expect ((x), 0)
|
|
|
|
#define _libctf_unused_ __attribute__ ((__unused__))
|
|
|
|
#define _libctf_malloc_ __attribute__((__malloc__))
|
|
|
|
|
|
|
|
#endif
|
|
|
|
|
2019-04-24 05:12:16 +08:00
|
|
|
/* libctf in-memory state. */
|
|
|
|
|
|
|
|
typedef struct ctf_fixed_hash ctf_hash_t; /* Private to ctf-hash.c. */
|
|
|
|
typedef struct ctf_dynhash ctf_dynhash_t; /* Private to ctf-hash.c. */
|
|
|
|
|
2019-04-24 05:24:13 +08:00
|
|
|
typedef struct ctf_strs
|
|
|
|
{
|
|
|
|
const char *cts_strs; /* Base address of string table. */
|
|
|
|
size_t cts_len; /* Size of string table in bytes. */
|
|
|
|
} ctf_strs_t;
|
|
|
|
|
libctf: deduplicate and sort the string table
ctf.h states:
> [...] the CTF string table does not contain any duplicated strings.
Unfortunately this is entirely untrue: libctf has before now made no
attempt whatsoever to deduplicate the string table. It computes the
string table's length on the fly as it adds new strings to the dynamic
CTF file, and ctf_update() just writes each string to the table and
notes the current write position as it traverses the dynamic CTF file's
data structures and builds the final CTF buffer. There is no global
view of the strings and no deduplication.
Fix this by erasing the ctf_dtvstrlen dead-reckoning length, and adding
a new dynhash table ctf_str_atoms that maps unique strings to a list
of references to those strings: a reference is a simple uint32_t * to
some value somewhere in the under-construction CTF buffer that needs
updating to note the string offset when the strtab is laid out.
Adding a string is now a simple matter of calling ctf_str_add_ref(),
which adds a new atom to the atoms table, if one doesn't already exist,
and adding the location of the reference to this atom to the refs list
attached to the atom: this works reliably as long as one takes care to
only call ctf_str_add_ref() once the final location of the offset is
known (so you can't call it on a temporary structure and then memcpy()
that structure into place in the CTF buffer, because the ref will still
point to the old location: ctf_update() changes accordingly).
Generating the CTF string table is a matter of calling
ctf_str_write_strtab(), which counts the length and number of elements
in the atoms table using the ctf_dynhash_iter() function we just added,
populating an array of pointers into the atoms table and sorting it into
order (to help compressors), then traversing this table and emitting it,
updating the refs to each atom as we go. The only complexity here is
arranging to keep the null string at offset zero, since a lot of code in
libctf depends on being able to leave strtab references at 0 to indicate
'no name'. Once the table is constructed and the refs updated, we know
how long it is, so we can realloc() the partial CTF buffer we allocated
earlier and can copy the table on to the end of it (and purge the refs
because they're not needed any more and have been invalidated by the
realloc() call in any case).
The net effect of all this is a reduction in uncompressed strtab sizes
of about 30% (perhaps a quarter to a half of all strings across the
Linux kernel are eliminated as duplicates). Of course, duplicated
strings are highly redundant, so the space saving after compression is
only about 20%: when the other non-strtab sections are factored in, CTF
sizes shrink by about 10%.
No change in externally-visible API or file format (other than the
reduction in pointless redundancy).
libctf/
* ctf-impl.h: (struct ctf_strs_writable): New, non-const version of
struct ctf_strs.
(struct ctf_dtdef): Note that dtd_data.ctt_name is unpopulated.
(struct ctf_str_atom): New, disambiguated single string.
(struct ctf_str_atom_ref): New, points to some other location that
references this string's offset.
(struct ctf_file): New members ctf_str_atoms and ctf_str_num_refs.
Remove member ctf_dtvstrlen: we no longer track the total strlen
as we add strings.
(ctf_str_create_atoms): Declare new function in ctf-string.c.
(ctf_str_free_atoms): Likewise.
(ctf_str_add): Likewise.
(ctf_str_add_ref): Likewise.
(ctf_str_purge_refs): Likewise.
(ctf_str_write_strtab): Likewise.
(ctf_realloc): Declare new function in ctf-util.c.
* ctf-open.c (ctf_bufopen): Create the atoms table.
(ctf_file_close): Destroy it.
* ctf-create.c (ctf_update): Copy-and-free it on update. No longer
special-case the position of the parname string. Construct the
strtab by calling ctf_str_add_ref and ctf_str_write_strtab after the
rest of each buffer element is constructed, not via open-coding:
realloc the CTF buffer and append the strtab to it. No longer
maintain ctf_dtvstrlen. Sort the variable entry table later, after
strtab construction.
(ctf_copy_membnames): Remove: integrated into ctf_copy_{s,l,e}members.
(ctf_copy_smembers): Drop the string offset: call ctf_str_add_ref
after buffer element construction instead.
(ctf_copy_lmembers): Likewise.
(ctf_copy_emembers): Likewise.
(ctf_create): No longer maintain the ctf_dtvstrlen.
(ctf_dtd_delete): Likewise.
(ctf_dvd_delete): Likewise.
(ctf_add_generic): Likewise.
(ctf_add_enumerator): Likewise.
(ctf_add_member_offset): Likewise.
(ctf_add_variable): Likewise.
(membadd): Likewise.
* ctf-util.c (ctf_realloc): New, wrapper around realloc that aborts
if there are active ctf_str_num_refs.
(ctf_strraw): Move to ctf-string.c.
(ctf_strptr): Likewise.
* ctf-string.c: New file, strtab manipulation.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
2019-06-27 20:51:10 +08:00
|
|
|
typedef struct ctf_strs_writable
|
|
|
|
{
|
|
|
|
char *cts_strs; /* Base address of string table. */
|
|
|
|
size_t cts_len; /* Size of string table in bytes. */
|
|
|
|
} ctf_strs_writable_t;
|
|
|
|
|
2019-04-24 05:24:13 +08:00
|
|
|
typedef struct ctf_dmodel
|
|
|
|
{
|
|
|
|
const char *ctd_name; /* Data model name. */
|
|
|
|
int ctd_code; /* Data model code. */
|
|
|
|
size_t ctd_pointer; /* Size of void * in bytes. */
|
|
|
|
size_t ctd_char; /* Size of char in bytes. */
|
|
|
|
size_t ctd_short; /* Size of short in bytes. */
|
|
|
|
size_t ctd_int; /* Size of int in bytes. */
|
|
|
|
size_t ctd_long; /* Size of long in bytes. */
|
|
|
|
} ctf_dmodel_t;
|
|
|
|
|
libctf: avoid the need to ever use ctf_update
The method of operation of libctf when the dictionary is writable has
before now been that types that are added land in the dynamic type
section, which is a linked list and hash of IDs -> dynamic type
definitions (and, recently a hash of names): the DTDs are a bit of CTF
representing the ctf_type_t and ad hoc C structures representing the
vlen. Historically, libctf was unable to do anything with these types,
not even look them up by ID, let alone by name: if you wanted to do that
say if you were adding a type that depended on one you just added) you
called ctf_update, which serializes all the DTDs into a CTF file and
reopens it, copying its guts over the fp it's called with. The
ctf_updated types are then frozen in amber and unchangeable: all lookups
will return the types in the static portion in preference to the dynamic
portion, and we will refuse to re-add things that already exist in the
static portion (and, of late, in the dynamic portion too). The libctf
machinery remembers the boundary between static and dynamic types and
looks in the right portion for each type. Lots of things still don't
quite work with dynamic types (e.g. getting their size), but enough
works to do a bunch of additions and then a ctf_update, most of the
time.
Except it doesn't, because ctf_add_type finds it necessary to walk the
full dynamic type definition list looking for types with matching names,
so it gets slower and slower with every type you add: fixing this
requires calling ctf_update periodically for no other reason than to
avoid massively slowing things down.
This is all clunky and very slow but kind of works, until you consider
that it is in fact possible and indeed necessary to modify one sort of
type after it has been added: forwards. These are necessarily promoted
to structs, unions or enums, and when they do so *their type ID does not
change*. So all of a sudden we are changing types that already exist in
the static portion. ctf_update gets massively confused by this and
allocates space enough for the forward (with no members), but then emits
the new dynamic type (with all the members) into it. You get an
assertion failure after that, if you're lucky, or a coredump.
So this commit rejigs things a bit and arranges to exclusively use the
dynamic type definitions in writable dictionaries, and the static type
definitions in readable dictionaries: we don't at any time have a mixture
of static and dynamic types, and you don't need to call ctf_update to
make things "appear". The ctf_dtbyname hash I introduced a few months
ago, which maps things like "struct foo" to DTDs, is removed, replaced
instead by a change of type of the four dictionaries which track names.
Rather than just being (unresizable) ctf_hash_t's populated only at
ctf_bufopen time, they are now a ctf_names_t structure, which is a pair
of ctf_hash_t and ctf_dynhash_t, with the ctf_hash_t portion being used
in readonly dictionaries, and the ctf_dynhash_t being used in writable
ones. The decision as to which to use is centralized in the new
functions ctf_lookup_by_rawname (which takes a type kind) and
ctf_lookup_by_rawhash, which it calls (which takes a ctf_names_t *.)
This change lets us switch from using static to dynamic name hashes on
the fly across the entirety of libctf without complexifying anything: in
fact, because we now centralize the knowledge about how to map from type
kind to name hash, it actually simplifies things and lets us throw out
quite a lot of now-unnecessary complexity, from ctf_dtnyname (replaced
by the dynamic half of the name tables), through to ctf_dtnextid (now
that a dictionary's static portion is never referenced if the dictionary
is writable, we can just use ctf_typemax to indicate the maximum type:
dynamic or non-dynamic does not matter, and we no longer need to track
the boundary between the types). You can now ctf_rollback() as far as
you like, even past a ctf_update or for that matter a full writeout; all
the iteration functions work just as well on writable as on read-only
dictionaries; ctf_add_type no longer needs expensive duplicated code to
run over the dynamic types hunting for ones it might be interested in;
and the linker no longer needs a hack to call ctf_update so that calling
ctf_add_type is not impossibly expensive.
There is still a bit more complexity: some new code paths in ctf-types.c
need to know how to extract information from dynamic types. This
complexity will go away again in a few months when libctf acquires a
proper intermediate representation.
You can still call ctf_update if you like (it's public API, after all),
but its only effect now is to set the point to which ctf_discard rolls
back.
Obviously *something* still needs to serialize the CTF file before
writeout, and this job is done by ctf_serialize, which does everything
ctf_update used to except set the counter used by ctf_discard. It is
automatically called by the various functions that do CTF writeout:
nobody else ever needs to call it.
With this in place, forwards that are promoted to non-forwards no longer
crash the link, even if it happens tens of thousands of types later.
v5: fix tabdamage.
libctf/
* ctf-impl.h (ctf_names_t): New.
(ctf_lookup_t) <ctf_hash>: Now a ctf_names_t, not a ctf_hash_t.
(ctf_file_t) <ctf_structs>: Likewise.
<ctf_unions>: Likewise.
<ctf_enums>: Likewise.
<ctf_names>: Likewise.
<ctf_lookups>: Improve comment.
<ctf_ptrtab_len>: New.
<ctf_prov_strtab>: New.
<ctf_str_prov_offset>: New.
<ctf_dtbyname>: Remove, redundant to the names hashes.
<ctf_dtnextid>: Remove, redundant to ctf_typemax.
(ctf_dtdef_t) <dtd_name>: Remove.
<dtd_data>: Note that the ctt_name is now populated.
(ctf_str_atom_t) <csa_offset>: This is now the strtab
offset for internal strings too.
<csa_external_offset>: New, the external strtab offset.
(CTF_INDEX_TO_TYPEPTR): Handle the LCTF_RDWR case.
(ctf_name_table): New declaration.
(ctf_lookup_by_rawname): Likewise.
(ctf_lookup_by_rawhash): Likewise.
(ctf_set_ctl_hashes): Likewise.
(ctf_serialize): Likewise.
(ctf_dtd_insert): Adjust.
(ctf_simple_open_internal): Likewise.
(ctf_bufopen_internal): Likewise.
(ctf_list_empty_p): Likewise.
(ctf_str_remove_ref): Likewise.
(ctf_str_add): Returns uint32_t now.
(ctf_str_add_ref): Likewise.
(ctf_str_add_external): Now returns a boolean (int).
* ctf-string.c (ctf_strraw_explicit): Check the ctf_prov_strtab
for strings in the appropriate range.
(ctf_str_create_atoms): Create the ctf_prov_strtab. Detect OOM
when adding the null string to the new strtab.
(ctf_str_free_atoms): Destroy the ctf_prov_strtab.
(ctf_str_add_ref_internal): Add make_provisional argument. If
make_provisional, populate the offset and fill in the
ctf_prov_strtab accordingly.
(ctf_str_add): Return the offset, not the string.
(ctf_str_add_ref): Likewise.
(ctf_str_add_external): Return a success integer.
(ctf_str_remove_ref): New, remove a single ref.
(ctf_str_count_strtab): Do not count the initial null string's
length or the existence or length of any unreferenced internal
atoms.
(ctf_str_populate_sorttab): Skip atoms with no refs.
(ctf_str_write_strtab): Populate the nullstr earlier. Add one
to the cts_len for the null string, since it is no longer done
in ctf_str_count_strtab. Adjust for csa_external_offset rename.
Populate the csa_offset for both internal and external cases.
Flush the ctf_prov_strtab afterwards, and reset the
ctf_str_prov_offset.
* ctf-create.c (ctf_grow_ptrtab): New.
(ctf_create): Call it. Initialize new fields rather than old
ones. Tell ctf_bufopen_internal that this is a writable dictionary.
Set the ctl hashes and data model.
(ctf_update): Rename to...
(ctf_serialize): ... this. Leave a compatibility function behind.
Tell ctf_simple_open_internal that this is a writable dictionary.
Pass the new fields along from the old dictionary. Drop
ctf_dtnextid and ctf_dtbyname. Use ctf_strraw, not dtd_name.
Do not zero out the DTD's ctt_name.
(ctf_prefixed_name): Rename to...
(ctf_name_table): ... this. No longer return a prefixed name: return
the applicable name table instead.
(ctf_dtd_insert): Use it, and use the right name table. Pass in the
kind we're adding. Migrate away from dtd_name.
(ctf_dtd_delete): Adjust similarly. Remove the ref to the
deleted ctt_name.
(ctf_dtd_lookup_type_by_name): Remove.
(ctf_dynamic_type): Always return NULL on read-only dictionaries.
No longer check ctf_dtnextid: check ctf_typemax instead.
(ctf_snapshot): No longer use ctf_dtnextid: use ctf_typemax instead.
(ctf_rollback): Likewise. No longer fail with ECTF_OVERROLLBACK. Use
ctf_name_table and the right name table, and migrate away from
dtd_name as in ctf_dtd_delete.
(ctf_add_generic): Pass in the kind explicitly and pass it to
ctf_dtd_insert. Use ctf_typemax, not ctf_dtnextid. Migrate away
from dtd_name to using ctf_str_add_ref to populate the ctt_name.
Grow the ptrtab if needed.
(ctf_add_encoded): Pass in the kind.
(ctf_add_slice): Likewise.
(ctf_add_array): Likewise.
(ctf_add_function): Likewise.
(ctf_add_typedef): Likewise.
(ctf_add_reftype): Likewise. Initialize the ctf_ptrtab, checking
ctt_name rather than dtd_name.
(ctf_add_struct_sized): Pass in the kind. Use
ctf_lookup_by_rawname, not ctf_hash_lookup_type /
ctf_dtd_lookup_type_by_name.
(ctf_add_union_sized): Likewise.
(ctf_add_enum): Likewise.
(ctf_add_enum_encoded): Likewise.
(ctf_add_forward): Likewise.
(ctf_add_type): Likewise.
(ctf_compress_write): Call ctf_serialize: adjust for ctf_size not
being initialized until after the call.
(ctf_write_mem): Likewise.
(ctf_write): Likewise.
* ctf-archive.c (arc_write_one_ctf): Likewise.
* ctf-lookup.c (ctf_lookup_by_name): Use ctf_lookuup_by_rawhash, not
ctf_hash_lookup_type.
(ctf_lookup_by_id): No longer check the readonly types if the
dictionary is writable.
* ctf-open.c (init_types): Assert that this dictionary is not
writable. Adjust to use the new name hashes, ctf_name_table,
and ctf_ptrtab_len. GNU style fix for the final ptrtab scan.
(ctf_bufopen_internal): New 'writable' parameter. Flip on LCTF_RDWR
if set. Drop out early when dictionary is writable. Split the
ctf_lookups initialization into...
(ctf_set_cth_hashes): ... this new function.
(ctf_simple_open_internal): Adjust. New 'writable' parameter.
(ctf_simple_open): Adjust accordingly.
(ctf_bufopen): Likewise.
(ctf_file_close): Destroy the appropriate name hashes. No longer
destroy ctf_dtbyname, which is gone.
(ctf_getdatasect): Remove spurious "extern".
* ctf-types.c (ctf_lookup_by_rawname): New, look up types in the
specified name table, given a kind.
(ctf_lookup_by_rawhash): Likewise, given a ctf_names_t *.
(ctf_member_iter): Add support for iterating over the
dynamic type list.
(ctf_enum_iter): Likewise.
(ctf_variable_iter): Likewise.
(ctf_type_rvisit): Likewise.
(ctf_member_info): Add support for types in the dynamic type list.
(ctf_enum_name): Likewise.
(ctf_enum_value): Likewise.
(ctf_func_type_info): Likewise.
(ctf_func_type_args): Likewise.
* ctf-link.c (ctf_accumulate_archive_names): No longer call
ctf_update.
(ctf_link_write): Likewise.
(ctf_link_intern_extern_string): Adjust for new
ctf_str_add_external return value.
(ctf_link_add_strtab): Likewise.
* ctf-util.c (ctf_list_empty_p): New.
2019-08-08 00:55:09 +08:00
|
|
|
typedef struct ctf_names
|
|
|
|
{
|
|
|
|
ctf_hash_t *ctn_readonly; /* Hash table when readonly. */
|
|
|
|
ctf_dynhash_t *ctn_writable; /* Hash table when writable. */
|
|
|
|
} ctf_names_t;
|
|
|
|
|
2019-04-24 05:24:13 +08:00
|
|
|
typedef struct ctf_lookup
|
|
|
|
{
|
|
|
|
const char *ctl_prefix; /* String prefix for this lookup. */
|
|
|
|
size_t ctl_len; /* Length of prefix string in bytes. */
|
libctf: avoid the need to ever use ctf_update
The method of operation of libctf when the dictionary is writable has
before now been that types that are added land in the dynamic type
section, which is a linked list and hash of IDs -> dynamic type
definitions (and, recently a hash of names): the DTDs are a bit of CTF
representing the ctf_type_t and ad hoc C structures representing the
vlen. Historically, libctf was unable to do anything with these types,
not even look them up by ID, let alone by name: if you wanted to do that
say if you were adding a type that depended on one you just added) you
called ctf_update, which serializes all the DTDs into a CTF file and
reopens it, copying its guts over the fp it's called with. The
ctf_updated types are then frozen in amber and unchangeable: all lookups
will return the types in the static portion in preference to the dynamic
portion, and we will refuse to re-add things that already exist in the
static portion (and, of late, in the dynamic portion too). The libctf
machinery remembers the boundary between static and dynamic types and
looks in the right portion for each type. Lots of things still don't
quite work with dynamic types (e.g. getting their size), but enough
works to do a bunch of additions and then a ctf_update, most of the
time.
Except it doesn't, because ctf_add_type finds it necessary to walk the
full dynamic type definition list looking for types with matching names,
so it gets slower and slower with every type you add: fixing this
requires calling ctf_update periodically for no other reason than to
avoid massively slowing things down.
This is all clunky and very slow but kind of works, until you consider
that it is in fact possible and indeed necessary to modify one sort of
type after it has been added: forwards. These are necessarily promoted
to structs, unions or enums, and when they do so *their type ID does not
change*. So all of a sudden we are changing types that already exist in
the static portion. ctf_update gets massively confused by this and
allocates space enough for the forward (with no members), but then emits
the new dynamic type (with all the members) into it. You get an
assertion failure after that, if you're lucky, or a coredump.
So this commit rejigs things a bit and arranges to exclusively use the
dynamic type definitions in writable dictionaries, and the static type
definitions in readable dictionaries: we don't at any time have a mixture
of static and dynamic types, and you don't need to call ctf_update to
make things "appear". The ctf_dtbyname hash I introduced a few months
ago, which maps things like "struct foo" to DTDs, is removed, replaced
instead by a change of type of the four dictionaries which track names.
Rather than just being (unresizable) ctf_hash_t's populated only at
ctf_bufopen time, they are now a ctf_names_t structure, which is a pair
of ctf_hash_t and ctf_dynhash_t, with the ctf_hash_t portion being used
in readonly dictionaries, and the ctf_dynhash_t being used in writable
ones. The decision as to which to use is centralized in the new
functions ctf_lookup_by_rawname (which takes a type kind) and
ctf_lookup_by_rawhash, which it calls (which takes a ctf_names_t *.)
This change lets us switch from using static to dynamic name hashes on
the fly across the entirety of libctf without complexifying anything: in
fact, because we now centralize the knowledge about how to map from type
kind to name hash, it actually simplifies things and lets us throw out
quite a lot of now-unnecessary complexity, from ctf_dtnyname (replaced
by the dynamic half of the name tables), through to ctf_dtnextid (now
that a dictionary's static portion is never referenced if the dictionary
is writable, we can just use ctf_typemax to indicate the maximum type:
dynamic or non-dynamic does not matter, and we no longer need to track
the boundary between the types). You can now ctf_rollback() as far as
you like, even past a ctf_update or for that matter a full writeout; all
the iteration functions work just as well on writable as on read-only
dictionaries; ctf_add_type no longer needs expensive duplicated code to
run over the dynamic types hunting for ones it might be interested in;
and the linker no longer needs a hack to call ctf_update so that calling
ctf_add_type is not impossibly expensive.
There is still a bit more complexity: some new code paths in ctf-types.c
need to know how to extract information from dynamic types. This
complexity will go away again in a few months when libctf acquires a
proper intermediate representation.
You can still call ctf_update if you like (it's public API, after all),
but its only effect now is to set the point to which ctf_discard rolls
back.
Obviously *something* still needs to serialize the CTF file before
writeout, and this job is done by ctf_serialize, which does everything
ctf_update used to except set the counter used by ctf_discard. It is
automatically called by the various functions that do CTF writeout:
nobody else ever needs to call it.
With this in place, forwards that are promoted to non-forwards no longer
crash the link, even if it happens tens of thousands of types later.
v5: fix tabdamage.
libctf/
* ctf-impl.h (ctf_names_t): New.
(ctf_lookup_t) <ctf_hash>: Now a ctf_names_t, not a ctf_hash_t.
(ctf_file_t) <ctf_structs>: Likewise.
<ctf_unions>: Likewise.
<ctf_enums>: Likewise.
<ctf_names>: Likewise.
<ctf_lookups>: Improve comment.
<ctf_ptrtab_len>: New.
<ctf_prov_strtab>: New.
<ctf_str_prov_offset>: New.
<ctf_dtbyname>: Remove, redundant to the names hashes.
<ctf_dtnextid>: Remove, redundant to ctf_typemax.
(ctf_dtdef_t) <dtd_name>: Remove.
<dtd_data>: Note that the ctt_name is now populated.
(ctf_str_atom_t) <csa_offset>: This is now the strtab
offset for internal strings too.
<csa_external_offset>: New, the external strtab offset.
(CTF_INDEX_TO_TYPEPTR): Handle the LCTF_RDWR case.
(ctf_name_table): New declaration.
(ctf_lookup_by_rawname): Likewise.
(ctf_lookup_by_rawhash): Likewise.
(ctf_set_ctl_hashes): Likewise.
(ctf_serialize): Likewise.
(ctf_dtd_insert): Adjust.
(ctf_simple_open_internal): Likewise.
(ctf_bufopen_internal): Likewise.
(ctf_list_empty_p): Likewise.
(ctf_str_remove_ref): Likewise.
(ctf_str_add): Returns uint32_t now.
(ctf_str_add_ref): Likewise.
(ctf_str_add_external): Now returns a boolean (int).
* ctf-string.c (ctf_strraw_explicit): Check the ctf_prov_strtab
for strings in the appropriate range.
(ctf_str_create_atoms): Create the ctf_prov_strtab. Detect OOM
when adding the null string to the new strtab.
(ctf_str_free_atoms): Destroy the ctf_prov_strtab.
(ctf_str_add_ref_internal): Add make_provisional argument. If
make_provisional, populate the offset and fill in the
ctf_prov_strtab accordingly.
(ctf_str_add): Return the offset, not the string.
(ctf_str_add_ref): Likewise.
(ctf_str_add_external): Return a success integer.
(ctf_str_remove_ref): New, remove a single ref.
(ctf_str_count_strtab): Do not count the initial null string's
length or the existence or length of any unreferenced internal
atoms.
(ctf_str_populate_sorttab): Skip atoms with no refs.
(ctf_str_write_strtab): Populate the nullstr earlier. Add one
to the cts_len for the null string, since it is no longer done
in ctf_str_count_strtab. Adjust for csa_external_offset rename.
Populate the csa_offset for both internal and external cases.
Flush the ctf_prov_strtab afterwards, and reset the
ctf_str_prov_offset.
* ctf-create.c (ctf_grow_ptrtab): New.
(ctf_create): Call it. Initialize new fields rather than old
ones. Tell ctf_bufopen_internal that this is a writable dictionary.
Set the ctl hashes and data model.
(ctf_update): Rename to...
(ctf_serialize): ... this. Leave a compatibility function behind.
Tell ctf_simple_open_internal that this is a writable dictionary.
Pass the new fields along from the old dictionary. Drop
ctf_dtnextid and ctf_dtbyname. Use ctf_strraw, not dtd_name.
Do not zero out the DTD's ctt_name.
(ctf_prefixed_name): Rename to...
(ctf_name_table): ... this. No longer return a prefixed name: return
the applicable name table instead.
(ctf_dtd_insert): Use it, and use the right name table. Pass in the
kind we're adding. Migrate away from dtd_name.
(ctf_dtd_delete): Adjust similarly. Remove the ref to the
deleted ctt_name.
(ctf_dtd_lookup_type_by_name): Remove.
(ctf_dynamic_type): Always return NULL on read-only dictionaries.
No longer check ctf_dtnextid: check ctf_typemax instead.
(ctf_snapshot): No longer use ctf_dtnextid: use ctf_typemax instead.
(ctf_rollback): Likewise. No longer fail with ECTF_OVERROLLBACK. Use
ctf_name_table and the right name table, and migrate away from
dtd_name as in ctf_dtd_delete.
(ctf_add_generic): Pass in the kind explicitly and pass it to
ctf_dtd_insert. Use ctf_typemax, not ctf_dtnextid. Migrate away
from dtd_name to using ctf_str_add_ref to populate the ctt_name.
Grow the ptrtab if needed.
(ctf_add_encoded): Pass in the kind.
(ctf_add_slice): Likewise.
(ctf_add_array): Likewise.
(ctf_add_function): Likewise.
(ctf_add_typedef): Likewise.
(ctf_add_reftype): Likewise. Initialize the ctf_ptrtab, checking
ctt_name rather than dtd_name.
(ctf_add_struct_sized): Pass in the kind. Use
ctf_lookup_by_rawname, not ctf_hash_lookup_type /
ctf_dtd_lookup_type_by_name.
(ctf_add_union_sized): Likewise.
(ctf_add_enum): Likewise.
(ctf_add_enum_encoded): Likewise.
(ctf_add_forward): Likewise.
(ctf_add_type): Likewise.
(ctf_compress_write): Call ctf_serialize: adjust for ctf_size not
being initialized until after the call.
(ctf_write_mem): Likewise.
(ctf_write): Likewise.
* ctf-archive.c (arc_write_one_ctf): Likewise.
* ctf-lookup.c (ctf_lookup_by_name): Use ctf_lookuup_by_rawhash, not
ctf_hash_lookup_type.
(ctf_lookup_by_id): No longer check the readonly types if the
dictionary is writable.
* ctf-open.c (init_types): Assert that this dictionary is not
writable. Adjust to use the new name hashes, ctf_name_table,
and ctf_ptrtab_len. GNU style fix for the final ptrtab scan.
(ctf_bufopen_internal): New 'writable' parameter. Flip on LCTF_RDWR
if set. Drop out early when dictionary is writable. Split the
ctf_lookups initialization into...
(ctf_set_cth_hashes): ... this new function.
(ctf_simple_open_internal): Adjust. New 'writable' parameter.
(ctf_simple_open): Adjust accordingly.
(ctf_bufopen): Likewise.
(ctf_file_close): Destroy the appropriate name hashes. No longer
destroy ctf_dtbyname, which is gone.
(ctf_getdatasect): Remove spurious "extern".
* ctf-types.c (ctf_lookup_by_rawname): New, look up types in the
specified name table, given a kind.
(ctf_lookup_by_rawhash): Likewise, given a ctf_names_t *.
(ctf_member_iter): Add support for iterating over the
dynamic type list.
(ctf_enum_iter): Likewise.
(ctf_variable_iter): Likewise.
(ctf_type_rvisit): Likewise.
(ctf_member_info): Add support for types in the dynamic type list.
(ctf_enum_name): Likewise.
(ctf_enum_value): Likewise.
(ctf_func_type_info): Likewise.
(ctf_func_type_args): Likewise.
* ctf-link.c (ctf_accumulate_archive_names): No longer call
ctf_update.
(ctf_link_write): Likewise.
(ctf_link_intern_extern_string): Adjust for new
ctf_str_add_external return value.
(ctf_link_add_strtab): Likewise.
* ctf-util.c (ctf_list_empty_p): New.
2019-08-08 00:55:09 +08:00
|
|
|
ctf_names_t *ctl_hash; /* Pointer to hash table for lookup. */
|
2019-04-24 05:24:13 +08:00
|
|
|
} ctf_lookup_t;
|
|
|
|
|
|
|
|
typedef struct ctf_fileops
|
|
|
|
{
|
|
|
|
uint32_t (*ctfo_get_kind) (uint32_t);
|
|
|
|
uint32_t (*ctfo_get_root) (uint32_t);
|
|
|
|
uint32_t (*ctfo_get_vlen) (uint32_t);
|
|
|
|
ssize_t (*ctfo_get_ctt_size) (const ctf_file_t *, const ctf_type_t *,
|
|
|
|
ssize_t *, ssize_t *);
|
|
|
|
ssize_t (*ctfo_get_vbytes) (unsigned short, ssize_t, size_t);
|
|
|
|
} ctf_fileops_t;
|
|
|
|
|
2019-04-24 04:45:30 +08:00
|
|
|
typedef struct ctf_list
|
|
|
|
{
|
|
|
|
struct ctf_list *l_prev; /* Previous pointer or tail pointer. */
|
|
|
|
struct ctf_list *l_next; /* Next pointer or head pointer. */
|
|
|
|
} ctf_list_t;
|
|
|
|
|
2019-04-24 05:24:13 +08:00
|
|
|
typedef enum
|
|
|
|
{
|
|
|
|
CTF_PREC_BASE,
|
|
|
|
CTF_PREC_POINTER,
|
|
|
|
CTF_PREC_ARRAY,
|
|
|
|
CTF_PREC_FUNCTION,
|
|
|
|
CTF_PREC_MAX
|
|
|
|
} ctf_decl_prec_t;
|
|
|
|
|
|
|
|
typedef struct ctf_decl_node
|
|
|
|
{
|
|
|
|
ctf_list_t cd_list; /* Linked list pointers. */
|
|
|
|
ctf_id_t cd_type; /* Type identifier. */
|
|
|
|
uint32_t cd_kind; /* Type kind. */
|
|
|
|
uint32_t cd_n; /* Type dimension if array. */
|
|
|
|
} ctf_decl_node_t;
|
|
|
|
|
|
|
|
typedef struct ctf_decl
|
|
|
|
{
|
|
|
|
ctf_list_t cd_nodes[CTF_PREC_MAX]; /* Declaration node stacks. */
|
|
|
|
int cd_order[CTF_PREC_MAX]; /* Storage order of decls. */
|
|
|
|
ctf_decl_prec_t cd_qualp; /* Qualifier precision. */
|
|
|
|
ctf_decl_prec_t cd_ordp; /* Ordered precision. */
|
|
|
|
char *cd_buf; /* Buffer for output. */
|
|
|
|
int cd_err; /* Saved error value. */
|
|
|
|
int cd_enomem; /* Nonzero if OOM during printing. */
|
|
|
|
} ctf_decl_t;
|
|
|
|
|
|
|
|
typedef struct ctf_dmdef
|
|
|
|
{
|
|
|
|
ctf_list_t dmd_list; /* List forward/back pointers. */
|
|
|
|
char *dmd_name; /* Name of this member. */
|
|
|
|
ctf_id_t dmd_type; /* Type of this member (for sou). */
|
|
|
|
unsigned long dmd_offset; /* Offset of this member in bits (for sou). */
|
|
|
|
int dmd_value; /* Value of this member (for enum). */
|
|
|
|
} ctf_dmdef_t;
|
|
|
|
|
|
|
|
typedef struct ctf_dtdef
|
|
|
|
{
|
|
|
|
ctf_list_t dtd_list; /* List forward/back pointers. */
|
|
|
|
ctf_id_t dtd_type; /* Type identifier for this definition. */
|
libctf: avoid the need to ever use ctf_update
The method of operation of libctf when the dictionary is writable has
before now been that types that are added land in the dynamic type
section, which is a linked list and hash of IDs -> dynamic type
definitions (and, recently a hash of names): the DTDs are a bit of CTF
representing the ctf_type_t and ad hoc C structures representing the
vlen. Historically, libctf was unable to do anything with these types,
not even look them up by ID, let alone by name: if you wanted to do that
say if you were adding a type that depended on one you just added) you
called ctf_update, which serializes all the DTDs into a CTF file and
reopens it, copying its guts over the fp it's called with. The
ctf_updated types are then frozen in amber and unchangeable: all lookups
will return the types in the static portion in preference to the dynamic
portion, and we will refuse to re-add things that already exist in the
static portion (and, of late, in the dynamic portion too). The libctf
machinery remembers the boundary between static and dynamic types and
looks in the right portion for each type. Lots of things still don't
quite work with dynamic types (e.g. getting their size), but enough
works to do a bunch of additions and then a ctf_update, most of the
time.
Except it doesn't, because ctf_add_type finds it necessary to walk the
full dynamic type definition list looking for types with matching names,
so it gets slower and slower with every type you add: fixing this
requires calling ctf_update periodically for no other reason than to
avoid massively slowing things down.
This is all clunky and very slow but kind of works, until you consider
that it is in fact possible and indeed necessary to modify one sort of
type after it has been added: forwards. These are necessarily promoted
to structs, unions or enums, and when they do so *their type ID does not
change*. So all of a sudden we are changing types that already exist in
the static portion. ctf_update gets massively confused by this and
allocates space enough for the forward (with no members), but then emits
the new dynamic type (with all the members) into it. You get an
assertion failure after that, if you're lucky, or a coredump.
So this commit rejigs things a bit and arranges to exclusively use the
dynamic type definitions in writable dictionaries, and the static type
definitions in readable dictionaries: we don't at any time have a mixture
of static and dynamic types, and you don't need to call ctf_update to
make things "appear". The ctf_dtbyname hash I introduced a few months
ago, which maps things like "struct foo" to DTDs, is removed, replaced
instead by a change of type of the four dictionaries which track names.
Rather than just being (unresizable) ctf_hash_t's populated only at
ctf_bufopen time, they are now a ctf_names_t structure, which is a pair
of ctf_hash_t and ctf_dynhash_t, with the ctf_hash_t portion being used
in readonly dictionaries, and the ctf_dynhash_t being used in writable
ones. The decision as to which to use is centralized in the new
functions ctf_lookup_by_rawname (which takes a type kind) and
ctf_lookup_by_rawhash, which it calls (which takes a ctf_names_t *.)
This change lets us switch from using static to dynamic name hashes on
the fly across the entirety of libctf without complexifying anything: in
fact, because we now centralize the knowledge about how to map from type
kind to name hash, it actually simplifies things and lets us throw out
quite a lot of now-unnecessary complexity, from ctf_dtnyname (replaced
by the dynamic half of the name tables), through to ctf_dtnextid (now
that a dictionary's static portion is never referenced if the dictionary
is writable, we can just use ctf_typemax to indicate the maximum type:
dynamic or non-dynamic does not matter, and we no longer need to track
the boundary between the types). You can now ctf_rollback() as far as
you like, even past a ctf_update or for that matter a full writeout; all
the iteration functions work just as well on writable as on read-only
dictionaries; ctf_add_type no longer needs expensive duplicated code to
run over the dynamic types hunting for ones it might be interested in;
and the linker no longer needs a hack to call ctf_update so that calling
ctf_add_type is not impossibly expensive.
There is still a bit more complexity: some new code paths in ctf-types.c
need to know how to extract information from dynamic types. This
complexity will go away again in a few months when libctf acquires a
proper intermediate representation.
You can still call ctf_update if you like (it's public API, after all),
but its only effect now is to set the point to which ctf_discard rolls
back.
Obviously *something* still needs to serialize the CTF file before
writeout, and this job is done by ctf_serialize, which does everything
ctf_update used to except set the counter used by ctf_discard. It is
automatically called by the various functions that do CTF writeout:
nobody else ever needs to call it.
With this in place, forwards that are promoted to non-forwards no longer
crash the link, even if it happens tens of thousands of types later.
v5: fix tabdamage.
libctf/
* ctf-impl.h (ctf_names_t): New.
(ctf_lookup_t) <ctf_hash>: Now a ctf_names_t, not a ctf_hash_t.
(ctf_file_t) <ctf_structs>: Likewise.
<ctf_unions>: Likewise.
<ctf_enums>: Likewise.
<ctf_names>: Likewise.
<ctf_lookups>: Improve comment.
<ctf_ptrtab_len>: New.
<ctf_prov_strtab>: New.
<ctf_str_prov_offset>: New.
<ctf_dtbyname>: Remove, redundant to the names hashes.
<ctf_dtnextid>: Remove, redundant to ctf_typemax.
(ctf_dtdef_t) <dtd_name>: Remove.
<dtd_data>: Note that the ctt_name is now populated.
(ctf_str_atom_t) <csa_offset>: This is now the strtab
offset for internal strings too.
<csa_external_offset>: New, the external strtab offset.
(CTF_INDEX_TO_TYPEPTR): Handle the LCTF_RDWR case.
(ctf_name_table): New declaration.
(ctf_lookup_by_rawname): Likewise.
(ctf_lookup_by_rawhash): Likewise.
(ctf_set_ctl_hashes): Likewise.
(ctf_serialize): Likewise.
(ctf_dtd_insert): Adjust.
(ctf_simple_open_internal): Likewise.
(ctf_bufopen_internal): Likewise.
(ctf_list_empty_p): Likewise.
(ctf_str_remove_ref): Likewise.
(ctf_str_add): Returns uint32_t now.
(ctf_str_add_ref): Likewise.
(ctf_str_add_external): Now returns a boolean (int).
* ctf-string.c (ctf_strraw_explicit): Check the ctf_prov_strtab
for strings in the appropriate range.
(ctf_str_create_atoms): Create the ctf_prov_strtab. Detect OOM
when adding the null string to the new strtab.
(ctf_str_free_atoms): Destroy the ctf_prov_strtab.
(ctf_str_add_ref_internal): Add make_provisional argument. If
make_provisional, populate the offset and fill in the
ctf_prov_strtab accordingly.
(ctf_str_add): Return the offset, not the string.
(ctf_str_add_ref): Likewise.
(ctf_str_add_external): Return a success integer.
(ctf_str_remove_ref): New, remove a single ref.
(ctf_str_count_strtab): Do not count the initial null string's
length or the existence or length of any unreferenced internal
atoms.
(ctf_str_populate_sorttab): Skip atoms with no refs.
(ctf_str_write_strtab): Populate the nullstr earlier. Add one
to the cts_len for the null string, since it is no longer done
in ctf_str_count_strtab. Adjust for csa_external_offset rename.
Populate the csa_offset for both internal and external cases.
Flush the ctf_prov_strtab afterwards, and reset the
ctf_str_prov_offset.
* ctf-create.c (ctf_grow_ptrtab): New.
(ctf_create): Call it. Initialize new fields rather than old
ones. Tell ctf_bufopen_internal that this is a writable dictionary.
Set the ctl hashes and data model.
(ctf_update): Rename to...
(ctf_serialize): ... this. Leave a compatibility function behind.
Tell ctf_simple_open_internal that this is a writable dictionary.
Pass the new fields along from the old dictionary. Drop
ctf_dtnextid and ctf_dtbyname. Use ctf_strraw, not dtd_name.
Do not zero out the DTD's ctt_name.
(ctf_prefixed_name): Rename to...
(ctf_name_table): ... this. No longer return a prefixed name: return
the applicable name table instead.
(ctf_dtd_insert): Use it, and use the right name table. Pass in the
kind we're adding. Migrate away from dtd_name.
(ctf_dtd_delete): Adjust similarly. Remove the ref to the
deleted ctt_name.
(ctf_dtd_lookup_type_by_name): Remove.
(ctf_dynamic_type): Always return NULL on read-only dictionaries.
No longer check ctf_dtnextid: check ctf_typemax instead.
(ctf_snapshot): No longer use ctf_dtnextid: use ctf_typemax instead.
(ctf_rollback): Likewise. No longer fail with ECTF_OVERROLLBACK. Use
ctf_name_table and the right name table, and migrate away from
dtd_name as in ctf_dtd_delete.
(ctf_add_generic): Pass in the kind explicitly and pass it to
ctf_dtd_insert. Use ctf_typemax, not ctf_dtnextid. Migrate away
from dtd_name to using ctf_str_add_ref to populate the ctt_name.
Grow the ptrtab if needed.
(ctf_add_encoded): Pass in the kind.
(ctf_add_slice): Likewise.
(ctf_add_array): Likewise.
(ctf_add_function): Likewise.
(ctf_add_typedef): Likewise.
(ctf_add_reftype): Likewise. Initialize the ctf_ptrtab, checking
ctt_name rather than dtd_name.
(ctf_add_struct_sized): Pass in the kind. Use
ctf_lookup_by_rawname, not ctf_hash_lookup_type /
ctf_dtd_lookup_type_by_name.
(ctf_add_union_sized): Likewise.
(ctf_add_enum): Likewise.
(ctf_add_enum_encoded): Likewise.
(ctf_add_forward): Likewise.
(ctf_add_type): Likewise.
(ctf_compress_write): Call ctf_serialize: adjust for ctf_size not
being initialized until after the call.
(ctf_write_mem): Likewise.
(ctf_write): Likewise.
* ctf-archive.c (arc_write_one_ctf): Likewise.
* ctf-lookup.c (ctf_lookup_by_name): Use ctf_lookuup_by_rawhash, not
ctf_hash_lookup_type.
(ctf_lookup_by_id): No longer check the readonly types if the
dictionary is writable.
* ctf-open.c (init_types): Assert that this dictionary is not
writable. Adjust to use the new name hashes, ctf_name_table,
and ctf_ptrtab_len. GNU style fix for the final ptrtab scan.
(ctf_bufopen_internal): New 'writable' parameter. Flip on LCTF_RDWR
if set. Drop out early when dictionary is writable. Split the
ctf_lookups initialization into...
(ctf_set_cth_hashes): ... this new function.
(ctf_simple_open_internal): Adjust. New 'writable' parameter.
(ctf_simple_open): Adjust accordingly.
(ctf_bufopen): Likewise.
(ctf_file_close): Destroy the appropriate name hashes. No longer
destroy ctf_dtbyname, which is gone.
(ctf_getdatasect): Remove spurious "extern".
* ctf-types.c (ctf_lookup_by_rawname): New, look up types in the
specified name table, given a kind.
(ctf_lookup_by_rawhash): Likewise, given a ctf_names_t *.
(ctf_member_iter): Add support for iterating over the
dynamic type list.
(ctf_enum_iter): Likewise.
(ctf_variable_iter): Likewise.
(ctf_type_rvisit): Likewise.
(ctf_member_info): Add support for types in the dynamic type list.
(ctf_enum_name): Likewise.
(ctf_enum_value): Likewise.
(ctf_func_type_info): Likewise.
(ctf_func_type_args): Likewise.
* ctf-link.c (ctf_accumulate_archive_names): No longer call
ctf_update.
(ctf_link_write): Likewise.
(ctf_link_intern_extern_string): Adjust for new
ctf_str_add_external return value.
(ctf_link_add_strtab): Likewise.
* ctf-util.c (ctf_list_empty_p): New.
2019-08-08 00:55:09 +08:00
|
|
|
ctf_type_t dtd_data; /* Type node, including name. */
|
2019-04-24 05:24:13 +08:00
|
|
|
union
|
|
|
|
{
|
|
|
|
ctf_list_t dtu_members; /* struct, union, or enum */
|
|
|
|
ctf_arinfo_t dtu_arr; /* array */
|
|
|
|
ctf_encoding_t dtu_enc; /* integer or float */
|
|
|
|
ctf_id_t *dtu_argv; /* function */
|
|
|
|
ctf_slice_t dtu_slice; /* slice */
|
|
|
|
} dtd_u;
|
|
|
|
} ctf_dtdef_t;
|
|
|
|
|
|
|
|
typedef struct ctf_dvdef
|
|
|
|
{
|
|
|
|
ctf_list_t dvd_list; /* List forward/back pointers. */
|
|
|
|
char *dvd_name; /* Name associated with variable. */
|
|
|
|
ctf_id_t dvd_type; /* Type of variable. */
|
|
|
|
unsigned long dvd_snapshots; /* Snapshot count when inserted. */
|
|
|
|
} ctf_dvdef_t;
|
|
|
|
|
|
|
|
typedef struct ctf_bundle
|
|
|
|
{
|
|
|
|
ctf_file_t *ctb_file; /* CTF container handle. */
|
|
|
|
ctf_id_t ctb_type; /* CTF type identifier. */
|
|
|
|
ctf_dtdef_t *ctb_dtd; /* CTF dynamic type definition (if any). */
|
|
|
|
} ctf_bundle_t;
|
|
|
|
|
libctf: deduplicate and sort the string table
ctf.h states:
> [...] the CTF string table does not contain any duplicated strings.
Unfortunately this is entirely untrue: libctf has before now made no
attempt whatsoever to deduplicate the string table. It computes the
string table's length on the fly as it adds new strings to the dynamic
CTF file, and ctf_update() just writes each string to the table and
notes the current write position as it traverses the dynamic CTF file's
data structures and builds the final CTF buffer. There is no global
view of the strings and no deduplication.
Fix this by erasing the ctf_dtvstrlen dead-reckoning length, and adding
a new dynhash table ctf_str_atoms that maps unique strings to a list
of references to those strings: a reference is a simple uint32_t * to
some value somewhere in the under-construction CTF buffer that needs
updating to note the string offset when the strtab is laid out.
Adding a string is now a simple matter of calling ctf_str_add_ref(),
which adds a new atom to the atoms table, if one doesn't already exist,
and adding the location of the reference to this atom to the refs list
attached to the atom: this works reliably as long as one takes care to
only call ctf_str_add_ref() once the final location of the offset is
known (so you can't call it on a temporary structure and then memcpy()
that structure into place in the CTF buffer, because the ref will still
point to the old location: ctf_update() changes accordingly).
Generating the CTF string table is a matter of calling
ctf_str_write_strtab(), which counts the length and number of elements
in the atoms table using the ctf_dynhash_iter() function we just added,
populating an array of pointers into the atoms table and sorting it into
order (to help compressors), then traversing this table and emitting it,
updating the refs to each atom as we go. The only complexity here is
arranging to keep the null string at offset zero, since a lot of code in
libctf depends on being able to leave strtab references at 0 to indicate
'no name'. Once the table is constructed and the refs updated, we know
how long it is, so we can realloc() the partial CTF buffer we allocated
earlier and can copy the table on to the end of it (and purge the refs
because they're not needed any more and have been invalidated by the
realloc() call in any case).
The net effect of all this is a reduction in uncompressed strtab sizes
of about 30% (perhaps a quarter to a half of all strings across the
Linux kernel are eliminated as duplicates). Of course, duplicated
strings are highly redundant, so the space saving after compression is
only about 20%: when the other non-strtab sections are factored in, CTF
sizes shrink by about 10%.
No change in externally-visible API or file format (other than the
reduction in pointless redundancy).
libctf/
* ctf-impl.h: (struct ctf_strs_writable): New, non-const version of
struct ctf_strs.
(struct ctf_dtdef): Note that dtd_data.ctt_name is unpopulated.
(struct ctf_str_atom): New, disambiguated single string.
(struct ctf_str_atom_ref): New, points to some other location that
references this string's offset.
(struct ctf_file): New members ctf_str_atoms and ctf_str_num_refs.
Remove member ctf_dtvstrlen: we no longer track the total strlen
as we add strings.
(ctf_str_create_atoms): Declare new function in ctf-string.c.
(ctf_str_free_atoms): Likewise.
(ctf_str_add): Likewise.
(ctf_str_add_ref): Likewise.
(ctf_str_purge_refs): Likewise.
(ctf_str_write_strtab): Likewise.
(ctf_realloc): Declare new function in ctf-util.c.
* ctf-open.c (ctf_bufopen): Create the atoms table.
(ctf_file_close): Destroy it.
* ctf-create.c (ctf_update): Copy-and-free it on update. No longer
special-case the position of the parname string. Construct the
strtab by calling ctf_str_add_ref and ctf_str_write_strtab after the
rest of each buffer element is constructed, not via open-coding:
realloc the CTF buffer and append the strtab to it. No longer
maintain ctf_dtvstrlen. Sort the variable entry table later, after
strtab construction.
(ctf_copy_membnames): Remove: integrated into ctf_copy_{s,l,e}members.
(ctf_copy_smembers): Drop the string offset: call ctf_str_add_ref
after buffer element construction instead.
(ctf_copy_lmembers): Likewise.
(ctf_copy_emembers): Likewise.
(ctf_create): No longer maintain the ctf_dtvstrlen.
(ctf_dtd_delete): Likewise.
(ctf_dvd_delete): Likewise.
(ctf_add_generic): Likewise.
(ctf_add_enumerator): Likewise.
(ctf_add_member_offset): Likewise.
(ctf_add_variable): Likewise.
(membadd): Likewise.
* ctf-util.c (ctf_realloc): New, wrapper around realloc that aborts
if there are active ctf_str_num_refs.
(ctf_strraw): Move to ctf-string.c.
(ctf_strptr): Likewise.
* ctf-string.c: New file, strtab manipulation.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
2019-06-27 20:51:10 +08:00
|
|
|
/* Atoms associate strings with a list of the CTF items that reference that
|
|
|
|
string, so that ctf_update() can instantiate all the strings using the
|
|
|
|
ctf_str_atoms and then reassociate them with the real string later.
|
|
|
|
|
|
|
|
Strings can be interned into ctf_str_atom without having refs associated
|
|
|
|
with them, for values that are returned to callers, etc. Items are only
|
|
|
|
removed from this table on ctf_close(), but on every ctf_update(), all the
|
|
|
|
csa_refs in all entries are purged. */
|
|
|
|
|
|
|
|
typedef struct ctf_str_atom
|
|
|
|
{
|
|
|
|
const char *csa_str; /* Backpointer to string (hash key). */
|
|
|
|
ctf_list_t csa_refs; /* This string's refs. */
|
libctf: avoid the need to ever use ctf_update
The method of operation of libctf when the dictionary is writable has
before now been that types that are added land in the dynamic type
section, which is a linked list and hash of IDs -> dynamic type
definitions (and, recently a hash of names): the DTDs are a bit of CTF
representing the ctf_type_t and ad hoc C structures representing the
vlen. Historically, libctf was unable to do anything with these types,
not even look them up by ID, let alone by name: if you wanted to do that
say if you were adding a type that depended on one you just added) you
called ctf_update, which serializes all the DTDs into a CTF file and
reopens it, copying its guts over the fp it's called with. The
ctf_updated types are then frozen in amber and unchangeable: all lookups
will return the types in the static portion in preference to the dynamic
portion, and we will refuse to re-add things that already exist in the
static portion (and, of late, in the dynamic portion too). The libctf
machinery remembers the boundary between static and dynamic types and
looks in the right portion for each type. Lots of things still don't
quite work with dynamic types (e.g. getting their size), but enough
works to do a bunch of additions and then a ctf_update, most of the
time.
Except it doesn't, because ctf_add_type finds it necessary to walk the
full dynamic type definition list looking for types with matching names,
so it gets slower and slower with every type you add: fixing this
requires calling ctf_update periodically for no other reason than to
avoid massively slowing things down.
This is all clunky and very slow but kind of works, until you consider
that it is in fact possible and indeed necessary to modify one sort of
type after it has been added: forwards. These are necessarily promoted
to structs, unions or enums, and when they do so *their type ID does not
change*. So all of a sudden we are changing types that already exist in
the static portion. ctf_update gets massively confused by this and
allocates space enough for the forward (with no members), but then emits
the new dynamic type (with all the members) into it. You get an
assertion failure after that, if you're lucky, or a coredump.
So this commit rejigs things a bit and arranges to exclusively use the
dynamic type definitions in writable dictionaries, and the static type
definitions in readable dictionaries: we don't at any time have a mixture
of static and dynamic types, and you don't need to call ctf_update to
make things "appear". The ctf_dtbyname hash I introduced a few months
ago, which maps things like "struct foo" to DTDs, is removed, replaced
instead by a change of type of the four dictionaries which track names.
Rather than just being (unresizable) ctf_hash_t's populated only at
ctf_bufopen time, they are now a ctf_names_t structure, which is a pair
of ctf_hash_t and ctf_dynhash_t, with the ctf_hash_t portion being used
in readonly dictionaries, and the ctf_dynhash_t being used in writable
ones. The decision as to which to use is centralized in the new
functions ctf_lookup_by_rawname (which takes a type kind) and
ctf_lookup_by_rawhash, which it calls (which takes a ctf_names_t *.)
This change lets us switch from using static to dynamic name hashes on
the fly across the entirety of libctf without complexifying anything: in
fact, because we now centralize the knowledge about how to map from type
kind to name hash, it actually simplifies things and lets us throw out
quite a lot of now-unnecessary complexity, from ctf_dtnyname (replaced
by the dynamic half of the name tables), through to ctf_dtnextid (now
that a dictionary's static portion is never referenced if the dictionary
is writable, we can just use ctf_typemax to indicate the maximum type:
dynamic or non-dynamic does not matter, and we no longer need to track
the boundary between the types). You can now ctf_rollback() as far as
you like, even past a ctf_update or for that matter a full writeout; all
the iteration functions work just as well on writable as on read-only
dictionaries; ctf_add_type no longer needs expensive duplicated code to
run over the dynamic types hunting for ones it might be interested in;
and the linker no longer needs a hack to call ctf_update so that calling
ctf_add_type is not impossibly expensive.
There is still a bit more complexity: some new code paths in ctf-types.c
need to know how to extract information from dynamic types. This
complexity will go away again in a few months when libctf acquires a
proper intermediate representation.
You can still call ctf_update if you like (it's public API, after all),
but its only effect now is to set the point to which ctf_discard rolls
back.
Obviously *something* still needs to serialize the CTF file before
writeout, and this job is done by ctf_serialize, which does everything
ctf_update used to except set the counter used by ctf_discard. It is
automatically called by the various functions that do CTF writeout:
nobody else ever needs to call it.
With this in place, forwards that are promoted to non-forwards no longer
crash the link, even if it happens tens of thousands of types later.
v5: fix tabdamage.
libctf/
* ctf-impl.h (ctf_names_t): New.
(ctf_lookup_t) <ctf_hash>: Now a ctf_names_t, not a ctf_hash_t.
(ctf_file_t) <ctf_structs>: Likewise.
<ctf_unions>: Likewise.
<ctf_enums>: Likewise.
<ctf_names>: Likewise.
<ctf_lookups>: Improve comment.
<ctf_ptrtab_len>: New.
<ctf_prov_strtab>: New.
<ctf_str_prov_offset>: New.
<ctf_dtbyname>: Remove, redundant to the names hashes.
<ctf_dtnextid>: Remove, redundant to ctf_typemax.
(ctf_dtdef_t) <dtd_name>: Remove.
<dtd_data>: Note that the ctt_name is now populated.
(ctf_str_atom_t) <csa_offset>: This is now the strtab
offset for internal strings too.
<csa_external_offset>: New, the external strtab offset.
(CTF_INDEX_TO_TYPEPTR): Handle the LCTF_RDWR case.
(ctf_name_table): New declaration.
(ctf_lookup_by_rawname): Likewise.
(ctf_lookup_by_rawhash): Likewise.
(ctf_set_ctl_hashes): Likewise.
(ctf_serialize): Likewise.
(ctf_dtd_insert): Adjust.
(ctf_simple_open_internal): Likewise.
(ctf_bufopen_internal): Likewise.
(ctf_list_empty_p): Likewise.
(ctf_str_remove_ref): Likewise.
(ctf_str_add): Returns uint32_t now.
(ctf_str_add_ref): Likewise.
(ctf_str_add_external): Now returns a boolean (int).
* ctf-string.c (ctf_strraw_explicit): Check the ctf_prov_strtab
for strings in the appropriate range.
(ctf_str_create_atoms): Create the ctf_prov_strtab. Detect OOM
when adding the null string to the new strtab.
(ctf_str_free_atoms): Destroy the ctf_prov_strtab.
(ctf_str_add_ref_internal): Add make_provisional argument. If
make_provisional, populate the offset and fill in the
ctf_prov_strtab accordingly.
(ctf_str_add): Return the offset, not the string.
(ctf_str_add_ref): Likewise.
(ctf_str_add_external): Return a success integer.
(ctf_str_remove_ref): New, remove a single ref.
(ctf_str_count_strtab): Do not count the initial null string's
length or the existence or length of any unreferenced internal
atoms.
(ctf_str_populate_sorttab): Skip atoms with no refs.
(ctf_str_write_strtab): Populate the nullstr earlier. Add one
to the cts_len for the null string, since it is no longer done
in ctf_str_count_strtab. Adjust for csa_external_offset rename.
Populate the csa_offset for both internal and external cases.
Flush the ctf_prov_strtab afterwards, and reset the
ctf_str_prov_offset.
* ctf-create.c (ctf_grow_ptrtab): New.
(ctf_create): Call it. Initialize new fields rather than old
ones. Tell ctf_bufopen_internal that this is a writable dictionary.
Set the ctl hashes and data model.
(ctf_update): Rename to...
(ctf_serialize): ... this. Leave a compatibility function behind.
Tell ctf_simple_open_internal that this is a writable dictionary.
Pass the new fields along from the old dictionary. Drop
ctf_dtnextid and ctf_dtbyname. Use ctf_strraw, not dtd_name.
Do not zero out the DTD's ctt_name.
(ctf_prefixed_name): Rename to...
(ctf_name_table): ... this. No longer return a prefixed name: return
the applicable name table instead.
(ctf_dtd_insert): Use it, and use the right name table. Pass in the
kind we're adding. Migrate away from dtd_name.
(ctf_dtd_delete): Adjust similarly. Remove the ref to the
deleted ctt_name.
(ctf_dtd_lookup_type_by_name): Remove.
(ctf_dynamic_type): Always return NULL on read-only dictionaries.
No longer check ctf_dtnextid: check ctf_typemax instead.
(ctf_snapshot): No longer use ctf_dtnextid: use ctf_typemax instead.
(ctf_rollback): Likewise. No longer fail with ECTF_OVERROLLBACK. Use
ctf_name_table and the right name table, and migrate away from
dtd_name as in ctf_dtd_delete.
(ctf_add_generic): Pass in the kind explicitly and pass it to
ctf_dtd_insert. Use ctf_typemax, not ctf_dtnextid. Migrate away
from dtd_name to using ctf_str_add_ref to populate the ctt_name.
Grow the ptrtab if needed.
(ctf_add_encoded): Pass in the kind.
(ctf_add_slice): Likewise.
(ctf_add_array): Likewise.
(ctf_add_function): Likewise.
(ctf_add_typedef): Likewise.
(ctf_add_reftype): Likewise. Initialize the ctf_ptrtab, checking
ctt_name rather than dtd_name.
(ctf_add_struct_sized): Pass in the kind. Use
ctf_lookup_by_rawname, not ctf_hash_lookup_type /
ctf_dtd_lookup_type_by_name.
(ctf_add_union_sized): Likewise.
(ctf_add_enum): Likewise.
(ctf_add_enum_encoded): Likewise.
(ctf_add_forward): Likewise.
(ctf_add_type): Likewise.
(ctf_compress_write): Call ctf_serialize: adjust for ctf_size not
being initialized until after the call.
(ctf_write_mem): Likewise.
(ctf_write): Likewise.
* ctf-archive.c (arc_write_one_ctf): Likewise.
* ctf-lookup.c (ctf_lookup_by_name): Use ctf_lookuup_by_rawhash, not
ctf_hash_lookup_type.
(ctf_lookup_by_id): No longer check the readonly types if the
dictionary is writable.
* ctf-open.c (init_types): Assert that this dictionary is not
writable. Adjust to use the new name hashes, ctf_name_table,
and ctf_ptrtab_len. GNU style fix for the final ptrtab scan.
(ctf_bufopen_internal): New 'writable' parameter. Flip on LCTF_RDWR
if set. Drop out early when dictionary is writable. Split the
ctf_lookups initialization into...
(ctf_set_cth_hashes): ... this new function.
(ctf_simple_open_internal): Adjust. New 'writable' parameter.
(ctf_simple_open): Adjust accordingly.
(ctf_bufopen): Likewise.
(ctf_file_close): Destroy the appropriate name hashes. No longer
destroy ctf_dtbyname, which is gone.
(ctf_getdatasect): Remove spurious "extern".
* ctf-types.c (ctf_lookup_by_rawname): New, look up types in the
specified name table, given a kind.
(ctf_lookup_by_rawhash): Likewise, given a ctf_names_t *.
(ctf_member_iter): Add support for iterating over the
dynamic type list.
(ctf_enum_iter): Likewise.
(ctf_variable_iter): Likewise.
(ctf_type_rvisit): Likewise.
(ctf_member_info): Add support for types in the dynamic type list.
(ctf_enum_name): Likewise.
(ctf_enum_value): Likewise.
(ctf_func_type_info): Likewise.
(ctf_func_type_args): Likewise.
* ctf-link.c (ctf_accumulate_archive_names): No longer call
ctf_update.
(ctf_link_write): Likewise.
(ctf_link_intern_extern_string): Adjust for new
ctf_str_add_external return value.
(ctf_link_add_strtab): Likewise.
* ctf-util.c (ctf_list_empty_p): New.
2019-08-08 00:55:09 +08:00
|
|
|
uint32_t csa_offset; /* Strtab offset, if any. */
|
|
|
|
uint32_t csa_external_offset; /* External strtab offset, if any. */
|
libctf: deduplicate and sort the string table
ctf.h states:
> [...] the CTF string table does not contain any duplicated strings.
Unfortunately this is entirely untrue: libctf has before now made no
attempt whatsoever to deduplicate the string table. It computes the
string table's length on the fly as it adds new strings to the dynamic
CTF file, and ctf_update() just writes each string to the table and
notes the current write position as it traverses the dynamic CTF file's
data structures and builds the final CTF buffer. There is no global
view of the strings and no deduplication.
Fix this by erasing the ctf_dtvstrlen dead-reckoning length, and adding
a new dynhash table ctf_str_atoms that maps unique strings to a list
of references to those strings: a reference is a simple uint32_t * to
some value somewhere in the under-construction CTF buffer that needs
updating to note the string offset when the strtab is laid out.
Adding a string is now a simple matter of calling ctf_str_add_ref(),
which adds a new atom to the atoms table, if one doesn't already exist,
and adding the location of the reference to this atom to the refs list
attached to the atom: this works reliably as long as one takes care to
only call ctf_str_add_ref() once the final location of the offset is
known (so you can't call it on a temporary structure and then memcpy()
that structure into place in the CTF buffer, because the ref will still
point to the old location: ctf_update() changes accordingly).
Generating the CTF string table is a matter of calling
ctf_str_write_strtab(), which counts the length and number of elements
in the atoms table using the ctf_dynhash_iter() function we just added,
populating an array of pointers into the atoms table and sorting it into
order (to help compressors), then traversing this table and emitting it,
updating the refs to each atom as we go. The only complexity here is
arranging to keep the null string at offset zero, since a lot of code in
libctf depends on being able to leave strtab references at 0 to indicate
'no name'. Once the table is constructed and the refs updated, we know
how long it is, so we can realloc() the partial CTF buffer we allocated
earlier and can copy the table on to the end of it (and purge the refs
because they're not needed any more and have been invalidated by the
realloc() call in any case).
The net effect of all this is a reduction in uncompressed strtab sizes
of about 30% (perhaps a quarter to a half of all strings across the
Linux kernel are eliminated as duplicates). Of course, duplicated
strings are highly redundant, so the space saving after compression is
only about 20%: when the other non-strtab sections are factored in, CTF
sizes shrink by about 10%.
No change in externally-visible API or file format (other than the
reduction in pointless redundancy).
libctf/
* ctf-impl.h: (struct ctf_strs_writable): New, non-const version of
struct ctf_strs.
(struct ctf_dtdef): Note that dtd_data.ctt_name is unpopulated.
(struct ctf_str_atom): New, disambiguated single string.
(struct ctf_str_atom_ref): New, points to some other location that
references this string's offset.
(struct ctf_file): New members ctf_str_atoms and ctf_str_num_refs.
Remove member ctf_dtvstrlen: we no longer track the total strlen
as we add strings.
(ctf_str_create_atoms): Declare new function in ctf-string.c.
(ctf_str_free_atoms): Likewise.
(ctf_str_add): Likewise.
(ctf_str_add_ref): Likewise.
(ctf_str_purge_refs): Likewise.
(ctf_str_write_strtab): Likewise.
(ctf_realloc): Declare new function in ctf-util.c.
* ctf-open.c (ctf_bufopen): Create the atoms table.
(ctf_file_close): Destroy it.
* ctf-create.c (ctf_update): Copy-and-free it on update. No longer
special-case the position of the parname string. Construct the
strtab by calling ctf_str_add_ref and ctf_str_write_strtab after the
rest of each buffer element is constructed, not via open-coding:
realloc the CTF buffer and append the strtab to it. No longer
maintain ctf_dtvstrlen. Sort the variable entry table later, after
strtab construction.
(ctf_copy_membnames): Remove: integrated into ctf_copy_{s,l,e}members.
(ctf_copy_smembers): Drop the string offset: call ctf_str_add_ref
after buffer element construction instead.
(ctf_copy_lmembers): Likewise.
(ctf_copy_emembers): Likewise.
(ctf_create): No longer maintain the ctf_dtvstrlen.
(ctf_dtd_delete): Likewise.
(ctf_dvd_delete): Likewise.
(ctf_add_generic): Likewise.
(ctf_add_enumerator): Likewise.
(ctf_add_member_offset): Likewise.
(ctf_add_variable): Likewise.
(membadd): Likewise.
* ctf-util.c (ctf_realloc): New, wrapper around realloc that aborts
if there are active ctf_str_num_refs.
(ctf_strraw): Move to ctf-string.c.
(ctf_strptr): Likewise.
* ctf-string.c: New file, strtab manipulation.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
2019-06-27 20:51:10 +08:00
|
|
|
unsigned long csa_snapshot_id; /* Snapshot ID at time of creation. */
|
|
|
|
} ctf_str_atom_t;
|
|
|
|
|
|
|
|
/* The refs of a single string in the atoms table. */
|
|
|
|
|
|
|
|
typedef struct ctf_str_atom_ref
|
|
|
|
{
|
|
|
|
ctf_list_t caf_list; /* List forward/back pointers. */
|
|
|
|
uint32_t *caf_ref; /* A single ref to this string. */
|
|
|
|
} ctf_str_atom_ref_t;
|
|
|
|
|
2019-07-14 04:31:26 +08:00
|
|
|
/* The structure used as the key in a ctf_link_type_mapping, which lets the
|
|
|
|
linker machinery determine which type IDs on the input side of a link map to
|
|
|
|
which types on the output side. (The value is a ctf_id_t: another
|
|
|
|
index, not a type.) */
|
|
|
|
|
|
|
|
typedef struct ctf_link_type_mapping_key
|
|
|
|
{
|
|
|
|
ctf_file_t *cltm_fp;
|
|
|
|
ctf_id_t cltm_idx;
|
|
|
|
} ctf_link_type_mapping_key_t;
|
|
|
|
|
2019-04-24 05:24:13 +08:00
|
|
|
/* The ctf_file is the structure used to represent a CTF container to library
|
|
|
|
clients, who see it only as an opaque pointer. Modifications can therefore
|
|
|
|
be made freely to this structure without regard to client versioning. The
|
|
|
|
ctf_file_t typedef appears in <ctf-api.h> and declares a forward tag.
|
|
|
|
|
|
|
|
NOTE: ctf_update() requires that everything inside of ctf_file either be an
|
|
|
|
immediate value, a pointer to dynamically allocated data *outside* of the
|
|
|
|
ctf_file itself, or a pointer to statically allocated data. If you add a
|
|
|
|
pointer to ctf_file that points to something within the ctf_file itself,
|
|
|
|
you must make corresponding changes to ctf_update(). */
|
|
|
|
|
|
|
|
struct ctf_file
|
|
|
|
{
|
|
|
|
const ctf_fileops_t *ctf_fileops; /* Version-specific file operations. */
|
libctf: allow the header to change between versions
libctf supports dynamic upgrading of the type table as file format
versions change, but before now has not supported changes to the CTF
header. Doing this is complicated by the baroque storage method used:
the CTF header is kept prepended to the rest of the CTF data, just as
when read from the file, and written out from there, and is
endian-flipped in place.
This makes accessing it needlessly hard and makes it almost impossible
to make the header larger if we add fields. The general storage
machinery around the malloced ctf pointer (the 'ctf_base') is also
overcomplicated: the pointer is sometimes malloced locally and sometimes
assigned from a parameter, so freeing it requires checking to see if
that parameter was used, needlessly coupling ctf_bufopen and
ctf_file_close together.
So split the header out into a new ctf_file_t.ctf_header, which is
written out explicitly: squeeze it out of the CTF buffer whenever we
reallocate it, and use ctf_file_t.ctf_buf to skip past the header when
we do not need to reallocate (when no upgrading or endian-flipping is
required). We now track whether the CTF base can be freed explicitly
via a new ctf_dynbase pointer which is non-NULL only when freeing is
possible.
With all this done, we can upgrade the header on the fly and add new
fields as desired, via a new upgrade_header function in ctf-open.
As with other forms of upgrading, libctf upgrades older headers
automatically to the latest supported version at open time.
For a first use of this field, we add a new string field cth_cuname, and
a corresponding setter/getter pair ctf_cuname_set and ctf_cuname: this
is used by debuggers to determine whether a CTF section's types relate
to a single compilation unit, or to all compilation units in the
program. (Types with ambiguous definitions in different CUs have only
one of these types placed in the top-level shared .ctf container: the
rest are placed in much smaller per-CU containers, which have the shared
container as their parent. Since CTF must be useful in the absence of
DWARF, we store the names of the relevant CUs ourselves, so the debugger
can look them up.)
v5: fix tabdamage.
include/
* ctf-api.h (ctf_cuname): New function.
(ctf_cuname_set): Likewise.
* ctf.h: Improve comment around upgrading, no longer
implying that v2 is the target of upgrades (it is v3 now).
(ctf_header_v2_t): New, old-format header for backward
compatibility.
(ctf_header_t): Add cth_cuname: this is the first of several
header changes in format v3.
libctf/
* ctf-impl.h (ctf_file_t): New fields ctf_header, ctf_dynbase,
ctf_cuname, ctf_dyncuname: ctf_base and ctf_buf are no longer const.
* ctf-open.c (ctf_set_base): Preserve the gap between ctf_buf and
ctf_base: do not assume that it is always sizeof (ctf_header_t).
Print out ctf_cuname: only print out ctf_parname if set.
(ctf_free_base): Removed, ctf_base is no longer freed: free
ctf_dynbase instead.
(ctf_set_version): Fix spacing.
(upgrade_header): New, in-place header upgrading.
(upgrade_types): Rename to...
(upgrade_types_v1): ... this. Free ctf_dynbase, not ctf_base. No
longer track old and new headers separately. No longer allow for
header sizes explicitly: squeeze the headers out on upgrade (they
are preserved in fp->ctf_header). Set ctf_dynbase, ctf_base and
ctf_buf explicitly. Use ctf_free, not ctf_free_base.
(upgrade_types): New, also handle ctf_parmax updating.
(flip_header): Flip ctf_cuname.
(flip_types): Flip BUF explicitly rather than deriving BUF from
BASE.
(ctf_bufopen): Store the header in fp->ctf_header. Correct minimum
required alignment of objtoff and funcoff. No longer store it in
the ctf_buf unless that buf is derived unmodified from the input.
Set ctf_dynbase where ctf_base is dynamically allocated. Drop locals
that duplicate fields in ctf_file: move allocation of ctf_file
further up instead. Call upgrade_header as needed. Move
version-specific ctf_parmax initialization into upgrade_types. More
concise error handling.
(ctf_file_close): No longer test for null pointers before freeing.
Free ctf_dyncuname, ctf_dynbase, and ctf_header. Do not call
ctf_free_base.
(ctf_cuname): New.
(ctf_cuname_set): New.
* ctf-create.c (ctf_update): Populate ctf_cuname.
(ctf_gzwrite): Write out the header explicitly. Remove obsolescent
comment.
(ctf_write): Likewise.
(ctf_compress_write): Get the header from ctf_header, not ctf_base.
Fix the compression length: fp->ctf_size never counted the CTF
header. Simplify the compress call accordingly.
2019-07-07 00:36:21 +08:00
|
|
|
struct ctf_header *ctf_header; /* The header from this CTF file. */
|
2019-07-08 20:59:15 +08:00
|
|
|
unsigned char ctf_openflags; /* Flags the file had when opened. */
|
2019-04-24 05:24:13 +08:00
|
|
|
ctf_sect_t ctf_data; /* CTF data from object file. */
|
|
|
|
ctf_sect_t ctf_symtab; /* Symbol table from object file. */
|
|
|
|
ctf_sect_t ctf_strtab; /* String table from object file. */
|
libctf: avoid the need to ever use ctf_update
The method of operation of libctf when the dictionary is writable has
before now been that types that are added land in the dynamic type
section, which is a linked list and hash of IDs -> dynamic type
definitions (and, recently a hash of names): the DTDs are a bit of CTF
representing the ctf_type_t and ad hoc C structures representing the
vlen. Historically, libctf was unable to do anything with these types,
not even look them up by ID, let alone by name: if you wanted to do that
say if you were adding a type that depended on one you just added) you
called ctf_update, which serializes all the DTDs into a CTF file and
reopens it, copying its guts over the fp it's called with. The
ctf_updated types are then frozen in amber and unchangeable: all lookups
will return the types in the static portion in preference to the dynamic
portion, and we will refuse to re-add things that already exist in the
static portion (and, of late, in the dynamic portion too). The libctf
machinery remembers the boundary between static and dynamic types and
looks in the right portion for each type. Lots of things still don't
quite work with dynamic types (e.g. getting their size), but enough
works to do a bunch of additions and then a ctf_update, most of the
time.
Except it doesn't, because ctf_add_type finds it necessary to walk the
full dynamic type definition list looking for types with matching names,
so it gets slower and slower with every type you add: fixing this
requires calling ctf_update periodically for no other reason than to
avoid massively slowing things down.
This is all clunky and very slow but kind of works, until you consider
that it is in fact possible and indeed necessary to modify one sort of
type after it has been added: forwards. These are necessarily promoted
to structs, unions or enums, and when they do so *their type ID does not
change*. So all of a sudden we are changing types that already exist in
the static portion. ctf_update gets massively confused by this and
allocates space enough for the forward (with no members), but then emits
the new dynamic type (with all the members) into it. You get an
assertion failure after that, if you're lucky, or a coredump.
So this commit rejigs things a bit and arranges to exclusively use the
dynamic type definitions in writable dictionaries, and the static type
definitions in readable dictionaries: we don't at any time have a mixture
of static and dynamic types, and you don't need to call ctf_update to
make things "appear". The ctf_dtbyname hash I introduced a few months
ago, which maps things like "struct foo" to DTDs, is removed, replaced
instead by a change of type of the four dictionaries which track names.
Rather than just being (unresizable) ctf_hash_t's populated only at
ctf_bufopen time, they are now a ctf_names_t structure, which is a pair
of ctf_hash_t and ctf_dynhash_t, with the ctf_hash_t portion being used
in readonly dictionaries, and the ctf_dynhash_t being used in writable
ones. The decision as to which to use is centralized in the new
functions ctf_lookup_by_rawname (which takes a type kind) and
ctf_lookup_by_rawhash, which it calls (which takes a ctf_names_t *.)
This change lets us switch from using static to dynamic name hashes on
the fly across the entirety of libctf without complexifying anything: in
fact, because we now centralize the knowledge about how to map from type
kind to name hash, it actually simplifies things and lets us throw out
quite a lot of now-unnecessary complexity, from ctf_dtnyname (replaced
by the dynamic half of the name tables), through to ctf_dtnextid (now
that a dictionary's static portion is never referenced if the dictionary
is writable, we can just use ctf_typemax to indicate the maximum type:
dynamic or non-dynamic does not matter, and we no longer need to track
the boundary between the types). You can now ctf_rollback() as far as
you like, even past a ctf_update or for that matter a full writeout; all
the iteration functions work just as well on writable as on read-only
dictionaries; ctf_add_type no longer needs expensive duplicated code to
run over the dynamic types hunting for ones it might be interested in;
and the linker no longer needs a hack to call ctf_update so that calling
ctf_add_type is not impossibly expensive.
There is still a bit more complexity: some new code paths in ctf-types.c
need to know how to extract information from dynamic types. This
complexity will go away again in a few months when libctf acquires a
proper intermediate representation.
You can still call ctf_update if you like (it's public API, after all),
but its only effect now is to set the point to which ctf_discard rolls
back.
Obviously *something* still needs to serialize the CTF file before
writeout, and this job is done by ctf_serialize, which does everything
ctf_update used to except set the counter used by ctf_discard. It is
automatically called by the various functions that do CTF writeout:
nobody else ever needs to call it.
With this in place, forwards that are promoted to non-forwards no longer
crash the link, even if it happens tens of thousands of types later.
v5: fix tabdamage.
libctf/
* ctf-impl.h (ctf_names_t): New.
(ctf_lookup_t) <ctf_hash>: Now a ctf_names_t, not a ctf_hash_t.
(ctf_file_t) <ctf_structs>: Likewise.
<ctf_unions>: Likewise.
<ctf_enums>: Likewise.
<ctf_names>: Likewise.
<ctf_lookups>: Improve comment.
<ctf_ptrtab_len>: New.
<ctf_prov_strtab>: New.
<ctf_str_prov_offset>: New.
<ctf_dtbyname>: Remove, redundant to the names hashes.
<ctf_dtnextid>: Remove, redundant to ctf_typemax.
(ctf_dtdef_t) <dtd_name>: Remove.
<dtd_data>: Note that the ctt_name is now populated.
(ctf_str_atom_t) <csa_offset>: This is now the strtab
offset for internal strings too.
<csa_external_offset>: New, the external strtab offset.
(CTF_INDEX_TO_TYPEPTR): Handle the LCTF_RDWR case.
(ctf_name_table): New declaration.
(ctf_lookup_by_rawname): Likewise.
(ctf_lookup_by_rawhash): Likewise.
(ctf_set_ctl_hashes): Likewise.
(ctf_serialize): Likewise.
(ctf_dtd_insert): Adjust.
(ctf_simple_open_internal): Likewise.
(ctf_bufopen_internal): Likewise.
(ctf_list_empty_p): Likewise.
(ctf_str_remove_ref): Likewise.
(ctf_str_add): Returns uint32_t now.
(ctf_str_add_ref): Likewise.
(ctf_str_add_external): Now returns a boolean (int).
* ctf-string.c (ctf_strraw_explicit): Check the ctf_prov_strtab
for strings in the appropriate range.
(ctf_str_create_atoms): Create the ctf_prov_strtab. Detect OOM
when adding the null string to the new strtab.
(ctf_str_free_atoms): Destroy the ctf_prov_strtab.
(ctf_str_add_ref_internal): Add make_provisional argument. If
make_provisional, populate the offset and fill in the
ctf_prov_strtab accordingly.
(ctf_str_add): Return the offset, not the string.
(ctf_str_add_ref): Likewise.
(ctf_str_add_external): Return a success integer.
(ctf_str_remove_ref): New, remove a single ref.
(ctf_str_count_strtab): Do not count the initial null string's
length or the existence or length of any unreferenced internal
atoms.
(ctf_str_populate_sorttab): Skip atoms with no refs.
(ctf_str_write_strtab): Populate the nullstr earlier. Add one
to the cts_len for the null string, since it is no longer done
in ctf_str_count_strtab. Adjust for csa_external_offset rename.
Populate the csa_offset for both internal and external cases.
Flush the ctf_prov_strtab afterwards, and reset the
ctf_str_prov_offset.
* ctf-create.c (ctf_grow_ptrtab): New.
(ctf_create): Call it. Initialize new fields rather than old
ones. Tell ctf_bufopen_internal that this is a writable dictionary.
Set the ctl hashes and data model.
(ctf_update): Rename to...
(ctf_serialize): ... this. Leave a compatibility function behind.
Tell ctf_simple_open_internal that this is a writable dictionary.
Pass the new fields along from the old dictionary. Drop
ctf_dtnextid and ctf_dtbyname. Use ctf_strraw, not dtd_name.
Do not zero out the DTD's ctt_name.
(ctf_prefixed_name): Rename to...
(ctf_name_table): ... this. No longer return a prefixed name: return
the applicable name table instead.
(ctf_dtd_insert): Use it, and use the right name table. Pass in the
kind we're adding. Migrate away from dtd_name.
(ctf_dtd_delete): Adjust similarly. Remove the ref to the
deleted ctt_name.
(ctf_dtd_lookup_type_by_name): Remove.
(ctf_dynamic_type): Always return NULL on read-only dictionaries.
No longer check ctf_dtnextid: check ctf_typemax instead.
(ctf_snapshot): No longer use ctf_dtnextid: use ctf_typemax instead.
(ctf_rollback): Likewise. No longer fail with ECTF_OVERROLLBACK. Use
ctf_name_table and the right name table, and migrate away from
dtd_name as in ctf_dtd_delete.
(ctf_add_generic): Pass in the kind explicitly and pass it to
ctf_dtd_insert. Use ctf_typemax, not ctf_dtnextid. Migrate away
from dtd_name to using ctf_str_add_ref to populate the ctt_name.
Grow the ptrtab if needed.
(ctf_add_encoded): Pass in the kind.
(ctf_add_slice): Likewise.
(ctf_add_array): Likewise.
(ctf_add_function): Likewise.
(ctf_add_typedef): Likewise.
(ctf_add_reftype): Likewise. Initialize the ctf_ptrtab, checking
ctt_name rather than dtd_name.
(ctf_add_struct_sized): Pass in the kind. Use
ctf_lookup_by_rawname, not ctf_hash_lookup_type /
ctf_dtd_lookup_type_by_name.
(ctf_add_union_sized): Likewise.
(ctf_add_enum): Likewise.
(ctf_add_enum_encoded): Likewise.
(ctf_add_forward): Likewise.
(ctf_add_type): Likewise.
(ctf_compress_write): Call ctf_serialize: adjust for ctf_size not
being initialized until after the call.
(ctf_write_mem): Likewise.
(ctf_write): Likewise.
* ctf-archive.c (arc_write_one_ctf): Likewise.
* ctf-lookup.c (ctf_lookup_by_name): Use ctf_lookuup_by_rawhash, not
ctf_hash_lookup_type.
(ctf_lookup_by_id): No longer check the readonly types if the
dictionary is writable.
* ctf-open.c (init_types): Assert that this dictionary is not
writable. Adjust to use the new name hashes, ctf_name_table,
and ctf_ptrtab_len. GNU style fix for the final ptrtab scan.
(ctf_bufopen_internal): New 'writable' parameter. Flip on LCTF_RDWR
if set. Drop out early when dictionary is writable. Split the
ctf_lookups initialization into...
(ctf_set_cth_hashes): ... this new function.
(ctf_simple_open_internal): Adjust. New 'writable' parameter.
(ctf_simple_open): Adjust accordingly.
(ctf_bufopen): Likewise.
(ctf_file_close): Destroy the appropriate name hashes. No longer
destroy ctf_dtbyname, which is gone.
(ctf_getdatasect): Remove spurious "extern".
* ctf-types.c (ctf_lookup_by_rawname): New, look up types in the
specified name table, given a kind.
(ctf_lookup_by_rawhash): Likewise, given a ctf_names_t *.
(ctf_member_iter): Add support for iterating over the
dynamic type list.
(ctf_enum_iter): Likewise.
(ctf_variable_iter): Likewise.
(ctf_type_rvisit): Likewise.
(ctf_member_info): Add support for types in the dynamic type list.
(ctf_enum_name): Likewise.
(ctf_enum_value): Likewise.
(ctf_func_type_info): Likewise.
(ctf_func_type_args): Likewise.
* ctf-link.c (ctf_accumulate_archive_names): No longer call
ctf_update.
(ctf_link_write): Likewise.
(ctf_link_intern_extern_string): Adjust for new
ctf_str_add_external return value.
(ctf_link_add_strtab): Likewise.
* ctf-util.c (ctf_list_empty_p): New.
2019-08-08 00:55:09 +08:00
|
|
|
ctf_dynhash_t *ctf_prov_strtab; /* Maps provisional-strtab offsets
|
|
|
|
to names. */
|
libctf: support getting strings from the ELF strtab
The CTF file format has always supported "external strtabs", which
internally are strtab offsets with their MSB on: such refs
get their strings from the strtab passed in at CTF file open time:
this is usually intended to be the ELF strtab, and that's what this
implementation is meant to support, though in theory the external
strtab could come from anywhere.
This commit adds support for these external strings in the ctf-string.c
strtab tracking layer. It's quite easy: we just add a field csa_offset
to the atoms table that tracks all strings: this field tracks the offset
of the string in the ELF strtab (with its MSB already on, courtesy of a
new macro CTF_SET_STID), and adds a new function that sets the
csa_offset to the specified offset (plus MSB). Then we just need to
avoid writing out strings to the internal strtab if they have csa_offset
set, and note that the internal strtab is shorter than it might
otherwise be.
(We could in theory save a little more time here by eschewing sorting
such strings, since we never actually write the strings out anywhere,
but that would mean storing them separately and it's just not worth the
complexity cost until profiling shows it's worth doing.)
We also have to go through a bit of extra effort at variable-sorting
time. This was previously using direct references to the internal
strtab: it couldn't use ctf_strptr or ctf_strraw because the new strtab
is not yet ready to put in its usual field (in a ctf_file_t that hasn't
even been allocated yet at this stage): but now we're using the external
strtab, this will no longer do because it'll be looking things up in the
wrong strtab, with disastrous results. Instead, pass the new internal
strtab in to a new ctf_strraw_explicit function which is just like
ctf_strraw except you can specify a ne winternal strtab to use.
But even now that it is using a new internal strtab, this is not quite
enough: it can't look up strings in the external strtab because ld
hasn't written it out yet, and when it does will write it straight to
disk. Instead, when we write the internal strtab, note all the offset
-> string mappings that we have noted belong in the *external* strtab to
a new "synthetic external strtab" dynhash, ctf_syn_ext_strtab, and look
in there at ctf_strraw time if it is set. This uses minimal extra
memory (because only strings in the external strtab that we actually use
are stored, and even those come straight out of the atoms table), but
let both variable sorting and name interning when ctf_bufopen is next
called work fine. (This also means that we don't need to filter out
spurious ECTF_STRTAB warnings from ctf_bufopen but can pass them back to
the caller, once we wrap ctf_bufopen so that we have a new internal
variant of ctf_bufopen etc that we can pass the synthetic external
strtab to. That error has been filtered out since the days of Solaris
libctf, which didn't try to handle the problem of getting external
strtabs right at construction time at all.)
v3: add the synthetic strtab and all associated machinery.
v5: fix tabdamage.
include/
* ctf.h (CTF_SET_STID): New.
libctf/
* ctf-impl.h (ctf_str_atom_t) <csa_offset>: New field.
(ctf_file_t) <ctf_syn_ext_strtab>: Likewise.
(ctf_str_add_ref): Name the last arg.
(ctf_str_add_external) New.
(ctf_str_add_strraw_explicit): Likewise.
(ctf_simple_open_internal): Likewise.
(ctf_bufopen_internal): Likewise.
* ctf-string.c (ctf_strraw_explicit): Split from...
(ctf_strraw): ... here, with new support for ctf_syn_ext_strtab.
(ctf_str_add_ref_internal): Return the atom, not the
string.
(ctf_str_add): Adjust accordingly.
(ctf_str_add_ref): Likewise. Move up in the file.
(ctf_str_add_external): New: update the csa_offset.
(ctf_str_count_strtab): Only account for strings with no csa_offset
in the internal strtab length.
(ctf_str_write_strtab): If the csa_offset is set, update the
string's refs without writing the string out, and update the
ctf_syn_ext_strtab. Make OOM handling less ugly.
* ctf-create.c (struct ctf_sort_var_arg_cb): New.
(ctf_update): Handle failure to populate the strtab. Pass in the
new ctf_sort_var arg. Adjust for ctf_syn_ext_strtab addition.
Call ctf_simple_open_internal, not ctf_simple_open.
(ctf_sort_var): Call ctf_strraw_explicit rather than looking up
strings by hand.
* ctf-hash.c (ctf_hash_insert_type): Likewise (but using
ctf_strraw). Adjust to diagnose ECTF_STRTAB nonetheless.
* ctf-open.c (init_types): No longer filter out ECTF_STRTAB.
(ctf_file_close): Destroy the ctf_syn_ext_strtab.
(ctf_simple_open): Rename to, and reimplement as a wrapper around...
(ctf_simple_open_internal): ... this new function, which calls
ctf_bufopen_internal.
(ctf_bufopen): Rename to, and reimplement as a wrapper around...
(ctf_bufopen_internal): ... this new function, which sets
ctf_syn_ext_strtab.
2019-07-14 03:33:01 +08:00
|
|
|
ctf_dynhash_t *ctf_syn_ext_strtab; /* Maps ext-strtab offsets to names. */
|
libctf: ELF file opening via BFD
These functions let you open an ELF file with a customarily-named CTF
section in it, automatically opening the CTF file or archive and
associating the symbol and string tables in the ELF file with the CTF
container, so that you can look up the types of symbols in the ELF file
via ctf_lookup_by_symbol(), and so that strings can be shared between
the ELF file and CTF container, to save space.
It uses BFD machinery to do so. This has now been lightly tested and
seems to work. In particular, if you already have a bfd you can pass
it in to ctf_bfdopen(), and if you want a bfd made for you you can
call ctf_open() or ctf_fdopen(), optionally specifying a target (or
try once without a target and then again with one if you get
ECTF_BFD_AMBIGUOUS back).
We use a forward declaration for the struct bfd in ctf-api.h, so that
ctf-api.h users are not required to pull in <bfd.h>. (This is mostly
for the sake of readelf.)
libctf/
* ctf-open-bfd.c: New file.
* ctf-open.c (ctf_close): New.
* ctf-impl.h: Include bfd.h.
(ctf_file): New members ctf_data_mmapped, ctf_data_mmapped_len.
(ctf_archive_internal): New members ctfi_abfd, ctfi_data,
ctfi_bfd_close.
(ctf_bfdopen_ctfsect): New declaration.
(_CTF_SECTION): likewise.
include/
* ctf-api.h (struct bfd): New forward.
(ctf_fdopen): New.
(ctf_bfdopen): Likewise.
(ctf_open): Likewise.
(ctf_arc_open): Likewise.
2019-04-24 17:46:39 +08:00
|
|
|
void *ctf_data_mmapped; /* CTF data we mmapped, to free later. */
|
|
|
|
size_t ctf_data_mmapped_len; /* Length of CTF data we mmapped. */
|
libctf: avoid the need to ever use ctf_update
The method of operation of libctf when the dictionary is writable has
before now been that types that are added land in the dynamic type
section, which is a linked list and hash of IDs -> dynamic type
definitions (and, recently a hash of names): the DTDs are a bit of CTF
representing the ctf_type_t and ad hoc C structures representing the
vlen. Historically, libctf was unable to do anything with these types,
not even look them up by ID, let alone by name: if you wanted to do that
say if you were adding a type that depended on one you just added) you
called ctf_update, which serializes all the DTDs into a CTF file and
reopens it, copying its guts over the fp it's called with. The
ctf_updated types are then frozen in amber and unchangeable: all lookups
will return the types in the static portion in preference to the dynamic
portion, and we will refuse to re-add things that already exist in the
static portion (and, of late, in the dynamic portion too). The libctf
machinery remembers the boundary between static and dynamic types and
looks in the right portion for each type. Lots of things still don't
quite work with dynamic types (e.g. getting their size), but enough
works to do a bunch of additions and then a ctf_update, most of the
time.
Except it doesn't, because ctf_add_type finds it necessary to walk the
full dynamic type definition list looking for types with matching names,
so it gets slower and slower with every type you add: fixing this
requires calling ctf_update periodically for no other reason than to
avoid massively slowing things down.
This is all clunky and very slow but kind of works, until you consider
that it is in fact possible and indeed necessary to modify one sort of
type after it has been added: forwards. These are necessarily promoted
to structs, unions or enums, and when they do so *their type ID does not
change*. So all of a sudden we are changing types that already exist in
the static portion. ctf_update gets massively confused by this and
allocates space enough for the forward (with no members), but then emits
the new dynamic type (with all the members) into it. You get an
assertion failure after that, if you're lucky, or a coredump.
So this commit rejigs things a bit and arranges to exclusively use the
dynamic type definitions in writable dictionaries, and the static type
definitions in readable dictionaries: we don't at any time have a mixture
of static and dynamic types, and you don't need to call ctf_update to
make things "appear". The ctf_dtbyname hash I introduced a few months
ago, which maps things like "struct foo" to DTDs, is removed, replaced
instead by a change of type of the four dictionaries which track names.
Rather than just being (unresizable) ctf_hash_t's populated only at
ctf_bufopen time, they are now a ctf_names_t structure, which is a pair
of ctf_hash_t and ctf_dynhash_t, with the ctf_hash_t portion being used
in readonly dictionaries, and the ctf_dynhash_t being used in writable
ones. The decision as to which to use is centralized in the new
functions ctf_lookup_by_rawname (which takes a type kind) and
ctf_lookup_by_rawhash, which it calls (which takes a ctf_names_t *.)
This change lets us switch from using static to dynamic name hashes on
the fly across the entirety of libctf without complexifying anything: in
fact, because we now centralize the knowledge about how to map from type
kind to name hash, it actually simplifies things and lets us throw out
quite a lot of now-unnecessary complexity, from ctf_dtnyname (replaced
by the dynamic half of the name tables), through to ctf_dtnextid (now
that a dictionary's static portion is never referenced if the dictionary
is writable, we can just use ctf_typemax to indicate the maximum type:
dynamic or non-dynamic does not matter, and we no longer need to track
the boundary between the types). You can now ctf_rollback() as far as
you like, even past a ctf_update or for that matter a full writeout; all
the iteration functions work just as well on writable as on read-only
dictionaries; ctf_add_type no longer needs expensive duplicated code to
run over the dynamic types hunting for ones it might be interested in;
and the linker no longer needs a hack to call ctf_update so that calling
ctf_add_type is not impossibly expensive.
There is still a bit more complexity: some new code paths in ctf-types.c
need to know how to extract information from dynamic types. This
complexity will go away again in a few months when libctf acquires a
proper intermediate representation.
You can still call ctf_update if you like (it's public API, after all),
but its only effect now is to set the point to which ctf_discard rolls
back.
Obviously *something* still needs to serialize the CTF file before
writeout, and this job is done by ctf_serialize, which does everything
ctf_update used to except set the counter used by ctf_discard. It is
automatically called by the various functions that do CTF writeout:
nobody else ever needs to call it.
With this in place, forwards that are promoted to non-forwards no longer
crash the link, even if it happens tens of thousands of types later.
v5: fix tabdamage.
libctf/
* ctf-impl.h (ctf_names_t): New.
(ctf_lookup_t) <ctf_hash>: Now a ctf_names_t, not a ctf_hash_t.
(ctf_file_t) <ctf_structs>: Likewise.
<ctf_unions>: Likewise.
<ctf_enums>: Likewise.
<ctf_names>: Likewise.
<ctf_lookups>: Improve comment.
<ctf_ptrtab_len>: New.
<ctf_prov_strtab>: New.
<ctf_str_prov_offset>: New.
<ctf_dtbyname>: Remove, redundant to the names hashes.
<ctf_dtnextid>: Remove, redundant to ctf_typemax.
(ctf_dtdef_t) <dtd_name>: Remove.
<dtd_data>: Note that the ctt_name is now populated.
(ctf_str_atom_t) <csa_offset>: This is now the strtab
offset for internal strings too.
<csa_external_offset>: New, the external strtab offset.
(CTF_INDEX_TO_TYPEPTR): Handle the LCTF_RDWR case.
(ctf_name_table): New declaration.
(ctf_lookup_by_rawname): Likewise.
(ctf_lookup_by_rawhash): Likewise.
(ctf_set_ctl_hashes): Likewise.
(ctf_serialize): Likewise.
(ctf_dtd_insert): Adjust.
(ctf_simple_open_internal): Likewise.
(ctf_bufopen_internal): Likewise.
(ctf_list_empty_p): Likewise.
(ctf_str_remove_ref): Likewise.
(ctf_str_add): Returns uint32_t now.
(ctf_str_add_ref): Likewise.
(ctf_str_add_external): Now returns a boolean (int).
* ctf-string.c (ctf_strraw_explicit): Check the ctf_prov_strtab
for strings in the appropriate range.
(ctf_str_create_atoms): Create the ctf_prov_strtab. Detect OOM
when adding the null string to the new strtab.
(ctf_str_free_atoms): Destroy the ctf_prov_strtab.
(ctf_str_add_ref_internal): Add make_provisional argument. If
make_provisional, populate the offset and fill in the
ctf_prov_strtab accordingly.
(ctf_str_add): Return the offset, not the string.
(ctf_str_add_ref): Likewise.
(ctf_str_add_external): Return a success integer.
(ctf_str_remove_ref): New, remove a single ref.
(ctf_str_count_strtab): Do not count the initial null string's
length or the existence or length of any unreferenced internal
atoms.
(ctf_str_populate_sorttab): Skip atoms with no refs.
(ctf_str_write_strtab): Populate the nullstr earlier. Add one
to the cts_len for the null string, since it is no longer done
in ctf_str_count_strtab. Adjust for csa_external_offset rename.
Populate the csa_offset for both internal and external cases.
Flush the ctf_prov_strtab afterwards, and reset the
ctf_str_prov_offset.
* ctf-create.c (ctf_grow_ptrtab): New.
(ctf_create): Call it. Initialize new fields rather than old
ones. Tell ctf_bufopen_internal that this is a writable dictionary.
Set the ctl hashes and data model.
(ctf_update): Rename to...
(ctf_serialize): ... this. Leave a compatibility function behind.
Tell ctf_simple_open_internal that this is a writable dictionary.
Pass the new fields along from the old dictionary. Drop
ctf_dtnextid and ctf_dtbyname. Use ctf_strraw, not dtd_name.
Do not zero out the DTD's ctt_name.
(ctf_prefixed_name): Rename to...
(ctf_name_table): ... this. No longer return a prefixed name: return
the applicable name table instead.
(ctf_dtd_insert): Use it, and use the right name table. Pass in the
kind we're adding. Migrate away from dtd_name.
(ctf_dtd_delete): Adjust similarly. Remove the ref to the
deleted ctt_name.
(ctf_dtd_lookup_type_by_name): Remove.
(ctf_dynamic_type): Always return NULL on read-only dictionaries.
No longer check ctf_dtnextid: check ctf_typemax instead.
(ctf_snapshot): No longer use ctf_dtnextid: use ctf_typemax instead.
(ctf_rollback): Likewise. No longer fail with ECTF_OVERROLLBACK. Use
ctf_name_table and the right name table, and migrate away from
dtd_name as in ctf_dtd_delete.
(ctf_add_generic): Pass in the kind explicitly and pass it to
ctf_dtd_insert. Use ctf_typemax, not ctf_dtnextid. Migrate away
from dtd_name to using ctf_str_add_ref to populate the ctt_name.
Grow the ptrtab if needed.
(ctf_add_encoded): Pass in the kind.
(ctf_add_slice): Likewise.
(ctf_add_array): Likewise.
(ctf_add_function): Likewise.
(ctf_add_typedef): Likewise.
(ctf_add_reftype): Likewise. Initialize the ctf_ptrtab, checking
ctt_name rather than dtd_name.
(ctf_add_struct_sized): Pass in the kind. Use
ctf_lookup_by_rawname, not ctf_hash_lookup_type /
ctf_dtd_lookup_type_by_name.
(ctf_add_union_sized): Likewise.
(ctf_add_enum): Likewise.
(ctf_add_enum_encoded): Likewise.
(ctf_add_forward): Likewise.
(ctf_add_type): Likewise.
(ctf_compress_write): Call ctf_serialize: adjust for ctf_size not
being initialized until after the call.
(ctf_write_mem): Likewise.
(ctf_write): Likewise.
* ctf-archive.c (arc_write_one_ctf): Likewise.
* ctf-lookup.c (ctf_lookup_by_name): Use ctf_lookuup_by_rawhash, not
ctf_hash_lookup_type.
(ctf_lookup_by_id): No longer check the readonly types if the
dictionary is writable.
* ctf-open.c (init_types): Assert that this dictionary is not
writable. Adjust to use the new name hashes, ctf_name_table,
and ctf_ptrtab_len. GNU style fix for the final ptrtab scan.
(ctf_bufopen_internal): New 'writable' parameter. Flip on LCTF_RDWR
if set. Drop out early when dictionary is writable. Split the
ctf_lookups initialization into...
(ctf_set_cth_hashes): ... this new function.
(ctf_simple_open_internal): Adjust. New 'writable' parameter.
(ctf_simple_open): Adjust accordingly.
(ctf_bufopen): Likewise.
(ctf_file_close): Destroy the appropriate name hashes. No longer
destroy ctf_dtbyname, which is gone.
(ctf_getdatasect): Remove spurious "extern".
* ctf-types.c (ctf_lookup_by_rawname): New, look up types in the
specified name table, given a kind.
(ctf_lookup_by_rawhash): Likewise, given a ctf_names_t *.
(ctf_member_iter): Add support for iterating over the
dynamic type list.
(ctf_enum_iter): Likewise.
(ctf_variable_iter): Likewise.
(ctf_type_rvisit): Likewise.
(ctf_member_info): Add support for types in the dynamic type list.
(ctf_enum_name): Likewise.
(ctf_enum_value): Likewise.
(ctf_func_type_info): Likewise.
(ctf_func_type_args): Likewise.
* ctf-link.c (ctf_accumulate_archive_names): No longer call
ctf_update.
(ctf_link_write): Likewise.
(ctf_link_intern_extern_string): Adjust for new
ctf_str_add_external return value.
(ctf_link_add_strtab): Likewise.
* ctf-util.c (ctf_list_empty_p): New.
2019-08-08 00:55:09 +08:00
|
|
|
ctf_names_t ctf_structs; /* Hash table of struct types. */
|
|
|
|
ctf_names_t ctf_unions; /* Hash table of union types. */
|
|
|
|
ctf_names_t ctf_enums; /* Hash table of enum types. */
|
|
|
|
ctf_names_t ctf_names; /* Hash table of remaining type names. */
|
|
|
|
ctf_lookup_t ctf_lookups[5]; /* Pointers to nametabs for name lookup. */
|
2019-04-24 05:24:13 +08:00
|
|
|
ctf_strs_t ctf_str[2]; /* Array of string table base and bounds. */
|
libctf: deduplicate and sort the string table
ctf.h states:
> [...] the CTF string table does not contain any duplicated strings.
Unfortunately this is entirely untrue: libctf has before now made no
attempt whatsoever to deduplicate the string table. It computes the
string table's length on the fly as it adds new strings to the dynamic
CTF file, and ctf_update() just writes each string to the table and
notes the current write position as it traverses the dynamic CTF file's
data structures and builds the final CTF buffer. There is no global
view of the strings and no deduplication.
Fix this by erasing the ctf_dtvstrlen dead-reckoning length, and adding
a new dynhash table ctf_str_atoms that maps unique strings to a list
of references to those strings: a reference is a simple uint32_t * to
some value somewhere in the under-construction CTF buffer that needs
updating to note the string offset when the strtab is laid out.
Adding a string is now a simple matter of calling ctf_str_add_ref(),
which adds a new atom to the atoms table, if one doesn't already exist,
and adding the location of the reference to this atom to the refs list
attached to the atom: this works reliably as long as one takes care to
only call ctf_str_add_ref() once the final location of the offset is
known (so you can't call it on a temporary structure and then memcpy()
that structure into place in the CTF buffer, because the ref will still
point to the old location: ctf_update() changes accordingly).
Generating the CTF string table is a matter of calling
ctf_str_write_strtab(), which counts the length and number of elements
in the atoms table using the ctf_dynhash_iter() function we just added,
populating an array of pointers into the atoms table and sorting it into
order (to help compressors), then traversing this table and emitting it,
updating the refs to each atom as we go. The only complexity here is
arranging to keep the null string at offset zero, since a lot of code in
libctf depends on being able to leave strtab references at 0 to indicate
'no name'. Once the table is constructed and the refs updated, we know
how long it is, so we can realloc() the partial CTF buffer we allocated
earlier and can copy the table on to the end of it (and purge the refs
because they're not needed any more and have been invalidated by the
realloc() call in any case).
The net effect of all this is a reduction in uncompressed strtab sizes
of about 30% (perhaps a quarter to a half of all strings across the
Linux kernel are eliminated as duplicates). Of course, duplicated
strings are highly redundant, so the space saving after compression is
only about 20%: when the other non-strtab sections are factored in, CTF
sizes shrink by about 10%.
No change in externally-visible API or file format (other than the
reduction in pointless redundancy).
libctf/
* ctf-impl.h: (struct ctf_strs_writable): New, non-const version of
struct ctf_strs.
(struct ctf_dtdef): Note that dtd_data.ctt_name is unpopulated.
(struct ctf_str_atom): New, disambiguated single string.
(struct ctf_str_atom_ref): New, points to some other location that
references this string's offset.
(struct ctf_file): New members ctf_str_atoms and ctf_str_num_refs.
Remove member ctf_dtvstrlen: we no longer track the total strlen
as we add strings.
(ctf_str_create_atoms): Declare new function in ctf-string.c.
(ctf_str_free_atoms): Likewise.
(ctf_str_add): Likewise.
(ctf_str_add_ref): Likewise.
(ctf_str_purge_refs): Likewise.
(ctf_str_write_strtab): Likewise.
(ctf_realloc): Declare new function in ctf-util.c.
* ctf-open.c (ctf_bufopen): Create the atoms table.
(ctf_file_close): Destroy it.
* ctf-create.c (ctf_update): Copy-and-free it on update. No longer
special-case the position of the parname string. Construct the
strtab by calling ctf_str_add_ref and ctf_str_write_strtab after the
rest of each buffer element is constructed, not via open-coding:
realloc the CTF buffer and append the strtab to it. No longer
maintain ctf_dtvstrlen. Sort the variable entry table later, after
strtab construction.
(ctf_copy_membnames): Remove: integrated into ctf_copy_{s,l,e}members.
(ctf_copy_smembers): Drop the string offset: call ctf_str_add_ref
after buffer element construction instead.
(ctf_copy_lmembers): Likewise.
(ctf_copy_emembers): Likewise.
(ctf_create): No longer maintain the ctf_dtvstrlen.
(ctf_dtd_delete): Likewise.
(ctf_dvd_delete): Likewise.
(ctf_add_generic): Likewise.
(ctf_add_enumerator): Likewise.
(ctf_add_member_offset): Likewise.
(ctf_add_variable): Likewise.
(membadd): Likewise.
* ctf-util.c (ctf_realloc): New, wrapper around realloc that aborts
if there are active ctf_str_num_refs.
(ctf_strraw): Move to ctf-string.c.
(ctf_strptr): Likewise.
* ctf-string.c: New file, strtab manipulation.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
2019-06-27 20:51:10 +08:00
|
|
|
ctf_dynhash_t *ctf_str_atoms; /* Hash table of ctf_str_atoms_t. */
|
|
|
|
uint64_t ctf_str_num_refs; /* Number of refs to cts_str_atoms. */
|
libctf: avoid the need to ever use ctf_update
The method of operation of libctf when the dictionary is writable has
before now been that types that are added land in the dynamic type
section, which is a linked list and hash of IDs -> dynamic type
definitions (and, recently a hash of names): the DTDs are a bit of CTF
representing the ctf_type_t and ad hoc C structures representing the
vlen. Historically, libctf was unable to do anything with these types,
not even look them up by ID, let alone by name: if you wanted to do that
say if you were adding a type that depended on one you just added) you
called ctf_update, which serializes all the DTDs into a CTF file and
reopens it, copying its guts over the fp it's called with. The
ctf_updated types are then frozen in amber and unchangeable: all lookups
will return the types in the static portion in preference to the dynamic
portion, and we will refuse to re-add things that already exist in the
static portion (and, of late, in the dynamic portion too). The libctf
machinery remembers the boundary between static and dynamic types and
looks in the right portion for each type. Lots of things still don't
quite work with dynamic types (e.g. getting their size), but enough
works to do a bunch of additions and then a ctf_update, most of the
time.
Except it doesn't, because ctf_add_type finds it necessary to walk the
full dynamic type definition list looking for types with matching names,
so it gets slower and slower with every type you add: fixing this
requires calling ctf_update periodically for no other reason than to
avoid massively slowing things down.
This is all clunky and very slow but kind of works, until you consider
that it is in fact possible and indeed necessary to modify one sort of
type after it has been added: forwards. These are necessarily promoted
to structs, unions or enums, and when they do so *their type ID does not
change*. So all of a sudden we are changing types that already exist in
the static portion. ctf_update gets massively confused by this and
allocates space enough for the forward (with no members), but then emits
the new dynamic type (with all the members) into it. You get an
assertion failure after that, if you're lucky, or a coredump.
So this commit rejigs things a bit and arranges to exclusively use the
dynamic type definitions in writable dictionaries, and the static type
definitions in readable dictionaries: we don't at any time have a mixture
of static and dynamic types, and you don't need to call ctf_update to
make things "appear". The ctf_dtbyname hash I introduced a few months
ago, which maps things like "struct foo" to DTDs, is removed, replaced
instead by a change of type of the four dictionaries which track names.
Rather than just being (unresizable) ctf_hash_t's populated only at
ctf_bufopen time, they are now a ctf_names_t structure, which is a pair
of ctf_hash_t and ctf_dynhash_t, with the ctf_hash_t portion being used
in readonly dictionaries, and the ctf_dynhash_t being used in writable
ones. The decision as to which to use is centralized in the new
functions ctf_lookup_by_rawname (which takes a type kind) and
ctf_lookup_by_rawhash, which it calls (which takes a ctf_names_t *.)
This change lets us switch from using static to dynamic name hashes on
the fly across the entirety of libctf without complexifying anything: in
fact, because we now centralize the knowledge about how to map from type
kind to name hash, it actually simplifies things and lets us throw out
quite a lot of now-unnecessary complexity, from ctf_dtnyname (replaced
by the dynamic half of the name tables), through to ctf_dtnextid (now
that a dictionary's static portion is never referenced if the dictionary
is writable, we can just use ctf_typemax to indicate the maximum type:
dynamic or non-dynamic does not matter, and we no longer need to track
the boundary between the types). You can now ctf_rollback() as far as
you like, even past a ctf_update or for that matter a full writeout; all
the iteration functions work just as well on writable as on read-only
dictionaries; ctf_add_type no longer needs expensive duplicated code to
run over the dynamic types hunting for ones it might be interested in;
and the linker no longer needs a hack to call ctf_update so that calling
ctf_add_type is not impossibly expensive.
There is still a bit more complexity: some new code paths in ctf-types.c
need to know how to extract information from dynamic types. This
complexity will go away again in a few months when libctf acquires a
proper intermediate representation.
You can still call ctf_update if you like (it's public API, after all),
but its only effect now is to set the point to which ctf_discard rolls
back.
Obviously *something* still needs to serialize the CTF file before
writeout, and this job is done by ctf_serialize, which does everything
ctf_update used to except set the counter used by ctf_discard. It is
automatically called by the various functions that do CTF writeout:
nobody else ever needs to call it.
With this in place, forwards that are promoted to non-forwards no longer
crash the link, even if it happens tens of thousands of types later.
v5: fix tabdamage.
libctf/
* ctf-impl.h (ctf_names_t): New.
(ctf_lookup_t) <ctf_hash>: Now a ctf_names_t, not a ctf_hash_t.
(ctf_file_t) <ctf_structs>: Likewise.
<ctf_unions>: Likewise.
<ctf_enums>: Likewise.
<ctf_names>: Likewise.
<ctf_lookups>: Improve comment.
<ctf_ptrtab_len>: New.
<ctf_prov_strtab>: New.
<ctf_str_prov_offset>: New.
<ctf_dtbyname>: Remove, redundant to the names hashes.
<ctf_dtnextid>: Remove, redundant to ctf_typemax.
(ctf_dtdef_t) <dtd_name>: Remove.
<dtd_data>: Note that the ctt_name is now populated.
(ctf_str_atom_t) <csa_offset>: This is now the strtab
offset for internal strings too.
<csa_external_offset>: New, the external strtab offset.
(CTF_INDEX_TO_TYPEPTR): Handle the LCTF_RDWR case.
(ctf_name_table): New declaration.
(ctf_lookup_by_rawname): Likewise.
(ctf_lookup_by_rawhash): Likewise.
(ctf_set_ctl_hashes): Likewise.
(ctf_serialize): Likewise.
(ctf_dtd_insert): Adjust.
(ctf_simple_open_internal): Likewise.
(ctf_bufopen_internal): Likewise.
(ctf_list_empty_p): Likewise.
(ctf_str_remove_ref): Likewise.
(ctf_str_add): Returns uint32_t now.
(ctf_str_add_ref): Likewise.
(ctf_str_add_external): Now returns a boolean (int).
* ctf-string.c (ctf_strraw_explicit): Check the ctf_prov_strtab
for strings in the appropriate range.
(ctf_str_create_atoms): Create the ctf_prov_strtab. Detect OOM
when adding the null string to the new strtab.
(ctf_str_free_atoms): Destroy the ctf_prov_strtab.
(ctf_str_add_ref_internal): Add make_provisional argument. If
make_provisional, populate the offset and fill in the
ctf_prov_strtab accordingly.
(ctf_str_add): Return the offset, not the string.
(ctf_str_add_ref): Likewise.
(ctf_str_add_external): Return a success integer.
(ctf_str_remove_ref): New, remove a single ref.
(ctf_str_count_strtab): Do not count the initial null string's
length or the existence or length of any unreferenced internal
atoms.
(ctf_str_populate_sorttab): Skip atoms with no refs.
(ctf_str_write_strtab): Populate the nullstr earlier. Add one
to the cts_len for the null string, since it is no longer done
in ctf_str_count_strtab. Adjust for csa_external_offset rename.
Populate the csa_offset for both internal and external cases.
Flush the ctf_prov_strtab afterwards, and reset the
ctf_str_prov_offset.
* ctf-create.c (ctf_grow_ptrtab): New.
(ctf_create): Call it. Initialize new fields rather than old
ones. Tell ctf_bufopen_internal that this is a writable dictionary.
Set the ctl hashes and data model.
(ctf_update): Rename to...
(ctf_serialize): ... this. Leave a compatibility function behind.
Tell ctf_simple_open_internal that this is a writable dictionary.
Pass the new fields along from the old dictionary. Drop
ctf_dtnextid and ctf_dtbyname. Use ctf_strraw, not dtd_name.
Do not zero out the DTD's ctt_name.
(ctf_prefixed_name): Rename to...
(ctf_name_table): ... this. No longer return a prefixed name: return
the applicable name table instead.
(ctf_dtd_insert): Use it, and use the right name table. Pass in the
kind we're adding. Migrate away from dtd_name.
(ctf_dtd_delete): Adjust similarly. Remove the ref to the
deleted ctt_name.
(ctf_dtd_lookup_type_by_name): Remove.
(ctf_dynamic_type): Always return NULL on read-only dictionaries.
No longer check ctf_dtnextid: check ctf_typemax instead.
(ctf_snapshot): No longer use ctf_dtnextid: use ctf_typemax instead.
(ctf_rollback): Likewise. No longer fail with ECTF_OVERROLLBACK. Use
ctf_name_table and the right name table, and migrate away from
dtd_name as in ctf_dtd_delete.
(ctf_add_generic): Pass in the kind explicitly and pass it to
ctf_dtd_insert. Use ctf_typemax, not ctf_dtnextid. Migrate away
from dtd_name to using ctf_str_add_ref to populate the ctt_name.
Grow the ptrtab if needed.
(ctf_add_encoded): Pass in the kind.
(ctf_add_slice): Likewise.
(ctf_add_array): Likewise.
(ctf_add_function): Likewise.
(ctf_add_typedef): Likewise.
(ctf_add_reftype): Likewise. Initialize the ctf_ptrtab, checking
ctt_name rather than dtd_name.
(ctf_add_struct_sized): Pass in the kind. Use
ctf_lookup_by_rawname, not ctf_hash_lookup_type /
ctf_dtd_lookup_type_by_name.
(ctf_add_union_sized): Likewise.
(ctf_add_enum): Likewise.
(ctf_add_enum_encoded): Likewise.
(ctf_add_forward): Likewise.
(ctf_add_type): Likewise.
(ctf_compress_write): Call ctf_serialize: adjust for ctf_size not
being initialized until after the call.
(ctf_write_mem): Likewise.
(ctf_write): Likewise.
* ctf-archive.c (arc_write_one_ctf): Likewise.
* ctf-lookup.c (ctf_lookup_by_name): Use ctf_lookuup_by_rawhash, not
ctf_hash_lookup_type.
(ctf_lookup_by_id): No longer check the readonly types if the
dictionary is writable.
* ctf-open.c (init_types): Assert that this dictionary is not
writable. Adjust to use the new name hashes, ctf_name_table,
and ctf_ptrtab_len. GNU style fix for the final ptrtab scan.
(ctf_bufopen_internal): New 'writable' parameter. Flip on LCTF_RDWR
if set. Drop out early when dictionary is writable. Split the
ctf_lookups initialization into...
(ctf_set_cth_hashes): ... this new function.
(ctf_simple_open_internal): Adjust. New 'writable' parameter.
(ctf_simple_open): Adjust accordingly.
(ctf_bufopen): Likewise.
(ctf_file_close): Destroy the appropriate name hashes. No longer
destroy ctf_dtbyname, which is gone.
(ctf_getdatasect): Remove spurious "extern".
* ctf-types.c (ctf_lookup_by_rawname): New, look up types in the
specified name table, given a kind.
(ctf_lookup_by_rawhash): Likewise, given a ctf_names_t *.
(ctf_member_iter): Add support for iterating over the
dynamic type list.
(ctf_enum_iter): Likewise.
(ctf_variable_iter): Likewise.
(ctf_type_rvisit): Likewise.
(ctf_member_info): Add support for types in the dynamic type list.
(ctf_enum_name): Likewise.
(ctf_enum_value): Likewise.
(ctf_func_type_info): Likewise.
(ctf_func_type_args): Likewise.
* ctf-link.c (ctf_accumulate_archive_names): No longer call
ctf_update.
(ctf_link_write): Likewise.
(ctf_link_intern_extern_string): Adjust for new
ctf_str_add_external return value.
(ctf_link_add_strtab): Likewise.
* ctf-util.c (ctf_list_empty_p): New.
2019-08-08 00:55:09 +08:00
|
|
|
uint32_t ctf_str_prov_offset; /* Latest provisional offset assigned so far. */
|
libctf: allow the header to change between versions
libctf supports dynamic upgrading of the type table as file format
versions change, but before now has not supported changes to the CTF
header. Doing this is complicated by the baroque storage method used:
the CTF header is kept prepended to the rest of the CTF data, just as
when read from the file, and written out from there, and is
endian-flipped in place.
This makes accessing it needlessly hard and makes it almost impossible
to make the header larger if we add fields. The general storage
machinery around the malloced ctf pointer (the 'ctf_base') is also
overcomplicated: the pointer is sometimes malloced locally and sometimes
assigned from a parameter, so freeing it requires checking to see if
that parameter was used, needlessly coupling ctf_bufopen and
ctf_file_close together.
So split the header out into a new ctf_file_t.ctf_header, which is
written out explicitly: squeeze it out of the CTF buffer whenever we
reallocate it, and use ctf_file_t.ctf_buf to skip past the header when
we do not need to reallocate (when no upgrading or endian-flipping is
required). We now track whether the CTF base can be freed explicitly
via a new ctf_dynbase pointer which is non-NULL only when freeing is
possible.
With all this done, we can upgrade the header on the fly and add new
fields as desired, via a new upgrade_header function in ctf-open.
As with other forms of upgrading, libctf upgrades older headers
automatically to the latest supported version at open time.
For a first use of this field, we add a new string field cth_cuname, and
a corresponding setter/getter pair ctf_cuname_set and ctf_cuname: this
is used by debuggers to determine whether a CTF section's types relate
to a single compilation unit, or to all compilation units in the
program. (Types with ambiguous definitions in different CUs have only
one of these types placed in the top-level shared .ctf container: the
rest are placed in much smaller per-CU containers, which have the shared
container as their parent. Since CTF must be useful in the absence of
DWARF, we store the names of the relevant CUs ourselves, so the debugger
can look them up.)
v5: fix tabdamage.
include/
* ctf-api.h (ctf_cuname): New function.
(ctf_cuname_set): Likewise.
* ctf.h: Improve comment around upgrading, no longer
implying that v2 is the target of upgrades (it is v3 now).
(ctf_header_v2_t): New, old-format header for backward
compatibility.
(ctf_header_t): Add cth_cuname: this is the first of several
header changes in format v3.
libctf/
* ctf-impl.h (ctf_file_t): New fields ctf_header, ctf_dynbase,
ctf_cuname, ctf_dyncuname: ctf_base and ctf_buf are no longer const.
* ctf-open.c (ctf_set_base): Preserve the gap between ctf_buf and
ctf_base: do not assume that it is always sizeof (ctf_header_t).
Print out ctf_cuname: only print out ctf_parname if set.
(ctf_free_base): Removed, ctf_base is no longer freed: free
ctf_dynbase instead.
(ctf_set_version): Fix spacing.
(upgrade_header): New, in-place header upgrading.
(upgrade_types): Rename to...
(upgrade_types_v1): ... this. Free ctf_dynbase, not ctf_base. No
longer track old and new headers separately. No longer allow for
header sizes explicitly: squeeze the headers out on upgrade (they
are preserved in fp->ctf_header). Set ctf_dynbase, ctf_base and
ctf_buf explicitly. Use ctf_free, not ctf_free_base.
(upgrade_types): New, also handle ctf_parmax updating.
(flip_header): Flip ctf_cuname.
(flip_types): Flip BUF explicitly rather than deriving BUF from
BASE.
(ctf_bufopen): Store the header in fp->ctf_header. Correct minimum
required alignment of objtoff and funcoff. No longer store it in
the ctf_buf unless that buf is derived unmodified from the input.
Set ctf_dynbase where ctf_base is dynamically allocated. Drop locals
that duplicate fields in ctf_file: move allocation of ctf_file
further up instead. Call upgrade_header as needed. Move
version-specific ctf_parmax initialization into upgrade_types. More
concise error handling.
(ctf_file_close): No longer test for null pointers before freeing.
Free ctf_dyncuname, ctf_dynbase, and ctf_header. Do not call
ctf_free_base.
(ctf_cuname): New.
(ctf_cuname_set): New.
* ctf-create.c (ctf_update): Populate ctf_cuname.
(ctf_gzwrite): Write out the header explicitly. Remove obsolescent
comment.
(ctf_write): Likewise.
(ctf_compress_write): Get the header from ctf_header, not ctf_base.
Fix the compression length: fp->ctf_size never counted the CTF
header. Simplify the compress call accordingly.
2019-07-07 00:36:21 +08:00
|
|
|
unsigned char *ctf_base; /* CTF file pointer. */
|
|
|
|
unsigned char *ctf_dynbase; /* Freeable CTF file pointer. */
|
|
|
|
unsigned char *ctf_buf; /* Uncompressed CTF data buffer. */
|
2019-04-24 05:24:13 +08:00
|
|
|
size_t ctf_size; /* Size of CTF header + uncompressed data. */
|
|
|
|
uint32_t *ctf_sxlate; /* Translation table for symtab entries. */
|
|
|
|
unsigned long ctf_nsyms; /* Number of entries in symtab xlate table. */
|
|
|
|
uint32_t *ctf_txlate; /* Translation table for type IDs. */
|
|
|
|
uint32_t *ctf_ptrtab; /* Translation table for pointer-to lookups. */
|
libctf: avoid the need to ever use ctf_update
The method of operation of libctf when the dictionary is writable has
before now been that types that are added land in the dynamic type
section, which is a linked list and hash of IDs -> dynamic type
definitions (and, recently a hash of names): the DTDs are a bit of CTF
representing the ctf_type_t and ad hoc C structures representing the
vlen. Historically, libctf was unable to do anything with these types,
not even look them up by ID, let alone by name: if you wanted to do that
say if you were adding a type that depended on one you just added) you
called ctf_update, which serializes all the DTDs into a CTF file and
reopens it, copying its guts over the fp it's called with. The
ctf_updated types are then frozen in amber and unchangeable: all lookups
will return the types in the static portion in preference to the dynamic
portion, and we will refuse to re-add things that already exist in the
static portion (and, of late, in the dynamic portion too). The libctf
machinery remembers the boundary between static and dynamic types and
looks in the right portion for each type. Lots of things still don't
quite work with dynamic types (e.g. getting their size), but enough
works to do a bunch of additions and then a ctf_update, most of the
time.
Except it doesn't, because ctf_add_type finds it necessary to walk the
full dynamic type definition list looking for types with matching names,
so it gets slower and slower with every type you add: fixing this
requires calling ctf_update periodically for no other reason than to
avoid massively slowing things down.
This is all clunky and very slow but kind of works, until you consider
that it is in fact possible and indeed necessary to modify one sort of
type after it has been added: forwards. These are necessarily promoted
to structs, unions or enums, and when they do so *their type ID does not
change*. So all of a sudden we are changing types that already exist in
the static portion. ctf_update gets massively confused by this and
allocates space enough for the forward (with no members), but then emits
the new dynamic type (with all the members) into it. You get an
assertion failure after that, if you're lucky, or a coredump.
So this commit rejigs things a bit and arranges to exclusively use the
dynamic type definitions in writable dictionaries, and the static type
definitions in readable dictionaries: we don't at any time have a mixture
of static and dynamic types, and you don't need to call ctf_update to
make things "appear". The ctf_dtbyname hash I introduced a few months
ago, which maps things like "struct foo" to DTDs, is removed, replaced
instead by a change of type of the four dictionaries which track names.
Rather than just being (unresizable) ctf_hash_t's populated only at
ctf_bufopen time, they are now a ctf_names_t structure, which is a pair
of ctf_hash_t and ctf_dynhash_t, with the ctf_hash_t portion being used
in readonly dictionaries, and the ctf_dynhash_t being used in writable
ones. The decision as to which to use is centralized in the new
functions ctf_lookup_by_rawname (which takes a type kind) and
ctf_lookup_by_rawhash, which it calls (which takes a ctf_names_t *.)
This change lets us switch from using static to dynamic name hashes on
the fly across the entirety of libctf without complexifying anything: in
fact, because we now centralize the knowledge about how to map from type
kind to name hash, it actually simplifies things and lets us throw out
quite a lot of now-unnecessary complexity, from ctf_dtnyname (replaced
by the dynamic half of the name tables), through to ctf_dtnextid (now
that a dictionary's static portion is never referenced if the dictionary
is writable, we can just use ctf_typemax to indicate the maximum type:
dynamic or non-dynamic does not matter, and we no longer need to track
the boundary between the types). You can now ctf_rollback() as far as
you like, even past a ctf_update or for that matter a full writeout; all
the iteration functions work just as well on writable as on read-only
dictionaries; ctf_add_type no longer needs expensive duplicated code to
run over the dynamic types hunting for ones it might be interested in;
and the linker no longer needs a hack to call ctf_update so that calling
ctf_add_type is not impossibly expensive.
There is still a bit more complexity: some new code paths in ctf-types.c
need to know how to extract information from dynamic types. This
complexity will go away again in a few months when libctf acquires a
proper intermediate representation.
You can still call ctf_update if you like (it's public API, after all),
but its only effect now is to set the point to which ctf_discard rolls
back.
Obviously *something* still needs to serialize the CTF file before
writeout, and this job is done by ctf_serialize, which does everything
ctf_update used to except set the counter used by ctf_discard. It is
automatically called by the various functions that do CTF writeout:
nobody else ever needs to call it.
With this in place, forwards that are promoted to non-forwards no longer
crash the link, even if it happens tens of thousands of types later.
v5: fix tabdamage.
libctf/
* ctf-impl.h (ctf_names_t): New.
(ctf_lookup_t) <ctf_hash>: Now a ctf_names_t, not a ctf_hash_t.
(ctf_file_t) <ctf_structs>: Likewise.
<ctf_unions>: Likewise.
<ctf_enums>: Likewise.
<ctf_names>: Likewise.
<ctf_lookups>: Improve comment.
<ctf_ptrtab_len>: New.
<ctf_prov_strtab>: New.
<ctf_str_prov_offset>: New.
<ctf_dtbyname>: Remove, redundant to the names hashes.
<ctf_dtnextid>: Remove, redundant to ctf_typemax.
(ctf_dtdef_t) <dtd_name>: Remove.
<dtd_data>: Note that the ctt_name is now populated.
(ctf_str_atom_t) <csa_offset>: This is now the strtab
offset for internal strings too.
<csa_external_offset>: New, the external strtab offset.
(CTF_INDEX_TO_TYPEPTR): Handle the LCTF_RDWR case.
(ctf_name_table): New declaration.
(ctf_lookup_by_rawname): Likewise.
(ctf_lookup_by_rawhash): Likewise.
(ctf_set_ctl_hashes): Likewise.
(ctf_serialize): Likewise.
(ctf_dtd_insert): Adjust.
(ctf_simple_open_internal): Likewise.
(ctf_bufopen_internal): Likewise.
(ctf_list_empty_p): Likewise.
(ctf_str_remove_ref): Likewise.
(ctf_str_add): Returns uint32_t now.
(ctf_str_add_ref): Likewise.
(ctf_str_add_external): Now returns a boolean (int).
* ctf-string.c (ctf_strraw_explicit): Check the ctf_prov_strtab
for strings in the appropriate range.
(ctf_str_create_atoms): Create the ctf_prov_strtab. Detect OOM
when adding the null string to the new strtab.
(ctf_str_free_atoms): Destroy the ctf_prov_strtab.
(ctf_str_add_ref_internal): Add make_provisional argument. If
make_provisional, populate the offset and fill in the
ctf_prov_strtab accordingly.
(ctf_str_add): Return the offset, not the string.
(ctf_str_add_ref): Likewise.
(ctf_str_add_external): Return a success integer.
(ctf_str_remove_ref): New, remove a single ref.
(ctf_str_count_strtab): Do not count the initial null string's
length or the existence or length of any unreferenced internal
atoms.
(ctf_str_populate_sorttab): Skip atoms with no refs.
(ctf_str_write_strtab): Populate the nullstr earlier. Add one
to the cts_len for the null string, since it is no longer done
in ctf_str_count_strtab. Adjust for csa_external_offset rename.
Populate the csa_offset for both internal and external cases.
Flush the ctf_prov_strtab afterwards, and reset the
ctf_str_prov_offset.
* ctf-create.c (ctf_grow_ptrtab): New.
(ctf_create): Call it. Initialize new fields rather than old
ones. Tell ctf_bufopen_internal that this is a writable dictionary.
Set the ctl hashes and data model.
(ctf_update): Rename to...
(ctf_serialize): ... this. Leave a compatibility function behind.
Tell ctf_simple_open_internal that this is a writable dictionary.
Pass the new fields along from the old dictionary. Drop
ctf_dtnextid and ctf_dtbyname. Use ctf_strraw, not dtd_name.
Do not zero out the DTD's ctt_name.
(ctf_prefixed_name): Rename to...
(ctf_name_table): ... this. No longer return a prefixed name: return
the applicable name table instead.
(ctf_dtd_insert): Use it, and use the right name table. Pass in the
kind we're adding. Migrate away from dtd_name.
(ctf_dtd_delete): Adjust similarly. Remove the ref to the
deleted ctt_name.
(ctf_dtd_lookup_type_by_name): Remove.
(ctf_dynamic_type): Always return NULL on read-only dictionaries.
No longer check ctf_dtnextid: check ctf_typemax instead.
(ctf_snapshot): No longer use ctf_dtnextid: use ctf_typemax instead.
(ctf_rollback): Likewise. No longer fail with ECTF_OVERROLLBACK. Use
ctf_name_table and the right name table, and migrate away from
dtd_name as in ctf_dtd_delete.
(ctf_add_generic): Pass in the kind explicitly and pass it to
ctf_dtd_insert. Use ctf_typemax, not ctf_dtnextid. Migrate away
from dtd_name to using ctf_str_add_ref to populate the ctt_name.
Grow the ptrtab if needed.
(ctf_add_encoded): Pass in the kind.
(ctf_add_slice): Likewise.
(ctf_add_array): Likewise.
(ctf_add_function): Likewise.
(ctf_add_typedef): Likewise.
(ctf_add_reftype): Likewise. Initialize the ctf_ptrtab, checking
ctt_name rather than dtd_name.
(ctf_add_struct_sized): Pass in the kind. Use
ctf_lookup_by_rawname, not ctf_hash_lookup_type /
ctf_dtd_lookup_type_by_name.
(ctf_add_union_sized): Likewise.
(ctf_add_enum): Likewise.
(ctf_add_enum_encoded): Likewise.
(ctf_add_forward): Likewise.
(ctf_add_type): Likewise.
(ctf_compress_write): Call ctf_serialize: adjust for ctf_size not
being initialized until after the call.
(ctf_write_mem): Likewise.
(ctf_write): Likewise.
* ctf-archive.c (arc_write_one_ctf): Likewise.
* ctf-lookup.c (ctf_lookup_by_name): Use ctf_lookuup_by_rawhash, not
ctf_hash_lookup_type.
(ctf_lookup_by_id): No longer check the readonly types if the
dictionary is writable.
* ctf-open.c (init_types): Assert that this dictionary is not
writable. Adjust to use the new name hashes, ctf_name_table,
and ctf_ptrtab_len. GNU style fix for the final ptrtab scan.
(ctf_bufopen_internal): New 'writable' parameter. Flip on LCTF_RDWR
if set. Drop out early when dictionary is writable. Split the
ctf_lookups initialization into...
(ctf_set_cth_hashes): ... this new function.
(ctf_simple_open_internal): Adjust. New 'writable' parameter.
(ctf_simple_open): Adjust accordingly.
(ctf_bufopen): Likewise.
(ctf_file_close): Destroy the appropriate name hashes. No longer
destroy ctf_dtbyname, which is gone.
(ctf_getdatasect): Remove spurious "extern".
* ctf-types.c (ctf_lookup_by_rawname): New, look up types in the
specified name table, given a kind.
(ctf_lookup_by_rawhash): Likewise, given a ctf_names_t *.
(ctf_member_iter): Add support for iterating over the
dynamic type list.
(ctf_enum_iter): Likewise.
(ctf_variable_iter): Likewise.
(ctf_type_rvisit): Likewise.
(ctf_member_info): Add support for types in the dynamic type list.
(ctf_enum_name): Likewise.
(ctf_enum_value): Likewise.
(ctf_func_type_info): Likewise.
(ctf_func_type_args): Likewise.
* ctf-link.c (ctf_accumulate_archive_names): No longer call
ctf_update.
(ctf_link_write): Likewise.
(ctf_link_intern_extern_string): Adjust for new
ctf_str_add_external return value.
(ctf_link_add_strtab): Likewise.
* ctf-util.c (ctf_list_empty_p): New.
2019-08-08 00:55:09 +08:00
|
|
|
size_t ctf_ptrtab_len; /* Num types storable in ptrtab currently. */
|
2019-04-24 05:24:13 +08:00
|
|
|
struct ctf_varent *ctf_vars; /* Sorted variable->type mapping. */
|
|
|
|
unsigned long ctf_nvars; /* Number of variables in ctf_vars. */
|
|
|
|
unsigned long ctf_typemax; /* Maximum valid type ID number. */
|
|
|
|
const ctf_dmodel_t *ctf_dmodel; /* Data model pointer (see above). */
|
libctf: allow the header to change between versions
libctf supports dynamic upgrading of the type table as file format
versions change, but before now has not supported changes to the CTF
header. Doing this is complicated by the baroque storage method used:
the CTF header is kept prepended to the rest of the CTF data, just as
when read from the file, and written out from there, and is
endian-flipped in place.
This makes accessing it needlessly hard and makes it almost impossible
to make the header larger if we add fields. The general storage
machinery around the malloced ctf pointer (the 'ctf_base') is also
overcomplicated: the pointer is sometimes malloced locally and sometimes
assigned from a parameter, so freeing it requires checking to see if
that parameter was used, needlessly coupling ctf_bufopen and
ctf_file_close together.
So split the header out into a new ctf_file_t.ctf_header, which is
written out explicitly: squeeze it out of the CTF buffer whenever we
reallocate it, and use ctf_file_t.ctf_buf to skip past the header when
we do not need to reallocate (when no upgrading or endian-flipping is
required). We now track whether the CTF base can be freed explicitly
via a new ctf_dynbase pointer which is non-NULL only when freeing is
possible.
With all this done, we can upgrade the header on the fly and add new
fields as desired, via a new upgrade_header function in ctf-open.
As with other forms of upgrading, libctf upgrades older headers
automatically to the latest supported version at open time.
For a first use of this field, we add a new string field cth_cuname, and
a corresponding setter/getter pair ctf_cuname_set and ctf_cuname: this
is used by debuggers to determine whether a CTF section's types relate
to a single compilation unit, or to all compilation units in the
program. (Types with ambiguous definitions in different CUs have only
one of these types placed in the top-level shared .ctf container: the
rest are placed in much smaller per-CU containers, which have the shared
container as their parent. Since CTF must be useful in the absence of
DWARF, we store the names of the relevant CUs ourselves, so the debugger
can look them up.)
v5: fix tabdamage.
include/
* ctf-api.h (ctf_cuname): New function.
(ctf_cuname_set): Likewise.
* ctf.h: Improve comment around upgrading, no longer
implying that v2 is the target of upgrades (it is v3 now).
(ctf_header_v2_t): New, old-format header for backward
compatibility.
(ctf_header_t): Add cth_cuname: this is the first of several
header changes in format v3.
libctf/
* ctf-impl.h (ctf_file_t): New fields ctf_header, ctf_dynbase,
ctf_cuname, ctf_dyncuname: ctf_base and ctf_buf are no longer const.
* ctf-open.c (ctf_set_base): Preserve the gap between ctf_buf and
ctf_base: do not assume that it is always sizeof (ctf_header_t).
Print out ctf_cuname: only print out ctf_parname if set.
(ctf_free_base): Removed, ctf_base is no longer freed: free
ctf_dynbase instead.
(ctf_set_version): Fix spacing.
(upgrade_header): New, in-place header upgrading.
(upgrade_types): Rename to...
(upgrade_types_v1): ... this. Free ctf_dynbase, not ctf_base. No
longer track old and new headers separately. No longer allow for
header sizes explicitly: squeeze the headers out on upgrade (they
are preserved in fp->ctf_header). Set ctf_dynbase, ctf_base and
ctf_buf explicitly. Use ctf_free, not ctf_free_base.
(upgrade_types): New, also handle ctf_parmax updating.
(flip_header): Flip ctf_cuname.
(flip_types): Flip BUF explicitly rather than deriving BUF from
BASE.
(ctf_bufopen): Store the header in fp->ctf_header. Correct minimum
required alignment of objtoff and funcoff. No longer store it in
the ctf_buf unless that buf is derived unmodified from the input.
Set ctf_dynbase where ctf_base is dynamically allocated. Drop locals
that duplicate fields in ctf_file: move allocation of ctf_file
further up instead. Call upgrade_header as needed. Move
version-specific ctf_parmax initialization into upgrade_types. More
concise error handling.
(ctf_file_close): No longer test for null pointers before freeing.
Free ctf_dyncuname, ctf_dynbase, and ctf_header. Do not call
ctf_free_base.
(ctf_cuname): New.
(ctf_cuname_set): New.
* ctf-create.c (ctf_update): Populate ctf_cuname.
(ctf_gzwrite): Write out the header explicitly. Remove obsolescent
comment.
(ctf_write): Likewise.
(ctf_compress_write): Get the header from ctf_header, not ctf_base.
Fix the compression length: fp->ctf_size never counted the CTF
header. Simplify the compress call accordingly.
2019-07-07 00:36:21 +08:00
|
|
|
const char *ctf_cuname; /* Compilation unit name (if any). */
|
|
|
|
char *ctf_dyncuname; /* Dynamically allocated name of CU. */
|
2019-04-24 05:24:13 +08:00
|
|
|
struct ctf_file *ctf_parent; /* Parent CTF container (if any). */
|
|
|
|
const char *ctf_parlabel; /* Label in parent container (if any). */
|
|
|
|
const char *ctf_parname; /* Basename of parent (if any). */
|
|
|
|
char *ctf_dynparname; /* Dynamically allocated name of parent. */
|
|
|
|
uint32_t ctf_parmax; /* Highest type ID of a parent type. */
|
|
|
|
uint32_t ctf_refcnt; /* Reference count (for parent links). */
|
|
|
|
uint32_t ctf_flags; /* Libctf flags (see below). */
|
|
|
|
int ctf_errno; /* Error code for most recent error. */
|
|
|
|
int ctf_version; /* CTF data version. */
|
|
|
|
ctf_dynhash_t *ctf_dthash; /* Hash of dynamic type definitions. */
|
|
|
|
ctf_list_t ctf_dtdefs; /* List of dynamic type definitions. */
|
|
|
|
ctf_dynhash_t *ctf_dvhash; /* Hash of dynamic variable mappings. */
|
|
|
|
ctf_list_t ctf_dvdefs; /* List of dynamic variable definitions. */
|
|
|
|
unsigned long ctf_dtoldid; /* Oldest id that has been committed. */
|
|
|
|
unsigned long ctf_snapshots; /* ctf_snapshot() plus ctf_update() count. */
|
|
|
|
unsigned long ctf_snapshot_lu; /* ctf_snapshot() call count at last update. */
|
|
|
|
ctf_archive_t *ctf_archive; /* Archive this ctf_file_t came from. */
|
libctf: add the ctf_link machinery
This is the start of work on the core of the linking mechanism for CTF
sections. This commit handles the type and string sections.
The linker calls these functions in sequence:
ctf_link_add_ctf: to add each CTF section in the input in turn to a
newly-created ctf_file_t (which will appear in the output, and which
itself will become the shared parent that contains types that all
TUs have in common (in all link modes) and all types that do not
have conflicting definitions between types (by default). Input files
that are themselves products of ld -r are supported, though this is
not heavily tested yet.
ctf_link: called once all input files are added to merge the types in
all the input containers into the output container, eliminating
duplicates.
ctf_link_add_strtab: called once the ELF string table is finalized and
all its offsets are known, this calls a callback provided by the
linker which returns the string content and offset of every string in
the ELF strtab in turn: all these strings which appear in the input
CTF strtab are eliminated from it in favour of the ELF strtab:
equally, any strings that only appear in the input strtab will
reappear in the internal CTF strtab of the output.
ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab
is finalized, this calls a callback provided by the linker which
returns information on every symbol in turn as a ctf_link_sym_t. This
is then used to shuffle the function info and data object sections in
the CTF section into symbol table order, eliminating the index
sections which map those sections to symbol names before that point.
Currently just returns ECTF_NOTYET.
ctf_link_write: Returns a buffer containing either a serialized
ctf_file_t (if there are no types with conflicting definitions in the
object files in the link) or a ctf_archive_t containing a large
ctf_file_t (the common types) and a bunch of small ones named after
individual CUs in which conflicting types are found (containing the
conflicting types, and all types that reference them). A threshold
size above which compression takes place is passed as one parameter.
(Currently, only gzip compression is supported, but I hope to add lzma
as well.)
Lifetime rules for this are simple: don't close the input CTF files
until you've called ctf_link for the last time. We do not assume
that symbols or strings passed in by the callback outlast the
call to ctf_link_add_strtab or ctf_link_shuffle_syms.
Right now, the duplicate elimination mechanism is the one already
present as part of the ctf_add_type function, and is not particularly
good: it misses numerous actual duplicates, and the conflicting-types
detection hardly ever reports that types conflict, even when they do
(one of them just tends to get silently dropped): it is also very slow.
This will all be fixed in the next few weeks, but the fix hardly touches
any of this code, and the linker does work without it, just not as
well as it otherwise might. (And when no CTF section is present,
there is no effect on performance, of course. So only people using
a trunk GCC with not-yet-committed patches will even notice. By the
time it gets upstream, things should be better.)
v3: Fix error handling.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the
libctf linking machinery.
(CTF_LINK_SHARE_UNCONFLICTED): New.
(CTF_LINK_SHARE_DUPLICATED): New.
(ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED.
(ECTF_NOTYET): New, a 'not yet implemented' message.
(ctf_link_add_ctf): New, add an input file's CTF to the link.
(ctf_link): New, merge the type and string sections.
(ctf_link_strtab_string_f): New, callback for feeding strtab info.
(ctf_link_iter_symbol_f): New, callback for feeding symtab info.
(ctf_link_add_strtab): New, tell the CTF linker about the ELF
strtab's strings.
(ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its
symbols into symtab order.
(ctf_link_write): New, ask the CTF linker to write the CTF out.
libctf/
* ctf-link.c: New file, linking of the string and type sections.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
* ctf-impl.h (ctf_file_t): New fields ctf_link_inputs,
ctf_link_outputs.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
|
|
|
ctf_dynhash_t *ctf_link_inputs; /* Inputs to this link. */
|
|
|
|
ctf_dynhash_t *ctf_link_outputs; /* Additional outputs from this link. */
|
2019-07-14 04:31:26 +08:00
|
|
|
ctf_dynhash_t *ctf_link_type_mapping; /* Map input types to output types. */
|
libctf: add CU-mapping machinery
Once the deduplicator is capable of actually detecting conflicting types
with the same name (i.e., not yet) we will place such conflicting types,
and types that depend on them, into CTF dictionaries that are the child
of the main dictionary we usually emit: currently, this will lead to the
.ctf section becoming a CTF archive rather than a single dictionary,
with the default-named archive member (_CTF_SECTION, or NULL) being the
main shared dictionary with most of the types in it.
By default, the sections are named after the compilation unit they come
from (complete path and all), with the cuname field in the CTF header
providing further evidence of the name without requiring the caller to
engage in tiresome parsing. But some callers may not wish the mapping
from input CU to output sub-dictionary to be purely CU-based.
The machinery here allows this to be freely changed, in two ways:
- callers can call ctf_link_add_cu_mapping to specify that a single
input compilation unit should have its types placed in some other CU
if they conflict: the CU will always be created, even if empty, so
the consuming program can depend on its existence. You can map
multiple input CUs to one output CU to force all their types to be
merged together: if some of *those* types conflict, the behaviour is
currently unspecified (the new deduplicator will specify it).
- callers can call ctf_link_set_memb_name_changer to provide a function
which is passed every CTF sub-dictionary name in turn (including
_CTF_SECTION) and can return a new name, or NULL if no change is
desired. The mapping from input to output names should not map two
input names to the same output name: if this happens, the two are not
merged but will result in an archive with two members with the same
name (technically valid, but it's hard to access the second
same-named member: you have to do an iteration over archive members).
This is used by the kernel's ctfarchive machinery (not yet upstream) to
encode CTF under member names like {module name}.ctf rather than
.ctf.CU, but it is anticipated that other large projects may wish to
have their own storage for CTF outside of .ctf sections and may wish to
have new naming schemes that suit their special-purpose consumers.
New in v3.
v4: check for strdup failure.
v5: fix tabdamage.
include/
* ctf-api.h (ctf_link_add_cu_mapping): New.
(ctf_link_memb_name_changer_f): New.
(ctf_link_set_memb_name_changer): New.
libctf/
* ctf-impl.h (ctf_file_t) <ctf_link_cu_mappping>: New.
<ctf_link_memb_name_changer>: Likewise.
<ctf_link_memb_name_changer_arg>: Likewise.
* ctf-create.c (ctf_update): Update accordingly.
* ctf-open.c (ctf_file_close): Likewise.
* ctf-link.c (ctf_create_per_cu): Apply the cu mapping.
(ctf_link_add_cu_mapping): New.
(ctf_link_set_memb_name_changer): Likewise.
(ctf_change_parent_name): New.
(ctf_name_list_accum_cb_arg_t) <dynames>: New, storage for names
allocated by the caller's ctf_link_memb_name_changer.
<ndynames>: Likewise.
(ctf_accumulate_archive_names): Call the ctf_link_memb_name_changer.
(ctf_link_write): Likewise (for _CTF_SECTION only): also call
ctf_change_parent_name. Free any resulting names.
2019-07-20 21:44:44 +08:00
|
|
|
ctf_dynhash_t *ctf_link_cu_mapping; /* Map CU names to CTF dict names. */
|
|
|
|
/* Allow the caller to Change the name of link archive members. */
|
|
|
|
ctf_link_memb_name_changer_f *ctf_link_memb_name_changer;
|
|
|
|
void *ctf_link_memb_name_changer_arg; /* Argument for it. */
|
libctf: properly handle ctf_add_type of forwards and self-reffing structs
The code to handle structures (and unions) that refer to themselves in
ctf_add_type is extremely dodgy. It works by looking through the list
of not-yet-committed types for a structure with the same name as the
structure in question and assuming, if it finds it, that this must be a
reference to the same type. This is a linear search that gets ever
slower as the dictionary grows, requiring you to call ctf_update at
intervals to keep performance tolerable: but if you do that, you run
into the problem that if a forward declared before the ctf_update is
changed to a structure afterwards, ctf_update explodes.
The last commit fixed most of this: this commit can use it, adding a new
ctf_add_processing hash that tracks source type IDs that are currently
being processed and uses it to avoid infinite recursion rather than the
dynamic type list: we split ctf_add_type into a ctf_add_type_internal,
so that ctf_add_type itself can become a wrapper that empties out this
being-processed hash once the entire recursive type addition is over.
Structure additions themselves avoid adding their dependent types
quite so much by checking the type mapping and avoiding re-adding types
we already know we have added.
We also add support for adding forwards to dictionaries that already
contain the thing they are a forward to: we just silently return the
original type.
v4: return existing struct/union/enum types properly, rather than using
an uninitialized variable: shrinks sizes of CTF sections back down
to roughly where they were in v1/v2 of this patch series.
v5: fix tabdamage.
libctf/
* ctf-impl.h (ctf_file_t) <ctf_add_processing>: New.
* ctf-open.c (ctf_file_close): Free it.
* ctf-create.c (ctf_serialize): Adjust.
(membcmp): When reporting a conflict due to an error, report the
error.
(ctf_add_type): Turn into a ctf_add_processing wrapper. Rename to...
(ctf_add_type_internal): ... this. Hand back types we are already
in the middle of adding immediately. Hand back structs/unions with
the same number of members immediately. Do not walk the dynamic
list. Call ctf_add_type_internal, not ctf_add_type. Handle
forwards promoted to other types and the inverse case identically.
Add structs to the mapping as soon as we intern them, before they
gain any members.
2019-08-08 01:01:08 +08:00
|
|
|
ctf_dynhash_t *ctf_add_processing; /* Types ctf_add_type is working on now. */
|
2019-04-24 05:24:13 +08:00
|
|
|
char *ctf_tmp_typeslice; /* Storage for slicing up type names. */
|
|
|
|
size_t ctf_tmp_typeslicelen; /* Size of the typeslice. */
|
|
|
|
void *ctf_specific; /* Data for ctf_get/setspecific(). */
|
|
|
|
};
|
|
|
|
|
libctf: mmappable archives
If you need to store a large number of CTF containers somewhere, this
provides a dedicated facility for doing so: an mmappable archive format
like a very simple tar or ar without all the system-dependent format
horrors or need for heavy file copying, with built-in compression of
files above a particular size threshold.
libctf automatically mmap()s uncompressed elements of these archives, or
uncompresses them, as needed. (If the platform does not support mmap(),
copying into dynamically-allocated buffers is used.)
Archive iteration operations are partitioned into raw and non-raw
forms. Raw operations pass thhe raw archive contents to the callback:
non-raw forms open each member with ctf_bufopen() and pass the resulting
ctf_file_t to the iterator instead. This lets you manipulate the raw
data in the archive, or the contents interpreted as a CTF file, as
needed.
It is not yet known whether we will store CTF archives in a linked ELF
object in one of these (akin to debugdata) or whether they'll get one
section per TU plus one parent container for types shared between them.
(In the case of ELF objects with very large numbers of TUs, an archive
of all of them would seem preferable, so we might just use an archive,
and add lzma support so you can assume that .gnu_debugdata and .ctf are
compressed using the same algorithm if both are present.)
To make usage easier, the ctf_archive_t is not the on-disk
representation but an abstraction over both ctf_file_t's and archives of
many ctf_file_t's: users see both CTF archives and raw CTF files as
ctf_archive_t's upon opening, the only difference being that a raw CTF
file has only a single "archive member", named ".ctf" (the default if a
null pointer is passed in as the name). The next commit will make use
of this facility, in addition to providing the public interface to
actually open archives. (In the future, it should be possible to have
all CTF sections in an ELF file appear as an "archive" in the same
fashion.)
This machinery is also used to allow library-internal creators of
ctf_archive_t's (such as the next commit) to stash away an ELF string
and symbol table, so that all opens of members in a given archive will
use them. This lets CTF archives exploit the ELF string and symbol
table just like raw CTF files can.
(All this leads to somewhat confusing type naming. The ctf_archive_t is
a typedef for the opaque internal type, struct ctf_archive_internal: the
non-internal "struct ctf_archive" is the on-disk structure meant for
other libraries manipulating CTF files. It is probably clearest to use
the struct name for struct ctf_archive_internal inside the program, and
the typedef names outside.)
libctf/
* ctf-archive.c: New.
* ctf-impl.h (ctf_archive_internal): New type.
(ctf_arc_open_internal): New declaration.
(ctf_arc_bufopen): Likewise.
(ctf_arc_close_internal): Likewise.
include/
* ctf.h (CTFA_MAGIC): New.
(struct ctf_archive): New.
(struct ctf_archive_modent): Likewise.
* ctf-api.h (ctf_archive_member_f): New.
(ctf_archive_raw_member_f): Likewise.
(ctf_arc_write): Likewise.
(ctf_arc_close): Likewise.
(ctf_arc_open_by_name): Likewise.
(ctf_archive_iter): Likewise.
(ctf_archive_raw_iter): Likewise.
(ctf_get_arc): Likewise.
2019-04-24 18:30:17 +08:00
|
|
|
/* An abstraction over both a ctf_file_t and a ctf_archive_t. */
|
|
|
|
|
|
|
|
struct ctf_archive_internal
|
|
|
|
{
|
|
|
|
int ctfi_is_archive;
|
|
|
|
ctf_file_t *ctfi_file;
|
|
|
|
struct ctf_archive *ctfi_archive;
|
|
|
|
ctf_sect_t ctfi_symsect;
|
|
|
|
ctf_sect_t ctfi_strsect;
|
libctf, binutils: support CTF archives like objdump
objdump and readelf have one major CTF-related behavioural difference:
objdump can read .ctf sections that contain CTF archives and extract and
dump their members, while readelf cannot. Since the linker often emits
CTF archives, this means that readelf intermittently and (from the
user's perspective) randomly fails to read CTF in files that ld emits,
with a confusing error message wrongly claiming that the CTF content is
corrupt. This is purely because the archive-opening code in libctf was
needlessly tangled up with the BFD code, so readelf couldn't use it.
Here, we disentangle it, moving ctf_new_archive_internal from
ctf-open-bfd.c into ctf-archive.c and merging it with the helper
function in ctf-archive.c it was already using. We add a new public API
function ctf_arc_bufopen, that looks very like ctf_bufopen but returns
an archive given suitable section data rather than a ctf_file_t: the
archive is a ctf_archive_t, so it can be called on raw CTF dictionaries
(with no archive present) and will return a single-member synthetic
"archive".
There is a tiny lifetime tweak here: before now, the archive code could
assume that the symbol section in the ctf_archive_internal wrapper
structure was always owned by BFD if it was present and should always be
freed: now, the caller can pass one in via ctf_arc_bufopen, wihch has
the usual lifetime rules for such sections (caller frees): so we add an
extra field to track whether this is an internal call from ctf-open-bfd,
in which case we still free the symbol section.
include/
* ctf-api.h (ctf_arc_bufopen): New.
libctf/
* ctf-impl.h (ctf_new_archive_internal): Declare.
(ctf_arc_bufopen): Remove.
(ctf_archive_internal) <ctfi_free_symsect>: New.
* ctf-archive.c (ctf_arc_close): Use it.
(ctf_arc_bufopen): Fuse into...
(ctf_new_archive_internal): ... this, moved across from...
* ctf-open-bfd.c: ... here.
(ctf_bfdopen_ctfsect): Use ctf_arc_bufopen.
* libctf.ver: Add it.
binutils/
* readelf.c (dump_section_as_ctf): Support .ctf archives using
ctf_arc_bufopen. Automatically load the .ctf member of such
archives as the parent of all other members, unless specifically
overridden via --ctf-parent. Split out dumping code into...
(dump_ctf_archive_member): ... here, as in objdump, and call
it once per archive member.
(dump_ctf_indent_lines): Code style fix.
2019-12-13 20:01:12 +08:00
|
|
|
int ctfi_free_symsect;
|
libctf: mmappable archives
If you need to store a large number of CTF containers somewhere, this
provides a dedicated facility for doing so: an mmappable archive format
like a very simple tar or ar without all the system-dependent format
horrors or need for heavy file copying, with built-in compression of
files above a particular size threshold.
libctf automatically mmap()s uncompressed elements of these archives, or
uncompresses them, as needed. (If the platform does not support mmap(),
copying into dynamically-allocated buffers is used.)
Archive iteration operations are partitioned into raw and non-raw
forms. Raw operations pass thhe raw archive contents to the callback:
non-raw forms open each member with ctf_bufopen() and pass the resulting
ctf_file_t to the iterator instead. This lets you manipulate the raw
data in the archive, or the contents interpreted as a CTF file, as
needed.
It is not yet known whether we will store CTF archives in a linked ELF
object in one of these (akin to debugdata) or whether they'll get one
section per TU plus one parent container for types shared between them.
(In the case of ELF objects with very large numbers of TUs, an archive
of all of them would seem preferable, so we might just use an archive,
and add lzma support so you can assume that .gnu_debugdata and .ctf are
compressed using the same algorithm if both are present.)
To make usage easier, the ctf_archive_t is not the on-disk
representation but an abstraction over both ctf_file_t's and archives of
many ctf_file_t's: users see both CTF archives and raw CTF files as
ctf_archive_t's upon opening, the only difference being that a raw CTF
file has only a single "archive member", named ".ctf" (the default if a
null pointer is passed in as the name). The next commit will make use
of this facility, in addition to providing the public interface to
actually open archives. (In the future, it should be possible to have
all CTF sections in an ELF file appear as an "archive" in the same
fashion.)
This machinery is also used to allow library-internal creators of
ctf_archive_t's (such as the next commit) to stash away an ELF string
and symbol table, so that all opens of members in a given archive will
use them. This lets CTF archives exploit the ELF string and symbol
table just like raw CTF files can.
(All this leads to somewhat confusing type naming. The ctf_archive_t is
a typedef for the opaque internal type, struct ctf_archive_internal: the
non-internal "struct ctf_archive" is the on-disk structure meant for
other libraries manipulating CTF files. It is probably clearest to use
the struct name for struct ctf_archive_internal inside the program, and
the typedef names outside.)
libctf/
* ctf-archive.c: New.
* ctf-impl.h (ctf_archive_internal): New type.
(ctf_arc_open_internal): New declaration.
(ctf_arc_bufopen): Likewise.
(ctf_arc_close_internal): Likewise.
include/
* ctf.h (CTFA_MAGIC): New.
(struct ctf_archive): New.
(struct ctf_archive_modent): Likewise.
* ctf-api.h (ctf_archive_member_f): New.
(ctf_archive_raw_member_f): Likewise.
(ctf_arc_write): Likewise.
(ctf_arc_close): Likewise.
(ctf_arc_open_by_name): Likewise.
(ctf_archive_iter): Likewise.
(ctf_archive_raw_iter): Likewise.
(ctf_get_arc): Likewise.
2019-04-24 18:30:17 +08:00
|
|
|
void *ctfi_data;
|
libctf: ELF file opening via BFD
These functions let you open an ELF file with a customarily-named CTF
section in it, automatically opening the CTF file or archive and
associating the symbol and string tables in the ELF file with the CTF
container, so that you can look up the types of symbols in the ELF file
via ctf_lookup_by_symbol(), and so that strings can be shared between
the ELF file and CTF container, to save space.
It uses BFD machinery to do so. This has now been lightly tested and
seems to work. In particular, if you already have a bfd you can pass
it in to ctf_bfdopen(), and if you want a bfd made for you you can
call ctf_open() or ctf_fdopen(), optionally specifying a target (or
try once without a target and then again with one if you get
ECTF_BFD_AMBIGUOUS back).
We use a forward declaration for the struct bfd in ctf-api.h, so that
ctf-api.h users are not required to pull in <bfd.h>. (This is mostly
for the sake of readelf.)
libctf/
* ctf-open-bfd.c: New file.
* ctf-open.c (ctf_close): New.
* ctf-impl.h: Include bfd.h.
(ctf_file): New members ctf_data_mmapped, ctf_data_mmapped_len.
(ctf_archive_internal): New members ctfi_abfd, ctfi_data,
ctfi_bfd_close.
(ctf_bfdopen_ctfsect): New declaration.
(_CTF_SECTION): likewise.
include/
* ctf-api.h (struct bfd): New forward.
(ctf_fdopen): New.
(ctf_bfdopen): Likewise.
(ctf_open): Likewise.
(ctf_arc_open): Likewise.
2019-04-24 17:46:39 +08:00
|
|
|
bfd *ctfi_abfd; /* Optional source of section data. */
|
|
|
|
void (*ctfi_bfd_close) (struct ctf_archive_internal *);
|
libctf: mmappable archives
If you need to store a large number of CTF containers somewhere, this
provides a dedicated facility for doing so: an mmappable archive format
like a very simple tar or ar without all the system-dependent format
horrors or need for heavy file copying, with built-in compression of
files above a particular size threshold.
libctf automatically mmap()s uncompressed elements of these archives, or
uncompresses them, as needed. (If the platform does not support mmap(),
copying into dynamically-allocated buffers is used.)
Archive iteration operations are partitioned into raw and non-raw
forms. Raw operations pass thhe raw archive contents to the callback:
non-raw forms open each member with ctf_bufopen() and pass the resulting
ctf_file_t to the iterator instead. This lets you manipulate the raw
data in the archive, or the contents interpreted as a CTF file, as
needed.
It is not yet known whether we will store CTF archives in a linked ELF
object in one of these (akin to debugdata) or whether they'll get one
section per TU plus one parent container for types shared between them.
(In the case of ELF objects with very large numbers of TUs, an archive
of all of them would seem preferable, so we might just use an archive,
and add lzma support so you can assume that .gnu_debugdata and .ctf are
compressed using the same algorithm if both are present.)
To make usage easier, the ctf_archive_t is not the on-disk
representation but an abstraction over both ctf_file_t's and archives of
many ctf_file_t's: users see both CTF archives and raw CTF files as
ctf_archive_t's upon opening, the only difference being that a raw CTF
file has only a single "archive member", named ".ctf" (the default if a
null pointer is passed in as the name). The next commit will make use
of this facility, in addition to providing the public interface to
actually open archives. (In the future, it should be possible to have
all CTF sections in an ELF file appear as an "archive" in the same
fashion.)
This machinery is also used to allow library-internal creators of
ctf_archive_t's (such as the next commit) to stash away an ELF string
and symbol table, so that all opens of members in a given archive will
use them. This lets CTF archives exploit the ELF string and symbol
table just like raw CTF files can.
(All this leads to somewhat confusing type naming. The ctf_archive_t is
a typedef for the opaque internal type, struct ctf_archive_internal: the
non-internal "struct ctf_archive" is the on-disk structure meant for
other libraries manipulating CTF files. It is probably clearest to use
the struct name for struct ctf_archive_internal inside the program, and
the typedef names outside.)
libctf/
* ctf-archive.c: New.
* ctf-impl.h (ctf_archive_internal): New type.
(ctf_arc_open_internal): New declaration.
(ctf_arc_bufopen): Likewise.
(ctf_arc_close_internal): Likewise.
include/
* ctf.h (CTFA_MAGIC): New.
(struct ctf_archive): New.
(struct ctf_archive_modent): Likewise.
* ctf-api.h (ctf_archive_member_f): New.
(ctf_archive_raw_member_f): Likewise.
(ctf_arc_write): Likewise.
(ctf_arc_close): Likewise.
(ctf_arc_open_by_name): Likewise.
(ctf_archive_iter): Likewise.
(ctf_archive_raw_iter): Likewise.
(ctf_get_arc): Likewise.
2019-04-24 18:30:17 +08:00
|
|
|
};
|
|
|
|
|
2019-04-24 05:24:13 +08:00
|
|
|
/* Return x rounded up to an alignment boundary.
|
|
|
|
eg, P2ROUNDUP(0x1234, 0x100) == 0x1300 (0x13*align)
|
|
|
|
eg, P2ROUNDUP(0x5600, 0x100) == 0x5600 (0x56*align) */
|
|
|
|
#define P2ROUNDUP(x, align) (-(-(x) & -(align)))
|
|
|
|
|
|
|
|
/* * If an offs is not aligned already then round it up and align it. */
|
|
|
|
#define LCTF_ALIGN_OFFS(offs, align) ((offs + (align - 1)) & ~(align - 1))
|
|
|
|
|
|
|
|
#define LCTF_TYPE_ISPARENT(fp, id) ((id) <= fp->ctf_parmax)
|
|
|
|
#define LCTF_TYPE_ISCHILD(fp, id) ((id) > fp->ctf_parmax)
|
|
|
|
#define LCTF_TYPE_TO_INDEX(fp, id) ((id) & (fp->ctf_parmax))
|
|
|
|
#define LCTF_INDEX_TO_TYPE(fp, id, child) (child ? ((id) | (fp->ctf_parmax+1)) : \
|
|
|
|
(id))
|
|
|
|
|
|
|
|
#define LCTF_INDEX_TO_TYPEPTR(fp, i) \
|
libctf: avoid the need to ever use ctf_update
The method of operation of libctf when the dictionary is writable has
before now been that types that are added land in the dynamic type
section, which is a linked list and hash of IDs -> dynamic type
definitions (and, recently a hash of names): the DTDs are a bit of CTF
representing the ctf_type_t and ad hoc C structures representing the
vlen. Historically, libctf was unable to do anything with these types,
not even look them up by ID, let alone by name: if you wanted to do that
say if you were adding a type that depended on one you just added) you
called ctf_update, which serializes all the DTDs into a CTF file and
reopens it, copying its guts over the fp it's called with. The
ctf_updated types are then frozen in amber and unchangeable: all lookups
will return the types in the static portion in preference to the dynamic
portion, and we will refuse to re-add things that already exist in the
static portion (and, of late, in the dynamic portion too). The libctf
machinery remembers the boundary between static and dynamic types and
looks in the right portion for each type. Lots of things still don't
quite work with dynamic types (e.g. getting their size), but enough
works to do a bunch of additions and then a ctf_update, most of the
time.
Except it doesn't, because ctf_add_type finds it necessary to walk the
full dynamic type definition list looking for types with matching names,
so it gets slower and slower with every type you add: fixing this
requires calling ctf_update periodically for no other reason than to
avoid massively slowing things down.
This is all clunky and very slow but kind of works, until you consider
that it is in fact possible and indeed necessary to modify one sort of
type after it has been added: forwards. These are necessarily promoted
to structs, unions or enums, and when they do so *their type ID does not
change*. So all of a sudden we are changing types that already exist in
the static portion. ctf_update gets massively confused by this and
allocates space enough for the forward (with no members), but then emits
the new dynamic type (with all the members) into it. You get an
assertion failure after that, if you're lucky, or a coredump.
So this commit rejigs things a bit and arranges to exclusively use the
dynamic type definitions in writable dictionaries, and the static type
definitions in readable dictionaries: we don't at any time have a mixture
of static and dynamic types, and you don't need to call ctf_update to
make things "appear". The ctf_dtbyname hash I introduced a few months
ago, which maps things like "struct foo" to DTDs, is removed, replaced
instead by a change of type of the four dictionaries which track names.
Rather than just being (unresizable) ctf_hash_t's populated only at
ctf_bufopen time, they are now a ctf_names_t structure, which is a pair
of ctf_hash_t and ctf_dynhash_t, with the ctf_hash_t portion being used
in readonly dictionaries, and the ctf_dynhash_t being used in writable
ones. The decision as to which to use is centralized in the new
functions ctf_lookup_by_rawname (which takes a type kind) and
ctf_lookup_by_rawhash, which it calls (which takes a ctf_names_t *.)
This change lets us switch from using static to dynamic name hashes on
the fly across the entirety of libctf without complexifying anything: in
fact, because we now centralize the knowledge about how to map from type
kind to name hash, it actually simplifies things and lets us throw out
quite a lot of now-unnecessary complexity, from ctf_dtnyname (replaced
by the dynamic half of the name tables), through to ctf_dtnextid (now
that a dictionary's static portion is never referenced if the dictionary
is writable, we can just use ctf_typemax to indicate the maximum type:
dynamic or non-dynamic does not matter, and we no longer need to track
the boundary between the types). You can now ctf_rollback() as far as
you like, even past a ctf_update or for that matter a full writeout; all
the iteration functions work just as well on writable as on read-only
dictionaries; ctf_add_type no longer needs expensive duplicated code to
run over the dynamic types hunting for ones it might be interested in;
and the linker no longer needs a hack to call ctf_update so that calling
ctf_add_type is not impossibly expensive.
There is still a bit more complexity: some new code paths in ctf-types.c
need to know how to extract information from dynamic types. This
complexity will go away again in a few months when libctf acquires a
proper intermediate representation.
You can still call ctf_update if you like (it's public API, after all),
but its only effect now is to set the point to which ctf_discard rolls
back.
Obviously *something* still needs to serialize the CTF file before
writeout, and this job is done by ctf_serialize, which does everything
ctf_update used to except set the counter used by ctf_discard. It is
automatically called by the various functions that do CTF writeout:
nobody else ever needs to call it.
With this in place, forwards that are promoted to non-forwards no longer
crash the link, even if it happens tens of thousands of types later.
v5: fix tabdamage.
libctf/
* ctf-impl.h (ctf_names_t): New.
(ctf_lookup_t) <ctf_hash>: Now a ctf_names_t, not a ctf_hash_t.
(ctf_file_t) <ctf_structs>: Likewise.
<ctf_unions>: Likewise.
<ctf_enums>: Likewise.
<ctf_names>: Likewise.
<ctf_lookups>: Improve comment.
<ctf_ptrtab_len>: New.
<ctf_prov_strtab>: New.
<ctf_str_prov_offset>: New.
<ctf_dtbyname>: Remove, redundant to the names hashes.
<ctf_dtnextid>: Remove, redundant to ctf_typemax.
(ctf_dtdef_t) <dtd_name>: Remove.
<dtd_data>: Note that the ctt_name is now populated.
(ctf_str_atom_t) <csa_offset>: This is now the strtab
offset for internal strings too.
<csa_external_offset>: New, the external strtab offset.
(CTF_INDEX_TO_TYPEPTR): Handle the LCTF_RDWR case.
(ctf_name_table): New declaration.
(ctf_lookup_by_rawname): Likewise.
(ctf_lookup_by_rawhash): Likewise.
(ctf_set_ctl_hashes): Likewise.
(ctf_serialize): Likewise.
(ctf_dtd_insert): Adjust.
(ctf_simple_open_internal): Likewise.
(ctf_bufopen_internal): Likewise.
(ctf_list_empty_p): Likewise.
(ctf_str_remove_ref): Likewise.
(ctf_str_add): Returns uint32_t now.
(ctf_str_add_ref): Likewise.
(ctf_str_add_external): Now returns a boolean (int).
* ctf-string.c (ctf_strraw_explicit): Check the ctf_prov_strtab
for strings in the appropriate range.
(ctf_str_create_atoms): Create the ctf_prov_strtab. Detect OOM
when adding the null string to the new strtab.
(ctf_str_free_atoms): Destroy the ctf_prov_strtab.
(ctf_str_add_ref_internal): Add make_provisional argument. If
make_provisional, populate the offset and fill in the
ctf_prov_strtab accordingly.
(ctf_str_add): Return the offset, not the string.
(ctf_str_add_ref): Likewise.
(ctf_str_add_external): Return a success integer.
(ctf_str_remove_ref): New, remove a single ref.
(ctf_str_count_strtab): Do not count the initial null string's
length or the existence or length of any unreferenced internal
atoms.
(ctf_str_populate_sorttab): Skip atoms with no refs.
(ctf_str_write_strtab): Populate the nullstr earlier. Add one
to the cts_len for the null string, since it is no longer done
in ctf_str_count_strtab. Adjust for csa_external_offset rename.
Populate the csa_offset for both internal and external cases.
Flush the ctf_prov_strtab afterwards, and reset the
ctf_str_prov_offset.
* ctf-create.c (ctf_grow_ptrtab): New.
(ctf_create): Call it. Initialize new fields rather than old
ones. Tell ctf_bufopen_internal that this is a writable dictionary.
Set the ctl hashes and data model.
(ctf_update): Rename to...
(ctf_serialize): ... this. Leave a compatibility function behind.
Tell ctf_simple_open_internal that this is a writable dictionary.
Pass the new fields along from the old dictionary. Drop
ctf_dtnextid and ctf_dtbyname. Use ctf_strraw, not dtd_name.
Do not zero out the DTD's ctt_name.
(ctf_prefixed_name): Rename to...
(ctf_name_table): ... this. No longer return a prefixed name: return
the applicable name table instead.
(ctf_dtd_insert): Use it, and use the right name table. Pass in the
kind we're adding. Migrate away from dtd_name.
(ctf_dtd_delete): Adjust similarly. Remove the ref to the
deleted ctt_name.
(ctf_dtd_lookup_type_by_name): Remove.
(ctf_dynamic_type): Always return NULL on read-only dictionaries.
No longer check ctf_dtnextid: check ctf_typemax instead.
(ctf_snapshot): No longer use ctf_dtnextid: use ctf_typemax instead.
(ctf_rollback): Likewise. No longer fail with ECTF_OVERROLLBACK. Use
ctf_name_table and the right name table, and migrate away from
dtd_name as in ctf_dtd_delete.
(ctf_add_generic): Pass in the kind explicitly and pass it to
ctf_dtd_insert. Use ctf_typemax, not ctf_dtnextid. Migrate away
from dtd_name to using ctf_str_add_ref to populate the ctt_name.
Grow the ptrtab if needed.
(ctf_add_encoded): Pass in the kind.
(ctf_add_slice): Likewise.
(ctf_add_array): Likewise.
(ctf_add_function): Likewise.
(ctf_add_typedef): Likewise.
(ctf_add_reftype): Likewise. Initialize the ctf_ptrtab, checking
ctt_name rather than dtd_name.
(ctf_add_struct_sized): Pass in the kind. Use
ctf_lookup_by_rawname, not ctf_hash_lookup_type /
ctf_dtd_lookup_type_by_name.
(ctf_add_union_sized): Likewise.
(ctf_add_enum): Likewise.
(ctf_add_enum_encoded): Likewise.
(ctf_add_forward): Likewise.
(ctf_add_type): Likewise.
(ctf_compress_write): Call ctf_serialize: adjust for ctf_size not
being initialized until after the call.
(ctf_write_mem): Likewise.
(ctf_write): Likewise.
* ctf-archive.c (arc_write_one_ctf): Likewise.
* ctf-lookup.c (ctf_lookup_by_name): Use ctf_lookuup_by_rawhash, not
ctf_hash_lookup_type.
(ctf_lookup_by_id): No longer check the readonly types if the
dictionary is writable.
* ctf-open.c (init_types): Assert that this dictionary is not
writable. Adjust to use the new name hashes, ctf_name_table,
and ctf_ptrtab_len. GNU style fix for the final ptrtab scan.
(ctf_bufopen_internal): New 'writable' parameter. Flip on LCTF_RDWR
if set. Drop out early when dictionary is writable. Split the
ctf_lookups initialization into...
(ctf_set_cth_hashes): ... this new function.
(ctf_simple_open_internal): Adjust. New 'writable' parameter.
(ctf_simple_open): Adjust accordingly.
(ctf_bufopen): Likewise.
(ctf_file_close): Destroy the appropriate name hashes. No longer
destroy ctf_dtbyname, which is gone.
(ctf_getdatasect): Remove spurious "extern".
* ctf-types.c (ctf_lookup_by_rawname): New, look up types in the
specified name table, given a kind.
(ctf_lookup_by_rawhash): Likewise, given a ctf_names_t *.
(ctf_member_iter): Add support for iterating over the
dynamic type list.
(ctf_enum_iter): Likewise.
(ctf_variable_iter): Likewise.
(ctf_type_rvisit): Likewise.
(ctf_member_info): Add support for types in the dynamic type list.
(ctf_enum_name): Likewise.
(ctf_enum_value): Likewise.
(ctf_func_type_info): Likewise.
(ctf_func_type_args): Likewise.
* ctf-link.c (ctf_accumulate_archive_names): No longer call
ctf_update.
(ctf_link_write): Likewise.
(ctf_link_intern_extern_string): Adjust for new
ctf_str_add_external return value.
(ctf_link_add_strtab): Likewise.
* ctf-util.c (ctf_list_empty_p): New.
2019-08-08 00:55:09 +08:00
|
|
|
((fp->ctf_flags & LCTF_RDWR) ? \
|
|
|
|
&(ctf_dtd_lookup (fp, LCTF_INDEX_TO_TYPE \
|
|
|
|
(fp, i, fp->ctf_flags & LCTF_CHILD))->dtd_data) : \
|
|
|
|
(ctf_type_t *)((uintptr_t)(fp)->ctf_buf + (fp)->ctf_txlate[(i)]))
|
2019-04-24 05:24:13 +08:00
|
|
|
|
|
|
|
#define LCTF_INFO_KIND(fp, info) ((fp)->ctf_fileops->ctfo_get_kind(info))
|
|
|
|
#define LCTF_INFO_ISROOT(fp, info) ((fp)->ctf_fileops->ctfo_get_root(info))
|
|
|
|
#define LCTF_INFO_VLEN(fp, info) ((fp)->ctf_fileops->ctfo_get_vlen(info))
|
|
|
|
#define LCTF_VBYTES(fp, kind, size, vlen) \
|
|
|
|
((fp)->ctf_fileops->ctfo_get_vbytes(kind, size, vlen))
|
|
|
|
|
|
|
|
static inline ssize_t ctf_get_ctt_size (const ctf_file_t *fp,
|
|
|
|
const ctf_type_t *tp,
|
|
|
|
ssize_t *sizep,
|
|
|
|
ssize_t *incrementp)
|
|
|
|
{
|
|
|
|
return (fp->ctf_fileops->ctfo_get_ctt_size (fp, tp, sizep, incrementp));
|
|
|
|
}
|
|
|
|
|
|
|
|
#define LCTF_CHILD 0x0001 /* CTF container is a child */
|
|
|
|
#define LCTF_RDWR 0x0002 /* CTF container is writable */
|
|
|
|
#define LCTF_DIRTY 0x0004 /* CTF container has been modified */
|
|
|
|
|
libctf: avoid the need to ever use ctf_update
The method of operation of libctf when the dictionary is writable has
before now been that types that are added land in the dynamic type
section, which is a linked list and hash of IDs -> dynamic type
definitions (and, recently a hash of names): the DTDs are a bit of CTF
representing the ctf_type_t and ad hoc C structures representing the
vlen. Historically, libctf was unable to do anything with these types,
not even look them up by ID, let alone by name: if you wanted to do that
say if you were adding a type that depended on one you just added) you
called ctf_update, which serializes all the DTDs into a CTF file and
reopens it, copying its guts over the fp it's called with. The
ctf_updated types are then frozen in amber and unchangeable: all lookups
will return the types in the static portion in preference to the dynamic
portion, and we will refuse to re-add things that already exist in the
static portion (and, of late, in the dynamic portion too). The libctf
machinery remembers the boundary between static and dynamic types and
looks in the right portion for each type. Lots of things still don't
quite work with dynamic types (e.g. getting their size), but enough
works to do a bunch of additions and then a ctf_update, most of the
time.
Except it doesn't, because ctf_add_type finds it necessary to walk the
full dynamic type definition list looking for types with matching names,
so it gets slower and slower with every type you add: fixing this
requires calling ctf_update periodically for no other reason than to
avoid massively slowing things down.
This is all clunky and very slow but kind of works, until you consider
that it is in fact possible and indeed necessary to modify one sort of
type after it has been added: forwards. These are necessarily promoted
to structs, unions or enums, and when they do so *their type ID does not
change*. So all of a sudden we are changing types that already exist in
the static portion. ctf_update gets massively confused by this and
allocates space enough for the forward (with no members), but then emits
the new dynamic type (with all the members) into it. You get an
assertion failure after that, if you're lucky, or a coredump.
So this commit rejigs things a bit and arranges to exclusively use the
dynamic type definitions in writable dictionaries, and the static type
definitions in readable dictionaries: we don't at any time have a mixture
of static and dynamic types, and you don't need to call ctf_update to
make things "appear". The ctf_dtbyname hash I introduced a few months
ago, which maps things like "struct foo" to DTDs, is removed, replaced
instead by a change of type of the four dictionaries which track names.
Rather than just being (unresizable) ctf_hash_t's populated only at
ctf_bufopen time, they are now a ctf_names_t structure, which is a pair
of ctf_hash_t and ctf_dynhash_t, with the ctf_hash_t portion being used
in readonly dictionaries, and the ctf_dynhash_t being used in writable
ones. The decision as to which to use is centralized in the new
functions ctf_lookup_by_rawname (which takes a type kind) and
ctf_lookup_by_rawhash, which it calls (which takes a ctf_names_t *.)
This change lets us switch from using static to dynamic name hashes on
the fly across the entirety of libctf without complexifying anything: in
fact, because we now centralize the knowledge about how to map from type
kind to name hash, it actually simplifies things and lets us throw out
quite a lot of now-unnecessary complexity, from ctf_dtnyname (replaced
by the dynamic half of the name tables), through to ctf_dtnextid (now
that a dictionary's static portion is never referenced if the dictionary
is writable, we can just use ctf_typemax to indicate the maximum type:
dynamic or non-dynamic does not matter, and we no longer need to track
the boundary between the types). You can now ctf_rollback() as far as
you like, even past a ctf_update or for that matter a full writeout; all
the iteration functions work just as well on writable as on read-only
dictionaries; ctf_add_type no longer needs expensive duplicated code to
run over the dynamic types hunting for ones it might be interested in;
and the linker no longer needs a hack to call ctf_update so that calling
ctf_add_type is not impossibly expensive.
There is still a bit more complexity: some new code paths in ctf-types.c
need to know how to extract information from dynamic types. This
complexity will go away again in a few months when libctf acquires a
proper intermediate representation.
You can still call ctf_update if you like (it's public API, after all),
but its only effect now is to set the point to which ctf_discard rolls
back.
Obviously *something* still needs to serialize the CTF file before
writeout, and this job is done by ctf_serialize, which does everything
ctf_update used to except set the counter used by ctf_discard. It is
automatically called by the various functions that do CTF writeout:
nobody else ever needs to call it.
With this in place, forwards that are promoted to non-forwards no longer
crash the link, even if it happens tens of thousands of types later.
v5: fix tabdamage.
libctf/
* ctf-impl.h (ctf_names_t): New.
(ctf_lookup_t) <ctf_hash>: Now a ctf_names_t, not a ctf_hash_t.
(ctf_file_t) <ctf_structs>: Likewise.
<ctf_unions>: Likewise.
<ctf_enums>: Likewise.
<ctf_names>: Likewise.
<ctf_lookups>: Improve comment.
<ctf_ptrtab_len>: New.
<ctf_prov_strtab>: New.
<ctf_str_prov_offset>: New.
<ctf_dtbyname>: Remove, redundant to the names hashes.
<ctf_dtnextid>: Remove, redundant to ctf_typemax.
(ctf_dtdef_t) <dtd_name>: Remove.
<dtd_data>: Note that the ctt_name is now populated.
(ctf_str_atom_t) <csa_offset>: This is now the strtab
offset for internal strings too.
<csa_external_offset>: New, the external strtab offset.
(CTF_INDEX_TO_TYPEPTR): Handle the LCTF_RDWR case.
(ctf_name_table): New declaration.
(ctf_lookup_by_rawname): Likewise.
(ctf_lookup_by_rawhash): Likewise.
(ctf_set_ctl_hashes): Likewise.
(ctf_serialize): Likewise.
(ctf_dtd_insert): Adjust.
(ctf_simple_open_internal): Likewise.
(ctf_bufopen_internal): Likewise.
(ctf_list_empty_p): Likewise.
(ctf_str_remove_ref): Likewise.
(ctf_str_add): Returns uint32_t now.
(ctf_str_add_ref): Likewise.
(ctf_str_add_external): Now returns a boolean (int).
* ctf-string.c (ctf_strraw_explicit): Check the ctf_prov_strtab
for strings in the appropriate range.
(ctf_str_create_atoms): Create the ctf_prov_strtab. Detect OOM
when adding the null string to the new strtab.
(ctf_str_free_atoms): Destroy the ctf_prov_strtab.
(ctf_str_add_ref_internal): Add make_provisional argument. If
make_provisional, populate the offset and fill in the
ctf_prov_strtab accordingly.
(ctf_str_add): Return the offset, not the string.
(ctf_str_add_ref): Likewise.
(ctf_str_add_external): Return a success integer.
(ctf_str_remove_ref): New, remove a single ref.
(ctf_str_count_strtab): Do not count the initial null string's
length or the existence or length of any unreferenced internal
atoms.
(ctf_str_populate_sorttab): Skip atoms with no refs.
(ctf_str_write_strtab): Populate the nullstr earlier. Add one
to the cts_len for the null string, since it is no longer done
in ctf_str_count_strtab. Adjust for csa_external_offset rename.
Populate the csa_offset for both internal and external cases.
Flush the ctf_prov_strtab afterwards, and reset the
ctf_str_prov_offset.
* ctf-create.c (ctf_grow_ptrtab): New.
(ctf_create): Call it. Initialize new fields rather than old
ones. Tell ctf_bufopen_internal that this is a writable dictionary.
Set the ctl hashes and data model.
(ctf_update): Rename to...
(ctf_serialize): ... this. Leave a compatibility function behind.
Tell ctf_simple_open_internal that this is a writable dictionary.
Pass the new fields along from the old dictionary. Drop
ctf_dtnextid and ctf_dtbyname. Use ctf_strraw, not dtd_name.
Do not zero out the DTD's ctt_name.
(ctf_prefixed_name): Rename to...
(ctf_name_table): ... this. No longer return a prefixed name: return
the applicable name table instead.
(ctf_dtd_insert): Use it, and use the right name table. Pass in the
kind we're adding. Migrate away from dtd_name.
(ctf_dtd_delete): Adjust similarly. Remove the ref to the
deleted ctt_name.
(ctf_dtd_lookup_type_by_name): Remove.
(ctf_dynamic_type): Always return NULL on read-only dictionaries.
No longer check ctf_dtnextid: check ctf_typemax instead.
(ctf_snapshot): No longer use ctf_dtnextid: use ctf_typemax instead.
(ctf_rollback): Likewise. No longer fail with ECTF_OVERROLLBACK. Use
ctf_name_table and the right name table, and migrate away from
dtd_name as in ctf_dtd_delete.
(ctf_add_generic): Pass in the kind explicitly and pass it to
ctf_dtd_insert. Use ctf_typemax, not ctf_dtnextid. Migrate away
from dtd_name to using ctf_str_add_ref to populate the ctt_name.
Grow the ptrtab if needed.
(ctf_add_encoded): Pass in the kind.
(ctf_add_slice): Likewise.
(ctf_add_array): Likewise.
(ctf_add_function): Likewise.
(ctf_add_typedef): Likewise.
(ctf_add_reftype): Likewise. Initialize the ctf_ptrtab, checking
ctt_name rather than dtd_name.
(ctf_add_struct_sized): Pass in the kind. Use
ctf_lookup_by_rawname, not ctf_hash_lookup_type /
ctf_dtd_lookup_type_by_name.
(ctf_add_union_sized): Likewise.
(ctf_add_enum): Likewise.
(ctf_add_enum_encoded): Likewise.
(ctf_add_forward): Likewise.
(ctf_add_type): Likewise.
(ctf_compress_write): Call ctf_serialize: adjust for ctf_size not
being initialized until after the call.
(ctf_write_mem): Likewise.
(ctf_write): Likewise.
* ctf-archive.c (arc_write_one_ctf): Likewise.
* ctf-lookup.c (ctf_lookup_by_name): Use ctf_lookuup_by_rawhash, not
ctf_hash_lookup_type.
(ctf_lookup_by_id): No longer check the readonly types if the
dictionary is writable.
* ctf-open.c (init_types): Assert that this dictionary is not
writable. Adjust to use the new name hashes, ctf_name_table,
and ctf_ptrtab_len. GNU style fix for the final ptrtab scan.
(ctf_bufopen_internal): New 'writable' parameter. Flip on LCTF_RDWR
if set. Drop out early when dictionary is writable. Split the
ctf_lookups initialization into...
(ctf_set_cth_hashes): ... this new function.
(ctf_simple_open_internal): Adjust. New 'writable' parameter.
(ctf_simple_open): Adjust accordingly.
(ctf_bufopen): Likewise.
(ctf_file_close): Destroy the appropriate name hashes. No longer
destroy ctf_dtbyname, which is gone.
(ctf_getdatasect): Remove spurious "extern".
* ctf-types.c (ctf_lookup_by_rawname): New, look up types in the
specified name table, given a kind.
(ctf_lookup_by_rawhash): Likewise, given a ctf_names_t *.
(ctf_member_iter): Add support for iterating over the
dynamic type list.
(ctf_enum_iter): Likewise.
(ctf_variable_iter): Likewise.
(ctf_type_rvisit): Likewise.
(ctf_member_info): Add support for types in the dynamic type list.
(ctf_enum_name): Likewise.
(ctf_enum_value): Likewise.
(ctf_func_type_info): Likewise.
(ctf_func_type_args): Likewise.
* ctf-link.c (ctf_accumulate_archive_names): No longer call
ctf_update.
(ctf_link_write): Likewise.
(ctf_link_intern_extern_string): Adjust for new
ctf_str_add_external return value.
(ctf_link_add_strtab): Likewise.
* ctf-util.c (ctf_list_empty_p): New.
2019-08-08 00:55:09 +08:00
|
|
|
extern ctf_names_t *ctf_name_table (ctf_file_t *, int);
|
2019-04-24 05:24:13 +08:00
|
|
|
extern const ctf_type_t *ctf_lookup_by_id (ctf_file_t **, ctf_id_t);
|
libctf: avoid the need to ever use ctf_update
The method of operation of libctf when the dictionary is writable has
before now been that types that are added land in the dynamic type
section, which is a linked list and hash of IDs -> dynamic type
definitions (and, recently a hash of names): the DTDs are a bit of CTF
representing the ctf_type_t and ad hoc C structures representing the
vlen. Historically, libctf was unable to do anything with these types,
not even look them up by ID, let alone by name: if you wanted to do that
say if you were adding a type that depended on one you just added) you
called ctf_update, which serializes all the DTDs into a CTF file and
reopens it, copying its guts over the fp it's called with. The
ctf_updated types are then frozen in amber and unchangeable: all lookups
will return the types in the static portion in preference to the dynamic
portion, and we will refuse to re-add things that already exist in the
static portion (and, of late, in the dynamic portion too). The libctf
machinery remembers the boundary between static and dynamic types and
looks in the right portion for each type. Lots of things still don't
quite work with dynamic types (e.g. getting their size), but enough
works to do a bunch of additions and then a ctf_update, most of the
time.
Except it doesn't, because ctf_add_type finds it necessary to walk the
full dynamic type definition list looking for types with matching names,
so it gets slower and slower with every type you add: fixing this
requires calling ctf_update periodically for no other reason than to
avoid massively slowing things down.
This is all clunky and very slow but kind of works, until you consider
that it is in fact possible and indeed necessary to modify one sort of
type after it has been added: forwards. These are necessarily promoted
to structs, unions or enums, and when they do so *their type ID does not
change*. So all of a sudden we are changing types that already exist in
the static portion. ctf_update gets massively confused by this and
allocates space enough for the forward (with no members), but then emits
the new dynamic type (with all the members) into it. You get an
assertion failure after that, if you're lucky, or a coredump.
So this commit rejigs things a bit and arranges to exclusively use the
dynamic type definitions in writable dictionaries, and the static type
definitions in readable dictionaries: we don't at any time have a mixture
of static and dynamic types, and you don't need to call ctf_update to
make things "appear". The ctf_dtbyname hash I introduced a few months
ago, which maps things like "struct foo" to DTDs, is removed, replaced
instead by a change of type of the four dictionaries which track names.
Rather than just being (unresizable) ctf_hash_t's populated only at
ctf_bufopen time, they are now a ctf_names_t structure, which is a pair
of ctf_hash_t and ctf_dynhash_t, with the ctf_hash_t portion being used
in readonly dictionaries, and the ctf_dynhash_t being used in writable
ones. The decision as to which to use is centralized in the new
functions ctf_lookup_by_rawname (which takes a type kind) and
ctf_lookup_by_rawhash, which it calls (which takes a ctf_names_t *.)
This change lets us switch from using static to dynamic name hashes on
the fly across the entirety of libctf without complexifying anything: in
fact, because we now centralize the knowledge about how to map from type
kind to name hash, it actually simplifies things and lets us throw out
quite a lot of now-unnecessary complexity, from ctf_dtnyname (replaced
by the dynamic half of the name tables), through to ctf_dtnextid (now
that a dictionary's static portion is never referenced if the dictionary
is writable, we can just use ctf_typemax to indicate the maximum type:
dynamic or non-dynamic does not matter, and we no longer need to track
the boundary between the types). You can now ctf_rollback() as far as
you like, even past a ctf_update or for that matter a full writeout; all
the iteration functions work just as well on writable as on read-only
dictionaries; ctf_add_type no longer needs expensive duplicated code to
run over the dynamic types hunting for ones it might be interested in;
and the linker no longer needs a hack to call ctf_update so that calling
ctf_add_type is not impossibly expensive.
There is still a bit more complexity: some new code paths in ctf-types.c
need to know how to extract information from dynamic types. This
complexity will go away again in a few months when libctf acquires a
proper intermediate representation.
You can still call ctf_update if you like (it's public API, after all),
but its only effect now is to set the point to which ctf_discard rolls
back.
Obviously *something* still needs to serialize the CTF file before
writeout, and this job is done by ctf_serialize, which does everything
ctf_update used to except set the counter used by ctf_discard. It is
automatically called by the various functions that do CTF writeout:
nobody else ever needs to call it.
With this in place, forwards that are promoted to non-forwards no longer
crash the link, even if it happens tens of thousands of types later.
v5: fix tabdamage.
libctf/
* ctf-impl.h (ctf_names_t): New.
(ctf_lookup_t) <ctf_hash>: Now a ctf_names_t, not a ctf_hash_t.
(ctf_file_t) <ctf_structs>: Likewise.
<ctf_unions>: Likewise.
<ctf_enums>: Likewise.
<ctf_names>: Likewise.
<ctf_lookups>: Improve comment.
<ctf_ptrtab_len>: New.
<ctf_prov_strtab>: New.
<ctf_str_prov_offset>: New.
<ctf_dtbyname>: Remove, redundant to the names hashes.
<ctf_dtnextid>: Remove, redundant to ctf_typemax.
(ctf_dtdef_t) <dtd_name>: Remove.
<dtd_data>: Note that the ctt_name is now populated.
(ctf_str_atom_t) <csa_offset>: This is now the strtab
offset for internal strings too.
<csa_external_offset>: New, the external strtab offset.
(CTF_INDEX_TO_TYPEPTR): Handle the LCTF_RDWR case.
(ctf_name_table): New declaration.
(ctf_lookup_by_rawname): Likewise.
(ctf_lookup_by_rawhash): Likewise.
(ctf_set_ctl_hashes): Likewise.
(ctf_serialize): Likewise.
(ctf_dtd_insert): Adjust.
(ctf_simple_open_internal): Likewise.
(ctf_bufopen_internal): Likewise.
(ctf_list_empty_p): Likewise.
(ctf_str_remove_ref): Likewise.
(ctf_str_add): Returns uint32_t now.
(ctf_str_add_ref): Likewise.
(ctf_str_add_external): Now returns a boolean (int).
* ctf-string.c (ctf_strraw_explicit): Check the ctf_prov_strtab
for strings in the appropriate range.
(ctf_str_create_atoms): Create the ctf_prov_strtab. Detect OOM
when adding the null string to the new strtab.
(ctf_str_free_atoms): Destroy the ctf_prov_strtab.
(ctf_str_add_ref_internal): Add make_provisional argument. If
make_provisional, populate the offset and fill in the
ctf_prov_strtab accordingly.
(ctf_str_add): Return the offset, not the string.
(ctf_str_add_ref): Likewise.
(ctf_str_add_external): Return a success integer.
(ctf_str_remove_ref): New, remove a single ref.
(ctf_str_count_strtab): Do not count the initial null string's
length or the existence or length of any unreferenced internal
atoms.
(ctf_str_populate_sorttab): Skip atoms with no refs.
(ctf_str_write_strtab): Populate the nullstr earlier. Add one
to the cts_len for the null string, since it is no longer done
in ctf_str_count_strtab. Adjust for csa_external_offset rename.
Populate the csa_offset for both internal and external cases.
Flush the ctf_prov_strtab afterwards, and reset the
ctf_str_prov_offset.
* ctf-create.c (ctf_grow_ptrtab): New.
(ctf_create): Call it. Initialize new fields rather than old
ones. Tell ctf_bufopen_internal that this is a writable dictionary.
Set the ctl hashes and data model.
(ctf_update): Rename to...
(ctf_serialize): ... this. Leave a compatibility function behind.
Tell ctf_simple_open_internal that this is a writable dictionary.
Pass the new fields along from the old dictionary. Drop
ctf_dtnextid and ctf_dtbyname. Use ctf_strraw, not dtd_name.
Do not zero out the DTD's ctt_name.
(ctf_prefixed_name): Rename to...
(ctf_name_table): ... this. No longer return a prefixed name: return
the applicable name table instead.
(ctf_dtd_insert): Use it, and use the right name table. Pass in the
kind we're adding. Migrate away from dtd_name.
(ctf_dtd_delete): Adjust similarly. Remove the ref to the
deleted ctt_name.
(ctf_dtd_lookup_type_by_name): Remove.
(ctf_dynamic_type): Always return NULL on read-only dictionaries.
No longer check ctf_dtnextid: check ctf_typemax instead.
(ctf_snapshot): No longer use ctf_dtnextid: use ctf_typemax instead.
(ctf_rollback): Likewise. No longer fail with ECTF_OVERROLLBACK. Use
ctf_name_table and the right name table, and migrate away from
dtd_name as in ctf_dtd_delete.
(ctf_add_generic): Pass in the kind explicitly and pass it to
ctf_dtd_insert. Use ctf_typemax, not ctf_dtnextid. Migrate away
from dtd_name to using ctf_str_add_ref to populate the ctt_name.
Grow the ptrtab if needed.
(ctf_add_encoded): Pass in the kind.
(ctf_add_slice): Likewise.
(ctf_add_array): Likewise.
(ctf_add_function): Likewise.
(ctf_add_typedef): Likewise.
(ctf_add_reftype): Likewise. Initialize the ctf_ptrtab, checking
ctt_name rather than dtd_name.
(ctf_add_struct_sized): Pass in the kind. Use
ctf_lookup_by_rawname, not ctf_hash_lookup_type /
ctf_dtd_lookup_type_by_name.
(ctf_add_union_sized): Likewise.
(ctf_add_enum): Likewise.
(ctf_add_enum_encoded): Likewise.
(ctf_add_forward): Likewise.
(ctf_add_type): Likewise.
(ctf_compress_write): Call ctf_serialize: adjust for ctf_size not
being initialized until after the call.
(ctf_write_mem): Likewise.
(ctf_write): Likewise.
* ctf-archive.c (arc_write_one_ctf): Likewise.
* ctf-lookup.c (ctf_lookup_by_name): Use ctf_lookuup_by_rawhash, not
ctf_hash_lookup_type.
(ctf_lookup_by_id): No longer check the readonly types if the
dictionary is writable.
* ctf-open.c (init_types): Assert that this dictionary is not
writable. Adjust to use the new name hashes, ctf_name_table,
and ctf_ptrtab_len. GNU style fix for the final ptrtab scan.
(ctf_bufopen_internal): New 'writable' parameter. Flip on LCTF_RDWR
if set. Drop out early when dictionary is writable. Split the
ctf_lookups initialization into...
(ctf_set_cth_hashes): ... this new function.
(ctf_simple_open_internal): Adjust. New 'writable' parameter.
(ctf_simple_open): Adjust accordingly.
(ctf_bufopen): Likewise.
(ctf_file_close): Destroy the appropriate name hashes. No longer
destroy ctf_dtbyname, which is gone.
(ctf_getdatasect): Remove spurious "extern".
* ctf-types.c (ctf_lookup_by_rawname): New, look up types in the
specified name table, given a kind.
(ctf_lookup_by_rawhash): Likewise, given a ctf_names_t *.
(ctf_member_iter): Add support for iterating over the
dynamic type list.
(ctf_enum_iter): Likewise.
(ctf_variable_iter): Likewise.
(ctf_type_rvisit): Likewise.
(ctf_member_info): Add support for types in the dynamic type list.
(ctf_enum_name): Likewise.
(ctf_enum_value): Likewise.
(ctf_func_type_info): Likewise.
(ctf_func_type_args): Likewise.
* ctf-link.c (ctf_accumulate_archive_names): No longer call
ctf_update.
(ctf_link_write): Likewise.
(ctf_link_intern_extern_string): Adjust for new
ctf_str_add_external return value.
(ctf_link_add_strtab): Likewise.
* ctf-util.c (ctf_list_empty_p): New.
2019-08-08 00:55:09 +08:00
|
|
|
extern ctf_id_t ctf_lookup_by_rawname (ctf_file_t *, int, const char *);
|
|
|
|
extern ctf_id_t ctf_lookup_by_rawhash (ctf_file_t *, ctf_names_t *, const char *);
|
|
|
|
extern void ctf_set_ctl_hashes (ctf_file_t *);
|
2019-04-24 05:24:13 +08:00
|
|
|
|
2019-04-24 05:12:16 +08:00
|
|
|
typedef unsigned int (*ctf_hash_fun) (const void *ptr);
|
|
|
|
extern unsigned int ctf_hash_integer (const void *ptr);
|
|
|
|
extern unsigned int ctf_hash_string (const void *ptr);
|
2019-07-14 04:31:26 +08:00
|
|
|
extern unsigned int ctf_hash_type_mapping_key (const void *ptr);
|
2019-04-24 05:12:16 +08:00
|
|
|
|
|
|
|
typedef int (*ctf_hash_eq_fun) (const void *, const void *);
|
|
|
|
extern int ctf_hash_eq_integer (const void *, const void *);
|
|
|
|
extern int ctf_hash_eq_string (const void *, const void *);
|
2019-07-14 04:31:26 +08:00
|
|
|
extern int ctf_hash_eq_type_mapping_key (const void *, const void *);
|
2019-04-24 05:12:16 +08:00
|
|
|
|
|
|
|
typedef void (*ctf_hash_free_fun) (void *);
|
|
|
|
|
2019-06-27 20:30:22 +08:00
|
|
|
typedef void (*ctf_hash_iter_f) (void *key, void *value, void *arg);
|
|
|
|
typedef int (*ctf_hash_iter_remove_f) (void *key, void *value, void *arg);
|
|
|
|
|
2019-04-24 05:12:16 +08:00
|
|
|
extern ctf_hash_t *ctf_hash_create (unsigned long, ctf_hash_fun, ctf_hash_eq_fun);
|
|
|
|
extern int ctf_hash_insert_type (ctf_hash_t *, ctf_file_t *, uint32_t, uint32_t);
|
|
|
|
extern int ctf_hash_define_type (ctf_hash_t *, ctf_file_t *, uint32_t, uint32_t);
|
|
|
|
extern ctf_id_t ctf_hash_lookup_type (ctf_hash_t *, ctf_file_t *, const char *);
|
|
|
|
extern uint32_t ctf_hash_size (const ctf_hash_t *);
|
|
|
|
extern void ctf_hash_destroy (ctf_hash_t *);
|
|
|
|
|
|
|
|
extern ctf_dynhash_t *ctf_dynhash_create (ctf_hash_fun, ctf_hash_eq_fun,
|
|
|
|
ctf_hash_free_fun, ctf_hash_free_fun);
|
|
|
|
extern int ctf_dynhash_insert (ctf_dynhash_t *, void *, void *);
|
|
|
|
extern void ctf_dynhash_remove (ctf_dynhash_t *, const void *);
|
2019-07-14 04:31:26 +08:00
|
|
|
extern void ctf_dynhash_empty (ctf_dynhash_t *);
|
2019-04-24 05:12:16 +08:00
|
|
|
extern void *ctf_dynhash_lookup (ctf_dynhash_t *, const void *);
|
|
|
|
extern void ctf_dynhash_destroy (ctf_dynhash_t *);
|
2019-06-27 20:30:22 +08:00
|
|
|
extern void ctf_dynhash_iter (ctf_dynhash_t *, ctf_hash_iter_f, void *);
|
|
|
|
extern void ctf_dynhash_iter_remove (ctf_dynhash_t *, ctf_hash_iter_remove_f,
|
|
|
|
void *);
|
2019-04-24 05:12:16 +08:00
|
|
|
|
2019-04-24 04:45:30 +08:00
|
|
|
#define ctf_list_prev(elem) ((void *)(((ctf_list_t *)(elem))->l_prev))
|
|
|
|
#define ctf_list_next(elem) ((void *)(((ctf_list_t *)(elem))->l_next))
|
|
|
|
|
|
|
|
extern void ctf_list_append (ctf_list_t *, void *);
|
|
|
|
extern void ctf_list_prepend (ctf_list_t *, void *);
|
|
|
|
extern void ctf_list_delete (ctf_list_t *, void *);
|
libctf: avoid the need to ever use ctf_update
The method of operation of libctf when the dictionary is writable has
before now been that types that are added land in the dynamic type
section, which is a linked list and hash of IDs -> dynamic type
definitions (and, recently a hash of names): the DTDs are a bit of CTF
representing the ctf_type_t and ad hoc C structures representing the
vlen. Historically, libctf was unable to do anything with these types,
not even look them up by ID, let alone by name: if you wanted to do that
say if you were adding a type that depended on one you just added) you
called ctf_update, which serializes all the DTDs into a CTF file and
reopens it, copying its guts over the fp it's called with. The
ctf_updated types are then frozen in amber and unchangeable: all lookups
will return the types in the static portion in preference to the dynamic
portion, and we will refuse to re-add things that already exist in the
static portion (and, of late, in the dynamic portion too). The libctf
machinery remembers the boundary between static and dynamic types and
looks in the right portion for each type. Lots of things still don't
quite work with dynamic types (e.g. getting their size), but enough
works to do a bunch of additions and then a ctf_update, most of the
time.
Except it doesn't, because ctf_add_type finds it necessary to walk the
full dynamic type definition list looking for types with matching names,
so it gets slower and slower with every type you add: fixing this
requires calling ctf_update periodically for no other reason than to
avoid massively slowing things down.
This is all clunky and very slow but kind of works, until you consider
that it is in fact possible and indeed necessary to modify one sort of
type after it has been added: forwards. These are necessarily promoted
to structs, unions or enums, and when they do so *their type ID does not
change*. So all of a sudden we are changing types that already exist in
the static portion. ctf_update gets massively confused by this and
allocates space enough for the forward (with no members), but then emits
the new dynamic type (with all the members) into it. You get an
assertion failure after that, if you're lucky, or a coredump.
So this commit rejigs things a bit and arranges to exclusively use the
dynamic type definitions in writable dictionaries, and the static type
definitions in readable dictionaries: we don't at any time have a mixture
of static and dynamic types, and you don't need to call ctf_update to
make things "appear". The ctf_dtbyname hash I introduced a few months
ago, which maps things like "struct foo" to DTDs, is removed, replaced
instead by a change of type of the four dictionaries which track names.
Rather than just being (unresizable) ctf_hash_t's populated only at
ctf_bufopen time, they are now a ctf_names_t structure, which is a pair
of ctf_hash_t and ctf_dynhash_t, with the ctf_hash_t portion being used
in readonly dictionaries, and the ctf_dynhash_t being used in writable
ones. The decision as to which to use is centralized in the new
functions ctf_lookup_by_rawname (which takes a type kind) and
ctf_lookup_by_rawhash, which it calls (which takes a ctf_names_t *.)
This change lets us switch from using static to dynamic name hashes on
the fly across the entirety of libctf without complexifying anything: in
fact, because we now centralize the knowledge about how to map from type
kind to name hash, it actually simplifies things and lets us throw out
quite a lot of now-unnecessary complexity, from ctf_dtnyname (replaced
by the dynamic half of the name tables), through to ctf_dtnextid (now
that a dictionary's static portion is never referenced if the dictionary
is writable, we can just use ctf_typemax to indicate the maximum type:
dynamic or non-dynamic does not matter, and we no longer need to track
the boundary between the types). You can now ctf_rollback() as far as
you like, even past a ctf_update or for that matter a full writeout; all
the iteration functions work just as well on writable as on read-only
dictionaries; ctf_add_type no longer needs expensive duplicated code to
run over the dynamic types hunting for ones it might be interested in;
and the linker no longer needs a hack to call ctf_update so that calling
ctf_add_type is not impossibly expensive.
There is still a bit more complexity: some new code paths in ctf-types.c
need to know how to extract information from dynamic types. This
complexity will go away again in a few months when libctf acquires a
proper intermediate representation.
You can still call ctf_update if you like (it's public API, after all),
but its only effect now is to set the point to which ctf_discard rolls
back.
Obviously *something* still needs to serialize the CTF file before
writeout, and this job is done by ctf_serialize, which does everything
ctf_update used to except set the counter used by ctf_discard. It is
automatically called by the various functions that do CTF writeout:
nobody else ever needs to call it.
With this in place, forwards that are promoted to non-forwards no longer
crash the link, even if it happens tens of thousands of types later.
v5: fix tabdamage.
libctf/
* ctf-impl.h (ctf_names_t): New.
(ctf_lookup_t) <ctf_hash>: Now a ctf_names_t, not a ctf_hash_t.
(ctf_file_t) <ctf_structs>: Likewise.
<ctf_unions>: Likewise.
<ctf_enums>: Likewise.
<ctf_names>: Likewise.
<ctf_lookups>: Improve comment.
<ctf_ptrtab_len>: New.
<ctf_prov_strtab>: New.
<ctf_str_prov_offset>: New.
<ctf_dtbyname>: Remove, redundant to the names hashes.
<ctf_dtnextid>: Remove, redundant to ctf_typemax.
(ctf_dtdef_t) <dtd_name>: Remove.
<dtd_data>: Note that the ctt_name is now populated.
(ctf_str_atom_t) <csa_offset>: This is now the strtab
offset for internal strings too.
<csa_external_offset>: New, the external strtab offset.
(CTF_INDEX_TO_TYPEPTR): Handle the LCTF_RDWR case.
(ctf_name_table): New declaration.
(ctf_lookup_by_rawname): Likewise.
(ctf_lookup_by_rawhash): Likewise.
(ctf_set_ctl_hashes): Likewise.
(ctf_serialize): Likewise.
(ctf_dtd_insert): Adjust.
(ctf_simple_open_internal): Likewise.
(ctf_bufopen_internal): Likewise.
(ctf_list_empty_p): Likewise.
(ctf_str_remove_ref): Likewise.
(ctf_str_add): Returns uint32_t now.
(ctf_str_add_ref): Likewise.
(ctf_str_add_external): Now returns a boolean (int).
* ctf-string.c (ctf_strraw_explicit): Check the ctf_prov_strtab
for strings in the appropriate range.
(ctf_str_create_atoms): Create the ctf_prov_strtab. Detect OOM
when adding the null string to the new strtab.
(ctf_str_free_atoms): Destroy the ctf_prov_strtab.
(ctf_str_add_ref_internal): Add make_provisional argument. If
make_provisional, populate the offset and fill in the
ctf_prov_strtab accordingly.
(ctf_str_add): Return the offset, not the string.
(ctf_str_add_ref): Likewise.
(ctf_str_add_external): Return a success integer.
(ctf_str_remove_ref): New, remove a single ref.
(ctf_str_count_strtab): Do not count the initial null string's
length or the existence or length of any unreferenced internal
atoms.
(ctf_str_populate_sorttab): Skip atoms with no refs.
(ctf_str_write_strtab): Populate the nullstr earlier. Add one
to the cts_len for the null string, since it is no longer done
in ctf_str_count_strtab. Adjust for csa_external_offset rename.
Populate the csa_offset for both internal and external cases.
Flush the ctf_prov_strtab afterwards, and reset the
ctf_str_prov_offset.
* ctf-create.c (ctf_grow_ptrtab): New.
(ctf_create): Call it. Initialize new fields rather than old
ones. Tell ctf_bufopen_internal that this is a writable dictionary.
Set the ctl hashes and data model.
(ctf_update): Rename to...
(ctf_serialize): ... this. Leave a compatibility function behind.
Tell ctf_simple_open_internal that this is a writable dictionary.
Pass the new fields along from the old dictionary. Drop
ctf_dtnextid and ctf_dtbyname. Use ctf_strraw, not dtd_name.
Do not zero out the DTD's ctt_name.
(ctf_prefixed_name): Rename to...
(ctf_name_table): ... this. No longer return a prefixed name: return
the applicable name table instead.
(ctf_dtd_insert): Use it, and use the right name table. Pass in the
kind we're adding. Migrate away from dtd_name.
(ctf_dtd_delete): Adjust similarly. Remove the ref to the
deleted ctt_name.
(ctf_dtd_lookup_type_by_name): Remove.
(ctf_dynamic_type): Always return NULL on read-only dictionaries.
No longer check ctf_dtnextid: check ctf_typemax instead.
(ctf_snapshot): No longer use ctf_dtnextid: use ctf_typemax instead.
(ctf_rollback): Likewise. No longer fail with ECTF_OVERROLLBACK. Use
ctf_name_table and the right name table, and migrate away from
dtd_name as in ctf_dtd_delete.
(ctf_add_generic): Pass in the kind explicitly and pass it to
ctf_dtd_insert. Use ctf_typemax, not ctf_dtnextid. Migrate away
from dtd_name to using ctf_str_add_ref to populate the ctt_name.
Grow the ptrtab if needed.
(ctf_add_encoded): Pass in the kind.
(ctf_add_slice): Likewise.
(ctf_add_array): Likewise.
(ctf_add_function): Likewise.
(ctf_add_typedef): Likewise.
(ctf_add_reftype): Likewise. Initialize the ctf_ptrtab, checking
ctt_name rather than dtd_name.
(ctf_add_struct_sized): Pass in the kind. Use
ctf_lookup_by_rawname, not ctf_hash_lookup_type /
ctf_dtd_lookup_type_by_name.
(ctf_add_union_sized): Likewise.
(ctf_add_enum): Likewise.
(ctf_add_enum_encoded): Likewise.
(ctf_add_forward): Likewise.
(ctf_add_type): Likewise.
(ctf_compress_write): Call ctf_serialize: adjust for ctf_size not
being initialized until after the call.
(ctf_write_mem): Likewise.
(ctf_write): Likewise.
* ctf-archive.c (arc_write_one_ctf): Likewise.
* ctf-lookup.c (ctf_lookup_by_name): Use ctf_lookuup_by_rawhash, not
ctf_hash_lookup_type.
(ctf_lookup_by_id): No longer check the readonly types if the
dictionary is writable.
* ctf-open.c (init_types): Assert that this dictionary is not
writable. Adjust to use the new name hashes, ctf_name_table,
and ctf_ptrtab_len. GNU style fix for the final ptrtab scan.
(ctf_bufopen_internal): New 'writable' parameter. Flip on LCTF_RDWR
if set. Drop out early when dictionary is writable. Split the
ctf_lookups initialization into...
(ctf_set_cth_hashes): ... this new function.
(ctf_simple_open_internal): Adjust. New 'writable' parameter.
(ctf_simple_open): Adjust accordingly.
(ctf_bufopen): Likewise.
(ctf_file_close): Destroy the appropriate name hashes. No longer
destroy ctf_dtbyname, which is gone.
(ctf_getdatasect): Remove spurious "extern".
* ctf-types.c (ctf_lookup_by_rawname): New, look up types in the
specified name table, given a kind.
(ctf_lookup_by_rawhash): Likewise, given a ctf_names_t *.
(ctf_member_iter): Add support for iterating over the
dynamic type list.
(ctf_enum_iter): Likewise.
(ctf_variable_iter): Likewise.
(ctf_type_rvisit): Likewise.
(ctf_member_info): Add support for types in the dynamic type list.
(ctf_enum_name): Likewise.
(ctf_enum_value): Likewise.
(ctf_func_type_info): Likewise.
(ctf_func_type_args): Likewise.
* ctf-link.c (ctf_accumulate_archive_names): No longer call
ctf_update.
(ctf_link_write): Likewise.
(ctf_link_intern_extern_string): Adjust for new
ctf_str_add_external return value.
(ctf_link_add_strtab): Likewise.
* ctf-util.c (ctf_list_empty_p): New.
2019-08-08 00:55:09 +08:00
|
|
|
extern int ctf_list_empty_p (ctf_list_t *lp);
|
2019-04-24 04:45:30 +08:00
|
|
|
|
2019-10-21 18:27:43 +08:00
|
|
|
extern int ctf_dtd_insert (ctf_file_t *, ctf_dtdef_t *, int flag, int kind);
|
2019-04-24 05:24:13 +08:00
|
|
|
extern void ctf_dtd_delete (ctf_file_t *, ctf_dtdef_t *);
|
|
|
|
extern ctf_dtdef_t *ctf_dtd_lookup (const ctf_file_t *, ctf_id_t);
|
|
|
|
extern ctf_dtdef_t *ctf_dynamic_type (const ctf_file_t *, ctf_id_t);
|
|
|
|
|
2019-06-19 19:14:16 +08:00
|
|
|
extern int ctf_dvd_insert (ctf_file_t *, ctf_dvdef_t *);
|
2019-04-24 05:24:13 +08:00
|
|
|
extern void ctf_dvd_delete (ctf_file_t *, ctf_dvdef_t *);
|
|
|
|
extern ctf_dvdef_t *ctf_dvd_lookup (const ctf_file_t *, const char *);
|
|
|
|
|
2019-07-14 04:31:26 +08:00
|
|
|
extern void ctf_add_type_mapping (ctf_file_t *src_fp, ctf_id_t src_type,
|
|
|
|
ctf_file_t *dst_fp, ctf_id_t dst_type);
|
|
|
|
extern ctf_id_t ctf_type_mapping (ctf_file_t *src_fp, ctf_id_t src_type,
|
|
|
|
ctf_file_t **dst_fp);
|
|
|
|
|
2019-04-24 18:03:37 +08:00
|
|
|
extern void ctf_decl_init (ctf_decl_t *);
|
|
|
|
extern void ctf_decl_fini (ctf_decl_t *);
|
|
|
|
extern void ctf_decl_push (ctf_decl_t *, ctf_file_t *, ctf_id_t);
|
|
|
|
|
|
|
|
_libctf_printflike_ (2, 3)
|
|
|
|
extern void ctf_decl_sprintf (ctf_decl_t *, const char *, ...);
|
|
|
|
extern char *ctf_decl_buf (ctf_decl_t *cd);
|
|
|
|
|
2019-04-24 04:45:30 +08:00
|
|
|
extern const char *ctf_strptr (ctf_file_t *, uint32_t);
|
libctf: support getting strings from the ELF strtab
The CTF file format has always supported "external strtabs", which
internally are strtab offsets with their MSB on: such refs
get their strings from the strtab passed in at CTF file open time:
this is usually intended to be the ELF strtab, and that's what this
implementation is meant to support, though in theory the external
strtab could come from anywhere.
This commit adds support for these external strings in the ctf-string.c
strtab tracking layer. It's quite easy: we just add a field csa_offset
to the atoms table that tracks all strings: this field tracks the offset
of the string in the ELF strtab (with its MSB already on, courtesy of a
new macro CTF_SET_STID), and adds a new function that sets the
csa_offset to the specified offset (plus MSB). Then we just need to
avoid writing out strings to the internal strtab if they have csa_offset
set, and note that the internal strtab is shorter than it might
otherwise be.
(We could in theory save a little more time here by eschewing sorting
such strings, since we never actually write the strings out anywhere,
but that would mean storing them separately and it's just not worth the
complexity cost until profiling shows it's worth doing.)
We also have to go through a bit of extra effort at variable-sorting
time. This was previously using direct references to the internal
strtab: it couldn't use ctf_strptr or ctf_strraw because the new strtab
is not yet ready to put in its usual field (in a ctf_file_t that hasn't
even been allocated yet at this stage): but now we're using the external
strtab, this will no longer do because it'll be looking things up in the
wrong strtab, with disastrous results. Instead, pass the new internal
strtab in to a new ctf_strraw_explicit function which is just like
ctf_strraw except you can specify a ne winternal strtab to use.
But even now that it is using a new internal strtab, this is not quite
enough: it can't look up strings in the external strtab because ld
hasn't written it out yet, and when it does will write it straight to
disk. Instead, when we write the internal strtab, note all the offset
-> string mappings that we have noted belong in the *external* strtab to
a new "synthetic external strtab" dynhash, ctf_syn_ext_strtab, and look
in there at ctf_strraw time if it is set. This uses minimal extra
memory (because only strings in the external strtab that we actually use
are stored, and even those come straight out of the atoms table), but
let both variable sorting and name interning when ctf_bufopen is next
called work fine. (This also means that we don't need to filter out
spurious ECTF_STRTAB warnings from ctf_bufopen but can pass them back to
the caller, once we wrap ctf_bufopen so that we have a new internal
variant of ctf_bufopen etc that we can pass the synthetic external
strtab to. That error has been filtered out since the days of Solaris
libctf, which didn't try to handle the problem of getting external
strtabs right at construction time at all.)
v3: add the synthetic strtab and all associated machinery.
v5: fix tabdamage.
include/
* ctf.h (CTF_SET_STID): New.
libctf/
* ctf-impl.h (ctf_str_atom_t) <csa_offset>: New field.
(ctf_file_t) <ctf_syn_ext_strtab>: Likewise.
(ctf_str_add_ref): Name the last arg.
(ctf_str_add_external) New.
(ctf_str_add_strraw_explicit): Likewise.
(ctf_simple_open_internal): Likewise.
(ctf_bufopen_internal): Likewise.
* ctf-string.c (ctf_strraw_explicit): Split from...
(ctf_strraw): ... here, with new support for ctf_syn_ext_strtab.
(ctf_str_add_ref_internal): Return the atom, not the
string.
(ctf_str_add): Adjust accordingly.
(ctf_str_add_ref): Likewise. Move up in the file.
(ctf_str_add_external): New: update the csa_offset.
(ctf_str_count_strtab): Only account for strings with no csa_offset
in the internal strtab length.
(ctf_str_write_strtab): If the csa_offset is set, update the
string's refs without writing the string out, and update the
ctf_syn_ext_strtab. Make OOM handling less ugly.
* ctf-create.c (struct ctf_sort_var_arg_cb): New.
(ctf_update): Handle failure to populate the strtab. Pass in the
new ctf_sort_var arg. Adjust for ctf_syn_ext_strtab addition.
Call ctf_simple_open_internal, not ctf_simple_open.
(ctf_sort_var): Call ctf_strraw_explicit rather than looking up
strings by hand.
* ctf-hash.c (ctf_hash_insert_type): Likewise (but using
ctf_strraw). Adjust to diagnose ECTF_STRTAB nonetheless.
* ctf-open.c (init_types): No longer filter out ECTF_STRTAB.
(ctf_file_close): Destroy the ctf_syn_ext_strtab.
(ctf_simple_open): Rename to, and reimplement as a wrapper around...
(ctf_simple_open_internal): ... this new function, which calls
ctf_bufopen_internal.
(ctf_bufopen): Rename to, and reimplement as a wrapper around...
(ctf_bufopen_internal): ... this new function, which sets
ctf_syn_ext_strtab.
2019-07-14 03:33:01 +08:00
|
|
|
extern const char *ctf_strraw (ctf_file_t *, uint32_t);
|
|
|
|
extern const char *ctf_strraw_explicit (ctf_file_t *, uint32_t,
|
|
|
|
ctf_strs_t *);
|
libctf: deduplicate and sort the string table
ctf.h states:
> [...] the CTF string table does not contain any duplicated strings.
Unfortunately this is entirely untrue: libctf has before now made no
attempt whatsoever to deduplicate the string table. It computes the
string table's length on the fly as it adds new strings to the dynamic
CTF file, and ctf_update() just writes each string to the table and
notes the current write position as it traverses the dynamic CTF file's
data structures and builds the final CTF buffer. There is no global
view of the strings and no deduplication.
Fix this by erasing the ctf_dtvstrlen dead-reckoning length, and adding
a new dynhash table ctf_str_atoms that maps unique strings to a list
of references to those strings: a reference is a simple uint32_t * to
some value somewhere in the under-construction CTF buffer that needs
updating to note the string offset when the strtab is laid out.
Adding a string is now a simple matter of calling ctf_str_add_ref(),
which adds a new atom to the atoms table, if one doesn't already exist,
and adding the location of the reference to this atom to the refs list
attached to the atom: this works reliably as long as one takes care to
only call ctf_str_add_ref() once the final location of the offset is
known (so you can't call it on a temporary structure and then memcpy()
that structure into place in the CTF buffer, because the ref will still
point to the old location: ctf_update() changes accordingly).
Generating the CTF string table is a matter of calling
ctf_str_write_strtab(), which counts the length and number of elements
in the atoms table using the ctf_dynhash_iter() function we just added,
populating an array of pointers into the atoms table and sorting it into
order (to help compressors), then traversing this table and emitting it,
updating the refs to each atom as we go. The only complexity here is
arranging to keep the null string at offset zero, since a lot of code in
libctf depends on being able to leave strtab references at 0 to indicate
'no name'. Once the table is constructed and the refs updated, we know
how long it is, so we can realloc() the partial CTF buffer we allocated
earlier and can copy the table on to the end of it (and purge the refs
because they're not needed any more and have been invalidated by the
realloc() call in any case).
The net effect of all this is a reduction in uncompressed strtab sizes
of about 30% (perhaps a quarter to a half of all strings across the
Linux kernel are eliminated as duplicates). Of course, duplicated
strings are highly redundant, so the space saving after compression is
only about 20%: when the other non-strtab sections are factored in, CTF
sizes shrink by about 10%.
No change in externally-visible API or file format (other than the
reduction in pointless redundancy).
libctf/
* ctf-impl.h: (struct ctf_strs_writable): New, non-const version of
struct ctf_strs.
(struct ctf_dtdef): Note that dtd_data.ctt_name is unpopulated.
(struct ctf_str_atom): New, disambiguated single string.
(struct ctf_str_atom_ref): New, points to some other location that
references this string's offset.
(struct ctf_file): New members ctf_str_atoms and ctf_str_num_refs.
Remove member ctf_dtvstrlen: we no longer track the total strlen
as we add strings.
(ctf_str_create_atoms): Declare new function in ctf-string.c.
(ctf_str_free_atoms): Likewise.
(ctf_str_add): Likewise.
(ctf_str_add_ref): Likewise.
(ctf_str_purge_refs): Likewise.
(ctf_str_write_strtab): Likewise.
(ctf_realloc): Declare new function in ctf-util.c.
* ctf-open.c (ctf_bufopen): Create the atoms table.
(ctf_file_close): Destroy it.
* ctf-create.c (ctf_update): Copy-and-free it on update. No longer
special-case the position of the parname string. Construct the
strtab by calling ctf_str_add_ref and ctf_str_write_strtab after the
rest of each buffer element is constructed, not via open-coding:
realloc the CTF buffer and append the strtab to it. No longer
maintain ctf_dtvstrlen. Sort the variable entry table later, after
strtab construction.
(ctf_copy_membnames): Remove: integrated into ctf_copy_{s,l,e}members.
(ctf_copy_smembers): Drop the string offset: call ctf_str_add_ref
after buffer element construction instead.
(ctf_copy_lmembers): Likewise.
(ctf_copy_emembers): Likewise.
(ctf_create): No longer maintain the ctf_dtvstrlen.
(ctf_dtd_delete): Likewise.
(ctf_dvd_delete): Likewise.
(ctf_add_generic): Likewise.
(ctf_add_enumerator): Likewise.
(ctf_add_member_offset): Likewise.
(ctf_add_variable): Likewise.
(membadd): Likewise.
* ctf-util.c (ctf_realloc): New, wrapper around realloc that aborts
if there are active ctf_str_num_refs.
(ctf_strraw): Move to ctf-string.c.
(ctf_strptr): Likewise.
* ctf-string.c: New file, strtab manipulation.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
2019-06-27 20:51:10 +08:00
|
|
|
extern int ctf_str_create_atoms (ctf_file_t *);
|
|
|
|
extern void ctf_str_free_atoms (ctf_file_t *);
|
libctf: avoid the need to ever use ctf_update
The method of operation of libctf when the dictionary is writable has
before now been that types that are added land in the dynamic type
section, which is a linked list and hash of IDs -> dynamic type
definitions (and, recently a hash of names): the DTDs are a bit of CTF
representing the ctf_type_t and ad hoc C structures representing the
vlen. Historically, libctf was unable to do anything with these types,
not even look them up by ID, let alone by name: if you wanted to do that
say if you were adding a type that depended on one you just added) you
called ctf_update, which serializes all the DTDs into a CTF file and
reopens it, copying its guts over the fp it's called with. The
ctf_updated types are then frozen in amber and unchangeable: all lookups
will return the types in the static portion in preference to the dynamic
portion, and we will refuse to re-add things that already exist in the
static portion (and, of late, in the dynamic portion too). The libctf
machinery remembers the boundary between static and dynamic types and
looks in the right portion for each type. Lots of things still don't
quite work with dynamic types (e.g. getting their size), but enough
works to do a bunch of additions and then a ctf_update, most of the
time.
Except it doesn't, because ctf_add_type finds it necessary to walk the
full dynamic type definition list looking for types with matching names,
so it gets slower and slower with every type you add: fixing this
requires calling ctf_update periodically for no other reason than to
avoid massively slowing things down.
This is all clunky and very slow but kind of works, until you consider
that it is in fact possible and indeed necessary to modify one sort of
type after it has been added: forwards. These are necessarily promoted
to structs, unions or enums, and when they do so *their type ID does not
change*. So all of a sudden we are changing types that already exist in
the static portion. ctf_update gets massively confused by this and
allocates space enough for the forward (with no members), but then emits
the new dynamic type (with all the members) into it. You get an
assertion failure after that, if you're lucky, or a coredump.
So this commit rejigs things a bit and arranges to exclusively use the
dynamic type definitions in writable dictionaries, and the static type
definitions in readable dictionaries: we don't at any time have a mixture
of static and dynamic types, and you don't need to call ctf_update to
make things "appear". The ctf_dtbyname hash I introduced a few months
ago, which maps things like "struct foo" to DTDs, is removed, replaced
instead by a change of type of the four dictionaries which track names.
Rather than just being (unresizable) ctf_hash_t's populated only at
ctf_bufopen time, they are now a ctf_names_t structure, which is a pair
of ctf_hash_t and ctf_dynhash_t, with the ctf_hash_t portion being used
in readonly dictionaries, and the ctf_dynhash_t being used in writable
ones. The decision as to which to use is centralized in the new
functions ctf_lookup_by_rawname (which takes a type kind) and
ctf_lookup_by_rawhash, which it calls (which takes a ctf_names_t *.)
This change lets us switch from using static to dynamic name hashes on
the fly across the entirety of libctf without complexifying anything: in
fact, because we now centralize the knowledge about how to map from type
kind to name hash, it actually simplifies things and lets us throw out
quite a lot of now-unnecessary complexity, from ctf_dtnyname (replaced
by the dynamic half of the name tables), through to ctf_dtnextid (now
that a dictionary's static portion is never referenced if the dictionary
is writable, we can just use ctf_typemax to indicate the maximum type:
dynamic or non-dynamic does not matter, and we no longer need to track
the boundary between the types). You can now ctf_rollback() as far as
you like, even past a ctf_update or for that matter a full writeout; all
the iteration functions work just as well on writable as on read-only
dictionaries; ctf_add_type no longer needs expensive duplicated code to
run over the dynamic types hunting for ones it might be interested in;
and the linker no longer needs a hack to call ctf_update so that calling
ctf_add_type is not impossibly expensive.
There is still a bit more complexity: some new code paths in ctf-types.c
need to know how to extract information from dynamic types. This
complexity will go away again in a few months when libctf acquires a
proper intermediate representation.
You can still call ctf_update if you like (it's public API, after all),
but its only effect now is to set the point to which ctf_discard rolls
back.
Obviously *something* still needs to serialize the CTF file before
writeout, and this job is done by ctf_serialize, which does everything
ctf_update used to except set the counter used by ctf_discard. It is
automatically called by the various functions that do CTF writeout:
nobody else ever needs to call it.
With this in place, forwards that are promoted to non-forwards no longer
crash the link, even if it happens tens of thousands of types later.
v5: fix tabdamage.
libctf/
* ctf-impl.h (ctf_names_t): New.
(ctf_lookup_t) <ctf_hash>: Now a ctf_names_t, not a ctf_hash_t.
(ctf_file_t) <ctf_structs>: Likewise.
<ctf_unions>: Likewise.
<ctf_enums>: Likewise.
<ctf_names>: Likewise.
<ctf_lookups>: Improve comment.
<ctf_ptrtab_len>: New.
<ctf_prov_strtab>: New.
<ctf_str_prov_offset>: New.
<ctf_dtbyname>: Remove, redundant to the names hashes.
<ctf_dtnextid>: Remove, redundant to ctf_typemax.
(ctf_dtdef_t) <dtd_name>: Remove.
<dtd_data>: Note that the ctt_name is now populated.
(ctf_str_atom_t) <csa_offset>: This is now the strtab
offset for internal strings too.
<csa_external_offset>: New, the external strtab offset.
(CTF_INDEX_TO_TYPEPTR): Handle the LCTF_RDWR case.
(ctf_name_table): New declaration.
(ctf_lookup_by_rawname): Likewise.
(ctf_lookup_by_rawhash): Likewise.
(ctf_set_ctl_hashes): Likewise.
(ctf_serialize): Likewise.
(ctf_dtd_insert): Adjust.
(ctf_simple_open_internal): Likewise.
(ctf_bufopen_internal): Likewise.
(ctf_list_empty_p): Likewise.
(ctf_str_remove_ref): Likewise.
(ctf_str_add): Returns uint32_t now.
(ctf_str_add_ref): Likewise.
(ctf_str_add_external): Now returns a boolean (int).
* ctf-string.c (ctf_strraw_explicit): Check the ctf_prov_strtab
for strings in the appropriate range.
(ctf_str_create_atoms): Create the ctf_prov_strtab. Detect OOM
when adding the null string to the new strtab.
(ctf_str_free_atoms): Destroy the ctf_prov_strtab.
(ctf_str_add_ref_internal): Add make_provisional argument. If
make_provisional, populate the offset and fill in the
ctf_prov_strtab accordingly.
(ctf_str_add): Return the offset, not the string.
(ctf_str_add_ref): Likewise.
(ctf_str_add_external): Return a success integer.
(ctf_str_remove_ref): New, remove a single ref.
(ctf_str_count_strtab): Do not count the initial null string's
length or the existence or length of any unreferenced internal
atoms.
(ctf_str_populate_sorttab): Skip atoms with no refs.
(ctf_str_write_strtab): Populate the nullstr earlier. Add one
to the cts_len for the null string, since it is no longer done
in ctf_str_count_strtab. Adjust for csa_external_offset rename.
Populate the csa_offset for both internal and external cases.
Flush the ctf_prov_strtab afterwards, and reset the
ctf_str_prov_offset.
* ctf-create.c (ctf_grow_ptrtab): New.
(ctf_create): Call it. Initialize new fields rather than old
ones. Tell ctf_bufopen_internal that this is a writable dictionary.
Set the ctl hashes and data model.
(ctf_update): Rename to...
(ctf_serialize): ... this. Leave a compatibility function behind.
Tell ctf_simple_open_internal that this is a writable dictionary.
Pass the new fields along from the old dictionary. Drop
ctf_dtnextid and ctf_dtbyname. Use ctf_strraw, not dtd_name.
Do not zero out the DTD's ctt_name.
(ctf_prefixed_name): Rename to...
(ctf_name_table): ... this. No longer return a prefixed name: return
the applicable name table instead.
(ctf_dtd_insert): Use it, and use the right name table. Pass in the
kind we're adding. Migrate away from dtd_name.
(ctf_dtd_delete): Adjust similarly. Remove the ref to the
deleted ctt_name.
(ctf_dtd_lookup_type_by_name): Remove.
(ctf_dynamic_type): Always return NULL on read-only dictionaries.
No longer check ctf_dtnextid: check ctf_typemax instead.
(ctf_snapshot): No longer use ctf_dtnextid: use ctf_typemax instead.
(ctf_rollback): Likewise. No longer fail with ECTF_OVERROLLBACK. Use
ctf_name_table and the right name table, and migrate away from
dtd_name as in ctf_dtd_delete.
(ctf_add_generic): Pass in the kind explicitly and pass it to
ctf_dtd_insert. Use ctf_typemax, not ctf_dtnextid. Migrate away
from dtd_name to using ctf_str_add_ref to populate the ctt_name.
Grow the ptrtab if needed.
(ctf_add_encoded): Pass in the kind.
(ctf_add_slice): Likewise.
(ctf_add_array): Likewise.
(ctf_add_function): Likewise.
(ctf_add_typedef): Likewise.
(ctf_add_reftype): Likewise. Initialize the ctf_ptrtab, checking
ctt_name rather than dtd_name.
(ctf_add_struct_sized): Pass in the kind. Use
ctf_lookup_by_rawname, not ctf_hash_lookup_type /
ctf_dtd_lookup_type_by_name.
(ctf_add_union_sized): Likewise.
(ctf_add_enum): Likewise.
(ctf_add_enum_encoded): Likewise.
(ctf_add_forward): Likewise.
(ctf_add_type): Likewise.
(ctf_compress_write): Call ctf_serialize: adjust for ctf_size not
being initialized until after the call.
(ctf_write_mem): Likewise.
(ctf_write): Likewise.
* ctf-archive.c (arc_write_one_ctf): Likewise.
* ctf-lookup.c (ctf_lookup_by_name): Use ctf_lookuup_by_rawhash, not
ctf_hash_lookup_type.
(ctf_lookup_by_id): No longer check the readonly types if the
dictionary is writable.
* ctf-open.c (init_types): Assert that this dictionary is not
writable. Adjust to use the new name hashes, ctf_name_table,
and ctf_ptrtab_len. GNU style fix for the final ptrtab scan.
(ctf_bufopen_internal): New 'writable' parameter. Flip on LCTF_RDWR
if set. Drop out early when dictionary is writable. Split the
ctf_lookups initialization into...
(ctf_set_cth_hashes): ... this new function.
(ctf_simple_open_internal): Adjust. New 'writable' parameter.
(ctf_simple_open): Adjust accordingly.
(ctf_bufopen): Likewise.
(ctf_file_close): Destroy the appropriate name hashes. No longer
destroy ctf_dtbyname, which is gone.
(ctf_getdatasect): Remove spurious "extern".
* ctf-types.c (ctf_lookup_by_rawname): New, look up types in the
specified name table, given a kind.
(ctf_lookup_by_rawhash): Likewise, given a ctf_names_t *.
(ctf_member_iter): Add support for iterating over the
dynamic type list.
(ctf_enum_iter): Likewise.
(ctf_variable_iter): Likewise.
(ctf_type_rvisit): Likewise.
(ctf_member_info): Add support for types in the dynamic type list.
(ctf_enum_name): Likewise.
(ctf_enum_value): Likewise.
(ctf_func_type_info): Likewise.
(ctf_func_type_args): Likewise.
* ctf-link.c (ctf_accumulate_archive_names): No longer call
ctf_update.
(ctf_link_write): Likewise.
(ctf_link_intern_extern_string): Adjust for new
ctf_str_add_external return value.
(ctf_link_add_strtab): Likewise.
* ctf-util.c (ctf_list_empty_p): New.
2019-08-08 00:55:09 +08:00
|
|
|
extern uint32_t ctf_str_add (ctf_file_t *, const char *);
|
|
|
|
extern uint32_t ctf_str_add_ref (ctf_file_t *, const char *, uint32_t *ref);
|
|
|
|
extern int ctf_str_add_external (ctf_file_t *, const char *, uint32_t offset);
|
|
|
|
extern void ctf_str_remove_ref (ctf_file_t *, const char *, uint32_t *ref);
|
libctf: deduplicate and sort the string table
ctf.h states:
> [...] the CTF string table does not contain any duplicated strings.
Unfortunately this is entirely untrue: libctf has before now made no
attempt whatsoever to deduplicate the string table. It computes the
string table's length on the fly as it adds new strings to the dynamic
CTF file, and ctf_update() just writes each string to the table and
notes the current write position as it traverses the dynamic CTF file's
data structures and builds the final CTF buffer. There is no global
view of the strings and no deduplication.
Fix this by erasing the ctf_dtvstrlen dead-reckoning length, and adding
a new dynhash table ctf_str_atoms that maps unique strings to a list
of references to those strings: a reference is a simple uint32_t * to
some value somewhere in the under-construction CTF buffer that needs
updating to note the string offset when the strtab is laid out.
Adding a string is now a simple matter of calling ctf_str_add_ref(),
which adds a new atom to the atoms table, if one doesn't already exist,
and adding the location of the reference to this atom to the refs list
attached to the atom: this works reliably as long as one takes care to
only call ctf_str_add_ref() once the final location of the offset is
known (so you can't call it on a temporary structure and then memcpy()
that structure into place in the CTF buffer, because the ref will still
point to the old location: ctf_update() changes accordingly).
Generating the CTF string table is a matter of calling
ctf_str_write_strtab(), which counts the length and number of elements
in the atoms table using the ctf_dynhash_iter() function we just added,
populating an array of pointers into the atoms table and sorting it into
order (to help compressors), then traversing this table and emitting it,
updating the refs to each atom as we go. The only complexity here is
arranging to keep the null string at offset zero, since a lot of code in
libctf depends on being able to leave strtab references at 0 to indicate
'no name'. Once the table is constructed and the refs updated, we know
how long it is, so we can realloc() the partial CTF buffer we allocated
earlier and can copy the table on to the end of it (and purge the refs
because they're not needed any more and have been invalidated by the
realloc() call in any case).
The net effect of all this is a reduction in uncompressed strtab sizes
of about 30% (perhaps a quarter to a half of all strings across the
Linux kernel are eliminated as duplicates). Of course, duplicated
strings are highly redundant, so the space saving after compression is
only about 20%: when the other non-strtab sections are factored in, CTF
sizes shrink by about 10%.
No change in externally-visible API or file format (other than the
reduction in pointless redundancy).
libctf/
* ctf-impl.h: (struct ctf_strs_writable): New, non-const version of
struct ctf_strs.
(struct ctf_dtdef): Note that dtd_data.ctt_name is unpopulated.
(struct ctf_str_atom): New, disambiguated single string.
(struct ctf_str_atom_ref): New, points to some other location that
references this string's offset.
(struct ctf_file): New members ctf_str_atoms and ctf_str_num_refs.
Remove member ctf_dtvstrlen: we no longer track the total strlen
as we add strings.
(ctf_str_create_atoms): Declare new function in ctf-string.c.
(ctf_str_free_atoms): Likewise.
(ctf_str_add): Likewise.
(ctf_str_add_ref): Likewise.
(ctf_str_purge_refs): Likewise.
(ctf_str_write_strtab): Likewise.
(ctf_realloc): Declare new function in ctf-util.c.
* ctf-open.c (ctf_bufopen): Create the atoms table.
(ctf_file_close): Destroy it.
* ctf-create.c (ctf_update): Copy-and-free it on update. No longer
special-case the position of the parname string. Construct the
strtab by calling ctf_str_add_ref and ctf_str_write_strtab after the
rest of each buffer element is constructed, not via open-coding:
realloc the CTF buffer and append the strtab to it. No longer
maintain ctf_dtvstrlen. Sort the variable entry table later, after
strtab construction.
(ctf_copy_membnames): Remove: integrated into ctf_copy_{s,l,e}members.
(ctf_copy_smembers): Drop the string offset: call ctf_str_add_ref
after buffer element construction instead.
(ctf_copy_lmembers): Likewise.
(ctf_copy_emembers): Likewise.
(ctf_create): No longer maintain the ctf_dtvstrlen.
(ctf_dtd_delete): Likewise.
(ctf_dvd_delete): Likewise.
(ctf_add_generic): Likewise.
(ctf_add_enumerator): Likewise.
(ctf_add_member_offset): Likewise.
(ctf_add_variable): Likewise.
(membadd): Likewise.
* ctf-util.c (ctf_realloc): New, wrapper around realloc that aborts
if there are active ctf_str_num_refs.
(ctf_strraw): Move to ctf-string.c.
(ctf_strptr): Likewise.
* ctf-string.c: New file, strtab manipulation.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
2019-06-27 20:51:10 +08:00
|
|
|
extern void ctf_str_rollback (ctf_file_t *, ctf_snapshot_id_t);
|
|
|
|
extern void ctf_str_purge_refs (ctf_file_t *);
|
|
|
|
extern ctf_strs_writable_t ctf_str_write_strtab (ctf_file_t *);
|
2019-04-24 04:45:30 +08:00
|
|
|
|
libctf, binutils: support CTF archives like objdump
objdump and readelf have one major CTF-related behavioural difference:
objdump can read .ctf sections that contain CTF archives and extract and
dump their members, while readelf cannot. Since the linker often emits
CTF archives, this means that readelf intermittently and (from the
user's perspective) randomly fails to read CTF in files that ld emits,
with a confusing error message wrongly claiming that the CTF content is
corrupt. This is purely because the archive-opening code in libctf was
needlessly tangled up with the BFD code, so readelf couldn't use it.
Here, we disentangle it, moving ctf_new_archive_internal from
ctf-open-bfd.c into ctf-archive.c and merging it with the helper
function in ctf-archive.c it was already using. We add a new public API
function ctf_arc_bufopen, that looks very like ctf_bufopen but returns
an archive given suitable section data rather than a ctf_file_t: the
archive is a ctf_archive_t, so it can be called on raw CTF dictionaries
(with no archive present) and will return a single-member synthetic
"archive".
There is a tiny lifetime tweak here: before now, the archive code could
assume that the symbol section in the ctf_archive_internal wrapper
structure was always owned by BFD if it was present and should always be
freed: now, the caller can pass one in via ctf_arc_bufopen, wihch has
the usual lifetime rules for such sections (caller frees): so we add an
extra field to track whether this is an internal call from ctf-open-bfd,
in which case we still free the symbol section.
include/
* ctf-api.h (ctf_arc_bufopen): New.
libctf/
* ctf-impl.h (ctf_new_archive_internal): Declare.
(ctf_arc_bufopen): Remove.
(ctf_archive_internal) <ctfi_free_symsect>: New.
* ctf-archive.c (ctf_arc_close): Use it.
(ctf_arc_bufopen): Fuse into...
(ctf_new_archive_internal): ... this, moved across from...
* ctf-open-bfd.c: ... here.
(ctf_bfdopen_ctfsect): Use ctf_arc_bufopen.
* libctf.ver: Add it.
binutils/
* readelf.c (dump_section_as_ctf): Support .ctf archives using
ctf_arc_bufopen. Automatically load the .ctf member of such
archives as the parent of all other members, unless specifically
overridden via --ctf-parent. Split out dumping code into...
(dump_ctf_archive_member): ... here, as in objdump, and call
it once per archive member.
(dump_ctf_indent_lines): Code style fix.
2019-12-13 20:01:12 +08:00
|
|
|
extern struct ctf_archive_internal *ctf_new_archive_internal
|
|
|
|
(int is_archive, struct ctf_archive *arc,
|
|
|
|
ctf_file_t *fp, const ctf_sect_t *symsect,
|
|
|
|
const ctf_sect_t *strsect, int *errp);
|
libctf: mmappable archives
If you need to store a large number of CTF containers somewhere, this
provides a dedicated facility for doing so: an mmappable archive format
like a very simple tar or ar without all the system-dependent format
horrors or need for heavy file copying, with built-in compression of
files above a particular size threshold.
libctf automatically mmap()s uncompressed elements of these archives, or
uncompresses them, as needed. (If the platform does not support mmap(),
copying into dynamically-allocated buffers is used.)
Archive iteration operations are partitioned into raw and non-raw
forms. Raw operations pass thhe raw archive contents to the callback:
non-raw forms open each member with ctf_bufopen() and pass the resulting
ctf_file_t to the iterator instead. This lets you manipulate the raw
data in the archive, or the contents interpreted as a CTF file, as
needed.
It is not yet known whether we will store CTF archives in a linked ELF
object in one of these (akin to debugdata) or whether they'll get one
section per TU plus one parent container for types shared between them.
(In the case of ELF objects with very large numbers of TUs, an archive
of all of them would seem preferable, so we might just use an archive,
and add lzma support so you can assume that .gnu_debugdata and .ctf are
compressed using the same algorithm if both are present.)
To make usage easier, the ctf_archive_t is not the on-disk
representation but an abstraction over both ctf_file_t's and archives of
many ctf_file_t's: users see both CTF archives and raw CTF files as
ctf_archive_t's upon opening, the only difference being that a raw CTF
file has only a single "archive member", named ".ctf" (the default if a
null pointer is passed in as the name). The next commit will make use
of this facility, in addition to providing the public interface to
actually open archives. (In the future, it should be possible to have
all CTF sections in an ELF file appear as an "archive" in the same
fashion.)
This machinery is also used to allow library-internal creators of
ctf_archive_t's (such as the next commit) to stash away an ELF string
and symbol table, so that all opens of members in a given archive will
use them. This lets CTF archives exploit the ELF string and symbol
table just like raw CTF files can.
(All this leads to somewhat confusing type naming. The ctf_archive_t is
a typedef for the opaque internal type, struct ctf_archive_internal: the
non-internal "struct ctf_archive" is the on-disk structure meant for
other libraries manipulating CTF files. It is probably clearest to use
the struct name for struct ctf_archive_internal inside the program, and
the typedef names outside.)
libctf/
* ctf-archive.c: New.
* ctf-impl.h (ctf_archive_internal): New type.
(ctf_arc_open_internal): New declaration.
(ctf_arc_bufopen): Likewise.
(ctf_arc_close_internal): Likewise.
include/
* ctf.h (CTFA_MAGIC): New.
(struct ctf_archive): New.
(struct ctf_archive_modent): Likewise.
* ctf-api.h (ctf_archive_member_f): New.
(ctf_archive_raw_member_f): Likewise.
(ctf_arc_write): Likewise.
(ctf_arc_close): Likewise.
(ctf_arc_open_by_name): Likewise.
(ctf_archive_iter): Likewise.
(ctf_archive_raw_iter): Likewise.
(ctf_get_arc): Likewise.
2019-04-24 18:30:17 +08:00
|
|
|
extern struct ctf_archive *ctf_arc_open_internal (const char *, int *);
|
|
|
|
extern void ctf_arc_close_internal (struct ctf_archive *);
|
2019-04-24 04:45:30 +08:00
|
|
|
extern void *ctf_set_open_errno (int *, int);
|
libctf: fix a number of build problems found on Solaris and NetBSD
- Use of nonportable <endian.h>
- Use of qsort_r
- Use of zlib without appropriate magic to pull in the binutils zlib
- Use of off64_t without checking (fixed by dropping the unused fields
that need off64_t entirely)
- signedness problems due to long being too short a type on 32-bit
platforms: ctf_id_t is now 'unsigned long', and CTF_ERR must be
used only for functions that return ctf_id_t
- One lingering use of bzero() and of <sys/errno.h>
All fixed, using code from gnulib where possible.
Relatedly, set cts_size in a couple of places it was missed
(string table and symbol table loading upon ctf_bfdopen()).
binutils/
* objdump.c (make_ctfsect): Drop cts_type, cts_flags, and
cts_offset.
* readelf.c (shdr_to_ctf_sect): Likewise.
include/
* ctf-api.h (ctf_sect_t): Drop cts_type, cts_flags, and cts_offset.
(ctf_id_t): This is now an unsigned type.
(CTF_ERR): Cast it to ctf_id_t. Note that it should only be used
for ctf_id_t-returning functions.
libctf/
* Makefile.am (ZLIB): New.
(ZLIBINC): Likewise.
(AM_CFLAGS): Use them.
(libctf_a_LIBADD): New, for LIBOBJS.
* configure.ac: Check for zlib, endian.h, and qsort_r.
* ctf-endian.h: New, providing htole64 and le64toh.
* swap.h: Code style fixes.
(bswap_identity_64): New.
* qsort_r.c: New, from gnulib (with one added #include).
* ctf-decls.h: New, providing a conditional qsort_r declaration,
and unconditional definitions of MIN and MAX.
* ctf-impl.h: Use it. Do not use <sys/errno.h>.
(ctf_set_errno): Now returns unsigned long.
* ctf-util.c (ctf_set_errno): Adjust here too.
* ctf-archive.c: Use ctf-endian.h.
(ctf_arc_open_by_offset): Use memset, not bzero. Drop cts_type,
cts_flags and cts_offset.
(ctf_arc_write): Drop debugging dependent on the size of off_t.
* ctf-create.c: Provide a definition of roundup if not defined.
(ctf_create): Drop cts_type, cts_flags and cts_offset.
(ctf_add_reftype): Do not check if type IDs are below zero.
(ctf_add_slice): Likewise.
(ctf_add_typedef): Likewise.
(ctf_add_member_offset): Cast error-returning ssize_t's to size_t
when known error-free. Drop CTF_ERR usage for functions returning
int.
(ctf_add_member_encoded): Drop CTF_ERR usage for functions returning
int.
(ctf_add_variable): Likewise.
(enumcmp): Likewise.
(enumadd): Likewise.
(membcmp): Likewise.
(ctf_add_type): Likewise. Cast error-returning ssize_t's to size_t
when known error-free.
* ctf-dump.c (ctf_is_slice): Drop CTF_ERR usage for functions
returning int: use CTF_ERR for functions returning ctf_type_id.
(ctf_dump_label): Likewise.
(ctf_dump_objts): Likewise.
* ctf-labels.c (ctf_label_topmost): Likewise.
(ctf_label_iter): Likewise.
(ctf_label_info): Likewise.
* ctf-lookup.c (ctf_func_args): Likewise.
* ctf-open.c (upgrade_types): Cast to size_t where appropriate.
(ctf_bufopen): Likewise. Use zlib types as needed.
* ctf-types.c (ctf_member_iter): Drop CTF_ERR usage for functions
returning int.
(ctf_enum_iter): Likewise.
(ctf_type_size): Likewise.
(ctf_type_align): Likewise. Cast to size_t where appropriate.
(ctf_type_kind_unsliced): Likewise.
(ctf_type_kind): Likewise.
(ctf_type_encoding): Likewise.
(ctf_member_info): Likewise.
(ctf_array_info): Likewise.
(ctf_enum_value): Likewise.
(ctf_type_rvisit): Likewise.
* ctf-open-bfd.c (ctf_bfdopen): Drop cts_type, cts_flags and
cts_offset.
(ctf_simple_open): Likewise.
(ctf_bfdopen_ctfsect): Likewise. Set cts_size properly.
* Makefile.in: Regenerate.
* aclocal.m4: Likewise.
* config.h: Likewise.
* configure: Likewise.
2019-05-31 17:10:51 +08:00
|
|
|
extern unsigned long ctf_set_errno (ctf_file_t *, int);
|
2019-04-24 04:45:30 +08:00
|
|
|
|
libctf: support getting strings from the ELF strtab
The CTF file format has always supported "external strtabs", which
internally are strtab offsets with their MSB on: such refs
get their strings from the strtab passed in at CTF file open time:
this is usually intended to be the ELF strtab, and that's what this
implementation is meant to support, though in theory the external
strtab could come from anywhere.
This commit adds support for these external strings in the ctf-string.c
strtab tracking layer. It's quite easy: we just add a field csa_offset
to the atoms table that tracks all strings: this field tracks the offset
of the string in the ELF strtab (with its MSB already on, courtesy of a
new macro CTF_SET_STID), and adds a new function that sets the
csa_offset to the specified offset (plus MSB). Then we just need to
avoid writing out strings to the internal strtab if they have csa_offset
set, and note that the internal strtab is shorter than it might
otherwise be.
(We could in theory save a little more time here by eschewing sorting
such strings, since we never actually write the strings out anywhere,
but that would mean storing them separately and it's just not worth the
complexity cost until profiling shows it's worth doing.)
We also have to go through a bit of extra effort at variable-sorting
time. This was previously using direct references to the internal
strtab: it couldn't use ctf_strptr or ctf_strraw because the new strtab
is not yet ready to put in its usual field (in a ctf_file_t that hasn't
even been allocated yet at this stage): but now we're using the external
strtab, this will no longer do because it'll be looking things up in the
wrong strtab, with disastrous results. Instead, pass the new internal
strtab in to a new ctf_strraw_explicit function which is just like
ctf_strraw except you can specify a ne winternal strtab to use.
But even now that it is using a new internal strtab, this is not quite
enough: it can't look up strings in the external strtab because ld
hasn't written it out yet, and when it does will write it straight to
disk. Instead, when we write the internal strtab, note all the offset
-> string mappings that we have noted belong in the *external* strtab to
a new "synthetic external strtab" dynhash, ctf_syn_ext_strtab, and look
in there at ctf_strraw time if it is set. This uses minimal extra
memory (because only strings in the external strtab that we actually use
are stored, and even those come straight out of the atoms table), but
let both variable sorting and name interning when ctf_bufopen is next
called work fine. (This also means that we don't need to filter out
spurious ECTF_STRTAB warnings from ctf_bufopen but can pass them back to
the caller, once we wrap ctf_bufopen so that we have a new internal
variant of ctf_bufopen etc that we can pass the synthetic external
strtab to. That error has been filtered out since the days of Solaris
libctf, which didn't try to handle the problem of getting external
strtabs right at construction time at all.)
v3: add the synthetic strtab and all associated machinery.
v5: fix tabdamage.
include/
* ctf.h (CTF_SET_STID): New.
libctf/
* ctf-impl.h (ctf_str_atom_t) <csa_offset>: New field.
(ctf_file_t) <ctf_syn_ext_strtab>: Likewise.
(ctf_str_add_ref): Name the last arg.
(ctf_str_add_external) New.
(ctf_str_add_strraw_explicit): Likewise.
(ctf_simple_open_internal): Likewise.
(ctf_bufopen_internal): Likewise.
* ctf-string.c (ctf_strraw_explicit): Split from...
(ctf_strraw): ... here, with new support for ctf_syn_ext_strtab.
(ctf_str_add_ref_internal): Return the atom, not the
string.
(ctf_str_add): Adjust accordingly.
(ctf_str_add_ref): Likewise. Move up in the file.
(ctf_str_add_external): New: update the csa_offset.
(ctf_str_count_strtab): Only account for strings with no csa_offset
in the internal strtab length.
(ctf_str_write_strtab): If the csa_offset is set, update the
string's refs without writing the string out, and update the
ctf_syn_ext_strtab. Make OOM handling less ugly.
* ctf-create.c (struct ctf_sort_var_arg_cb): New.
(ctf_update): Handle failure to populate the strtab. Pass in the
new ctf_sort_var arg. Adjust for ctf_syn_ext_strtab addition.
Call ctf_simple_open_internal, not ctf_simple_open.
(ctf_sort_var): Call ctf_strraw_explicit rather than looking up
strings by hand.
* ctf-hash.c (ctf_hash_insert_type): Likewise (but using
ctf_strraw). Adjust to diagnose ECTF_STRTAB nonetheless.
* ctf-open.c (init_types): No longer filter out ECTF_STRTAB.
(ctf_file_close): Destroy the ctf_syn_ext_strtab.
(ctf_simple_open): Rename to, and reimplement as a wrapper around...
(ctf_simple_open_internal): ... this new function, which calls
ctf_bufopen_internal.
(ctf_bufopen): Rename to, and reimplement as a wrapper around...
(ctf_bufopen_internal): ... this new function, which sets
ctf_syn_ext_strtab.
2019-07-14 03:33:01 +08:00
|
|
|
extern ctf_file_t *ctf_simple_open_internal (const char *, size_t, const char *,
|
|
|
|
size_t, size_t,
|
|
|
|
const char *, size_t,
|
libctf: avoid the need to ever use ctf_update
The method of operation of libctf when the dictionary is writable has
before now been that types that are added land in the dynamic type
section, which is a linked list and hash of IDs -> dynamic type
definitions (and, recently a hash of names): the DTDs are a bit of CTF
representing the ctf_type_t and ad hoc C structures representing the
vlen. Historically, libctf was unable to do anything with these types,
not even look them up by ID, let alone by name: if you wanted to do that
say if you were adding a type that depended on one you just added) you
called ctf_update, which serializes all the DTDs into a CTF file and
reopens it, copying its guts over the fp it's called with. The
ctf_updated types are then frozen in amber and unchangeable: all lookups
will return the types in the static portion in preference to the dynamic
portion, and we will refuse to re-add things that already exist in the
static portion (and, of late, in the dynamic portion too). The libctf
machinery remembers the boundary between static and dynamic types and
looks in the right portion for each type. Lots of things still don't
quite work with dynamic types (e.g. getting their size), but enough
works to do a bunch of additions and then a ctf_update, most of the
time.
Except it doesn't, because ctf_add_type finds it necessary to walk the
full dynamic type definition list looking for types with matching names,
so it gets slower and slower with every type you add: fixing this
requires calling ctf_update periodically for no other reason than to
avoid massively slowing things down.
This is all clunky and very slow but kind of works, until you consider
that it is in fact possible and indeed necessary to modify one sort of
type after it has been added: forwards. These are necessarily promoted
to structs, unions or enums, and when they do so *their type ID does not
change*. So all of a sudden we are changing types that already exist in
the static portion. ctf_update gets massively confused by this and
allocates space enough for the forward (with no members), but then emits
the new dynamic type (with all the members) into it. You get an
assertion failure after that, if you're lucky, or a coredump.
So this commit rejigs things a bit and arranges to exclusively use the
dynamic type definitions in writable dictionaries, and the static type
definitions in readable dictionaries: we don't at any time have a mixture
of static and dynamic types, and you don't need to call ctf_update to
make things "appear". The ctf_dtbyname hash I introduced a few months
ago, which maps things like "struct foo" to DTDs, is removed, replaced
instead by a change of type of the four dictionaries which track names.
Rather than just being (unresizable) ctf_hash_t's populated only at
ctf_bufopen time, they are now a ctf_names_t structure, which is a pair
of ctf_hash_t and ctf_dynhash_t, with the ctf_hash_t portion being used
in readonly dictionaries, and the ctf_dynhash_t being used in writable
ones. The decision as to which to use is centralized in the new
functions ctf_lookup_by_rawname (which takes a type kind) and
ctf_lookup_by_rawhash, which it calls (which takes a ctf_names_t *.)
This change lets us switch from using static to dynamic name hashes on
the fly across the entirety of libctf without complexifying anything: in
fact, because we now centralize the knowledge about how to map from type
kind to name hash, it actually simplifies things and lets us throw out
quite a lot of now-unnecessary complexity, from ctf_dtnyname (replaced
by the dynamic half of the name tables), through to ctf_dtnextid (now
that a dictionary's static portion is never referenced if the dictionary
is writable, we can just use ctf_typemax to indicate the maximum type:
dynamic or non-dynamic does not matter, and we no longer need to track
the boundary between the types). You can now ctf_rollback() as far as
you like, even past a ctf_update or for that matter a full writeout; all
the iteration functions work just as well on writable as on read-only
dictionaries; ctf_add_type no longer needs expensive duplicated code to
run over the dynamic types hunting for ones it might be interested in;
and the linker no longer needs a hack to call ctf_update so that calling
ctf_add_type is not impossibly expensive.
There is still a bit more complexity: some new code paths in ctf-types.c
need to know how to extract information from dynamic types. This
complexity will go away again in a few months when libctf acquires a
proper intermediate representation.
You can still call ctf_update if you like (it's public API, after all),
but its only effect now is to set the point to which ctf_discard rolls
back.
Obviously *something* still needs to serialize the CTF file before
writeout, and this job is done by ctf_serialize, which does everything
ctf_update used to except set the counter used by ctf_discard. It is
automatically called by the various functions that do CTF writeout:
nobody else ever needs to call it.
With this in place, forwards that are promoted to non-forwards no longer
crash the link, even if it happens tens of thousands of types later.
v5: fix tabdamage.
libctf/
* ctf-impl.h (ctf_names_t): New.
(ctf_lookup_t) <ctf_hash>: Now a ctf_names_t, not a ctf_hash_t.
(ctf_file_t) <ctf_structs>: Likewise.
<ctf_unions>: Likewise.
<ctf_enums>: Likewise.
<ctf_names>: Likewise.
<ctf_lookups>: Improve comment.
<ctf_ptrtab_len>: New.
<ctf_prov_strtab>: New.
<ctf_str_prov_offset>: New.
<ctf_dtbyname>: Remove, redundant to the names hashes.
<ctf_dtnextid>: Remove, redundant to ctf_typemax.
(ctf_dtdef_t) <dtd_name>: Remove.
<dtd_data>: Note that the ctt_name is now populated.
(ctf_str_atom_t) <csa_offset>: This is now the strtab
offset for internal strings too.
<csa_external_offset>: New, the external strtab offset.
(CTF_INDEX_TO_TYPEPTR): Handle the LCTF_RDWR case.
(ctf_name_table): New declaration.
(ctf_lookup_by_rawname): Likewise.
(ctf_lookup_by_rawhash): Likewise.
(ctf_set_ctl_hashes): Likewise.
(ctf_serialize): Likewise.
(ctf_dtd_insert): Adjust.
(ctf_simple_open_internal): Likewise.
(ctf_bufopen_internal): Likewise.
(ctf_list_empty_p): Likewise.
(ctf_str_remove_ref): Likewise.
(ctf_str_add): Returns uint32_t now.
(ctf_str_add_ref): Likewise.
(ctf_str_add_external): Now returns a boolean (int).
* ctf-string.c (ctf_strraw_explicit): Check the ctf_prov_strtab
for strings in the appropriate range.
(ctf_str_create_atoms): Create the ctf_prov_strtab. Detect OOM
when adding the null string to the new strtab.
(ctf_str_free_atoms): Destroy the ctf_prov_strtab.
(ctf_str_add_ref_internal): Add make_provisional argument. If
make_provisional, populate the offset and fill in the
ctf_prov_strtab accordingly.
(ctf_str_add): Return the offset, not the string.
(ctf_str_add_ref): Likewise.
(ctf_str_add_external): Return a success integer.
(ctf_str_remove_ref): New, remove a single ref.
(ctf_str_count_strtab): Do not count the initial null string's
length or the existence or length of any unreferenced internal
atoms.
(ctf_str_populate_sorttab): Skip atoms with no refs.
(ctf_str_write_strtab): Populate the nullstr earlier. Add one
to the cts_len for the null string, since it is no longer done
in ctf_str_count_strtab. Adjust for csa_external_offset rename.
Populate the csa_offset for both internal and external cases.
Flush the ctf_prov_strtab afterwards, and reset the
ctf_str_prov_offset.
* ctf-create.c (ctf_grow_ptrtab): New.
(ctf_create): Call it. Initialize new fields rather than old
ones. Tell ctf_bufopen_internal that this is a writable dictionary.
Set the ctl hashes and data model.
(ctf_update): Rename to...
(ctf_serialize): ... this. Leave a compatibility function behind.
Tell ctf_simple_open_internal that this is a writable dictionary.
Pass the new fields along from the old dictionary. Drop
ctf_dtnextid and ctf_dtbyname. Use ctf_strraw, not dtd_name.
Do not zero out the DTD's ctt_name.
(ctf_prefixed_name): Rename to...
(ctf_name_table): ... this. No longer return a prefixed name: return
the applicable name table instead.
(ctf_dtd_insert): Use it, and use the right name table. Pass in the
kind we're adding. Migrate away from dtd_name.
(ctf_dtd_delete): Adjust similarly. Remove the ref to the
deleted ctt_name.
(ctf_dtd_lookup_type_by_name): Remove.
(ctf_dynamic_type): Always return NULL on read-only dictionaries.
No longer check ctf_dtnextid: check ctf_typemax instead.
(ctf_snapshot): No longer use ctf_dtnextid: use ctf_typemax instead.
(ctf_rollback): Likewise. No longer fail with ECTF_OVERROLLBACK. Use
ctf_name_table and the right name table, and migrate away from
dtd_name as in ctf_dtd_delete.
(ctf_add_generic): Pass in the kind explicitly and pass it to
ctf_dtd_insert. Use ctf_typemax, not ctf_dtnextid. Migrate away
from dtd_name to using ctf_str_add_ref to populate the ctt_name.
Grow the ptrtab if needed.
(ctf_add_encoded): Pass in the kind.
(ctf_add_slice): Likewise.
(ctf_add_array): Likewise.
(ctf_add_function): Likewise.
(ctf_add_typedef): Likewise.
(ctf_add_reftype): Likewise. Initialize the ctf_ptrtab, checking
ctt_name rather than dtd_name.
(ctf_add_struct_sized): Pass in the kind. Use
ctf_lookup_by_rawname, not ctf_hash_lookup_type /
ctf_dtd_lookup_type_by_name.
(ctf_add_union_sized): Likewise.
(ctf_add_enum): Likewise.
(ctf_add_enum_encoded): Likewise.
(ctf_add_forward): Likewise.
(ctf_add_type): Likewise.
(ctf_compress_write): Call ctf_serialize: adjust for ctf_size not
being initialized until after the call.
(ctf_write_mem): Likewise.
(ctf_write): Likewise.
* ctf-archive.c (arc_write_one_ctf): Likewise.
* ctf-lookup.c (ctf_lookup_by_name): Use ctf_lookuup_by_rawhash, not
ctf_hash_lookup_type.
(ctf_lookup_by_id): No longer check the readonly types if the
dictionary is writable.
* ctf-open.c (init_types): Assert that this dictionary is not
writable. Adjust to use the new name hashes, ctf_name_table,
and ctf_ptrtab_len. GNU style fix for the final ptrtab scan.
(ctf_bufopen_internal): New 'writable' parameter. Flip on LCTF_RDWR
if set. Drop out early when dictionary is writable. Split the
ctf_lookups initialization into...
(ctf_set_cth_hashes): ... this new function.
(ctf_simple_open_internal): Adjust. New 'writable' parameter.
(ctf_simple_open): Adjust accordingly.
(ctf_bufopen): Likewise.
(ctf_file_close): Destroy the appropriate name hashes. No longer
destroy ctf_dtbyname, which is gone.
(ctf_getdatasect): Remove spurious "extern".
* ctf-types.c (ctf_lookup_by_rawname): New, look up types in the
specified name table, given a kind.
(ctf_lookup_by_rawhash): Likewise, given a ctf_names_t *.
(ctf_member_iter): Add support for iterating over the
dynamic type list.
(ctf_enum_iter): Likewise.
(ctf_variable_iter): Likewise.
(ctf_type_rvisit): Likewise.
(ctf_member_info): Add support for types in the dynamic type list.
(ctf_enum_name): Likewise.
(ctf_enum_value): Likewise.
(ctf_func_type_info): Likewise.
(ctf_func_type_args): Likewise.
* ctf-link.c (ctf_accumulate_archive_names): No longer call
ctf_update.
(ctf_link_write): Likewise.
(ctf_link_intern_extern_string): Adjust for new
ctf_str_add_external return value.
(ctf_link_add_strtab): Likewise.
* ctf-util.c (ctf_list_empty_p): New.
2019-08-08 00:55:09 +08:00
|
|
|
ctf_dynhash_t *, int, int *);
|
libctf: support getting strings from the ELF strtab
The CTF file format has always supported "external strtabs", which
internally are strtab offsets with their MSB on: such refs
get their strings from the strtab passed in at CTF file open time:
this is usually intended to be the ELF strtab, and that's what this
implementation is meant to support, though in theory the external
strtab could come from anywhere.
This commit adds support for these external strings in the ctf-string.c
strtab tracking layer. It's quite easy: we just add a field csa_offset
to the atoms table that tracks all strings: this field tracks the offset
of the string in the ELF strtab (with its MSB already on, courtesy of a
new macro CTF_SET_STID), and adds a new function that sets the
csa_offset to the specified offset (plus MSB). Then we just need to
avoid writing out strings to the internal strtab if they have csa_offset
set, and note that the internal strtab is shorter than it might
otherwise be.
(We could in theory save a little more time here by eschewing sorting
such strings, since we never actually write the strings out anywhere,
but that would mean storing them separately and it's just not worth the
complexity cost until profiling shows it's worth doing.)
We also have to go through a bit of extra effort at variable-sorting
time. This was previously using direct references to the internal
strtab: it couldn't use ctf_strptr or ctf_strraw because the new strtab
is not yet ready to put in its usual field (in a ctf_file_t that hasn't
even been allocated yet at this stage): but now we're using the external
strtab, this will no longer do because it'll be looking things up in the
wrong strtab, with disastrous results. Instead, pass the new internal
strtab in to a new ctf_strraw_explicit function which is just like
ctf_strraw except you can specify a ne winternal strtab to use.
But even now that it is using a new internal strtab, this is not quite
enough: it can't look up strings in the external strtab because ld
hasn't written it out yet, and when it does will write it straight to
disk. Instead, when we write the internal strtab, note all the offset
-> string mappings that we have noted belong in the *external* strtab to
a new "synthetic external strtab" dynhash, ctf_syn_ext_strtab, and look
in there at ctf_strraw time if it is set. This uses minimal extra
memory (because only strings in the external strtab that we actually use
are stored, and even those come straight out of the atoms table), but
let both variable sorting and name interning when ctf_bufopen is next
called work fine. (This also means that we don't need to filter out
spurious ECTF_STRTAB warnings from ctf_bufopen but can pass them back to
the caller, once we wrap ctf_bufopen so that we have a new internal
variant of ctf_bufopen etc that we can pass the synthetic external
strtab to. That error has been filtered out since the days of Solaris
libctf, which didn't try to handle the problem of getting external
strtabs right at construction time at all.)
v3: add the synthetic strtab and all associated machinery.
v5: fix tabdamage.
include/
* ctf.h (CTF_SET_STID): New.
libctf/
* ctf-impl.h (ctf_str_atom_t) <csa_offset>: New field.
(ctf_file_t) <ctf_syn_ext_strtab>: Likewise.
(ctf_str_add_ref): Name the last arg.
(ctf_str_add_external) New.
(ctf_str_add_strraw_explicit): Likewise.
(ctf_simple_open_internal): Likewise.
(ctf_bufopen_internal): Likewise.
* ctf-string.c (ctf_strraw_explicit): Split from...
(ctf_strraw): ... here, with new support for ctf_syn_ext_strtab.
(ctf_str_add_ref_internal): Return the atom, not the
string.
(ctf_str_add): Adjust accordingly.
(ctf_str_add_ref): Likewise. Move up in the file.
(ctf_str_add_external): New: update the csa_offset.
(ctf_str_count_strtab): Only account for strings with no csa_offset
in the internal strtab length.
(ctf_str_write_strtab): If the csa_offset is set, update the
string's refs without writing the string out, and update the
ctf_syn_ext_strtab. Make OOM handling less ugly.
* ctf-create.c (struct ctf_sort_var_arg_cb): New.
(ctf_update): Handle failure to populate the strtab. Pass in the
new ctf_sort_var arg. Adjust for ctf_syn_ext_strtab addition.
Call ctf_simple_open_internal, not ctf_simple_open.
(ctf_sort_var): Call ctf_strraw_explicit rather than looking up
strings by hand.
* ctf-hash.c (ctf_hash_insert_type): Likewise (but using
ctf_strraw). Adjust to diagnose ECTF_STRTAB nonetheless.
* ctf-open.c (init_types): No longer filter out ECTF_STRTAB.
(ctf_file_close): Destroy the ctf_syn_ext_strtab.
(ctf_simple_open): Rename to, and reimplement as a wrapper around...
(ctf_simple_open_internal): ... this new function, which calls
ctf_bufopen_internal.
(ctf_bufopen): Rename to, and reimplement as a wrapper around...
(ctf_bufopen_internal): ... this new function, which sets
ctf_syn_ext_strtab.
2019-07-14 03:33:01 +08:00
|
|
|
extern ctf_file_t *ctf_bufopen_internal (const ctf_sect_t *, const ctf_sect_t *,
|
|
|
|
const ctf_sect_t *, ctf_dynhash_t *,
|
libctf: avoid the need to ever use ctf_update
The method of operation of libctf when the dictionary is writable has
before now been that types that are added land in the dynamic type
section, which is a linked list and hash of IDs -> dynamic type
definitions (and, recently a hash of names): the DTDs are a bit of CTF
representing the ctf_type_t and ad hoc C structures representing the
vlen. Historically, libctf was unable to do anything with these types,
not even look them up by ID, let alone by name: if you wanted to do that
say if you were adding a type that depended on one you just added) you
called ctf_update, which serializes all the DTDs into a CTF file and
reopens it, copying its guts over the fp it's called with. The
ctf_updated types are then frozen in amber and unchangeable: all lookups
will return the types in the static portion in preference to the dynamic
portion, and we will refuse to re-add things that already exist in the
static portion (and, of late, in the dynamic portion too). The libctf
machinery remembers the boundary between static and dynamic types and
looks in the right portion for each type. Lots of things still don't
quite work with dynamic types (e.g. getting their size), but enough
works to do a bunch of additions and then a ctf_update, most of the
time.
Except it doesn't, because ctf_add_type finds it necessary to walk the
full dynamic type definition list looking for types with matching names,
so it gets slower and slower with every type you add: fixing this
requires calling ctf_update periodically for no other reason than to
avoid massively slowing things down.
This is all clunky and very slow but kind of works, until you consider
that it is in fact possible and indeed necessary to modify one sort of
type after it has been added: forwards. These are necessarily promoted
to structs, unions or enums, and when they do so *their type ID does not
change*. So all of a sudden we are changing types that already exist in
the static portion. ctf_update gets massively confused by this and
allocates space enough for the forward (with no members), but then emits
the new dynamic type (with all the members) into it. You get an
assertion failure after that, if you're lucky, or a coredump.
So this commit rejigs things a bit and arranges to exclusively use the
dynamic type definitions in writable dictionaries, and the static type
definitions in readable dictionaries: we don't at any time have a mixture
of static and dynamic types, and you don't need to call ctf_update to
make things "appear". The ctf_dtbyname hash I introduced a few months
ago, which maps things like "struct foo" to DTDs, is removed, replaced
instead by a change of type of the four dictionaries which track names.
Rather than just being (unresizable) ctf_hash_t's populated only at
ctf_bufopen time, they are now a ctf_names_t structure, which is a pair
of ctf_hash_t and ctf_dynhash_t, with the ctf_hash_t portion being used
in readonly dictionaries, and the ctf_dynhash_t being used in writable
ones. The decision as to which to use is centralized in the new
functions ctf_lookup_by_rawname (which takes a type kind) and
ctf_lookup_by_rawhash, which it calls (which takes a ctf_names_t *.)
This change lets us switch from using static to dynamic name hashes on
the fly across the entirety of libctf without complexifying anything: in
fact, because we now centralize the knowledge about how to map from type
kind to name hash, it actually simplifies things and lets us throw out
quite a lot of now-unnecessary complexity, from ctf_dtnyname (replaced
by the dynamic half of the name tables), through to ctf_dtnextid (now
that a dictionary's static portion is never referenced if the dictionary
is writable, we can just use ctf_typemax to indicate the maximum type:
dynamic or non-dynamic does not matter, and we no longer need to track
the boundary between the types). You can now ctf_rollback() as far as
you like, even past a ctf_update or for that matter a full writeout; all
the iteration functions work just as well on writable as on read-only
dictionaries; ctf_add_type no longer needs expensive duplicated code to
run over the dynamic types hunting for ones it might be interested in;
and the linker no longer needs a hack to call ctf_update so that calling
ctf_add_type is not impossibly expensive.
There is still a bit more complexity: some new code paths in ctf-types.c
need to know how to extract information from dynamic types. This
complexity will go away again in a few months when libctf acquires a
proper intermediate representation.
You can still call ctf_update if you like (it's public API, after all),
but its only effect now is to set the point to which ctf_discard rolls
back.
Obviously *something* still needs to serialize the CTF file before
writeout, and this job is done by ctf_serialize, which does everything
ctf_update used to except set the counter used by ctf_discard. It is
automatically called by the various functions that do CTF writeout:
nobody else ever needs to call it.
With this in place, forwards that are promoted to non-forwards no longer
crash the link, even if it happens tens of thousands of types later.
v5: fix tabdamage.
libctf/
* ctf-impl.h (ctf_names_t): New.
(ctf_lookup_t) <ctf_hash>: Now a ctf_names_t, not a ctf_hash_t.
(ctf_file_t) <ctf_structs>: Likewise.
<ctf_unions>: Likewise.
<ctf_enums>: Likewise.
<ctf_names>: Likewise.
<ctf_lookups>: Improve comment.
<ctf_ptrtab_len>: New.
<ctf_prov_strtab>: New.
<ctf_str_prov_offset>: New.
<ctf_dtbyname>: Remove, redundant to the names hashes.
<ctf_dtnextid>: Remove, redundant to ctf_typemax.
(ctf_dtdef_t) <dtd_name>: Remove.
<dtd_data>: Note that the ctt_name is now populated.
(ctf_str_atom_t) <csa_offset>: This is now the strtab
offset for internal strings too.
<csa_external_offset>: New, the external strtab offset.
(CTF_INDEX_TO_TYPEPTR): Handle the LCTF_RDWR case.
(ctf_name_table): New declaration.
(ctf_lookup_by_rawname): Likewise.
(ctf_lookup_by_rawhash): Likewise.
(ctf_set_ctl_hashes): Likewise.
(ctf_serialize): Likewise.
(ctf_dtd_insert): Adjust.
(ctf_simple_open_internal): Likewise.
(ctf_bufopen_internal): Likewise.
(ctf_list_empty_p): Likewise.
(ctf_str_remove_ref): Likewise.
(ctf_str_add): Returns uint32_t now.
(ctf_str_add_ref): Likewise.
(ctf_str_add_external): Now returns a boolean (int).
* ctf-string.c (ctf_strraw_explicit): Check the ctf_prov_strtab
for strings in the appropriate range.
(ctf_str_create_atoms): Create the ctf_prov_strtab. Detect OOM
when adding the null string to the new strtab.
(ctf_str_free_atoms): Destroy the ctf_prov_strtab.
(ctf_str_add_ref_internal): Add make_provisional argument. If
make_provisional, populate the offset and fill in the
ctf_prov_strtab accordingly.
(ctf_str_add): Return the offset, not the string.
(ctf_str_add_ref): Likewise.
(ctf_str_add_external): Return a success integer.
(ctf_str_remove_ref): New, remove a single ref.
(ctf_str_count_strtab): Do not count the initial null string's
length or the existence or length of any unreferenced internal
atoms.
(ctf_str_populate_sorttab): Skip atoms with no refs.
(ctf_str_write_strtab): Populate the nullstr earlier. Add one
to the cts_len for the null string, since it is no longer done
in ctf_str_count_strtab. Adjust for csa_external_offset rename.
Populate the csa_offset for both internal and external cases.
Flush the ctf_prov_strtab afterwards, and reset the
ctf_str_prov_offset.
* ctf-create.c (ctf_grow_ptrtab): New.
(ctf_create): Call it. Initialize new fields rather than old
ones. Tell ctf_bufopen_internal that this is a writable dictionary.
Set the ctl hashes and data model.
(ctf_update): Rename to...
(ctf_serialize): ... this. Leave a compatibility function behind.
Tell ctf_simple_open_internal that this is a writable dictionary.
Pass the new fields along from the old dictionary. Drop
ctf_dtnextid and ctf_dtbyname. Use ctf_strraw, not dtd_name.
Do not zero out the DTD's ctt_name.
(ctf_prefixed_name): Rename to...
(ctf_name_table): ... this. No longer return a prefixed name: return
the applicable name table instead.
(ctf_dtd_insert): Use it, and use the right name table. Pass in the
kind we're adding. Migrate away from dtd_name.
(ctf_dtd_delete): Adjust similarly. Remove the ref to the
deleted ctt_name.
(ctf_dtd_lookup_type_by_name): Remove.
(ctf_dynamic_type): Always return NULL on read-only dictionaries.
No longer check ctf_dtnextid: check ctf_typemax instead.
(ctf_snapshot): No longer use ctf_dtnextid: use ctf_typemax instead.
(ctf_rollback): Likewise. No longer fail with ECTF_OVERROLLBACK. Use
ctf_name_table and the right name table, and migrate away from
dtd_name as in ctf_dtd_delete.
(ctf_add_generic): Pass in the kind explicitly and pass it to
ctf_dtd_insert. Use ctf_typemax, not ctf_dtnextid. Migrate away
from dtd_name to using ctf_str_add_ref to populate the ctt_name.
Grow the ptrtab if needed.
(ctf_add_encoded): Pass in the kind.
(ctf_add_slice): Likewise.
(ctf_add_array): Likewise.
(ctf_add_function): Likewise.
(ctf_add_typedef): Likewise.
(ctf_add_reftype): Likewise. Initialize the ctf_ptrtab, checking
ctt_name rather than dtd_name.
(ctf_add_struct_sized): Pass in the kind. Use
ctf_lookup_by_rawname, not ctf_hash_lookup_type /
ctf_dtd_lookup_type_by_name.
(ctf_add_union_sized): Likewise.
(ctf_add_enum): Likewise.
(ctf_add_enum_encoded): Likewise.
(ctf_add_forward): Likewise.
(ctf_add_type): Likewise.
(ctf_compress_write): Call ctf_serialize: adjust for ctf_size not
being initialized until after the call.
(ctf_write_mem): Likewise.
(ctf_write): Likewise.
* ctf-archive.c (arc_write_one_ctf): Likewise.
* ctf-lookup.c (ctf_lookup_by_name): Use ctf_lookuup_by_rawhash, not
ctf_hash_lookup_type.
(ctf_lookup_by_id): No longer check the readonly types if the
dictionary is writable.
* ctf-open.c (init_types): Assert that this dictionary is not
writable. Adjust to use the new name hashes, ctf_name_table,
and ctf_ptrtab_len. GNU style fix for the final ptrtab scan.
(ctf_bufopen_internal): New 'writable' parameter. Flip on LCTF_RDWR
if set. Drop out early when dictionary is writable. Split the
ctf_lookups initialization into...
(ctf_set_cth_hashes): ... this new function.
(ctf_simple_open_internal): Adjust. New 'writable' parameter.
(ctf_simple_open): Adjust accordingly.
(ctf_bufopen): Likewise.
(ctf_file_close): Destroy the appropriate name hashes. No longer
destroy ctf_dtbyname, which is gone.
(ctf_getdatasect): Remove spurious "extern".
* ctf-types.c (ctf_lookup_by_rawname): New, look up types in the
specified name table, given a kind.
(ctf_lookup_by_rawhash): Likewise, given a ctf_names_t *.
(ctf_member_iter): Add support for iterating over the
dynamic type list.
(ctf_enum_iter): Likewise.
(ctf_variable_iter): Likewise.
(ctf_type_rvisit): Likewise.
(ctf_member_info): Add support for types in the dynamic type list.
(ctf_enum_name): Likewise.
(ctf_enum_value): Likewise.
(ctf_func_type_info): Likewise.
(ctf_func_type_args): Likewise.
* ctf-link.c (ctf_accumulate_archive_names): No longer call
ctf_update.
(ctf_link_write): Likewise.
(ctf_link_intern_extern_string): Adjust for new
ctf_str_add_external return value.
(ctf_link_add_strtab): Likewise.
* ctf-util.c (ctf_list_empty_p): New.
2019-08-08 00:55:09 +08:00
|
|
|
int, int *);
|
|
|
|
extern int ctf_serialize (ctf_file_t *);
|
libctf: support getting strings from the ELF strtab
The CTF file format has always supported "external strtabs", which
internally are strtab offsets with their MSB on: such refs
get their strings from the strtab passed in at CTF file open time:
this is usually intended to be the ELF strtab, and that's what this
implementation is meant to support, though in theory the external
strtab could come from anywhere.
This commit adds support for these external strings in the ctf-string.c
strtab tracking layer. It's quite easy: we just add a field csa_offset
to the atoms table that tracks all strings: this field tracks the offset
of the string in the ELF strtab (with its MSB already on, courtesy of a
new macro CTF_SET_STID), and adds a new function that sets the
csa_offset to the specified offset (plus MSB). Then we just need to
avoid writing out strings to the internal strtab if they have csa_offset
set, and note that the internal strtab is shorter than it might
otherwise be.
(We could in theory save a little more time here by eschewing sorting
such strings, since we never actually write the strings out anywhere,
but that would mean storing them separately and it's just not worth the
complexity cost until profiling shows it's worth doing.)
We also have to go through a bit of extra effort at variable-sorting
time. This was previously using direct references to the internal
strtab: it couldn't use ctf_strptr or ctf_strraw because the new strtab
is not yet ready to put in its usual field (in a ctf_file_t that hasn't
even been allocated yet at this stage): but now we're using the external
strtab, this will no longer do because it'll be looking things up in the
wrong strtab, with disastrous results. Instead, pass the new internal
strtab in to a new ctf_strraw_explicit function which is just like
ctf_strraw except you can specify a ne winternal strtab to use.
But even now that it is using a new internal strtab, this is not quite
enough: it can't look up strings in the external strtab because ld
hasn't written it out yet, and when it does will write it straight to
disk. Instead, when we write the internal strtab, note all the offset
-> string mappings that we have noted belong in the *external* strtab to
a new "synthetic external strtab" dynhash, ctf_syn_ext_strtab, and look
in there at ctf_strraw time if it is set. This uses minimal extra
memory (because only strings in the external strtab that we actually use
are stored, and even those come straight out of the atoms table), but
let both variable sorting and name interning when ctf_bufopen is next
called work fine. (This also means that we don't need to filter out
spurious ECTF_STRTAB warnings from ctf_bufopen but can pass them back to
the caller, once we wrap ctf_bufopen so that we have a new internal
variant of ctf_bufopen etc that we can pass the synthetic external
strtab to. That error has been filtered out since the days of Solaris
libctf, which didn't try to handle the problem of getting external
strtabs right at construction time at all.)
v3: add the synthetic strtab and all associated machinery.
v5: fix tabdamage.
include/
* ctf.h (CTF_SET_STID): New.
libctf/
* ctf-impl.h (ctf_str_atom_t) <csa_offset>: New field.
(ctf_file_t) <ctf_syn_ext_strtab>: Likewise.
(ctf_str_add_ref): Name the last arg.
(ctf_str_add_external) New.
(ctf_str_add_strraw_explicit): Likewise.
(ctf_simple_open_internal): Likewise.
(ctf_bufopen_internal): Likewise.
* ctf-string.c (ctf_strraw_explicit): Split from...
(ctf_strraw): ... here, with new support for ctf_syn_ext_strtab.
(ctf_str_add_ref_internal): Return the atom, not the
string.
(ctf_str_add): Adjust accordingly.
(ctf_str_add_ref): Likewise. Move up in the file.
(ctf_str_add_external): New: update the csa_offset.
(ctf_str_count_strtab): Only account for strings with no csa_offset
in the internal strtab length.
(ctf_str_write_strtab): If the csa_offset is set, update the
string's refs without writing the string out, and update the
ctf_syn_ext_strtab. Make OOM handling less ugly.
* ctf-create.c (struct ctf_sort_var_arg_cb): New.
(ctf_update): Handle failure to populate the strtab. Pass in the
new ctf_sort_var arg. Adjust for ctf_syn_ext_strtab addition.
Call ctf_simple_open_internal, not ctf_simple_open.
(ctf_sort_var): Call ctf_strraw_explicit rather than looking up
strings by hand.
* ctf-hash.c (ctf_hash_insert_type): Likewise (but using
ctf_strraw). Adjust to diagnose ECTF_STRTAB nonetheless.
* ctf-open.c (init_types): No longer filter out ECTF_STRTAB.
(ctf_file_close): Destroy the ctf_syn_ext_strtab.
(ctf_simple_open): Rename to, and reimplement as a wrapper around...
(ctf_simple_open_internal): ... this new function, which calls
ctf_bufopen_internal.
(ctf_bufopen): Rename to, and reimplement as a wrapper around...
(ctf_bufopen_internal): ... this new function, which sets
ctf_syn_ext_strtab.
2019-07-14 03:33:01 +08:00
|
|
|
|
2019-04-24 01:55:27 +08:00
|
|
|
_libctf_malloc_
|
|
|
|
extern void *ctf_mmap (size_t length, size_t offset, int fd);
|
|
|
|
extern void ctf_munmap (void *, size_t);
|
|
|
|
extern ssize_t ctf_pread (int fd, void *buf, ssize_t count, off_t offset);
|
|
|
|
|
libctf: deduplicate and sort the string table
ctf.h states:
> [...] the CTF string table does not contain any duplicated strings.
Unfortunately this is entirely untrue: libctf has before now made no
attempt whatsoever to deduplicate the string table. It computes the
string table's length on the fly as it adds new strings to the dynamic
CTF file, and ctf_update() just writes each string to the table and
notes the current write position as it traverses the dynamic CTF file's
data structures and builds the final CTF buffer. There is no global
view of the strings and no deduplication.
Fix this by erasing the ctf_dtvstrlen dead-reckoning length, and adding
a new dynhash table ctf_str_atoms that maps unique strings to a list
of references to those strings: a reference is a simple uint32_t * to
some value somewhere in the under-construction CTF buffer that needs
updating to note the string offset when the strtab is laid out.
Adding a string is now a simple matter of calling ctf_str_add_ref(),
which adds a new atom to the atoms table, if one doesn't already exist,
and adding the location of the reference to this atom to the refs list
attached to the atom: this works reliably as long as one takes care to
only call ctf_str_add_ref() once the final location of the offset is
known (so you can't call it on a temporary structure and then memcpy()
that structure into place in the CTF buffer, because the ref will still
point to the old location: ctf_update() changes accordingly).
Generating the CTF string table is a matter of calling
ctf_str_write_strtab(), which counts the length and number of elements
in the atoms table using the ctf_dynhash_iter() function we just added,
populating an array of pointers into the atoms table and sorting it into
order (to help compressors), then traversing this table and emitting it,
updating the refs to each atom as we go. The only complexity here is
arranging to keep the null string at offset zero, since a lot of code in
libctf depends on being able to leave strtab references at 0 to indicate
'no name'. Once the table is constructed and the refs updated, we know
how long it is, so we can realloc() the partial CTF buffer we allocated
earlier and can copy the table on to the end of it (and purge the refs
because they're not needed any more and have been invalidated by the
realloc() call in any case).
The net effect of all this is a reduction in uncompressed strtab sizes
of about 30% (perhaps a quarter to a half of all strings across the
Linux kernel are eliminated as duplicates). Of course, duplicated
strings are highly redundant, so the space saving after compression is
only about 20%: when the other non-strtab sections are factored in, CTF
sizes shrink by about 10%.
No change in externally-visible API or file format (other than the
reduction in pointless redundancy).
libctf/
* ctf-impl.h: (struct ctf_strs_writable): New, non-const version of
struct ctf_strs.
(struct ctf_dtdef): Note that dtd_data.ctt_name is unpopulated.
(struct ctf_str_atom): New, disambiguated single string.
(struct ctf_str_atom_ref): New, points to some other location that
references this string's offset.
(struct ctf_file): New members ctf_str_atoms and ctf_str_num_refs.
Remove member ctf_dtvstrlen: we no longer track the total strlen
as we add strings.
(ctf_str_create_atoms): Declare new function in ctf-string.c.
(ctf_str_free_atoms): Likewise.
(ctf_str_add): Likewise.
(ctf_str_add_ref): Likewise.
(ctf_str_purge_refs): Likewise.
(ctf_str_write_strtab): Likewise.
(ctf_realloc): Declare new function in ctf-util.c.
* ctf-open.c (ctf_bufopen): Create the atoms table.
(ctf_file_close): Destroy it.
* ctf-create.c (ctf_update): Copy-and-free it on update. No longer
special-case the position of the parname string. Construct the
strtab by calling ctf_str_add_ref and ctf_str_write_strtab after the
rest of each buffer element is constructed, not via open-coding:
realloc the CTF buffer and append the strtab to it. No longer
maintain ctf_dtvstrlen. Sort the variable entry table later, after
strtab construction.
(ctf_copy_membnames): Remove: integrated into ctf_copy_{s,l,e}members.
(ctf_copy_smembers): Drop the string offset: call ctf_str_add_ref
after buffer element construction instead.
(ctf_copy_lmembers): Likewise.
(ctf_copy_emembers): Likewise.
(ctf_create): No longer maintain the ctf_dtvstrlen.
(ctf_dtd_delete): Likewise.
(ctf_dvd_delete): Likewise.
(ctf_add_generic): Likewise.
(ctf_add_enumerator): Likewise.
(ctf_add_member_offset): Likewise.
(ctf_add_variable): Likewise.
(membadd): Likewise.
* ctf-util.c (ctf_realloc): New, wrapper around realloc that aborts
if there are active ctf_str_num_refs.
(ctf_strraw): Move to ctf-string.c.
(ctf_strptr): Likewise.
* ctf-string.c: New file, strtab manipulation.
* Makefile.am (libctf_a_SOURCES): Add it.
* Makefile.in: Regenerate.
2019-06-27 20:51:10 +08:00
|
|
|
extern void *ctf_realloc (ctf_file_t *, void *, size_t);
|
2019-04-24 04:45:30 +08:00
|
|
|
extern char *ctf_str_append (char *, const char *);
|
2019-09-17 13:57:00 +08:00
|
|
|
extern char *ctf_str_append_noerr (char *, const char *);
|
2019-04-24 04:45:30 +08:00
|
|
|
extern const char *ctf_strerror (int);
|
|
|
|
|
2019-04-24 05:24:13 +08:00
|
|
|
extern ctf_id_t ctf_type_resolve_unsliced (ctf_file_t *, ctf_id_t);
|
|
|
|
extern int ctf_type_kind_unsliced (ctf_file_t *, ctf_id_t);
|
|
|
|
|
2019-04-24 01:55:27 +08:00
|
|
|
_libctf_printflike_ (1, 2)
|
|
|
|
extern void ctf_dprintf (const char *, ...);
|
|
|
|
extern void libctf_init_debug (void);
|
|
|
|
|
2019-04-24 04:45:30 +08:00
|
|
|
extern Elf64_Sym *ctf_sym_to_elf64 (const Elf32_Sym *src, Elf64_Sym *dst);
|
2019-04-24 18:15:33 +08:00
|
|
|
extern const char *ctf_lookup_symbol_name (ctf_file_t *fp, unsigned long symidx);
|
2019-04-24 04:45:30 +08:00
|
|
|
|
2019-04-24 05:24:13 +08:00
|
|
|
/* Variables, all underscore-prepended. */
|
|
|
|
|
libctf: ELF file opening via BFD
These functions let you open an ELF file with a customarily-named CTF
section in it, automatically opening the CTF file or archive and
associating the symbol and string tables in the ELF file with the CTF
container, so that you can look up the types of symbols in the ELF file
via ctf_lookup_by_symbol(), and so that strings can be shared between
the ELF file and CTF container, to save space.
It uses BFD machinery to do so. This has now been lightly tested and
seems to work. In particular, if you already have a bfd you can pass
it in to ctf_bfdopen(), and if you want a bfd made for you you can
call ctf_open() or ctf_fdopen(), optionally specifying a target (or
try once without a target and then again with one if you get
ECTF_BFD_AMBIGUOUS back).
We use a forward declaration for the struct bfd in ctf-api.h, so that
ctf-api.h users are not required to pull in <bfd.h>. (This is mostly
for the sake of readelf.)
libctf/
* ctf-open-bfd.c: New file.
* ctf-open.c (ctf_close): New.
* ctf-impl.h: Include bfd.h.
(ctf_file): New members ctf_data_mmapped, ctf_data_mmapped_len.
(ctf_archive_internal): New members ctfi_abfd, ctfi_data,
ctfi_bfd_close.
(ctf_bfdopen_ctfsect): New declaration.
(_CTF_SECTION): likewise.
include/
* ctf-api.h (struct bfd): New forward.
(ctf_fdopen): New.
(ctf_bfdopen): Likewise.
(ctf_open): Likewise.
(ctf_arc_open): Likewise.
2019-04-24 17:46:39 +08:00
|
|
|
extern const char _CTF_SECTION[]; /* name of CTF ELF section */
|
2019-04-24 05:24:13 +08:00
|
|
|
extern const char _CTF_NULLSTR[]; /* empty string */
|
|
|
|
|
2019-04-24 18:26:42 +08:00
|
|
|
extern int _libctf_version; /* library client version */
|
2019-04-24 01:55:27 +08:00
|
|
|
extern int _libctf_debug; /* debugging messages enabled */
|
|
|
|
|
|
|
|
#ifdef __cplusplus
|
|
|
|
}
|
|
|
|
#endif
|
|
|
|
|
|
|
|
#endif /* _CTF_IMPL_H */
|