binutils-gdb/libctf/ctf-impl.h

744 lines
30 KiB
C
Raw Normal View History

/* Implementation header.
Copyright (C) 2019-2020 Free Software Foundation, Inc.
This file is part of libctf.
libctf is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free
Software Foundation; either version 3, or (at your option) any later
version.
This program is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; see the file COPYING. If not see
<http://www.gnu.org/licenses/>. */
#ifndef _CTF_IMPL_H
#define _CTF_IMPL_H
#include "config.h"
libctf: fix a number of build problems found on Solaris and NetBSD - Use of nonportable <endian.h> - Use of qsort_r - Use of zlib without appropriate magic to pull in the binutils zlib - Use of off64_t without checking (fixed by dropping the unused fields that need off64_t entirely) - signedness problems due to long being too short a type on 32-bit platforms: ctf_id_t is now 'unsigned long', and CTF_ERR must be used only for functions that return ctf_id_t - One lingering use of bzero() and of <sys/errno.h> All fixed, using code from gnulib where possible. Relatedly, set cts_size in a couple of places it was missed (string table and symbol table loading upon ctf_bfdopen()). binutils/ * objdump.c (make_ctfsect): Drop cts_type, cts_flags, and cts_offset. * readelf.c (shdr_to_ctf_sect): Likewise. include/ * ctf-api.h (ctf_sect_t): Drop cts_type, cts_flags, and cts_offset. (ctf_id_t): This is now an unsigned type. (CTF_ERR): Cast it to ctf_id_t. Note that it should only be used for ctf_id_t-returning functions. libctf/ * Makefile.am (ZLIB): New. (ZLIBINC): Likewise. (AM_CFLAGS): Use them. (libctf_a_LIBADD): New, for LIBOBJS. * configure.ac: Check for zlib, endian.h, and qsort_r. * ctf-endian.h: New, providing htole64 and le64toh. * swap.h: Code style fixes. (bswap_identity_64): New. * qsort_r.c: New, from gnulib (with one added #include). * ctf-decls.h: New, providing a conditional qsort_r declaration, and unconditional definitions of MIN and MAX. * ctf-impl.h: Use it. Do not use <sys/errno.h>. (ctf_set_errno): Now returns unsigned long. * ctf-util.c (ctf_set_errno): Adjust here too. * ctf-archive.c: Use ctf-endian.h. (ctf_arc_open_by_offset): Use memset, not bzero. Drop cts_type, cts_flags and cts_offset. (ctf_arc_write): Drop debugging dependent on the size of off_t. * ctf-create.c: Provide a definition of roundup if not defined. (ctf_create): Drop cts_type, cts_flags and cts_offset. (ctf_add_reftype): Do not check if type IDs are below zero. (ctf_add_slice): Likewise. (ctf_add_typedef): Likewise. (ctf_add_member_offset): Cast error-returning ssize_t's to size_t when known error-free. Drop CTF_ERR usage for functions returning int. (ctf_add_member_encoded): Drop CTF_ERR usage for functions returning int. (ctf_add_variable): Likewise. (enumcmp): Likewise. (enumadd): Likewise. (membcmp): Likewise. (ctf_add_type): Likewise. Cast error-returning ssize_t's to size_t when known error-free. * ctf-dump.c (ctf_is_slice): Drop CTF_ERR usage for functions returning int: use CTF_ERR for functions returning ctf_type_id. (ctf_dump_label): Likewise. (ctf_dump_objts): Likewise. * ctf-labels.c (ctf_label_topmost): Likewise. (ctf_label_iter): Likewise. (ctf_label_info): Likewise. * ctf-lookup.c (ctf_func_args): Likewise. * ctf-open.c (upgrade_types): Cast to size_t where appropriate. (ctf_bufopen): Likewise. Use zlib types as needed. * ctf-types.c (ctf_member_iter): Drop CTF_ERR usage for functions returning int. (ctf_enum_iter): Likewise. (ctf_type_size): Likewise. (ctf_type_align): Likewise. Cast to size_t where appropriate. (ctf_type_kind_unsliced): Likewise. (ctf_type_kind): Likewise. (ctf_type_encoding): Likewise. (ctf_member_info): Likewise. (ctf_array_info): Likewise. (ctf_enum_value): Likewise. (ctf_type_rvisit): Likewise. * ctf-open-bfd.c (ctf_bfdopen): Drop cts_type, cts_flags and cts_offset. (ctf_simple_open): Likewise. (ctf_bfdopen_ctfsect): Likewise. Set cts_size properly. * Makefile.in: Regenerate. * aclocal.m4: Likewise. * config.h: Likewise. * configure: Likewise.
2019-05-31 17:10:51 +08:00
#include <errno.h>
#include <sys/param.h>
libctf: fix a number of build problems found on Solaris and NetBSD - Use of nonportable <endian.h> - Use of qsort_r - Use of zlib without appropriate magic to pull in the binutils zlib - Use of off64_t without checking (fixed by dropping the unused fields that need off64_t entirely) - signedness problems due to long being too short a type on 32-bit platforms: ctf_id_t is now 'unsigned long', and CTF_ERR must be used only for functions that return ctf_id_t - One lingering use of bzero() and of <sys/errno.h> All fixed, using code from gnulib where possible. Relatedly, set cts_size in a couple of places it was missed (string table and symbol table loading upon ctf_bfdopen()). binutils/ * objdump.c (make_ctfsect): Drop cts_type, cts_flags, and cts_offset. * readelf.c (shdr_to_ctf_sect): Likewise. include/ * ctf-api.h (ctf_sect_t): Drop cts_type, cts_flags, and cts_offset. (ctf_id_t): This is now an unsigned type. (CTF_ERR): Cast it to ctf_id_t. Note that it should only be used for ctf_id_t-returning functions. libctf/ * Makefile.am (ZLIB): New. (ZLIBINC): Likewise. (AM_CFLAGS): Use them. (libctf_a_LIBADD): New, for LIBOBJS. * configure.ac: Check for zlib, endian.h, and qsort_r. * ctf-endian.h: New, providing htole64 and le64toh. * swap.h: Code style fixes. (bswap_identity_64): New. * qsort_r.c: New, from gnulib (with one added #include). * ctf-decls.h: New, providing a conditional qsort_r declaration, and unconditional definitions of MIN and MAX. * ctf-impl.h: Use it. Do not use <sys/errno.h>. (ctf_set_errno): Now returns unsigned long. * ctf-util.c (ctf_set_errno): Adjust here too. * ctf-archive.c: Use ctf-endian.h. (ctf_arc_open_by_offset): Use memset, not bzero. Drop cts_type, cts_flags and cts_offset. (ctf_arc_write): Drop debugging dependent on the size of off_t. * ctf-create.c: Provide a definition of roundup if not defined. (ctf_create): Drop cts_type, cts_flags and cts_offset. (ctf_add_reftype): Do not check if type IDs are below zero. (ctf_add_slice): Likewise. (ctf_add_typedef): Likewise. (ctf_add_member_offset): Cast error-returning ssize_t's to size_t when known error-free. Drop CTF_ERR usage for functions returning int. (ctf_add_member_encoded): Drop CTF_ERR usage for functions returning int. (ctf_add_variable): Likewise. (enumcmp): Likewise. (enumadd): Likewise. (membcmp): Likewise. (ctf_add_type): Likewise. Cast error-returning ssize_t's to size_t when known error-free. * ctf-dump.c (ctf_is_slice): Drop CTF_ERR usage for functions returning int: use CTF_ERR for functions returning ctf_type_id. (ctf_dump_label): Likewise. (ctf_dump_objts): Likewise. * ctf-labels.c (ctf_label_topmost): Likewise. (ctf_label_iter): Likewise. (ctf_label_info): Likewise. * ctf-lookup.c (ctf_func_args): Likewise. * ctf-open.c (upgrade_types): Cast to size_t where appropriate. (ctf_bufopen): Likewise. Use zlib types as needed. * ctf-types.c (ctf_member_iter): Drop CTF_ERR usage for functions returning int. (ctf_enum_iter): Likewise. (ctf_type_size): Likewise. (ctf_type_align): Likewise. Cast to size_t where appropriate. (ctf_type_kind_unsliced): Likewise. (ctf_type_kind): Likewise. (ctf_type_encoding): Likewise. (ctf_member_info): Likewise. (ctf_array_info): Likewise. (ctf_enum_value): Likewise. (ctf_type_rvisit): Likewise. * ctf-open-bfd.c (ctf_bfdopen): Drop cts_type, cts_flags and cts_offset. (ctf_simple_open): Likewise. (ctf_bfdopen_ctfsect): Likewise. Set cts_size properly. * Makefile.in: Regenerate. * aclocal.m4: Likewise. * config.h: Likewise. * configure: Likewise.
2019-05-31 17:10:51 +08:00
#include "ctf-decls.h"
#include <ctf-api.h>
#include "ctf-sha1.h"
#include <sys/types.h>
#include <stdlib.h>
#include <stdarg.h>
#include <stdio.h>
#include <stdint.h>
#include <limits.h>
#include <ctype.h>
#include <elf.h>
#include <bfd.h>
libctf, hash: introduce the ctf_dynset There are many places in the deduplicator which use hashtables as tiny sets: keys with no value (and usually, but not always, no freeing function) often with only one or a few members. For each of these, even after the last change to not store the freeing functions, we are storing a little malloced block for each item just to track the key/value pair, and a little malloced block for the hash table itself just to track the freeing function because we can't use libiberty hashtab's freeing function because we are using that to free the little malloced per-item block. If we only have a key, we don't need any of that: we can ditch the per-malloced block because we don't have a value, and we can ditch the per-hashtab structure because we don't need to independently track the freeing functions since libiberty hashtab is doing it for us. That means we don't need an owner field in the (now nonexistent) item block either. Roughly speaking, this datatype saves about 25% in time and 20% in peak memory usage for normal links, even fairly big ones. So this might seem redundant, but it's really worth it. Instead of a _lookup function, a dynset has two distinct functions: ctf_dynset_exists, which returns true or false and an optional pointer to the set member, and ctf_dynhash_lookup_any, which is used if all members of the set are expected to be equivalent and we just want *any* member and we don't care which one. There is no iterator in this set of functions, not because we don't iterate over dynset members -- we do, a lot -- but because the iterator here is a member of an entirely new family of much more convenient iteration functions, introduced in the next commit. libctf/ * ctf-hash.c (ctf_dynset_eq_string): New. (ctf_dynset_create): New. (DYNSET_EMPTY_ENTRY_REPLACEMENT): New. (DYNSET_DELETED_ENTRY_REPLACEMENT): New. (key_to_internal): New. (internal_to_key): New. (ctf_dynset_insert): New. (ctf_dynset_remove): New. (ctf_dynset_destroy): New. (ctf_dynset_lookup): New. (ctf_dynset_exists): New. (ctf_dynset_lookup_any): New. (ctf_hash_insert_type): Coding style. (ctf_hash_define_type): Likewise. * ctf-impl.h (ctf_dynset_t): New. (ctf_dynset_eq_string): New. (ctf_dynset_create): New. (ctf_dynset_insert): New. (ctf_dynset_remove): New. (ctf_dynset_destroy): New. (ctf_dynset_lookup): New. (ctf_dynset_exists): New. (ctf_dynset_lookup_any): New. * ctf-inlines.h (ctf_dynset_cinsert): New.
2020-06-03 05:26:38 +08:00
#include "hashtab.h"
#ifdef __cplusplus
extern "C"
{
#endif
/* Compiler attributes. */
#if defined (__GNUC__)
/* GCC. We assume that all compilers claiming to be GCC support sufficiently
many GCC attributes that the code below works. If some non-GCC compilers
masquerading as GCC in fact do not implement these attributes, version checks
may be required. */
/* We use the _libctf_*_ pattern to avoid clashes with any future attribute
macros glibc may introduce, which have names of the pattern
__attribute_blah__. */
#define _libctf_printflike_(string_index,first_to_check) \
__attribute__ ((__format__ (__printf__, (string_index), (first_to_check))))
#define _libctf_unlikely_(x) __builtin_expect ((x), 0)
#define _libctf_unused_ __attribute__ ((__unused__))
#define _libctf_malloc_ __attribute__((__malloc__))
#else
#define _libctf_printflike_(string_index,first_to_check)
#define _libctf_unlikely_(x) (x)
#define _libctf_unused_
#define _libctf_malloc_
#define __extension__
#endif
#if defined (ENABLE_LIBCTF_HASH_DEBUGGING) && !defined (NDEBUG)
#include <assert.h>
#define ctf_assert(fp, expr) (assert (expr), 1)
#else
libctf, ld, binutils: add textual error/warning reporting for libctf This commit adds a long-missing piece of infrastructure to libctf: the ability to report errors and warnings using all the power of printf, rather than being restricted to one errno value. Internally, libctf calls ctf_err_warn() to add errors and warnings to a list: a new iterator ctf_errwarning_next() then consumes this list one by one and hands it to the caller, which can free it. New errors and warnings are added until the list is consumed by the caller or the ctf_file_t is closed, so you can dump them at intervals. The caller can of course choose to print only those warnings it wants. (I am not sure whether we want objdump, readelf or ld to print warnings or not: right now I'm printing them, but maybe we only want to print errors? This entirely depends on whether warnings are voluminous things describing e.g. the inability to emit single types because of name clashes or something. There are no users of this infrastructure yet, so it's hard to say.) There is no internationalization here yet, but this at least adds a place where internationalization can be added, to one of ctf_errwarning_next or ctf_err_warn. We also provide a new ctf_assert() function which uses this infrastructure to provide non-fatal assertion failures while emitting an assert-like string to the caller: to save space and avoid needlessly duplicating unchanging strings, the assertion test is inlined but the print-things-out failure case is not. All assertions in libctf will be converted to use this machinery in future commits and propagate assertion-failure errors up, so that the linker in particular cannot be killed by libctf assertion failures when it could perfectly well just print warnings and drop the CTF section. include/ * ctf-api.h (ECTF_INTERNAL): Adjust error text. (ctf_errwarning_next): New. libctf/ * ctf-impl.h (ctf_assert): New. (ctf_err_warning_t): Likewise. (ctf_file_t) <ctf_errs_warnings>: Likewise. (ctf_err_warn): New prototype. (ctf_assert_fail_internal): Likewise. * ctf-inlines.h (ctf_assert_internal): Likewise. * ctf-open.c (ctf_file_close): Free ctf_errs_warnings. * ctf-create.c (ctf_serialize): Copy it on serialization. * ctf-subr.c (ctf_err_warn): New, add an error/warning. (ctf_errwarning_next): New iterator, free and pass back errors/warnings in succession. * libctf.ver (ctf_errwarning_next): Add. ld/ * ldlang.c (lang_ctf_errs_warnings): New, print CTF errors and warnings. Assert when libctf asserts. (lang_merge_ctf): Call it. (land_write_ctf): Likewise. binutils/ * objdump.c (ctf_archive_member): Print CTF errors and warnings. * readelf.c (dump_ctf_archive_member): Likewise.
2020-06-04 22:07:54 +08:00
#define ctf_assert(fp, expr) \
_libctf_unlikely_ (ctf_assert_internal (fp, __FILE__, __LINE__, \
#expr, !!(expr)))
#endif
libctf, ld, binutils: add textual error/warning reporting for libctf This commit adds a long-missing piece of infrastructure to libctf: the ability to report errors and warnings using all the power of printf, rather than being restricted to one errno value. Internally, libctf calls ctf_err_warn() to add errors and warnings to a list: a new iterator ctf_errwarning_next() then consumes this list one by one and hands it to the caller, which can free it. New errors and warnings are added until the list is consumed by the caller or the ctf_file_t is closed, so you can dump them at intervals. The caller can of course choose to print only those warnings it wants. (I am not sure whether we want objdump, readelf or ld to print warnings or not: right now I'm printing them, but maybe we only want to print errors? This entirely depends on whether warnings are voluminous things describing e.g. the inability to emit single types because of name clashes or something. There are no users of this infrastructure yet, so it's hard to say.) There is no internationalization here yet, but this at least adds a place where internationalization can be added, to one of ctf_errwarning_next or ctf_err_warn. We also provide a new ctf_assert() function which uses this infrastructure to provide non-fatal assertion failures while emitting an assert-like string to the caller: to save space and avoid needlessly duplicating unchanging strings, the assertion test is inlined but the print-things-out failure case is not. All assertions in libctf will be converted to use this machinery in future commits and propagate assertion-failure errors up, so that the linker in particular cannot be killed by libctf assertion failures when it could perfectly well just print warnings and drop the CTF section. include/ * ctf-api.h (ECTF_INTERNAL): Adjust error text. (ctf_errwarning_next): New. libctf/ * ctf-impl.h (ctf_assert): New. (ctf_err_warning_t): Likewise. (ctf_file_t) <ctf_errs_warnings>: Likewise. (ctf_err_warn): New prototype. (ctf_assert_fail_internal): Likewise. * ctf-inlines.h (ctf_assert_internal): Likewise. * ctf-open.c (ctf_file_close): Free ctf_errs_warnings. * ctf-create.c (ctf_serialize): Copy it on serialization. * ctf-subr.c (ctf_err_warn): New, add an error/warning. (ctf_errwarning_next): New iterator, free and pass back errors/warnings in succession. * libctf.ver (ctf_errwarning_next): Add. ld/ * ldlang.c (lang_ctf_errs_warnings): New, print CTF errors and warnings. Assert when libctf asserts. (lang_merge_ctf): Call it. (land_write_ctf): Likewise. binutils/ * objdump.c (ctf_archive_member): Print CTF errors and warnings. * readelf.c (dump_ctf_archive_member): Likewise.
2020-06-04 22:07:54 +08:00
/* libctf in-memory state. */
typedef struct ctf_fixed_hash ctf_hash_t; /* Private to ctf-hash.c. */
typedef struct ctf_dynhash ctf_dynhash_t; /* Private to ctf-hash.c. */
libctf, hash: introduce the ctf_dynset There are many places in the deduplicator which use hashtables as tiny sets: keys with no value (and usually, but not always, no freeing function) often with only one or a few members. For each of these, even after the last change to not store the freeing functions, we are storing a little malloced block for each item just to track the key/value pair, and a little malloced block for the hash table itself just to track the freeing function because we can't use libiberty hashtab's freeing function because we are using that to free the little malloced per-item block. If we only have a key, we don't need any of that: we can ditch the per-malloced block because we don't have a value, and we can ditch the per-hashtab structure because we don't need to independently track the freeing functions since libiberty hashtab is doing it for us. That means we don't need an owner field in the (now nonexistent) item block either. Roughly speaking, this datatype saves about 25% in time and 20% in peak memory usage for normal links, even fairly big ones. So this might seem redundant, but it's really worth it. Instead of a _lookup function, a dynset has two distinct functions: ctf_dynset_exists, which returns true or false and an optional pointer to the set member, and ctf_dynhash_lookup_any, which is used if all members of the set are expected to be equivalent and we just want *any* member and we don't care which one. There is no iterator in this set of functions, not because we don't iterate over dynset members -- we do, a lot -- but because the iterator here is a member of an entirely new family of much more convenient iteration functions, introduced in the next commit. libctf/ * ctf-hash.c (ctf_dynset_eq_string): New. (ctf_dynset_create): New. (DYNSET_EMPTY_ENTRY_REPLACEMENT): New. (DYNSET_DELETED_ENTRY_REPLACEMENT): New. (key_to_internal): New. (internal_to_key): New. (ctf_dynset_insert): New. (ctf_dynset_remove): New. (ctf_dynset_destroy): New. (ctf_dynset_lookup): New. (ctf_dynset_exists): New. (ctf_dynset_lookup_any): New. (ctf_hash_insert_type): Coding style. (ctf_hash_define_type): Likewise. * ctf-impl.h (ctf_dynset_t): New. (ctf_dynset_eq_string): New. (ctf_dynset_create): New. (ctf_dynset_insert): New. (ctf_dynset_remove): New. (ctf_dynset_destroy): New. (ctf_dynset_lookup): New. (ctf_dynset_exists): New. (ctf_dynset_lookup_any): New. * ctf-inlines.h (ctf_dynset_cinsert): New.
2020-06-03 05:26:38 +08:00
typedef struct ctf_dynset ctf_dynset_t; /* Private to ctf-hash.c. */
typedef struct ctf_strs
{
const char *cts_strs; /* Base address of string table. */
size_t cts_len; /* Size of string table in bytes. */
} ctf_strs_t;
libctf: deduplicate and sort the string table ctf.h states: > [...] the CTF string table does not contain any duplicated strings. Unfortunately this is entirely untrue: libctf has before now made no attempt whatsoever to deduplicate the string table. It computes the string table's length on the fly as it adds new strings to the dynamic CTF file, and ctf_update() just writes each string to the table and notes the current write position as it traverses the dynamic CTF file's data structures and builds the final CTF buffer. There is no global view of the strings and no deduplication. Fix this by erasing the ctf_dtvstrlen dead-reckoning length, and adding a new dynhash table ctf_str_atoms that maps unique strings to a list of references to those strings: a reference is a simple uint32_t * to some value somewhere in the under-construction CTF buffer that needs updating to note the string offset when the strtab is laid out. Adding a string is now a simple matter of calling ctf_str_add_ref(), which adds a new atom to the atoms table, if one doesn't already exist, and adding the location of the reference to this atom to the refs list attached to the atom: this works reliably as long as one takes care to only call ctf_str_add_ref() once the final location of the offset is known (so you can't call it on a temporary structure and then memcpy() that structure into place in the CTF buffer, because the ref will still point to the old location: ctf_update() changes accordingly). Generating the CTF string table is a matter of calling ctf_str_write_strtab(), which counts the length and number of elements in the atoms table using the ctf_dynhash_iter() function we just added, populating an array of pointers into the atoms table and sorting it into order (to help compressors), then traversing this table and emitting it, updating the refs to each atom as we go. The only complexity here is arranging to keep the null string at offset zero, since a lot of code in libctf depends on being able to leave strtab references at 0 to indicate 'no name'. Once the table is constructed and the refs updated, we know how long it is, so we can realloc() the partial CTF buffer we allocated earlier and can copy the table on to the end of it (and purge the refs because they're not needed any more and have been invalidated by the realloc() call in any case). The net effect of all this is a reduction in uncompressed strtab sizes of about 30% (perhaps a quarter to a half of all strings across the Linux kernel are eliminated as duplicates). Of course, duplicated strings are highly redundant, so the space saving after compression is only about 20%: when the other non-strtab sections are factored in, CTF sizes shrink by about 10%. No change in externally-visible API or file format (other than the reduction in pointless redundancy). libctf/ * ctf-impl.h: (struct ctf_strs_writable): New, non-const version of struct ctf_strs. (struct ctf_dtdef): Note that dtd_data.ctt_name is unpopulated. (struct ctf_str_atom): New, disambiguated single string. (struct ctf_str_atom_ref): New, points to some other location that references this string's offset. (struct ctf_file): New members ctf_str_atoms and ctf_str_num_refs. Remove member ctf_dtvstrlen: we no longer track the total strlen as we add strings. (ctf_str_create_atoms): Declare new function in ctf-string.c. (ctf_str_free_atoms): Likewise. (ctf_str_add): Likewise. (ctf_str_add_ref): Likewise. (ctf_str_purge_refs): Likewise. (ctf_str_write_strtab): Likewise. (ctf_realloc): Declare new function in ctf-util.c. * ctf-open.c (ctf_bufopen): Create the atoms table. (ctf_file_close): Destroy it. * ctf-create.c (ctf_update): Copy-and-free it on update. No longer special-case the position of the parname string. Construct the strtab by calling ctf_str_add_ref and ctf_str_write_strtab after the rest of each buffer element is constructed, not via open-coding: realloc the CTF buffer and append the strtab to it. No longer maintain ctf_dtvstrlen. Sort the variable entry table later, after strtab construction. (ctf_copy_membnames): Remove: integrated into ctf_copy_{s,l,e}members. (ctf_copy_smembers): Drop the string offset: call ctf_str_add_ref after buffer element construction instead. (ctf_copy_lmembers): Likewise. (ctf_copy_emembers): Likewise. (ctf_create): No longer maintain the ctf_dtvstrlen. (ctf_dtd_delete): Likewise. (ctf_dvd_delete): Likewise. (ctf_add_generic): Likewise. (ctf_add_enumerator): Likewise. (ctf_add_member_offset): Likewise. (ctf_add_variable): Likewise. (membadd): Likewise. * ctf-util.c (ctf_realloc): New, wrapper around realloc that aborts if there are active ctf_str_num_refs. (ctf_strraw): Move to ctf-string.c. (ctf_strptr): Likewise. * ctf-string.c: New file, strtab manipulation. * Makefile.am (libctf_a_SOURCES): Add it. * Makefile.in: Regenerate.
2019-06-27 20:51:10 +08:00
typedef struct ctf_strs_writable
{
char *cts_strs; /* Base address of string table. */
size_t cts_len; /* Size of string table in bytes. */
} ctf_strs_writable_t;
typedef struct ctf_dmodel
{
const char *ctd_name; /* Data model name. */
int ctd_code; /* Data model code. */
size_t ctd_pointer; /* Size of void * in bytes. */
size_t ctd_char; /* Size of char in bytes. */
size_t ctd_short; /* Size of short in bytes. */
size_t ctd_int; /* Size of int in bytes. */
size_t ctd_long; /* Size of long in bytes. */
} ctf_dmodel_t;
libctf: avoid the need to ever use ctf_update The method of operation of libctf when the dictionary is writable has before now been that types that are added land in the dynamic type section, which is a linked list and hash of IDs -> dynamic type definitions (and, recently a hash of names): the DTDs are a bit of CTF representing the ctf_type_t and ad hoc C structures representing the vlen. Historically, libctf was unable to do anything with these types, not even look them up by ID, let alone by name: if you wanted to do that say if you were adding a type that depended on one you just added) you called ctf_update, which serializes all the DTDs into a CTF file and reopens it, copying its guts over the fp it's called with. The ctf_updated types are then frozen in amber and unchangeable: all lookups will return the types in the static portion in preference to the dynamic portion, and we will refuse to re-add things that already exist in the static portion (and, of late, in the dynamic portion too). The libctf machinery remembers the boundary between static and dynamic types and looks in the right portion for each type. Lots of things still don't quite work with dynamic types (e.g. getting their size), but enough works to do a bunch of additions and then a ctf_update, most of the time. Except it doesn't, because ctf_add_type finds it necessary to walk the full dynamic type definition list looking for types with matching names, so it gets slower and slower with every type you add: fixing this requires calling ctf_update periodically for no other reason than to avoid massively slowing things down. This is all clunky and very slow but kind of works, until you consider that it is in fact possible and indeed necessary to modify one sort of type after it has been added: forwards. These are necessarily promoted to structs, unions or enums, and when they do so *their type ID does not change*. So all of a sudden we are changing types that already exist in the static portion. ctf_update gets massively confused by this and allocates space enough for the forward (with no members), but then emits the new dynamic type (with all the members) into it. You get an assertion failure after that, if you're lucky, or a coredump. So this commit rejigs things a bit and arranges to exclusively use the dynamic type definitions in writable dictionaries, and the static type definitions in readable dictionaries: we don't at any time have a mixture of static and dynamic types, and you don't need to call ctf_update to make things "appear". The ctf_dtbyname hash I introduced a few months ago, which maps things like "struct foo" to DTDs, is removed, replaced instead by a change of type of the four dictionaries which track names. Rather than just being (unresizable) ctf_hash_t's populated only at ctf_bufopen time, they are now a ctf_names_t structure, which is a pair of ctf_hash_t and ctf_dynhash_t, with the ctf_hash_t portion being used in readonly dictionaries, and the ctf_dynhash_t being used in writable ones. The decision as to which to use is centralized in the new functions ctf_lookup_by_rawname (which takes a type kind) and ctf_lookup_by_rawhash, which it calls (which takes a ctf_names_t *.) This change lets us switch from using static to dynamic name hashes on the fly across the entirety of libctf without complexifying anything: in fact, because we now centralize the knowledge about how to map from type kind to name hash, it actually simplifies things and lets us throw out quite a lot of now-unnecessary complexity, from ctf_dtnyname (replaced by the dynamic half of the name tables), through to ctf_dtnextid (now that a dictionary's static portion is never referenced if the dictionary is writable, we can just use ctf_typemax to indicate the maximum type: dynamic or non-dynamic does not matter, and we no longer need to track the boundary between the types). You can now ctf_rollback() as far as you like, even past a ctf_update or for that matter a full writeout; all the iteration functions work just as well on writable as on read-only dictionaries; ctf_add_type no longer needs expensive duplicated code to run over the dynamic types hunting for ones it might be interested in; and the linker no longer needs a hack to call ctf_update so that calling ctf_add_type is not impossibly expensive. There is still a bit more complexity: some new code paths in ctf-types.c need to know how to extract information from dynamic types. This complexity will go away again in a few months when libctf acquires a proper intermediate representation. You can still call ctf_update if you like (it's public API, after all), but its only effect now is to set the point to which ctf_discard rolls back. Obviously *something* still needs to serialize the CTF file before writeout, and this job is done by ctf_serialize, which does everything ctf_update used to except set the counter used by ctf_discard. It is automatically called by the various functions that do CTF writeout: nobody else ever needs to call it. With this in place, forwards that are promoted to non-forwards no longer crash the link, even if it happens tens of thousands of types later. v5: fix tabdamage. libctf/ * ctf-impl.h (ctf_names_t): New. (ctf_lookup_t) <ctf_hash>: Now a ctf_names_t, not a ctf_hash_t. (ctf_file_t) <ctf_structs>: Likewise. <ctf_unions>: Likewise. <ctf_enums>: Likewise. <ctf_names>: Likewise. <ctf_lookups>: Improve comment. <ctf_ptrtab_len>: New. <ctf_prov_strtab>: New. <ctf_str_prov_offset>: New. <ctf_dtbyname>: Remove, redundant to the names hashes. <ctf_dtnextid>: Remove, redundant to ctf_typemax. (ctf_dtdef_t) <dtd_name>: Remove. <dtd_data>: Note that the ctt_name is now populated. (ctf_str_atom_t) <csa_offset>: This is now the strtab offset for internal strings too. <csa_external_offset>: New, the external strtab offset. (CTF_INDEX_TO_TYPEPTR): Handle the LCTF_RDWR case. (ctf_name_table): New declaration. (ctf_lookup_by_rawname): Likewise. (ctf_lookup_by_rawhash): Likewise. (ctf_set_ctl_hashes): Likewise. (ctf_serialize): Likewise. (ctf_dtd_insert): Adjust. (ctf_simple_open_internal): Likewise. (ctf_bufopen_internal): Likewise. (ctf_list_empty_p): Likewise. (ctf_str_remove_ref): Likewise. (ctf_str_add): Returns uint32_t now. (ctf_str_add_ref): Likewise. (ctf_str_add_external): Now returns a boolean (int). * ctf-string.c (ctf_strraw_explicit): Check the ctf_prov_strtab for strings in the appropriate range. (ctf_str_create_atoms): Create the ctf_prov_strtab. Detect OOM when adding the null string to the new strtab. (ctf_str_free_atoms): Destroy the ctf_prov_strtab. (ctf_str_add_ref_internal): Add make_provisional argument. If make_provisional, populate the offset and fill in the ctf_prov_strtab accordingly. (ctf_str_add): Return the offset, not the string. (ctf_str_add_ref): Likewise. (ctf_str_add_external): Return a success integer. (ctf_str_remove_ref): New, remove a single ref. (ctf_str_count_strtab): Do not count the initial null string's length or the existence or length of any unreferenced internal atoms. (ctf_str_populate_sorttab): Skip atoms with no refs. (ctf_str_write_strtab): Populate the nullstr earlier. Add one to the cts_len for the null string, since it is no longer done in ctf_str_count_strtab. Adjust for csa_external_offset rename. Populate the csa_offset for both internal and external cases. Flush the ctf_prov_strtab afterwards, and reset the ctf_str_prov_offset. * ctf-create.c (ctf_grow_ptrtab): New. (ctf_create): Call it. Initialize new fields rather than old ones. Tell ctf_bufopen_internal that this is a writable dictionary. Set the ctl hashes and data model. (ctf_update): Rename to... (ctf_serialize): ... this. Leave a compatibility function behind. Tell ctf_simple_open_internal that this is a writable dictionary. Pass the new fields along from the old dictionary. Drop ctf_dtnextid and ctf_dtbyname. Use ctf_strraw, not dtd_name. Do not zero out the DTD's ctt_name. (ctf_prefixed_name): Rename to... (ctf_name_table): ... this. No longer return a prefixed name: return the applicable name table instead. (ctf_dtd_insert): Use it, and use the right name table. Pass in the kind we're adding. Migrate away from dtd_name. (ctf_dtd_delete): Adjust similarly. Remove the ref to the deleted ctt_name. (ctf_dtd_lookup_type_by_name): Remove. (ctf_dynamic_type): Always return NULL on read-only dictionaries. No longer check ctf_dtnextid: check ctf_typemax instead. (ctf_snapshot): No longer use ctf_dtnextid: use ctf_typemax instead. (ctf_rollback): Likewise. No longer fail with ECTF_OVERROLLBACK. Use ctf_name_table and the right name table, and migrate away from dtd_name as in ctf_dtd_delete. (ctf_add_generic): Pass in the kind explicitly and pass it to ctf_dtd_insert. Use ctf_typemax, not ctf_dtnextid. Migrate away from dtd_name to using ctf_str_add_ref to populate the ctt_name. Grow the ptrtab if needed. (ctf_add_encoded): Pass in the kind. (ctf_add_slice): Likewise. (ctf_add_array): Likewise. (ctf_add_function): Likewise. (ctf_add_typedef): Likewise. (ctf_add_reftype): Likewise. Initialize the ctf_ptrtab, checking ctt_name rather than dtd_name. (ctf_add_struct_sized): Pass in the kind. Use ctf_lookup_by_rawname, not ctf_hash_lookup_type / ctf_dtd_lookup_type_by_name. (ctf_add_union_sized): Likewise. (ctf_add_enum): Likewise. (ctf_add_enum_encoded): Likewise. (ctf_add_forward): Likewise. (ctf_add_type): Likewise. (ctf_compress_write): Call ctf_serialize: adjust for ctf_size not being initialized until after the call. (ctf_write_mem): Likewise. (ctf_write): Likewise. * ctf-archive.c (arc_write_one_ctf): Likewise. * ctf-lookup.c (ctf_lookup_by_name): Use ctf_lookuup_by_rawhash, not ctf_hash_lookup_type. (ctf_lookup_by_id): No longer check the readonly types if the dictionary is writable. * ctf-open.c (init_types): Assert that this dictionary is not writable. Adjust to use the new name hashes, ctf_name_table, and ctf_ptrtab_len. GNU style fix for the final ptrtab scan. (ctf_bufopen_internal): New 'writable' parameter. Flip on LCTF_RDWR if set. Drop out early when dictionary is writable. Split the ctf_lookups initialization into... (ctf_set_cth_hashes): ... this new function. (ctf_simple_open_internal): Adjust. New 'writable' parameter. (ctf_simple_open): Adjust accordingly. (ctf_bufopen): Likewise. (ctf_file_close): Destroy the appropriate name hashes. No longer destroy ctf_dtbyname, which is gone. (ctf_getdatasect): Remove spurious "extern". * ctf-types.c (ctf_lookup_by_rawname): New, look up types in the specified name table, given a kind. (ctf_lookup_by_rawhash): Likewise, given a ctf_names_t *. (ctf_member_iter): Add support for iterating over the dynamic type list. (ctf_enum_iter): Likewise. (ctf_variable_iter): Likewise. (ctf_type_rvisit): Likewise. (ctf_member_info): Add support for types in the dynamic type list. (ctf_enum_name): Likewise. (ctf_enum_value): Likewise. (ctf_func_type_info): Likewise. (ctf_func_type_args): Likewise. * ctf-link.c (ctf_accumulate_archive_names): No longer call ctf_update. (ctf_link_write): Likewise. (ctf_link_intern_extern_string): Adjust for new ctf_str_add_external return value. (ctf_link_add_strtab): Likewise. * ctf-util.c (ctf_list_empty_p): New.
2019-08-08 00:55:09 +08:00
typedef struct ctf_names
{
ctf_hash_t *ctn_readonly; /* Hash table when readonly. */
ctf_dynhash_t *ctn_writable; /* Hash table when writable. */
} ctf_names_t;
typedef struct ctf_lookup
{
const char *ctl_prefix; /* String prefix for this lookup. */
size_t ctl_len; /* Length of prefix string in bytes. */
libctf: avoid the need to ever use ctf_update The method of operation of libctf when the dictionary is writable has before now been that types that are added land in the dynamic type section, which is a linked list and hash of IDs -> dynamic type definitions (and, recently a hash of names): the DTDs are a bit of CTF representing the ctf_type_t and ad hoc C structures representing the vlen. Historically, libctf was unable to do anything with these types, not even look them up by ID, let alone by name: if you wanted to do that say if you were adding a type that depended on one you just added) you called ctf_update, which serializes all the DTDs into a CTF file and reopens it, copying its guts over the fp it's called with. The ctf_updated types are then frozen in amber and unchangeable: all lookups will return the types in the static portion in preference to the dynamic portion, and we will refuse to re-add things that already exist in the static portion (and, of late, in the dynamic portion too). The libctf machinery remembers the boundary between static and dynamic types and looks in the right portion for each type. Lots of things still don't quite work with dynamic types (e.g. getting their size), but enough works to do a bunch of additions and then a ctf_update, most of the time. Except it doesn't, because ctf_add_type finds it necessary to walk the full dynamic type definition list looking for types with matching names, so it gets slower and slower with every type you add: fixing this requires calling ctf_update periodically for no other reason than to avoid massively slowing things down. This is all clunky and very slow but kind of works, until you consider that it is in fact possible and indeed necessary to modify one sort of type after it has been added: forwards. These are necessarily promoted to structs, unions or enums, and when they do so *their type ID does not change*. So all of a sudden we are changing types that already exist in the static portion. ctf_update gets massively confused by this and allocates space enough for the forward (with no members), but then emits the new dynamic type (with all the members) into it. You get an assertion failure after that, if you're lucky, or a coredump. So this commit rejigs things a bit and arranges to exclusively use the dynamic type definitions in writable dictionaries, and the static type definitions in readable dictionaries: we don't at any time have a mixture of static and dynamic types, and you don't need to call ctf_update to make things "appear". The ctf_dtbyname hash I introduced a few months ago, which maps things like "struct foo" to DTDs, is removed, replaced instead by a change of type of the four dictionaries which track names. Rather than just being (unresizable) ctf_hash_t's populated only at ctf_bufopen time, they are now a ctf_names_t structure, which is a pair of ctf_hash_t and ctf_dynhash_t, with the ctf_hash_t portion being used in readonly dictionaries, and the ctf_dynhash_t being used in writable ones. The decision as to which to use is centralized in the new functions ctf_lookup_by_rawname (which takes a type kind) and ctf_lookup_by_rawhash, which it calls (which takes a ctf_names_t *.) This change lets us switch from using static to dynamic name hashes on the fly across the entirety of libctf without complexifying anything: in fact, because we now centralize the knowledge about how to map from type kind to name hash, it actually simplifies things and lets us throw out quite a lot of now-unnecessary complexity, from ctf_dtnyname (replaced by the dynamic half of the name tables), through to ctf_dtnextid (now that a dictionary's static portion is never referenced if the dictionary is writable, we can just use ctf_typemax to indicate the maximum type: dynamic or non-dynamic does not matter, and we no longer need to track the boundary between the types). You can now ctf_rollback() as far as you like, even past a ctf_update or for that matter a full writeout; all the iteration functions work just as well on writable as on read-only dictionaries; ctf_add_type no longer needs expensive duplicated code to run over the dynamic types hunting for ones it might be interested in; and the linker no longer needs a hack to call ctf_update so that calling ctf_add_type is not impossibly expensive. There is still a bit more complexity: some new code paths in ctf-types.c need to know how to extract information from dynamic types. This complexity will go away again in a few months when libctf acquires a proper intermediate representation. You can still call ctf_update if you like (it's public API, after all), but its only effect now is to set the point to which ctf_discard rolls back. Obviously *something* still needs to serialize the CTF file before writeout, and this job is done by ctf_serialize, which does everything ctf_update used to except set the counter used by ctf_discard. It is automatically called by the various functions that do CTF writeout: nobody else ever needs to call it. With this in place, forwards that are promoted to non-forwards no longer crash the link, even if it happens tens of thousands of types later. v5: fix tabdamage. libctf/ * ctf-impl.h (ctf_names_t): New. (ctf_lookup_t) <ctf_hash>: Now a ctf_names_t, not a ctf_hash_t. (ctf_file_t) <ctf_structs>: Likewise. <ctf_unions>: Likewise. <ctf_enums>: Likewise. <ctf_names>: Likewise. <ctf_lookups>: Improve comment. <ctf_ptrtab_len>: New. <ctf_prov_strtab>: New. <ctf_str_prov_offset>: New. <ctf_dtbyname>: Remove, redundant to the names hashes. <ctf_dtnextid>: Remove, redundant to ctf_typemax. (ctf_dtdef_t) <dtd_name>: Remove. <dtd_data>: Note that the ctt_name is now populated. (ctf_str_atom_t) <csa_offset>: This is now the strtab offset for internal strings too. <csa_external_offset>: New, the external strtab offset. (CTF_INDEX_TO_TYPEPTR): Handle the LCTF_RDWR case. (ctf_name_table): New declaration. (ctf_lookup_by_rawname): Likewise. (ctf_lookup_by_rawhash): Likewise. (ctf_set_ctl_hashes): Likewise. (ctf_serialize): Likewise. (ctf_dtd_insert): Adjust. (ctf_simple_open_internal): Likewise. (ctf_bufopen_internal): Likewise. (ctf_list_empty_p): Likewise. (ctf_str_remove_ref): Likewise. (ctf_str_add): Returns uint32_t now. (ctf_str_add_ref): Likewise. (ctf_str_add_external): Now returns a boolean (int). * ctf-string.c (ctf_strraw_explicit): Check the ctf_prov_strtab for strings in the appropriate range. (ctf_str_create_atoms): Create the ctf_prov_strtab. Detect OOM when adding the null string to the new strtab. (ctf_str_free_atoms): Destroy the ctf_prov_strtab. (ctf_str_add_ref_internal): Add make_provisional argument. If make_provisional, populate the offset and fill in the ctf_prov_strtab accordingly. (ctf_str_add): Return the offset, not the string. (ctf_str_add_ref): Likewise. (ctf_str_add_external): Return a success integer. (ctf_str_remove_ref): New, remove a single ref. (ctf_str_count_strtab): Do not count the initial null string's length or the existence or length of any unreferenced internal atoms. (ctf_str_populate_sorttab): Skip atoms with no refs. (ctf_str_write_strtab): Populate the nullstr earlier. Add one to the cts_len for the null string, since it is no longer done in ctf_str_count_strtab. Adjust for csa_external_offset rename. Populate the csa_offset for both internal and external cases. Flush the ctf_prov_strtab afterwards, and reset the ctf_str_prov_offset. * ctf-create.c (ctf_grow_ptrtab): New. (ctf_create): Call it. Initialize new fields rather than old ones. Tell ctf_bufopen_internal that this is a writable dictionary. Set the ctl hashes and data model. (ctf_update): Rename to... (ctf_serialize): ... this. Leave a compatibility function behind. Tell ctf_simple_open_internal that this is a writable dictionary. Pass the new fields along from the old dictionary. Drop ctf_dtnextid and ctf_dtbyname. Use ctf_strraw, not dtd_name. Do not zero out the DTD's ctt_name. (ctf_prefixed_name): Rename to... (ctf_name_table): ... this. No longer return a prefixed name: return the applicable name table instead. (ctf_dtd_insert): Use it, and use the right name table. Pass in the kind we're adding. Migrate away from dtd_name. (ctf_dtd_delete): Adjust similarly. Remove the ref to the deleted ctt_name. (ctf_dtd_lookup_type_by_name): Remove. (ctf_dynamic_type): Always return NULL on read-only dictionaries. No longer check ctf_dtnextid: check ctf_typemax instead. (ctf_snapshot): No longer use ctf_dtnextid: use ctf_typemax instead. (ctf_rollback): Likewise. No longer fail with ECTF_OVERROLLBACK. Use ctf_name_table and the right name table, and migrate away from dtd_name as in ctf_dtd_delete. (ctf_add_generic): Pass in the kind explicitly and pass it to ctf_dtd_insert. Use ctf_typemax, not ctf_dtnextid. Migrate away from dtd_name to using ctf_str_add_ref to populate the ctt_name. Grow the ptrtab if needed. (ctf_add_encoded): Pass in the kind. (ctf_add_slice): Likewise. (ctf_add_array): Likewise. (ctf_add_function): Likewise. (ctf_add_typedef): Likewise. (ctf_add_reftype): Likewise. Initialize the ctf_ptrtab, checking ctt_name rather than dtd_name. (ctf_add_struct_sized): Pass in the kind. Use ctf_lookup_by_rawname, not ctf_hash_lookup_type / ctf_dtd_lookup_type_by_name. (ctf_add_union_sized): Likewise. (ctf_add_enum): Likewise. (ctf_add_enum_encoded): Likewise. (ctf_add_forward): Likewise. (ctf_add_type): Likewise. (ctf_compress_write): Call ctf_serialize: adjust for ctf_size not being initialized until after the call. (ctf_write_mem): Likewise. (ctf_write): Likewise. * ctf-archive.c (arc_write_one_ctf): Likewise. * ctf-lookup.c (ctf_lookup_by_name): Use ctf_lookuup_by_rawhash, not ctf_hash_lookup_type. (ctf_lookup_by_id): No longer check the readonly types if the dictionary is writable. * ctf-open.c (init_types): Assert that this dictionary is not writable. Adjust to use the new name hashes, ctf_name_table, and ctf_ptrtab_len. GNU style fix for the final ptrtab scan. (ctf_bufopen_internal): New 'writable' parameter. Flip on LCTF_RDWR if set. Drop out early when dictionary is writable. Split the ctf_lookups initialization into... (ctf_set_cth_hashes): ... this new function. (ctf_simple_open_internal): Adjust. New 'writable' parameter. (ctf_simple_open): Adjust accordingly. (ctf_bufopen): Likewise. (ctf_file_close): Destroy the appropriate name hashes. No longer destroy ctf_dtbyname, which is gone. (ctf_getdatasect): Remove spurious "extern". * ctf-types.c (ctf_lookup_by_rawname): New, look up types in the specified name table, given a kind. (ctf_lookup_by_rawhash): Likewise, given a ctf_names_t *. (ctf_member_iter): Add support for iterating over the dynamic type list. (ctf_enum_iter): Likewise. (ctf_variable_iter): Likewise. (ctf_type_rvisit): Likewise. (ctf_member_info): Add support for types in the dynamic type list. (ctf_enum_name): Likewise. (ctf_enum_value): Likewise. (ctf_func_type_info): Likewise. (ctf_func_type_args): Likewise. * ctf-link.c (ctf_accumulate_archive_names): No longer call ctf_update. (ctf_link_write): Likewise. (ctf_link_intern_extern_string): Adjust for new ctf_str_add_external return value. (ctf_link_add_strtab): Likewise. * ctf-util.c (ctf_list_empty_p): New.
2019-08-08 00:55:09 +08:00
ctf_names_t *ctl_hash; /* Pointer to hash table for lookup. */
} ctf_lookup_t;
typedef struct ctf_fileops
{
uint32_t (*ctfo_get_kind) (uint32_t);
uint32_t (*ctfo_get_root) (uint32_t);
uint32_t (*ctfo_get_vlen) (uint32_t);
ssize_t (*ctfo_get_ctt_size) (const ctf_file_t *, const ctf_type_t *,
ssize_t *, ssize_t *);
ssize_t (*ctfo_get_vbytes) (unsigned short, ssize_t, size_t);
} ctf_fileops_t;
typedef struct ctf_list
{
struct ctf_list *l_prev; /* Previous pointer or tail pointer. */
struct ctf_list *l_next; /* Next pointer or head pointer. */
} ctf_list_t;
typedef enum
{
CTF_PREC_BASE,
CTF_PREC_POINTER,
CTF_PREC_ARRAY,
CTF_PREC_FUNCTION,
CTF_PREC_MAX
} ctf_decl_prec_t;
typedef struct ctf_decl_node
{
ctf_list_t cd_list; /* Linked list pointers. */
ctf_id_t cd_type; /* Type identifier. */
uint32_t cd_kind; /* Type kind. */
uint32_t cd_n; /* Type dimension if array. */
} ctf_decl_node_t;
typedef struct ctf_decl
{
ctf_list_t cd_nodes[CTF_PREC_MAX]; /* Declaration node stacks. */
int cd_order[CTF_PREC_MAX]; /* Storage order of decls. */
ctf_decl_prec_t cd_qualp; /* Qualifier precision. */
ctf_decl_prec_t cd_ordp; /* Ordered precision. */
char *cd_buf; /* Buffer for output. */
int cd_err; /* Saved error value. */
int cd_enomem; /* Nonzero if OOM during printing. */
} ctf_decl_t;
typedef struct ctf_dmdef
{
ctf_list_t dmd_list; /* List forward/back pointers. */
char *dmd_name; /* Name of this member. */
ctf_id_t dmd_type; /* Type of this member (for sou). */
unsigned long dmd_offset; /* Offset of this member in bits (for sou). */
int dmd_value; /* Value of this member (for enum). */
} ctf_dmdef_t;
typedef struct ctf_dtdef
{
ctf_list_t dtd_list; /* List forward/back pointers. */
ctf_id_t dtd_type; /* Type identifier for this definition. */
libctf: avoid the need to ever use ctf_update The method of operation of libctf when the dictionary is writable has before now been that types that are added land in the dynamic type section, which is a linked list and hash of IDs -> dynamic type definitions (and, recently a hash of names): the DTDs are a bit of CTF representing the ctf_type_t and ad hoc C structures representing the vlen. Historically, libctf was unable to do anything with these types, not even look them up by ID, let alone by name: if you wanted to do that say if you were adding a type that depended on one you just added) you called ctf_update, which serializes all the DTDs into a CTF file and reopens it, copying its guts over the fp it's called with. The ctf_updated types are then frozen in amber and unchangeable: all lookups will return the types in the static portion in preference to the dynamic portion, and we will refuse to re-add things that already exist in the static portion (and, of late, in the dynamic portion too). The libctf machinery remembers the boundary between static and dynamic types and looks in the right portion for each type. Lots of things still don't quite work with dynamic types (e.g. getting their size), but enough works to do a bunch of additions and then a ctf_update, most of the time. Except it doesn't, because ctf_add_type finds it necessary to walk the full dynamic type definition list looking for types with matching names, so it gets slower and slower with every type you add: fixing this requires calling ctf_update periodically for no other reason than to avoid massively slowing things down. This is all clunky and very slow but kind of works, until you consider that it is in fact possible and indeed necessary to modify one sort of type after it has been added: forwards. These are necessarily promoted to structs, unions or enums, and when they do so *their type ID does not change*. So all of a sudden we are changing types that already exist in the static portion. ctf_update gets massively confused by this and allocates space enough for the forward (with no members), but then emits the new dynamic type (with all the members) into it. You get an assertion failure after that, if you're lucky, or a coredump. So this commit rejigs things a bit and arranges to exclusively use the dynamic type definitions in writable dictionaries, and the static type definitions in readable dictionaries: we don't at any time have a mixture of static and dynamic types, and you don't need to call ctf_update to make things "appear". The ctf_dtbyname hash I introduced a few months ago, which maps things like "struct foo" to DTDs, is removed, replaced instead by a change of type of the four dictionaries which track names. Rather than just being (unresizable) ctf_hash_t's populated only at ctf_bufopen time, they are now a ctf_names_t structure, which is a pair of ctf_hash_t and ctf_dynhash_t, with the ctf_hash_t portion being used in readonly dictionaries, and the ctf_dynhash_t being used in writable ones. The decision as to which to use is centralized in the new functions ctf_lookup_by_rawname (which takes a type kind) and ctf_lookup_by_rawhash, which it calls (which takes a ctf_names_t *.) This change lets us switch from using static to dynamic name hashes on the fly across the entirety of libctf without complexifying anything: in fact, because we now centralize the knowledge about how to map from type kind to name hash, it actually simplifies things and lets us throw out quite a lot of now-unnecessary complexity, from ctf_dtnyname (replaced by the dynamic half of the name tables), through to ctf_dtnextid (now that a dictionary's static portion is never referenced if the dictionary is writable, we can just use ctf_typemax to indicate the maximum type: dynamic or non-dynamic does not matter, and we no longer need to track the boundary between the types). You can now ctf_rollback() as far as you like, even past a ctf_update or for that matter a full writeout; all the iteration functions work just as well on writable as on read-only dictionaries; ctf_add_type no longer needs expensive duplicated code to run over the dynamic types hunting for ones it might be interested in; and the linker no longer needs a hack to call ctf_update so that calling ctf_add_type is not impossibly expensive. There is still a bit more complexity: some new code paths in ctf-types.c need to know how to extract information from dynamic types. This complexity will go away again in a few months when libctf acquires a proper intermediate representation. You can still call ctf_update if you like (it's public API, after all), but its only effect now is to set the point to which ctf_discard rolls back. Obviously *something* still needs to serialize the CTF file before writeout, and this job is done by ctf_serialize, which does everything ctf_update used to except set the counter used by ctf_discard. It is automatically called by the various functions that do CTF writeout: nobody else ever needs to call it. With this in place, forwards that are promoted to non-forwards no longer crash the link, even if it happens tens of thousands of types later. v5: fix tabdamage. libctf/ * ctf-impl.h (ctf_names_t): New. (ctf_lookup_t) <ctf_hash>: Now a ctf_names_t, not a ctf_hash_t. (ctf_file_t) <ctf_structs>: Likewise. <ctf_unions>: Likewise. <ctf_enums>: Likewise. <ctf_names>: Likewise. <ctf_lookups>: Improve comment. <ctf_ptrtab_len>: New. <ctf_prov_strtab>: New. <ctf_str_prov_offset>: New. <ctf_dtbyname>: Remove, redundant to the names hashes. <ctf_dtnextid>: Remove, redundant to ctf_typemax. (ctf_dtdef_t) <dtd_name>: Remove. <dtd_data>: Note that the ctt_name is now populated. (ctf_str_atom_t) <csa_offset>: This is now the strtab offset for internal strings too. <csa_external_offset>: New, the external strtab offset. (CTF_INDEX_TO_TYPEPTR): Handle the LCTF_RDWR case. (ctf_name_table): New declaration. (ctf_lookup_by_rawname): Likewise. (ctf_lookup_by_rawhash): Likewise. (ctf_set_ctl_hashes): Likewise. (ctf_serialize): Likewise. (ctf_dtd_insert): Adjust. (ctf_simple_open_internal): Likewise. (ctf_bufopen_internal): Likewise. (ctf_list_empty_p): Likewise. (ctf_str_remove_ref): Likewise. (ctf_str_add): Returns uint32_t now. (ctf_str_add_ref): Likewise. (ctf_str_add_external): Now returns a boolean (int). * ctf-string.c (ctf_strraw_explicit): Check the ctf_prov_strtab for strings in the appropriate range. (ctf_str_create_atoms): Create the ctf_prov_strtab. Detect OOM when adding the null string to the new strtab. (ctf_str_free_atoms): Destroy the ctf_prov_strtab. (ctf_str_add_ref_internal): Add make_provisional argument. If make_provisional, populate the offset and fill in the ctf_prov_strtab accordingly. (ctf_str_add): Return the offset, not the string. (ctf_str_add_ref): Likewise. (ctf_str_add_external): Return a success integer. (ctf_str_remove_ref): New, remove a single ref. (ctf_str_count_strtab): Do not count the initial null string's length or the existence or length of any unreferenced internal atoms. (ctf_str_populate_sorttab): Skip atoms with no refs. (ctf_str_write_strtab): Populate the nullstr earlier. Add one to the cts_len for the null string, since it is no longer done in ctf_str_count_strtab. Adjust for csa_external_offset rename. Populate the csa_offset for both internal and external cases. Flush the ctf_prov_strtab afterwards, and reset the ctf_str_prov_offset. * ctf-create.c (ctf_grow_ptrtab): New. (ctf_create): Call it. Initialize new fields rather than old ones. Tell ctf_bufopen_internal that this is a writable dictionary. Set the ctl hashes and data model. (ctf_update): Rename to... (ctf_serialize): ... this. Leave a compatibility function behind. Tell ctf_simple_open_internal that this is a writable dictionary. Pass the new fields along from the old dictionary. Drop ctf_dtnextid and ctf_dtbyname. Use ctf_strraw, not dtd_name. Do not zero out the DTD's ctt_name. (ctf_prefixed_name): Rename to... (ctf_name_table): ... this. No longer return a prefixed name: return the applicable name table instead. (ctf_dtd_insert): Use it, and use the right name table. Pass in the kind we're adding. Migrate away from dtd_name. (ctf_dtd_delete): Adjust similarly. Remove the ref to the deleted ctt_name. (ctf_dtd_lookup_type_by_name): Remove. (ctf_dynamic_type): Always return NULL on read-only dictionaries. No longer check ctf_dtnextid: check ctf_typemax instead. (ctf_snapshot): No longer use ctf_dtnextid: use ctf_typemax instead. (ctf_rollback): Likewise. No longer fail with ECTF_OVERROLLBACK. Use ctf_name_table and the right name table, and migrate away from dtd_name as in ctf_dtd_delete. (ctf_add_generic): Pass in the kind explicitly and pass it to ctf_dtd_insert. Use ctf_typemax, not ctf_dtnextid. Migrate away from dtd_name to using ctf_str_add_ref to populate the ctt_name. Grow the ptrtab if needed. (ctf_add_encoded): Pass in the kind. (ctf_add_slice): Likewise. (ctf_add_array): Likewise. (ctf_add_function): Likewise. (ctf_add_typedef): Likewise. (ctf_add_reftype): Likewise. Initialize the ctf_ptrtab, checking ctt_name rather than dtd_name. (ctf_add_struct_sized): Pass in the kind. Use ctf_lookup_by_rawname, not ctf_hash_lookup_type / ctf_dtd_lookup_type_by_name. (ctf_add_union_sized): Likewise. (ctf_add_enum): Likewise. (ctf_add_enum_encoded): Likewise. (ctf_add_forward): Likewise. (ctf_add_type): Likewise. (ctf_compress_write): Call ctf_serialize: adjust for ctf_size not being initialized until after the call. (ctf_write_mem): Likewise. (ctf_write): Likewise. * ctf-archive.c (arc_write_one_ctf): Likewise. * ctf-lookup.c (ctf_lookup_by_name): Use ctf_lookuup_by_rawhash, not ctf_hash_lookup_type. (ctf_lookup_by_id): No longer check the readonly types if the dictionary is writable. * ctf-open.c (init_types): Assert that this dictionary is not writable. Adjust to use the new name hashes, ctf_name_table, and ctf_ptrtab_len. GNU style fix for the final ptrtab scan. (ctf_bufopen_internal): New 'writable' parameter. Flip on LCTF_RDWR if set. Drop out early when dictionary is writable. Split the ctf_lookups initialization into... (ctf_set_cth_hashes): ... this new function. (ctf_simple_open_internal): Adjust. New 'writable' parameter. (ctf_simple_open): Adjust accordingly. (ctf_bufopen): Likewise. (ctf_file_close): Destroy the appropriate name hashes. No longer destroy ctf_dtbyname, which is gone. (ctf_getdatasect): Remove spurious "extern". * ctf-types.c (ctf_lookup_by_rawname): New, look up types in the specified name table, given a kind. (ctf_lookup_by_rawhash): Likewise, given a ctf_names_t *. (ctf_member_iter): Add support for iterating over the dynamic type list. (ctf_enum_iter): Likewise. (ctf_variable_iter): Likewise. (ctf_type_rvisit): Likewise. (ctf_member_info): Add support for types in the dynamic type list. (ctf_enum_name): Likewise. (ctf_enum_value): Likewise. (ctf_func_type_info): Likewise. (ctf_func_type_args): Likewise. * ctf-link.c (ctf_accumulate_archive_names): No longer call ctf_update. (ctf_link_write): Likewise. (ctf_link_intern_extern_string): Adjust for new ctf_str_add_external return value. (ctf_link_add_strtab): Likewise. * ctf-util.c (ctf_list_empty_p): New.
2019-08-08 00:55:09 +08:00
ctf_type_t dtd_data; /* Type node, including name. */
union
{
ctf_list_t dtu_members; /* struct, union, or enum */
ctf_arinfo_t dtu_arr; /* array */
ctf_encoding_t dtu_enc; /* integer or float */
uint32_t *dtu_argv; /* function */
ctf_slice_t dtu_slice; /* slice */
} dtd_u;
} ctf_dtdef_t;
typedef struct ctf_dvdef
{
ctf_list_t dvd_list; /* List forward/back pointers. */
char *dvd_name; /* Name associated with variable. */
ctf_id_t dvd_type; /* Type of variable. */
unsigned long dvd_snapshots; /* Snapshot count when inserted. */
} ctf_dvdef_t;
typedef struct ctf_bundle
{
ctf_file_t *ctb_file; /* CTF container handle. */
ctf_id_t ctb_type; /* CTF type identifier. */
ctf_dtdef_t *ctb_dtd; /* CTF dynamic type definition (if any). */
} ctf_bundle_t;
libctf, ld, binutils: add textual error/warning reporting for libctf This commit adds a long-missing piece of infrastructure to libctf: the ability to report errors and warnings using all the power of printf, rather than being restricted to one errno value. Internally, libctf calls ctf_err_warn() to add errors and warnings to a list: a new iterator ctf_errwarning_next() then consumes this list one by one and hands it to the caller, which can free it. New errors and warnings are added until the list is consumed by the caller or the ctf_file_t is closed, so you can dump them at intervals. The caller can of course choose to print only those warnings it wants. (I am not sure whether we want objdump, readelf or ld to print warnings or not: right now I'm printing them, but maybe we only want to print errors? This entirely depends on whether warnings are voluminous things describing e.g. the inability to emit single types because of name clashes or something. There are no users of this infrastructure yet, so it's hard to say.) There is no internationalization here yet, but this at least adds a place where internationalization can be added, to one of ctf_errwarning_next or ctf_err_warn. We also provide a new ctf_assert() function which uses this infrastructure to provide non-fatal assertion failures while emitting an assert-like string to the caller: to save space and avoid needlessly duplicating unchanging strings, the assertion test is inlined but the print-things-out failure case is not. All assertions in libctf will be converted to use this machinery in future commits and propagate assertion-failure errors up, so that the linker in particular cannot be killed by libctf assertion failures when it could perfectly well just print warnings and drop the CTF section. include/ * ctf-api.h (ECTF_INTERNAL): Adjust error text. (ctf_errwarning_next): New. libctf/ * ctf-impl.h (ctf_assert): New. (ctf_err_warning_t): Likewise. (ctf_file_t) <ctf_errs_warnings>: Likewise. (ctf_err_warn): New prototype. (ctf_assert_fail_internal): Likewise. * ctf-inlines.h (ctf_assert_internal): Likewise. * ctf-open.c (ctf_file_close): Free ctf_errs_warnings. * ctf-create.c (ctf_serialize): Copy it on serialization. * ctf-subr.c (ctf_err_warn): New, add an error/warning. (ctf_errwarning_next): New iterator, free and pass back errors/warnings in succession. * libctf.ver (ctf_errwarning_next): Add. ld/ * ldlang.c (lang_ctf_errs_warnings): New, print CTF errors and warnings. Assert when libctf asserts. (lang_merge_ctf): Call it. (land_write_ctf): Likewise. binutils/ * objdump.c (ctf_archive_member): Print CTF errors and warnings. * readelf.c (dump_ctf_archive_member): Likewise.
2020-06-04 22:07:54 +08:00
typedef struct ctf_err_warning
{
ctf_list_t cew_list; /* List forward/back pointers. */
int cew_is_warning; /* 1 if warning, 0 if error. */
char *cew_text; /* Error/warning text. */
} ctf_err_warning_t;
libctf: deduplicate and sort the string table ctf.h states: > [...] the CTF string table does not contain any duplicated strings. Unfortunately this is entirely untrue: libctf has before now made no attempt whatsoever to deduplicate the string table. It computes the string table's length on the fly as it adds new strings to the dynamic CTF file, and ctf_update() just writes each string to the table and notes the current write position as it traverses the dynamic CTF file's data structures and builds the final CTF buffer. There is no global view of the strings and no deduplication. Fix this by erasing the ctf_dtvstrlen dead-reckoning length, and adding a new dynhash table ctf_str_atoms that maps unique strings to a list of references to those strings: a reference is a simple uint32_t * to some value somewhere in the under-construction CTF buffer that needs updating to note the string offset when the strtab is laid out. Adding a string is now a simple matter of calling ctf_str_add_ref(), which adds a new atom to the atoms table, if one doesn't already exist, and adding the location of the reference to this atom to the refs list attached to the atom: this works reliably as long as one takes care to only call ctf_str_add_ref() once the final location of the offset is known (so you can't call it on a temporary structure and then memcpy() that structure into place in the CTF buffer, because the ref will still point to the old location: ctf_update() changes accordingly). Generating the CTF string table is a matter of calling ctf_str_write_strtab(), which counts the length and number of elements in the atoms table using the ctf_dynhash_iter() function we just added, populating an array of pointers into the atoms table and sorting it into order (to help compressors), then traversing this table and emitting it, updating the refs to each atom as we go. The only complexity here is arranging to keep the null string at offset zero, since a lot of code in libctf depends on being able to leave strtab references at 0 to indicate 'no name'. Once the table is constructed and the refs updated, we know how long it is, so we can realloc() the partial CTF buffer we allocated earlier and can copy the table on to the end of it (and purge the refs because they're not needed any more and have been invalidated by the realloc() call in any case). The net effect of all this is a reduction in uncompressed strtab sizes of about 30% (perhaps a quarter to a half of all strings across the Linux kernel are eliminated as duplicates). Of course, duplicated strings are highly redundant, so the space saving after compression is only about 20%: when the other non-strtab sections are factored in, CTF sizes shrink by about 10%. No change in externally-visible API or file format (other than the reduction in pointless redundancy). libctf/ * ctf-impl.h: (struct ctf_strs_writable): New, non-const version of struct ctf_strs. (struct ctf_dtdef): Note that dtd_data.ctt_name is unpopulated. (struct ctf_str_atom): New, disambiguated single string. (struct ctf_str_atom_ref): New, points to some other location that references this string's offset. (struct ctf_file): New members ctf_str_atoms and ctf_str_num_refs. Remove member ctf_dtvstrlen: we no longer track the total strlen as we add strings. (ctf_str_create_atoms): Declare new function in ctf-string.c. (ctf_str_free_atoms): Likewise. (ctf_str_add): Likewise. (ctf_str_add_ref): Likewise. (ctf_str_purge_refs): Likewise. (ctf_str_write_strtab): Likewise. (ctf_realloc): Declare new function in ctf-util.c. * ctf-open.c (ctf_bufopen): Create the atoms table. (ctf_file_close): Destroy it. * ctf-create.c (ctf_update): Copy-and-free it on update. No longer special-case the position of the parname string. Construct the strtab by calling ctf_str_add_ref and ctf_str_write_strtab after the rest of each buffer element is constructed, not via open-coding: realloc the CTF buffer and append the strtab to it. No longer maintain ctf_dtvstrlen. Sort the variable entry table later, after strtab construction. (ctf_copy_membnames): Remove: integrated into ctf_copy_{s,l,e}members. (ctf_copy_smembers): Drop the string offset: call ctf_str_add_ref after buffer element construction instead. (ctf_copy_lmembers): Likewise. (ctf_copy_emembers): Likewise. (ctf_create): No longer maintain the ctf_dtvstrlen. (ctf_dtd_delete): Likewise. (ctf_dvd_delete): Likewise. (ctf_add_generic): Likewise. (ctf_add_enumerator): Likewise. (ctf_add_member_offset): Likewise. (ctf_add_variable): Likewise. (membadd): Likewise. * ctf-util.c (ctf_realloc): New, wrapper around realloc that aborts if there are active ctf_str_num_refs. (ctf_strraw): Move to ctf-string.c. (ctf_strptr): Likewise. * ctf-string.c: New file, strtab manipulation. * Makefile.am (libctf_a_SOURCES): Add it. * Makefile.in: Regenerate.
2019-06-27 20:51:10 +08:00
/* Atoms associate strings with a list of the CTF items that reference that
string, so that ctf_update() can instantiate all the strings using the
ctf_str_atoms and then reassociate them with the real string later.
Strings can be interned into ctf_str_atom without having refs associated
with them, for values that are returned to callers, etc. Items are only
removed from this table on ctf_close(), but on every ctf_update(), all the
csa_refs in all entries are purged. */
typedef struct ctf_str_atom
{
const char *csa_str; /* Backpointer to string (hash key). */
ctf_list_t csa_refs; /* This string's refs. */
libctf: avoid the need to ever use ctf_update The method of operation of libctf when the dictionary is writable has before now been that types that are added land in the dynamic type section, which is a linked list and hash of IDs -> dynamic type definitions (and, recently a hash of names): the DTDs are a bit of CTF representing the ctf_type_t and ad hoc C structures representing the vlen. Historically, libctf was unable to do anything with these types, not even look them up by ID, let alone by name: if you wanted to do that say if you were adding a type that depended on one you just added) you called ctf_update, which serializes all the DTDs into a CTF file and reopens it, copying its guts over the fp it's called with. The ctf_updated types are then frozen in amber and unchangeable: all lookups will return the types in the static portion in preference to the dynamic portion, and we will refuse to re-add things that already exist in the static portion (and, of late, in the dynamic portion too). The libctf machinery remembers the boundary between static and dynamic types and looks in the right portion for each type. Lots of things still don't quite work with dynamic types (e.g. getting their size), but enough works to do a bunch of additions and then a ctf_update, most of the time. Except it doesn't, because ctf_add_type finds it necessary to walk the full dynamic type definition list looking for types with matching names, so it gets slower and slower with every type you add: fixing this requires calling ctf_update periodically for no other reason than to avoid massively slowing things down. This is all clunky and very slow but kind of works, until you consider that it is in fact possible and indeed necessary to modify one sort of type after it has been added: forwards. These are necessarily promoted to structs, unions or enums, and when they do so *their type ID does not change*. So all of a sudden we are changing types that already exist in the static portion. ctf_update gets massively confused by this and allocates space enough for the forward (with no members), but then emits the new dynamic type (with all the members) into it. You get an assertion failure after that, if you're lucky, or a coredump. So this commit rejigs things a bit and arranges to exclusively use the dynamic type definitions in writable dictionaries, and the static type definitions in readable dictionaries: we don't at any time have a mixture of static and dynamic types, and you don't need to call ctf_update to make things "appear". The ctf_dtbyname hash I introduced a few months ago, which maps things like "struct foo" to DTDs, is removed, replaced instead by a change of type of the four dictionaries which track names. Rather than just being (unresizable) ctf_hash_t's populated only at ctf_bufopen time, they are now a ctf_names_t structure, which is a pair of ctf_hash_t and ctf_dynhash_t, with the ctf_hash_t portion being used in readonly dictionaries, and the ctf_dynhash_t being used in writable ones. The decision as to which to use is centralized in the new functions ctf_lookup_by_rawname (which takes a type kind) and ctf_lookup_by_rawhash, which it calls (which takes a ctf_names_t *.) This change lets us switch from using static to dynamic name hashes on the fly across the entirety of libctf without complexifying anything: in fact, because we now centralize the knowledge about how to map from type kind to name hash, it actually simplifies things and lets us throw out quite a lot of now-unnecessary complexity, from ctf_dtnyname (replaced by the dynamic half of the name tables), through to ctf_dtnextid (now that a dictionary's static portion is never referenced if the dictionary is writable, we can just use ctf_typemax to indicate the maximum type: dynamic or non-dynamic does not matter, and we no longer need to track the boundary between the types). You can now ctf_rollback() as far as you like, even past a ctf_update or for that matter a full writeout; all the iteration functions work just as well on writable as on read-only dictionaries; ctf_add_type no longer needs expensive duplicated code to run over the dynamic types hunting for ones it might be interested in; and the linker no longer needs a hack to call ctf_update so that calling ctf_add_type is not impossibly expensive. There is still a bit more complexity: some new code paths in ctf-types.c need to know how to extract information from dynamic types. This complexity will go away again in a few months when libctf acquires a proper intermediate representation. You can still call ctf_update if you like (it's public API, after all), but its only effect now is to set the point to which ctf_discard rolls back. Obviously *something* still needs to serialize the CTF file before writeout, and this job is done by ctf_serialize, which does everything ctf_update used to except set the counter used by ctf_discard. It is automatically called by the various functions that do CTF writeout: nobody else ever needs to call it. With this in place, forwards that are promoted to non-forwards no longer crash the link, even if it happens tens of thousands of types later. v5: fix tabdamage. libctf/ * ctf-impl.h (ctf_names_t): New. (ctf_lookup_t) <ctf_hash>: Now a ctf_names_t, not a ctf_hash_t. (ctf_file_t) <ctf_structs>: Likewise. <ctf_unions>: Likewise. <ctf_enums>: Likewise. <ctf_names>: Likewise. <ctf_lookups>: Improve comment. <ctf_ptrtab_len>: New. <ctf_prov_strtab>: New. <ctf_str_prov_offset>: New. <ctf_dtbyname>: Remove, redundant to the names hashes. <ctf_dtnextid>: Remove, redundant to ctf_typemax. (ctf_dtdef_t) <dtd_name>: Remove. <dtd_data>: Note that the ctt_name is now populated. (ctf_str_atom_t) <csa_offset>: This is now the strtab offset for internal strings too. <csa_external_offset>: New, the external strtab offset. (CTF_INDEX_TO_TYPEPTR): Handle the LCTF_RDWR case. (ctf_name_table): New declaration. (ctf_lookup_by_rawname): Likewise. (ctf_lookup_by_rawhash): Likewise. (ctf_set_ctl_hashes): Likewise. (ctf_serialize): Likewise. (ctf_dtd_insert): Adjust. (ctf_simple_open_internal): Likewise. (ctf_bufopen_internal): Likewise. (ctf_list_empty_p): Likewise. (ctf_str_remove_ref): Likewise. (ctf_str_add): Returns uint32_t now. (ctf_str_add_ref): Likewise. (ctf_str_add_external): Now returns a boolean (int). * ctf-string.c (ctf_strraw_explicit): Check the ctf_prov_strtab for strings in the appropriate range. (ctf_str_create_atoms): Create the ctf_prov_strtab. Detect OOM when adding the null string to the new strtab. (ctf_str_free_atoms): Destroy the ctf_prov_strtab. (ctf_str_add_ref_internal): Add make_provisional argument. If make_provisional, populate the offset and fill in the ctf_prov_strtab accordingly. (ctf_str_add): Return the offset, not the string. (ctf_str_add_ref): Likewise. (ctf_str_add_external): Return a success integer. (ctf_str_remove_ref): New, remove a single ref. (ctf_str_count_strtab): Do not count the initial null string's length or the existence or length of any unreferenced internal atoms. (ctf_str_populate_sorttab): Skip atoms with no refs. (ctf_str_write_strtab): Populate the nullstr earlier. Add one to the cts_len for the null string, since it is no longer done in ctf_str_count_strtab. Adjust for csa_external_offset rename. Populate the csa_offset for both internal and external cases. Flush the ctf_prov_strtab afterwards, and reset the ctf_str_prov_offset. * ctf-create.c (ctf_grow_ptrtab): New. (ctf_create): Call it. Initialize new fields rather than old ones. Tell ctf_bufopen_internal that this is a writable dictionary. Set the ctl hashes and data model. (ctf_update): Rename to... (ctf_serialize): ... this. Leave a compatibility function behind. Tell ctf_simple_open_internal that this is a writable dictionary. Pass the new fields along from the old dictionary. Drop ctf_dtnextid and ctf_dtbyname. Use ctf_strraw, not dtd_name. Do not zero out the DTD's ctt_name. (ctf_prefixed_name): Rename to... (ctf_name_table): ... this. No longer return a prefixed name: return the applicable name table instead. (ctf_dtd_insert): Use it, and use the right name table. Pass in the kind we're adding. Migrate away from dtd_name. (ctf_dtd_delete): Adjust similarly. Remove the ref to the deleted ctt_name. (ctf_dtd_lookup_type_by_name): Remove. (ctf_dynamic_type): Always return NULL on read-only dictionaries. No longer check ctf_dtnextid: check ctf_typemax instead. (ctf_snapshot): No longer use ctf_dtnextid: use ctf_typemax instead. (ctf_rollback): Likewise. No longer fail with ECTF_OVERROLLBACK. Use ctf_name_table and the right name table, and migrate away from dtd_name as in ctf_dtd_delete. (ctf_add_generic): Pass in the kind explicitly and pass it to ctf_dtd_insert. Use ctf_typemax, not ctf_dtnextid. Migrate away from dtd_name to using ctf_str_add_ref to populate the ctt_name. Grow the ptrtab if needed. (ctf_add_encoded): Pass in the kind. (ctf_add_slice): Likewise. (ctf_add_array): Likewise. (ctf_add_function): Likewise. (ctf_add_typedef): Likewise. (ctf_add_reftype): Likewise. Initialize the ctf_ptrtab, checking ctt_name rather than dtd_name. (ctf_add_struct_sized): Pass in the kind. Use ctf_lookup_by_rawname, not ctf_hash_lookup_type / ctf_dtd_lookup_type_by_name. (ctf_add_union_sized): Likewise. (ctf_add_enum): Likewise. (ctf_add_enum_encoded): Likewise. (ctf_add_forward): Likewise. (ctf_add_type): Likewise. (ctf_compress_write): Call ctf_serialize: adjust for ctf_size not being initialized until after the call. (ctf_write_mem): Likewise. (ctf_write): Likewise. * ctf-archive.c (arc_write_one_ctf): Likewise. * ctf-lookup.c (ctf_lookup_by_name): Use ctf_lookuup_by_rawhash, not ctf_hash_lookup_type. (ctf_lookup_by_id): No longer check the readonly types if the dictionary is writable. * ctf-open.c (init_types): Assert that this dictionary is not writable. Adjust to use the new name hashes, ctf_name_table, and ctf_ptrtab_len. GNU style fix for the final ptrtab scan. (ctf_bufopen_internal): New 'writable' parameter. Flip on LCTF_RDWR if set. Drop out early when dictionary is writable. Split the ctf_lookups initialization into... (ctf_set_cth_hashes): ... this new function. (ctf_simple_open_internal): Adjust. New 'writable' parameter. (ctf_simple_open): Adjust accordingly. (ctf_bufopen): Likewise. (ctf_file_close): Destroy the appropriate name hashes. No longer destroy ctf_dtbyname, which is gone. (ctf_getdatasect): Remove spurious "extern". * ctf-types.c (ctf_lookup_by_rawname): New, look up types in the specified name table, given a kind. (ctf_lookup_by_rawhash): Likewise, given a ctf_names_t *. (ctf_member_iter): Add support for iterating over the dynamic type list. (ctf_enum_iter): Likewise. (ctf_variable_iter): Likewise. (ctf_type_rvisit): Likewise. (ctf_member_info): Add support for types in the dynamic type list. (ctf_enum_name): Likewise. (ctf_enum_value): Likewise. (ctf_func_type_info): Likewise. (ctf_func_type_args): Likewise. * ctf-link.c (ctf_accumulate_archive_names): No longer call ctf_update. (ctf_link_write): Likewise. (ctf_link_intern_extern_string): Adjust for new ctf_str_add_external return value. (ctf_link_add_strtab): Likewise. * ctf-util.c (ctf_list_empty_p): New.
2019-08-08 00:55:09 +08:00
uint32_t csa_offset; /* Strtab offset, if any. */
uint32_t csa_external_offset; /* External strtab offset, if any. */
libctf: deduplicate and sort the string table ctf.h states: > [...] the CTF string table does not contain any duplicated strings. Unfortunately this is entirely untrue: libctf has before now made no attempt whatsoever to deduplicate the string table. It computes the string table's length on the fly as it adds new strings to the dynamic CTF file, and ctf_update() just writes each string to the table and notes the current write position as it traverses the dynamic CTF file's data structures and builds the final CTF buffer. There is no global view of the strings and no deduplication. Fix this by erasing the ctf_dtvstrlen dead-reckoning length, and adding a new dynhash table ctf_str_atoms that maps unique strings to a list of references to those strings: a reference is a simple uint32_t * to some value somewhere in the under-construction CTF buffer that needs updating to note the string offset when the strtab is laid out. Adding a string is now a simple matter of calling ctf_str_add_ref(), which adds a new atom to the atoms table, if one doesn't already exist, and adding the location of the reference to this atom to the refs list attached to the atom: this works reliably as long as one takes care to only call ctf_str_add_ref() once the final location of the offset is known (so you can't call it on a temporary structure and then memcpy() that structure into place in the CTF buffer, because the ref will still point to the old location: ctf_update() changes accordingly). Generating the CTF string table is a matter of calling ctf_str_write_strtab(), which counts the length and number of elements in the atoms table using the ctf_dynhash_iter() function we just added, populating an array of pointers into the atoms table and sorting it into order (to help compressors), then traversing this table and emitting it, updating the refs to each atom as we go. The only complexity here is arranging to keep the null string at offset zero, since a lot of code in libctf depends on being able to leave strtab references at 0 to indicate 'no name'. Once the table is constructed and the refs updated, we know how long it is, so we can realloc() the partial CTF buffer we allocated earlier and can copy the table on to the end of it (and purge the refs because they're not needed any more and have been invalidated by the realloc() call in any case). The net effect of all this is a reduction in uncompressed strtab sizes of about 30% (perhaps a quarter to a half of all strings across the Linux kernel are eliminated as duplicates). Of course, duplicated strings are highly redundant, so the space saving after compression is only about 20%: when the other non-strtab sections are factored in, CTF sizes shrink by about 10%. No change in externally-visible API or file format (other than the reduction in pointless redundancy). libctf/ * ctf-impl.h: (struct ctf_strs_writable): New, non-const version of struct ctf_strs. (struct ctf_dtdef): Note that dtd_data.ctt_name is unpopulated. (struct ctf_str_atom): New, disambiguated single string. (struct ctf_str_atom_ref): New, points to some other location that references this string's offset. (struct ctf_file): New members ctf_str_atoms and ctf_str_num_refs. Remove member ctf_dtvstrlen: we no longer track the total strlen as we add strings. (ctf_str_create_atoms): Declare new function in ctf-string.c. (ctf_str_free_atoms): Likewise. (ctf_str_add): Likewise. (ctf_str_add_ref): Likewise. (ctf_str_purge_refs): Likewise. (ctf_str_write_strtab): Likewise. (ctf_realloc): Declare new function in ctf-util.c. * ctf-open.c (ctf_bufopen): Create the atoms table. (ctf_file_close): Destroy it. * ctf-create.c (ctf_update): Copy-and-free it on update. No longer special-case the position of the parname string. Construct the strtab by calling ctf_str_add_ref and ctf_str_write_strtab after the rest of each buffer element is constructed, not via open-coding: realloc the CTF buffer and append the strtab to it. No longer maintain ctf_dtvstrlen. Sort the variable entry table later, after strtab construction. (ctf_copy_membnames): Remove: integrated into ctf_copy_{s,l,e}members. (ctf_copy_smembers): Drop the string offset: call ctf_str_add_ref after buffer element construction instead. (ctf_copy_lmembers): Likewise. (ctf_copy_emembers): Likewise. (ctf_create): No longer maintain the ctf_dtvstrlen. (ctf_dtd_delete): Likewise. (ctf_dvd_delete): Likewise. (ctf_add_generic): Likewise. (ctf_add_enumerator): Likewise. (ctf_add_member_offset): Likewise. (ctf_add_variable): Likewise. (membadd): Likewise. * ctf-util.c (ctf_realloc): New, wrapper around realloc that aborts if there are active ctf_str_num_refs. (ctf_strraw): Move to ctf-string.c. (ctf_strptr): Likewise. * ctf-string.c: New file, strtab manipulation. * Makefile.am (libctf_a_SOURCES): Add it. * Makefile.in: Regenerate.
2019-06-27 20:51:10 +08:00
unsigned long csa_snapshot_id; /* Snapshot ID at time of creation. */
} ctf_str_atom_t;
/* The refs of a single string in the atoms table. */
typedef struct ctf_str_atom_ref
{
ctf_list_t caf_list; /* List forward/back pointers. */
uint32_t *caf_ref; /* A single ref to this string. */
} ctf_str_atom_ref_t;
/* The structure used as the key in a ctf_link_type_mapping. The value is a
type index, not a type ID. */
libctf: map from old to corresponding newly-added types in ctf_add_type This lets you call ctf_type_mapping (dest_fp, src_fp, src_type_id) and get told what type ID the corresponding type has in the target ctf_file_t. This works even if it was added by a recursive call, and because it is stored in the target ctf_file_t it works even if we had to add one type to multiple ctf_file_t's as part of conflicting type handling. We empty out this mapping after every archive is linked: because it maps input to output fps, and we only visit each input fp once, its contents are rendered entirely useless every time the source fp changes. v3: add several missing mapping additions. Add ctf_dynhash_empty, and empty after every input archive. v5: fix tabdamage. libctf/ * ctf-impl.h (ctf_file_t): New field ctf_link_type_mapping. (struct ctf_link_type_mapping_key): New. (ctf_hash_type_mapping_key): Likewise. (ctf_hash_eq_type_mapping_key): Likewise. (ctf_add_type_mapping): Likewise. (ctf_type_mapping): Likewise. (ctf_dynhash_empty): Likewise. * ctf-open.c (ctf_file_close): Update accordingly. * ctf-create.c (ctf_update): Likewise. (ctf_add_type): Populate the mapping. * ctf-hash.c (ctf_hash_type_mapping_key): Hash a type mapping key. (ctf_hash_eq_type_mapping_key): Check the key for equality. (ctf_dynhash_insert): Fix comment typo. (ctf_dynhash_empty): New. * ctf-link.c (ctf_add_type_mapping): New. (ctf_type_mapping): Likewise. (empty_link_type_mapping): New. (ctf_link_one_input_archive): Call it.
2019-07-14 04:31:26 +08:00
typedef struct ctf_link_type_key
libctf: map from old to corresponding newly-added types in ctf_add_type This lets you call ctf_type_mapping (dest_fp, src_fp, src_type_id) and get told what type ID the corresponding type has in the target ctf_file_t. This works even if it was added by a recursive call, and because it is stored in the target ctf_file_t it works even if we had to add one type to multiple ctf_file_t's as part of conflicting type handling. We empty out this mapping after every archive is linked: because it maps input to output fps, and we only visit each input fp once, its contents are rendered entirely useless every time the source fp changes. v3: add several missing mapping additions. Add ctf_dynhash_empty, and empty after every input archive. v5: fix tabdamage. libctf/ * ctf-impl.h (ctf_file_t): New field ctf_link_type_mapping. (struct ctf_link_type_mapping_key): New. (ctf_hash_type_mapping_key): Likewise. (ctf_hash_eq_type_mapping_key): Likewise. (ctf_add_type_mapping): Likewise. (ctf_type_mapping): Likewise. (ctf_dynhash_empty): Likewise. * ctf-open.c (ctf_file_close): Update accordingly. * ctf-create.c (ctf_update): Likewise. (ctf_add_type): Populate the mapping. * ctf-hash.c (ctf_hash_type_mapping_key): Hash a type mapping key. (ctf_hash_eq_type_mapping_key): Check the key for equality. (ctf_dynhash_insert): Fix comment typo. (ctf_dynhash_empty): New. * ctf-link.c (ctf_add_type_mapping): New. (ctf_type_mapping): Likewise. (empty_link_type_mapping): New. (ctf_link_one_input_archive): Call it.
2019-07-14 04:31:26 +08:00
{
ctf_file_t *cltk_fp;
ctf_id_t cltk_idx;
} ctf_link_type_key_t;
/* The structure used as the key in a cd_id_to_file_t on 32-bit platforms. */
typedef struct ctf_type_id_key
{
int ctii_input_num;
ctf_id_t ctii_type;
} ctf_type_id_key_t;
/* Deduplicator state.
The dedup state below uses three terms consistently. A "hash" is a
ctf_dynhash_t; a "hash value" is the hash value of a type as returned by
ctf_dedup_hash_type; a "global type ID" or "global ID" is a packed-together
reference to a single ctf_file_t (by array index in an array of inputs) and
ctf_id_t, i.e. a single instance of some hash value in some input.
The deduplication algorithm takes a bunch of inputs and yields a single
shared "output" and possibly many outputs corresponding to individual inputs
that still contain types after sharing of unconflicted types. Almost all
deduplicator state is stored in the struct ctf_dedup in the output, though a
(very) few things are stored in inputs for simplicity's sake, usually if they
are linking together things within the scope of a single TU.
Flushed at the end of every ctf_dedup run. */
typedef struct ctf_dedup
{
/* The CTF linker flags in force for this dedup run. */
int cd_link_flags;
/* On 32-bit platforms only, a hash of global type IDs, in the form of
a ctf_link_type_id_key_t. */
ctf_dynhash_t *cd_id_to_file_t;
/* Atoms tables of decorated names: maps undecorated name to decorated name.
(The actual allocations are in the CTF file for the former and the real
atoms table for the latter). Uses the same namespaces as ctf_lookups,
below, but has no need for null-termination. */
ctf_dynhash_t *cd_decorated_names[4];
/* Map type names to a hash from type hash value -> number of times each value
has appeared. */
ctf_dynhash_t *cd_name_counts;
/* Map global type IDs to type hash values. Used to determine if types are
already hashed without having to recompute their hash values again, and to
link types together at later stages. Forwards that are peeked through to
structs and unions are not represented in here, so lookups that might be
such a type (in practice, all lookups) must go via cd_replaced_types first
to take this into account. Discarded before each rehashing. */
ctf_dynhash_t *cd_type_hashes;
/* Maps from the names of structs/unions/enums to a a single GID which is the
only appearance of that type in any input: if it appears in more than one
input, a value which is a GID with an input_num of -1 appears. Used in
share-duplicated link mode link modes to determine whether structs/unions
can be cited from multiple TUs. Only populated in that link mode. */
ctf_dynhash_t *cd_struct_origin;
/* Maps type hash values to a set of hash values of the types that cite them:
i.e., pointing backwards up the type graph. Used for recursive conflict
marking. Citations from tagged structures, unions, and forwards do not
appear in this graph. */
ctf_dynhash_t *cd_citers;
/* Maps type hash values to input global type IDs. The value is a set (a
hash) of global type IDs. Discarded before each rehashing. The result of
the ctf_dedup function. */
ctf_dynhash_t *cd_output_mapping;
/* A map giving the GID of the first appearance of each type for each type
hash value. */
ctf_dynhash_t *cd_output_first_gid;
/* Used to ensure that we never try to map a single type ID to more than one
hash. */
ctf_dynhash_t *cd_output_mapping_guard;
/* Maps the global type IDs of structures in input TUs whose members still
need emission to the global type ID of the already-emitted target type
(which has no members yet) in the appropriate target. Uniquely, the latter
ID represents a *target* ID (i.e. the cd_output_mapping of some specified
input): we encode the shared (parent) dict with an ID of -1. */
ctf_dynhash_t *cd_emission_struct_members;
/* A set (a hash) of hash values of conflicting types. */
ctf_dynset_t *cd_conflicting_types;
/* Maps type hashes to ctf_id_t's in this dictionary. Populated only at
emission time, in the dictionary where emission is taking place. */
ctf_dynhash_t *cd_output_emission_hashes;
/* Maps the decorated names of conflicted cross-TU forwards that were forcibly
emitted in this TU to their emitted ctf_id_ts. Populated only at emission
time, in the dictionary where emission is taking place. */
ctf_dynhash_t *cd_output_emission_conflicted_forwards;
/* Points to the output counterpart of this input dictionary, at emission
time. */
ctf_file_t *cd_output;
} ctf_dedup_t;
libctf: map from old to corresponding newly-added types in ctf_add_type This lets you call ctf_type_mapping (dest_fp, src_fp, src_type_id) and get told what type ID the corresponding type has in the target ctf_file_t. This works even if it was added by a recursive call, and because it is stored in the target ctf_file_t it works even if we had to add one type to multiple ctf_file_t's as part of conflicting type handling. We empty out this mapping after every archive is linked: because it maps input to output fps, and we only visit each input fp once, its contents are rendered entirely useless every time the source fp changes. v3: add several missing mapping additions. Add ctf_dynhash_empty, and empty after every input archive. v5: fix tabdamage. libctf/ * ctf-impl.h (ctf_file_t): New field ctf_link_type_mapping. (struct ctf_link_type_mapping_key): New. (ctf_hash_type_mapping_key): Likewise. (ctf_hash_eq_type_mapping_key): Likewise. (ctf_add_type_mapping): Likewise. (ctf_type_mapping): Likewise. (ctf_dynhash_empty): Likewise. * ctf-open.c (ctf_file_close): Update accordingly. * ctf-create.c (ctf_update): Likewise. (ctf_add_type): Populate the mapping. * ctf-hash.c (ctf_hash_type_mapping_key): Hash a type mapping key. (ctf_hash_eq_type_mapping_key): Check the key for equality. (ctf_dynhash_insert): Fix comment typo. (ctf_dynhash_empty): New. * ctf-link.c (ctf_add_type_mapping): New. (ctf_type_mapping): Likewise. (empty_link_type_mapping): New. (ctf_link_one_input_archive): Call it.
2019-07-14 04:31:26 +08:00
/* The ctf_file is the structure used to represent a CTF container to library
clients, who see it only as an opaque pointer. Modifications can therefore
be made freely to this structure without regard to client versioning. The
ctf_file_t typedef appears in <ctf-api.h> and declares a forward tag.
NOTE: ctf_update() requires that everything inside of ctf_file either be an
immediate value, a pointer to dynamically allocated data *outside* of the
ctf_file itself, or a pointer to statically allocated data. If you add a
pointer to ctf_file that points to something within the ctf_file itself,
you must make corresponding changes to ctf_update(). */
struct ctf_file
{
const ctf_fileops_t *ctf_fileops; /* Version-specific file operations. */
libctf: allow the header to change between versions libctf supports dynamic upgrading of the type table as file format versions change, but before now has not supported changes to the CTF header. Doing this is complicated by the baroque storage method used: the CTF header is kept prepended to the rest of the CTF data, just as when read from the file, and written out from there, and is endian-flipped in place. This makes accessing it needlessly hard and makes it almost impossible to make the header larger if we add fields. The general storage machinery around the malloced ctf pointer (the 'ctf_base') is also overcomplicated: the pointer is sometimes malloced locally and sometimes assigned from a parameter, so freeing it requires checking to see if that parameter was used, needlessly coupling ctf_bufopen and ctf_file_close together. So split the header out into a new ctf_file_t.ctf_header, which is written out explicitly: squeeze it out of the CTF buffer whenever we reallocate it, and use ctf_file_t.ctf_buf to skip past the header when we do not need to reallocate (when no upgrading or endian-flipping is required). We now track whether the CTF base can be freed explicitly via a new ctf_dynbase pointer which is non-NULL only when freeing is possible. With all this done, we can upgrade the header on the fly and add new fields as desired, via a new upgrade_header function in ctf-open. As with other forms of upgrading, libctf upgrades older headers automatically to the latest supported version at open time. For a first use of this field, we add a new string field cth_cuname, and a corresponding setter/getter pair ctf_cuname_set and ctf_cuname: this is used by debuggers to determine whether a CTF section's types relate to a single compilation unit, or to all compilation units in the program. (Types with ambiguous definitions in different CUs have only one of these types placed in the top-level shared .ctf container: the rest are placed in much smaller per-CU containers, which have the shared container as their parent. Since CTF must be useful in the absence of DWARF, we store the names of the relevant CUs ourselves, so the debugger can look them up.) v5: fix tabdamage. include/ * ctf-api.h (ctf_cuname): New function. (ctf_cuname_set): Likewise. * ctf.h: Improve comment around upgrading, no longer implying that v2 is the target of upgrades (it is v3 now). (ctf_header_v2_t): New, old-format header for backward compatibility. (ctf_header_t): Add cth_cuname: this is the first of several header changes in format v3. libctf/ * ctf-impl.h (ctf_file_t): New fields ctf_header, ctf_dynbase, ctf_cuname, ctf_dyncuname: ctf_base and ctf_buf are no longer const. * ctf-open.c (ctf_set_base): Preserve the gap between ctf_buf and ctf_base: do not assume that it is always sizeof (ctf_header_t). Print out ctf_cuname: only print out ctf_parname if set. (ctf_free_base): Removed, ctf_base is no longer freed: free ctf_dynbase instead. (ctf_set_version): Fix spacing. (upgrade_header): New, in-place header upgrading. (upgrade_types): Rename to... (upgrade_types_v1): ... this. Free ctf_dynbase, not ctf_base. No longer track old and new headers separately. No longer allow for header sizes explicitly: squeeze the headers out on upgrade (they are preserved in fp->ctf_header). Set ctf_dynbase, ctf_base and ctf_buf explicitly. Use ctf_free, not ctf_free_base. (upgrade_types): New, also handle ctf_parmax updating. (flip_header): Flip ctf_cuname. (flip_types): Flip BUF explicitly rather than deriving BUF from BASE. (ctf_bufopen): Store the header in fp->ctf_header. Correct minimum required alignment of objtoff and funcoff. No longer store it in the ctf_buf unless that buf is derived unmodified from the input. Set ctf_dynbase where ctf_base is dynamically allocated. Drop locals that duplicate fields in ctf_file: move allocation of ctf_file further up instead. Call upgrade_header as needed. Move version-specific ctf_parmax initialization into upgrade_types. More concise error handling. (ctf_file_close): No longer test for null pointers before freeing. Free ctf_dyncuname, ctf_dynbase, and ctf_header. Do not call ctf_free_base. (ctf_cuname): New. (ctf_cuname_set): New. * ctf-create.c (ctf_update): Populate ctf_cuname. (ctf_gzwrite): Write out the header explicitly. Remove obsolescent comment. (ctf_write): Likewise. (ctf_compress_write): Get the header from ctf_header, not ctf_base. Fix the compression length: fp->ctf_size never counted the CTF header. Simplify the compress call accordingly.
2019-07-07 00:36:21 +08:00
struct ctf_header *ctf_header; /* The header from this CTF file. */
unsigned char ctf_openflags; /* Flags the file had when opened. */
ctf_sect_t ctf_data; /* CTF data from object file. */
ctf_sect_t ctf_symtab; /* Symbol table from object file. */
ctf_sect_t ctf_strtab; /* String table from object file. */
libctf: avoid the need to ever use ctf_update The method of operation of libctf when the dictionary is writable has before now been that types that are added land in the dynamic type section, which is a linked list and hash of IDs -> dynamic type definitions (and, recently a hash of names): the DTDs are a bit of CTF representing the ctf_type_t and ad hoc C structures representing the vlen. Historically, libctf was unable to do anything with these types, not even look them up by ID, let alone by name: if you wanted to do that say if you were adding a type that depended on one you just added) you called ctf_update, which serializes all the DTDs into a CTF file and reopens it, copying its guts over the fp it's called with. The ctf_updated types are then frozen in amber and unchangeable: all lookups will return the types in the static portion in preference to the dynamic portion, and we will refuse to re-add things that already exist in the static portion (and, of late, in the dynamic portion too). The libctf machinery remembers the boundary between static and dynamic types and looks in the right portion for each type. Lots of things still don't quite work with dynamic types (e.g. getting their size), but enough works to do a bunch of additions and then a ctf_update, most of the time. Except it doesn't, because ctf_add_type finds it necessary to walk the full dynamic type definition list looking for types with matching names, so it gets slower and slower with every type you add: fixing this requires calling ctf_update periodically for no other reason than to avoid massively slowing things down. This is all clunky and very slow but kind of works, until you consider that it is in fact possible and indeed necessary to modify one sort of type after it has been added: forwards. These are necessarily promoted to structs, unions or enums, and when they do so *their type ID does not change*. So all of a sudden we are changing types that already exist in the static portion. ctf_update gets massively confused by this and allocates space enough for the forward (with no members), but then emits the new dynamic type (with all the members) into it. You get an assertion failure after that, if you're lucky, or a coredump. So this commit rejigs things a bit and arranges to exclusively use the dynamic type definitions in writable dictionaries, and the static type definitions in readable dictionaries: we don't at any time have a mixture of static and dynamic types, and you don't need to call ctf_update to make things "appear". The ctf_dtbyname hash I introduced a few months ago, which maps things like "struct foo" to DTDs, is removed, replaced instead by a change of type of the four dictionaries which track names. Rather than just being (unresizable) ctf_hash_t's populated only at ctf_bufopen time, they are now a ctf_names_t structure, which is a pair of ctf_hash_t and ctf_dynhash_t, with the ctf_hash_t portion being used in readonly dictionaries, and the ctf_dynhash_t being used in writable ones. The decision as to which to use is centralized in the new functions ctf_lookup_by_rawname (which takes a type kind) and ctf_lookup_by_rawhash, which it calls (which takes a ctf_names_t *.) This change lets us switch from using static to dynamic name hashes on the fly across the entirety of libctf without complexifying anything: in fact, because we now centralize the knowledge about how to map from type kind to name hash, it actually simplifies things and lets us throw out quite a lot of now-unnecessary complexity, from ctf_dtnyname (replaced by the dynamic half of the name tables), through to ctf_dtnextid (now that a dictionary's static portion is never referenced if the dictionary is writable, we can just use ctf_typemax to indicate the maximum type: dynamic or non-dynamic does not matter, and we no longer need to track the boundary between the types). You can now ctf_rollback() as far as you like, even past a ctf_update or for that matter a full writeout; all the iteration functions work just as well on writable as on read-only dictionaries; ctf_add_type no longer needs expensive duplicated code to run over the dynamic types hunting for ones it might be interested in; and the linker no longer needs a hack to call ctf_update so that calling ctf_add_type is not impossibly expensive. There is still a bit more complexity: some new code paths in ctf-types.c need to know how to extract information from dynamic types. This complexity will go away again in a few months when libctf acquires a proper intermediate representation. You can still call ctf_update if you like (it's public API, after all), but its only effect now is to set the point to which ctf_discard rolls back. Obviously *something* still needs to serialize the CTF file before writeout, and this job is done by ctf_serialize, which does everything ctf_update used to except set the counter used by ctf_discard. It is automatically called by the various functions that do CTF writeout: nobody else ever needs to call it. With this in place, forwards that are promoted to non-forwards no longer crash the link, even if it happens tens of thousands of types later. v5: fix tabdamage. libctf/ * ctf-impl.h (ctf_names_t): New. (ctf_lookup_t) <ctf_hash>: Now a ctf_names_t, not a ctf_hash_t. (ctf_file_t) <ctf_structs>: Likewise. <ctf_unions>: Likewise. <ctf_enums>: Likewise. <ctf_names>: Likewise. <ctf_lookups>: Improve comment. <ctf_ptrtab_len>: New. <ctf_prov_strtab>: New. <ctf_str_prov_offset>: New. <ctf_dtbyname>: Remove, redundant to the names hashes. <ctf_dtnextid>: Remove, redundant to ctf_typemax. (ctf_dtdef_t) <dtd_name>: Remove. <dtd_data>: Note that the ctt_name is now populated. (ctf_str_atom_t) <csa_offset>: This is now the strtab offset for internal strings too. <csa_external_offset>: New, the external strtab offset. (CTF_INDEX_TO_TYPEPTR): Handle the LCTF_RDWR case. (ctf_name_table): New declaration. (ctf_lookup_by_rawname): Likewise. (ctf_lookup_by_rawhash): Likewise. (ctf_set_ctl_hashes): Likewise. (ctf_serialize): Likewise. (ctf_dtd_insert): Adjust. (ctf_simple_open_internal): Likewise. (ctf_bufopen_internal): Likewise. (ctf_list_empty_p): Likewise. (ctf_str_remove_ref): Likewise. (ctf_str_add): Returns uint32_t now. (ctf_str_add_ref): Likewise. (ctf_str_add_external): Now returns a boolean (int). * ctf-string.c (ctf_strraw_explicit): Check the ctf_prov_strtab for strings in the appropriate range. (ctf_str_create_atoms): Create the ctf_prov_strtab. Detect OOM when adding the null string to the new strtab. (ctf_str_free_atoms): Destroy the ctf_prov_strtab. (ctf_str_add_ref_internal): Add make_provisional argument. If make_provisional, populate the offset and fill in the ctf_prov_strtab accordingly. (ctf_str_add): Return the offset, not the string. (ctf_str_add_ref): Likewise. (ctf_str_add_external): Return a success integer. (ctf_str_remove_ref): New, remove a single ref. (ctf_str_count_strtab): Do not count the initial null string's length or the existence or length of any unreferenced internal atoms. (ctf_str_populate_sorttab): Skip atoms with no refs. (ctf_str_write_strtab): Populate the nullstr earlier. Add one to the cts_len for the null string, since it is no longer done in ctf_str_count_strtab. Adjust for csa_external_offset rename. Populate the csa_offset for both internal and external cases. Flush the ctf_prov_strtab afterwards, and reset the ctf_str_prov_offset. * ctf-create.c (ctf_grow_ptrtab): New. (ctf_create): Call it. Initialize new fields rather than old ones. Tell ctf_bufopen_internal that this is a writable dictionary. Set the ctl hashes and data model. (ctf_update): Rename to... (ctf_serialize): ... this. Leave a compatibility function behind. Tell ctf_simple_open_internal that this is a writable dictionary. Pass the new fields along from the old dictionary. Drop ctf_dtnextid and ctf_dtbyname. Use ctf_strraw, not dtd_name. Do not zero out the DTD's ctt_name. (ctf_prefixed_name): Rename to... (ctf_name_table): ... this. No longer return a prefixed name: return the applicable name table instead. (ctf_dtd_insert): Use it, and use the right name table. Pass in the kind we're adding. Migrate away from dtd_name. (ctf_dtd_delete): Adjust similarly. Remove the ref to the deleted ctt_name. (ctf_dtd_lookup_type_by_name): Remove. (ctf_dynamic_type): Always return NULL on read-only dictionaries. No longer check ctf_dtnextid: check ctf_typemax instead. (ctf_snapshot): No longer use ctf_dtnextid: use ctf_typemax instead. (ctf_rollback): Likewise. No longer fail with ECTF_OVERROLLBACK. Use ctf_name_table and the right name table, and migrate away from dtd_name as in ctf_dtd_delete. (ctf_add_generic): Pass in the kind explicitly and pass it to ctf_dtd_insert. Use ctf_typemax, not ctf_dtnextid. Migrate away from dtd_name to using ctf_str_add_ref to populate the ctt_name. Grow the ptrtab if needed. (ctf_add_encoded): Pass in the kind. (ctf_add_slice): Likewise. (ctf_add_array): Likewise. (ctf_add_function): Likewise. (ctf_add_typedef): Likewise. (ctf_add_reftype): Likewise. Initialize the ctf_ptrtab, checking ctt_name rather than dtd_name. (ctf_add_struct_sized): Pass in the kind. Use ctf_lookup_by_rawname, not ctf_hash_lookup_type / ctf_dtd_lookup_type_by_name. (ctf_add_union_sized): Likewise. (ctf_add_enum): Likewise. (ctf_add_enum_encoded): Likewise. (ctf_add_forward): Likewise. (ctf_add_type): Likewise. (ctf_compress_write): Call ctf_serialize: adjust for ctf_size not being initialized until after the call. (ctf_write_mem): Likewise. (ctf_write): Likewise. * ctf-archive.c (arc_write_one_ctf): Likewise. * ctf-lookup.c (ctf_lookup_by_name): Use ctf_lookuup_by_rawhash, not ctf_hash_lookup_type. (ctf_lookup_by_id): No longer check the readonly types if the dictionary is writable. * ctf-open.c (init_types): Assert that this dictionary is not writable. Adjust to use the new name hashes, ctf_name_table, and ctf_ptrtab_len. GNU style fix for the final ptrtab scan. (ctf_bufopen_internal): New 'writable' parameter. Flip on LCTF_RDWR if set. Drop out early when dictionary is writable. Split the ctf_lookups initialization into... (ctf_set_cth_hashes): ... this new function. (ctf_simple_open_internal): Adjust. New 'writable' parameter. (ctf_simple_open): Adjust accordingly. (ctf_bufopen): Likewise. (ctf_file_close): Destroy the appropriate name hashes. No longer destroy ctf_dtbyname, which is gone. (ctf_getdatasect): Remove spurious "extern". * ctf-types.c (ctf_lookup_by_rawname): New, look up types in the specified name table, given a kind. (ctf_lookup_by_rawhash): Likewise, given a ctf_names_t *. (ctf_member_iter): Add support for iterating over the dynamic type list. (ctf_enum_iter): Likewise. (ctf_variable_iter): Likewise. (ctf_type_rvisit): Likewise. (ctf_member_info): Add support for types in the dynamic type list. (ctf_enum_name): Likewise. (ctf_enum_value): Likewise. (ctf_func_type_info): Likewise. (ctf_func_type_args): Likewise. * ctf-link.c (ctf_accumulate_archive_names): No longer call ctf_update. (ctf_link_write): Likewise. (ctf_link_intern_extern_string): Adjust for new ctf_str_add_external return value. (ctf_link_add_strtab): Likewise. * ctf-util.c (ctf_list_empty_p): New.
2019-08-08 00:55:09 +08:00
ctf_dynhash_t *ctf_prov_strtab; /* Maps provisional-strtab offsets
to names. */
libctf: support getting strings from the ELF strtab The CTF file format has always supported "external strtabs", which internally are strtab offsets with their MSB on: such refs get their strings from the strtab passed in at CTF file open time: this is usually intended to be the ELF strtab, and that's what this implementation is meant to support, though in theory the external strtab could come from anywhere. This commit adds support for these external strings in the ctf-string.c strtab tracking layer. It's quite easy: we just add a field csa_offset to the atoms table that tracks all strings: this field tracks the offset of the string in the ELF strtab (with its MSB already on, courtesy of a new macro CTF_SET_STID), and adds a new function that sets the csa_offset to the specified offset (plus MSB). Then we just need to avoid writing out strings to the internal strtab if they have csa_offset set, and note that the internal strtab is shorter than it might otherwise be. (We could in theory save a little more time here by eschewing sorting such strings, since we never actually write the strings out anywhere, but that would mean storing them separately and it's just not worth the complexity cost until profiling shows it's worth doing.) We also have to go through a bit of extra effort at variable-sorting time. This was previously using direct references to the internal strtab: it couldn't use ctf_strptr or ctf_strraw because the new strtab is not yet ready to put in its usual field (in a ctf_file_t that hasn't even been allocated yet at this stage): but now we're using the external strtab, this will no longer do because it'll be looking things up in the wrong strtab, with disastrous results. Instead, pass the new internal strtab in to a new ctf_strraw_explicit function which is just like ctf_strraw except you can specify a ne winternal strtab to use. But even now that it is using a new internal strtab, this is not quite enough: it can't look up strings in the external strtab because ld hasn't written it out yet, and when it does will write it straight to disk. Instead, when we write the internal strtab, note all the offset -> string mappings that we have noted belong in the *external* strtab to a new "synthetic external strtab" dynhash, ctf_syn_ext_strtab, and look in there at ctf_strraw time if it is set. This uses minimal extra memory (because only strings in the external strtab that we actually use are stored, and even those come straight out of the atoms table), but let both variable sorting and name interning when ctf_bufopen is next called work fine. (This also means that we don't need to filter out spurious ECTF_STRTAB warnings from ctf_bufopen but can pass them back to the caller, once we wrap ctf_bufopen so that we have a new internal variant of ctf_bufopen etc that we can pass the synthetic external strtab to. That error has been filtered out since the days of Solaris libctf, which didn't try to handle the problem of getting external strtabs right at construction time at all.) v3: add the synthetic strtab and all associated machinery. v5: fix tabdamage. include/ * ctf.h (CTF_SET_STID): New. libctf/ * ctf-impl.h (ctf_str_atom_t) <csa_offset>: New field. (ctf_file_t) <ctf_syn_ext_strtab>: Likewise. (ctf_str_add_ref): Name the last arg. (ctf_str_add_external) New. (ctf_str_add_strraw_explicit): Likewise. (ctf_simple_open_internal): Likewise. (ctf_bufopen_internal): Likewise. * ctf-string.c (ctf_strraw_explicit): Split from... (ctf_strraw): ... here, with new support for ctf_syn_ext_strtab. (ctf_str_add_ref_internal): Return the atom, not the string. (ctf_str_add): Adjust accordingly. (ctf_str_add_ref): Likewise. Move up in the file. (ctf_str_add_external): New: update the csa_offset. (ctf_str_count_strtab): Only account for strings with no csa_offset in the internal strtab length. (ctf_str_write_strtab): If the csa_offset is set, update the string's refs without writing the string out, and update the ctf_syn_ext_strtab. Make OOM handling less ugly. * ctf-create.c (struct ctf_sort_var_arg_cb): New. (ctf_update): Handle failure to populate the strtab. Pass in the new ctf_sort_var arg. Adjust for ctf_syn_ext_strtab addition. Call ctf_simple_open_internal, not ctf_simple_open. (ctf_sort_var): Call ctf_strraw_explicit rather than looking up strings by hand. * ctf-hash.c (ctf_hash_insert_type): Likewise (but using ctf_strraw). Adjust to diagnose ECTF_STRTAB nonetheless. * ctf-open.c (init_types): No longer filter out ECTF_STRTAB. (ctf_file_close): Destroy the ctf_syn_ext_strtab. (ctf_simple_open): Rename to, and reimplement as a wrapper around... (ctf_simple_open_internal): ... this new function, which calls ctf_bufopen_internal. (ctf_bufopen): Rename to, and reimplement as a wrapper around... (ctf_bufopen_internal): ... this new function, which sets ctf_syn_ext_strtab.
2019-07-14 03:33:01 +08:00
ctf_dynhash_t *ctf_syn_ext_strtab; /* Maps ext-strtab offsets to names. */
void *ctf_data_mmapped; /* CTF data we mmapped, to free later. */
size_t ctf_data_mmapped_len; /* Length of CTF data we mmapped. */
libctf: avoid the need to ever use ctf_update The method of operation of libctf when the dictionary is writable has before now been that types that are added land in the dynamic type section, which is a linked list and hash of IDs -> dynamic type definitions (and, recently a hash of names): the DTDs are a bit of CTF representing the ctf_type_t and ad hoc C structures representing the vlen. Historically, libctf was unable to do anything with these types, not even look them up by ID, let alone by name: if you wanted to do that say if you were adding a type that depended on one you just added) you called ctf_update, which serializes all the DTDs into a CTF file and reopens it, copying its guts over the fp it's called with. The ctf_updated types are then frozen in amber and unchangeable: all lookups will return the types in the static portion in preference to the dynamic portion, and we will refuse to re-add things that already exist in the static portion (and, of late, in the dynamic portion too). The libctf machinery remembers the boundary between static and dynamic types and looks in the right portion for each type. Lots of things still don't quite work with dynamic types (e.g. getting their size), but enough works to do a bunch of additions and then a ctf_update, most of the time. Except it doesn't, because ctf_add_type finds it necessary to walk the full dynamic type definition list looking for types with matching names, so it gets slower and slower with every type you add: fixing this requires calling ctf_update periodically for no other reason than to avoid massively slowing things down. This is all clunky and very slow but kind of works, until you consider that it is in fact possible and indeed necessary to modify one sort of type after it has been added: forwards. These are necessarily promoted to structs, unions or enums, and when they do so *their type ID does not change*. So all of a sudden we are changing types that already exist in the static portion. ctf_update gets massively confused by this and allocates space enough for the forward (with no members), but then emits the new dynamic type (with all the members) into it. You get an assertion failure after that, if you're lucky, or a coredump. So this commit rejigs things a bit and arranges to exclusively use the dynamic type definitions in writable dictionaries, and the static type definitions in readable dictionaries: we don't at any time have a mixture of static and dynamic types, and you don't need to call ctf_update to make things "appear". The ctf_dtbyname hash I introduced a few months ago, which maps things like "struct foo" to DTDs, is removed, replaced instead by a change of type of the four dictionaries which track names. Rather than just being (unresizable) ctf_hash_t's populated only at ctf_bufopen time, they are now a ctf_names_t structure, which is a pair of ctf_hash_t and ctf_dynhash_t, with the ctf_hash_t portion being used in readonly dictionaries, and the ctf_dynhash_t being used in writable ones. The decision as to which to use is centralized in the new functions ctf_lookup_by_rawname (which takes a type kind) and ctf_lookup_by_rawhash, which it calls (which takes a ctf_names_t *.) This change lets us switch from using static to dynamic name hashes on the fly across the entirety of libctf without complexifying anything: in fact, because we now centralize the knowledge about how to map from type kind to name hash, it actually simplifies things and lets us throw out quite a lot of now-unnecessary complexity, from ctf_dtnyname (replaced by the dynamic half of the name tables), through to ctf_dtnextid (now that a dictionary's static portion is never referenced if the dictionary is writable, we can just use ctf_typemax to indicate the maximum type: dynamic or non-dynamic does not matter, and we no longer need to track the boundary between the types). You can now ctf_rollback() as far as you like, even past a ctf_update or for that matter a full writeout; all the iteration functions work just as well on writable as on read-only dictionaries; ctf_add_type no longer needs expensive duplicated code to run over the dynamic types hunting for ones it might be interested in; and the linker no longer needs a hack to call ctf_update so that calling ctf_add_type is not impossibly expensive. There is still a bit more complexity: some new code paths in ctf-types.c need to know how to extract information from dynamic types. This complexity will go away again in a few months when libctf acquires a proper intermediate representation. You can still call ctf_update if you like (it's public API, after all), but its only effect now is to set the point to which ctf_discard rolls back. Obviously *something* still needs to serialize the CTF file before writeout, and this job is done by ctf_serialize, which does everything ctf_update used to except set the counter used by ctf_discard. It is automatically called by the various functions that do CTF writeout: nobody else ever needs to call it. With this in place, forwards that are promoted to non-forwards no longer crash the link, even if it happens tens of thousands of types later. v5: fix tabdamage. libctf/ * ctf-impl.h (ctf_names_t): New. (ctf_lookup_t) <ctf_hash>: Now a ctf_names_t, not a ctf_hash_t. (ctf_file_t) <ctf_structs>: Likewise. <ctf_unions>: Likewise. <ctf_enums>: Likewise. <ctf_names>: Likewise. <ctf_lookups>: Improve comment. <ctf_ptrtab_len>: New. <ctf_prov_strtab>: New. <ctf_str_prov_offset>: New. <ctf_dtbyname>: Remove, redundant to the names hashes. <ctf_dtnextid>: Remove, redundant to ctf_typemax. (ctf_dtdef_t) <dtd_name>: Remove. <dtd_data>: Note that the ctt_name is now populated. (ctf_str_atom_t) <csa_offset>: This is now the strtab offset for internal strings too. <csa_external_offset>: New, the external strtab offset. (CTF_INDEX_TO_TYPEPTR): Handle the LCTF_RDWR case. (ctf_name_table): New declaration. (ctf_lookup_by_rawname): Likewise. (ctf_lookup_by_rawhash): Likewise. (ctf_set_ctl_hashes): Likewise. (ctf_serialize): Likewise. (ctf_dtd_insert): Adjust. (ctf_simple_open_internal): Likewise. (ctf_bufopen_internal): Likewise. (ctf_list_empty_p): Likewise. (ctf_str_remove_ref): Likewise. (ctf_str_add): Returns uint32_t now. (ctf_str_add_ref): Likewise. (ctf_str_add_external): Now returns a boolean (int). * ctf-string.c (ctf_strraw_explicit): Check the ctf_prov_strtab for strings in the appropriate range. (ctf_str_create_atoms): Create the ctf_prov_strtab. Detect OOM when adding the null string to the new strtab. (ctf_str_free_atoms): Destroy the ctf_prov_strtab. (ctf_str_add_ref_internal): Add make_provisional argument. If make_provisional, populate the offset and fill in the ctf_prov_strtab accordingly. (ctf_str_add): Return the offset, not the string. (ctf_str_add_ref): Likewise. (ctf_str_add_external): Return a success integer. (ctf_str_remove_ref): New, remove a single ref. (ctf_str_count_strtab): Do not count the initial null string's length or the existence or length of any unreferenced internal atoms. (ctf_str_populate_sorttab): Skip atoms with no refs. (ctf_str_write_strtab): Populate the nullstr earlier. Add one to the cts_len for the null string, since it is no longer done in ctf_str_count_strtab. Adjust for csa_external_offset rename. Populate the csa_offset for both internal and external cases. Flush the ctf_prov_strtab afterwards, and reset the ctf_str_prov_offset. * ctf-create.c (ctf_grow_ptrtab): New. (ctf_create): Call it. Initialize new fields rather than old ones. Tell ctf_bufopen_internal that this is a writable dictionary. Set the ctl hashes and data model. (ctf_update): Rename to... (ctf_serialize): ... this. Leave a compatibility function behind. Tell ctf_simple_open_internal that this is a writable dictionary. Pass the new fields along from the old dictionary. Drop ctf_dtnextid and ctf_dtbyname. Use ctf_strraw, not dtd_name. Do not zero out the DTD's ctt_name. (ctf_prefixed_name): Rename to... (ctf_name_table): ... this. No longer return a prefixed name: return the applicable name table instead. (ctf_dtd_insert): Use it, and use the right name table. Pass in the kind we're adding. Migrate away from dtd_name. (ctf_dtd_delete): Adjust similarly. Remove the ref to the deleted ctt_name. (ctf_dtd_lookup_type_by_name): Remove. (ctf_dynamic_type): Always return NULL on read-only dictionaries. No longer check ctf_dtnextid: check ctf_typemax instead. (ctf_snapshot): No longer use ctf_dtnextid: use ctf_typemax instead. (ctf_rollback): Likewise. No longer fail with ECTF_OVERROLLBACK. Use ctf_name_table and the right name table, and migrate away from dtd_name as in ctf_dtd_delete. (ctf_add_generic): Pass in the kind explicitly and pass it to ctf_dtd_insert. Use ctf_typemax, not ctf_dtnextid. Migrate away from dtd_name to using ctf_str_add_ref to populate the ctt_name. Grow the ptrtab if needed. (ctf_add_encoded): Pass in the kind. (ctf_add_slice): Likewise. (ctf_add_array): Likewise. (ctf_add_function): Likewise. (ctf_add_typedef): Likewise. (ctf_add_reftype): Likewise. Initialize the ctf_ptrtab, checking ctt_name rather than dtd_name. (ctf_add_struct_sized): Pass in the kind. Use ctf_lookup_by_rawname, not ctf_hash_lookup_type / ctf_dtd_lookup_type_by_name. (ctf_add_union_sized): Likewise. (ctf_add_enum): Likewise. (ctf_add_enum_encoded): Likewise. (ctf_add_forward): Likewise. (ctf_add_type): Likewise. (ctf_compress_write): Call ctf_serialize: adjust for ctf_size not being initialized until after the call. (ctf_write_mem): Likewise. (ctf_write): Likewise. * ctf-archive.c (arc_write_one_ctf): Likewise. * ctf-lookup.c (ctf_lookup_by_name): Use ctf_lookuup_by_rawhash, not ctf_hash_lookup_type. (ctf_lookup_by_id): No longer check the readonly types if the dictionary is writable. * ctf-open.c (init_types): Assert that this dictionary is not writable. Adjust to use the new name hashes, ctf_name_table, and ctf_ptrtab_len. GNU style fix for the final ptrtab scan. (ctf_bufopen_internal): New 'writable' parameter. Flip on LCTF_RDWR if set. Drop out early when dictionary is writable. Split the ctf_lookups initialization into... (ctf_set_cth_hashes): ... this new function. (ctf_simple_open_internal): Adjust. New 'writable' parameter. (ctf_simple_open): Adjust accordingly. (ctf_bufopen): Likewise. (ctf_file_close): Destroy the appropriate name hashes. No longer destroy ctf_dtbyname, which is gone. (ctf_getdatasect): Remove spurious "extern". * ctf-types.c (ctf_lookup_by_rawname): New, look up types in the specified name table, given a kind. (ctf_lookup_by_rawhash): Likewise, given a ctf_names_t *. (ctf_member_iter): Add support for iterating over the dynamic type list. (ctf_enum_iter): Likewise. (ctf_variable_iter): Likewise. (ctf_type_rvisit): Likewise. (ctf_member_info): Add support for types in the dynamic type list. (ctf_enum_name): Likewise. (ctf_enum_value): Likewise. (ctf_func_type_info): Likewise. (ctf_func_type_args): Likewise. * ctf-link.c (ctf_accumulate_archive_names): No longer call ctf_update. (ctf_link_write): Likewise. (ctf_link_intern_extern_string): Adjust for new ctf_str_add_external return value. (ctf_link_add_strtab): Likewise. * ctf-util.c (ctf_list_empty_p): New.
2019-08-08 00:55:09 +08:00
ctf_names_t ctf_structs; /* Hash table of struct types. */
ctf_names_t ctf_unions; /* Hash table of union types. */
ctf_names_t ctf_enums; /* Hash table of enum types. */
ctf_names_t ctf_names; /* Hash table of remaining type names. */
ctf_lookup_t ctf_lookups[5]; /* Pointers to nametabs for name lookup. */
ctf_strs_t ctf_str[2]; /* Array of string table base and bounds. */
libctf: deduplicate and sort the string table ctf.h states: > [...] the CTF string table does not contain any duplicated strings. Unfortunately this is entirely untrue: libctf has before now made no attempt whatsoever to deduplicate the string table. It computes the string table's length on the fly as it adds new strings to the dynamic CTF file, and ctf_update() just writes each string to the table and notes the current write position as it traverses the dynamic CTF file's data structures and builds the final CTF buffer. There is no global view of the strings and no deduplication. Fix this by erasing the ctf_dtvstrlen dead-reckoning length, and adding a new dynhash table ctf_str_atoms that maps unique strings to a list of references to those strings: a reference is a simple uint32_t * to some value somewhere in the under-construction CTF buffer that needs updating to note the string offset when the strtab is laid out. Adding a string is now a simple matter of calling ctf_str_add_ref(), which adds a new atom to the atoms table, if one doesn't already exist, and adding the location of the reference to this atom to the refs list attached to the atom: this works reliably as long as one takes care to only call ctf_str_add_ref() once the final location of the offset is known (so you can't call it on a temporary structure and then memcpy() that structure into place in the CTF buffer, because the ref will still point to the old location: ctf_update() changes accordingly). Generating the CTF string table is a matter of calling ctf_str_write_strtab(), which counts the length and number of elements in the atoms table using the ctf_dynhash_iter() function we just added, populating an array of pointers into the atoms table and sorting it into order (to help compressors), then traversing this table and emitting it, updating the refs to each atom as we go. The only complexity here is arranging to keep the null string at offset zero, since a lot of code in libctf depends on being able to leave strtab references at 0 to indicate 'no name'. Once the table is constructed and the refs updated, we know how long it is, so we can realloc() the partial CTF buffer we allocated earlier and can copy the table on to the end of it (and purge the refs because they're not needed any more and have been invalidated by the realloc() call in any case). The net effect of all this is a reduction in uncompressed strtab sizes of about 30% (perhaps a quarter to a half of all strings across the Linux kernel are eliminated as duplicates). Of course, duplicated strings are highly redundant, so the space saving after compression is only about 20%: when the other non-strtab sections are factored in, CTF sizes shrink by about 10%. No change in externally-visible API or file format (other than the reduction in pointless redundancy). libctf/ * ctf-impl.h: (struct ctf_strs_writable): New, non-const version of struct ctf_strs. (struct ctf_dtdef): Note that dtd_data.ctt_name is unpopulated. (struct ctf_str_atom): New, disambiguated single string. (struct ctf_str_atom_ref): New, points to some other location that references this string's offset. (struct ctf_file): New members ctf_str_atoms and ctf_str_num_refs. Remove member ctf_dtvstrlen: we no longer track the total strlen as we add strings. (ctf_str_create_atoms): Declare new function in ctf-string.c. (ctf_str_free_atoms): Likewise. (ctf_str_add): Likewise. (ctf_str_add_ref): Likewise. (ctf_str_purge_refs): Likewise. (ctf_str_write_strtab): Likewise. (ctf_realloc): Declare new function in ctf-util.c. * ctf-open.c (ctf_bufopen): Create the atoms table. (ctf_file_close): Destroy it. * ctf-create.c (ctf_update): Copy-and-free it on update. No longer special-case the position of the parname string. Construct the strtab by calling ctf_str_add_ref and ctf_str_write_strtab after the rest of each buffer element is constructed, not via open-coding: realloc the CTF buffer and append the strtab to it. No longer maintain ctf_dtvstrlen. Sort the variable entry table later, after strtab construction. (ctf_copy_membnames): Remove: integrated into ctf_copy_{s,l,e}members. (ctf_copy_smembers): Drop the string offset: call ctf_str_add_ref after buffer element construction instead. (ctf_copy_lmembers): Likewise. (ctf_copy_emembers): Likewise. (ctf_create): No longer maintain the ctf_dtvstrlen. (ctf_dtd_delete): Likewise. (ctf_dvd_delete): Likewise. (ctf_add_generic): Likewise. (ctf_add_enumerator): Likewise. (ctf_add_member_offset): Likewise. (ctf_add_variable): Likewise. (membadd): Likewise. * ctf-util.c (ctf_realloc): New, wrapper around realloc that aborts if there are active ctf_str_num_refs. (ctf_strraw): Move to ctf-string.c. (ctf_strptr): Likewise. * ctf-string.c: New file, strtab manipulation. * Makefile.am (libctf_a_SOURCES): Add it. * Makefile.in: Regenerate.
2019-06-27 20:51:10 +08:00
ctf_dynhash_t *ctf_str_atoms; /* Hash table of ctf_str_atoms_t. */
uint64_t ctf_str_num_refs; /* Number of refs to cts_str_atoms. */
libctf: avoid the need to ever use ctf_update The method of operation of libctf when the dictionary is writable has before now been that types that are added land in the dynamic type section, which is a linked list and hash of IDs -> dynamic type definitions (and, recently a hash of names): the DTDs are a bit of CTF representing the ctf_type_t and ad hoc C structures representing the vlen. Historically, libctf was unable to do anything with these types, not even look them up by ID, let alone by name: if you wanted to do that say if you were adding a type that depended on one you just added) you called ctf_update, which serializes all the DTDs into a CTF file and reopens it, copying its guts over the fp it's called with. The ctf_updated types are then frozen in amber and unchangeable: all lookups will return the types in the static portion in preference to the dynamic portion, and we will refuse to re-add things that already exist in the static portion (and, of late, in the dynamic portion too). The libctf machinery remembers the boundary between static and dynamic types and looks in the right portion for each type. Lots of things still don't quite work with dynamic types (e.g. getting their size), but enough works to do a bunch of additions and then a ctf_update, most of the time. Except it doesn't, because ctf_add_type finds it necessary to walk the full dynamic type definition list looking for types with matching names, so it gets slower and slower with every type you add: fixing this requires calling ctf_update periodically for no other reason than to avoid massively slowing things down. This is all clunky and very slow but kind of works, until you consider that it is in fact possible and indeed necessary to modify one sort of type after it has been added: forwards. These are necessarily promoted to structs, unions or enums, and when they do so *their type ID does not change*. So all of a sudden we are changing types that already exist in the static portion. ctf_update gets massively confused by this and allocates space enough for the forward (with no members), but then emits the new dynamic type (with all the members) into it. You get an assertion failure after that, if you're lucky, or a coredump. So this commit rejigs things a bit and arranges to exclusively use the dynamic type definitions in writable dictionaries, and the static type definitions in readable dictionaries: we don't at any time have a mixture of static and dynamic types, and you don't need to call ctf_update to make things "appear". The ctf_dtbyname hash I introduced a few months ago, which maps things like "struct foo" to DTDs, is removed, replaced instead by a change of type of the four dictionaries which track names. Rather than just being (unresizable) ctf_hash_t's populated only at ctf_bufopen time, they are now a ctf_names_t structure, which is a pair of ctf_hash_t and ctf_dynhash_t, with the ctf_hash_t portion being used in readonly dictionaries, and the ctf_dynhash_t being used in writable ones. The decision as to which to use is centralized in the new functions ctf_lookup_by_rawname (which takes a type kind) and ctf_lookup_by_rawhash, which it calls (which takes a ctf_names_t *.) This change lets us switch from using static to dynamic name hashes on the fly across the entirety of libctf without complexifying anything: in fact, because we now centralize the knowledge about how to map from type kind to name hash, it actually simplifies things and lets us throw out quite a lot of now-unnecessary complexity, from ctf_dtnyname (replaced by the dynamic half of the name tables), through to ctf_dtnextid (now that a dictionary's static portion is never referenced if the dictionary is writable, we can just use ctf_typemax to indicate the maximum type: dynamic or non-dynamic does not matter, and we no longer need to track the boundary between the types). You can now ctf_rollback() as far as you like, even past a ctf_update or for that matter a full writeout; all the iteration functions work just as well on writable as on read-only dictionaries; ctf_add_type no longer needs expensive duplicated code to run over the dynamic types hunting for ones it might be interested in; and the linker no longer needs a hack to call ctf_update so that calling ctf_add_type is not impossibly expensive. There is still a bit more complexity: some new code paths in ctf-types.c need to know how to extract information from dynamic types. This complexity will go away again in a few months when libctf acquires a proper intermediate representation. You can still call ctf_update if you like (it's public API, after all), but its only effect now is to set the point to which ctf_discard rolls back. Obviously *something* still needs to serialize the CTF file before writeout, and this job is done by ctf_serialize, which does everything ctf_update used to except set the counter used by ctf_discard. It is automatically called by the various functions that do CTF writeout: nobody else ever needs to call it. With this in place, forwards that are promoted to non-forwards no longer crash the link, even if it happens tens of thousands of types later. v5: fix tabdamage. libctf/ * ctf-impl.h (ctf_names_t): New. (ctf_lookup_t) <ctf_hash>: Now a ctf_names_t, not a ctf_hash_t. (ctf_file_t) <ctf_structs>: Likewise. <ctf_unions>: Likewise. <ctf_enums>: Likewise. <ctf_names>: Likewise. <ctf_lookups>: Improve comment. <ctf_ptrtab_len>: New. <ctf_prov_strtab>: New. <ctf_str_prov_offset>: New. <ctf_dtbyname>: Remove, redundant to the names hashes. <ctf_dtnextid>: Remove, redundant to ctf_typemax. (ctf_dtdef_t) <dtd_name>: Remove. <dtd_data>: Note that the ctt_name is now populated. (ctf_str_atom_t) <csa_offset>: This is now the strtab offset for internal strings too. <csa_external_offset>: New, the external strtab offset. (CTF_INDEX_TO_TYPEPTR): Handle the LCTF_RDWR case. (ctf_name_table): New declaration. (ctf_lookup_by_rawname): Likewise. (ctf_lookup_by_rawhash): Likewise. (ctf_set_ctl_hashes): Likewise. (ctf_serialize): Likewise. (ctf_dtd_insert): Adjust. (ctf_simple_open_internal): Likewise. (ctf_bufopen_internal): Likewise. (ctf_list_empty_p): Likewise. (ctf_str_remove_ref): Likewise. (ctf_str_add): Returns uint32_t now. (ctf_str_add_ref): Likewise. (ctf_str_add_external): Now returns a boolean (int). * ctf-string.c (ctf_strraw_explicit): Check the ctf_prov_strtab for strings in the appropriate range. (ctf_str_create_atoms): Create the ctf_prov_strtab. Detect OOM when adding the null string to the new strtab. (ctf_str_free_atoms): Destroy the ctf_prov_strtab. (ctf_str_add_ref_internal): Add make_provisional argument. If make_provisional, populate the offset and fill in the ctf_prov_strtab accordingly. (ctf_str_add): Return the offset, not the string. (ctf_str_add_ref): Likewise. (ctf_str_add_external): Return a success integer. (ctf_str_remove_ref): New, remove a single ref. (ctf_str_count_strtab): Do not count the initial null string's length or the existence or length of any unreferenced internal atoms. (ctf_str_populate_sorttab): Skip atoms with no refs. (ctf_str_write_strtab): Populate the nullstr earlier. Add one to the cts_len for the null string, since it is no longer done in ctf_str_count_strtab. Adjust for csa_external_offset rename. Populate the csa_offset for both internal and external cases. Flush the ctf_prov_strtab afterwards, and reset the ctf_str_prov_offset. * ctf-create.c (ctf_grow_ptrtab): New. (ctf_create): Call it. Initialize new fields rather than old ones. Tell ctf_bufopen_internal that this is a writable dictionary. Set the ctl hashes and data model. (ctf_update): Rename to... (ctf_serialize): ... this. Leave a compatibility function behind. Tell ctf_simple_open_internal that this is a writable dictionary. Pass the new fields along from the old dictionary. Drop ctf_dtnextid and ctf_dtbyname. Use ctf_strraw, not dtd_name. Do not zero out the DTD's ctt_name. (ctf_prefixed_name): Rename to... (ctf_name_table): ... this. No longer return a prefixed name: return the applicable name table instead. (ctf_dtd_insert): Use it, and use the right name table. Pass in the kind we're adding. Migrate away from dtd_name. (ctf_dtd_delete): Adjust similarly. Remove the ref to the deleted ctt_name. (ctf_dtd_lookup_type_by_name): Remove. (ctf_dynamic_type): Always return NULL on read-only dictionaries. No longer check ctf_dtnextid: check ctf_typemax instead. (ctf_snapshot): No longer use ctf_dtnextid: use ctf_typemax instead. (ctf_rollback): Likewise. No longer fail with ECTF_OVERROLLBACK. Use ctf_name_table and the right name table, and migrate away from dtd_name as in ctf_dtd_delete. (ctf_add_generic): Pass in the kind explicitly and pass it to ctf_dtd_insert. Use ctf_typemax, not ctf_dtnextid. Migrate away from dtd_name to using ctf_str_add_ref to populate the ctt_name. Grow the ptrtab if needed. (ctf_add_encoded): Pass in the kind. (ctf_add_slice): Likewise. (ctf_add_array): Likewise. (ctf_add_function): Likewise. (ctf_add_typedef): Likewise. (ctf_add_reftype): Likewise. Initialize the ctf_ptrtab, checking ctt_name rather than dtd_name. (ctf_add_struct_sized): Pass in the kind. Use ctf_lookup_by_rawname, not ctf_hash_lookup_type / ctf_dtd_lookup_type_by_name. (ctf_add_union_sized): Likewise. (ctf_add_enum): Likewise. (ctf_add_enum_encoded): Likewise. (ctf_add_forward): Likewise. (ctf_add_type): Likewise. (ctf_compress_write): Call ctf_serialize: adjust for ctf_size not being initialized until after the call. (ctf_write_mem): Likewise. (ctf_write): Likewise. * ctf-archive.c (arc_write_one_ctf): Likewise. * ctf-lookup.c (ctf_lookup_by_name): Use ctf_lookuup_by_rawhash, not ctf_hash_lookup_type. (ctf_lookup_by_id): No longer check the readonly types if the dictionary is writable. * ctf-open.c (init_types): Assert that this dictionary is not writable. Adjust to use the new name hashes, ctf_name_table, and ctf_ptrtab_len. GNU style fix for the final ptrtab scan. (ctf_bufopen_internal): New 'writable' parameter. Flip on LCTF_RDWR if set. Drop out early when dictionary is writable. Split the ctf_lookups initialization into... (ctf_set_cth_hashes): ... this new function. (ctf_simple_open_internal): Adjust. New 'writable' parameter. (ctf_simple_open): Adjust accordingly. (ctf_bufopen): Likewise. (ctf_file_close): Destroy the appropriate name hashes. No longer destroy ctf_dtbyname, which is gone. (ctf_getdatasect): Remove spurious "extern". * ctf-types.c (ctf_lookup_by_rawname): New, look up types in the specified name table, given a kind. (ctf_lookup_by_rawhash): Likewise, given a ctf_names_t *. (ctf_member_iter): Add support for iterating over the dynamic type list. (ctf_enum_iter): Likewise. (ctf_variable_iter): Likewise. (ctf_type_rvisit): Likewise. (ctf_member_info): Add support for types in the dynamic type list. (ctf_enum_name): Likewise. (ctf_enum_value): Likewise. (ctf_func_type_info): Likewise. (ctf_func_type_args): Likewise. * ctf-link.c (ctf_accumulate_archive_names): No longer call ctf_update. (ctf_link_write): Likewise. (ctf_link_intern_extern_string): Adjust for new ctf_str_add_external return value. (ctf_link_add_strtab): Likewise. * ctf-util.c (ctf_list_empty_p): New.
2019-08-08 00:55:09 +08:00
uint32_t ctf_str_prov_offset; /* Latest provisional offset assigned so far. */
libctf: allow the header to change between versions libctf supports dynamic upgrading of the type table as file format versions change, but before now has not supported changes to the CTF header. Doing this is complicated by the baroque storage method used: the CTF header is kept prepended to the rest of the CTF data, just as when read from the file, and written out from there, and is endian-flipped in place. This makes accessing it needlessly hard and makes it almost impossible to make the header larger if we add fields. The general storage machinery around the malloced ctf pointer (the 'ctf_base') is also overcomplicated: the pointer is sometimes malloced locally and sometimes assigned from a parameter, so freeing it requires checking to see if that parameter was used, needlessly coupling ctf_bufopen and ctf_file_close together. So split the header out into a new ctf_file_t.ctf_header, which is written out explicitly: squeeze it out of the CTF buffer whenever we reallocate it, and use ctf_file_t.ctf_buf to skip past the header when we do not need to reallocate (when no upgrading or endian-flipping is required). We now track whether the CTF base can be freed explicitly via a new ctf_dynbase pointer which is non-NULL only when freeing is possible. With all this done, we can upgrade the header on the fly and add new fields as desired, via a new upgrade_header function in ctf-open. As with other forms of upgrading, libctf upgrades older headers automatically to the latest supported version at open time. For a first use of this field, we add a new string field cth_cuname, and a corresponding setter/getter pair ctf_cuname_set and ctf_cuname: this is used by debuggers to determine whether a CTF section's types relate to a single compilation unit, or to all compilation units in the program. (Types with ambiguous definitions in different CUs have only one of these types placed in the top-level shared .ctf container: the rest are placed in much smaller per-CU containers, which have the shared container as their parent. Since CTF must be useful in the absence of DWARF, we store the names of the relevant CUs ourselves, so the debugger can look them up.) v5: fix tabdamage. include/ * ctf-api.h (ctf_cuname): New function. (ctf_cuname_set): Likewise. * ctf.h: Improve comment around upgrading, no longer implying that v2 is the target of upgrades (it is v3 now). (ctf_header_v2_t): New, old-format header for backward compatibility. (ctf_header_t): Add cth_cuname: this is the first of several header changes in format v3. libctf/ * ctf-impl.h (ctf_file_t): New fields ctf_header, ctf_dynbase, ctf_cuname, ctf_dyncuname: ctf_base and ctf_buf are no longer const. * ctf-open.c (ctf_set_base): Preserve the gap between ctf_buf and ctf_base: do not assume that it is always sizeof (ctf_header_t). Print out ctf_cuname: only print out ctf_parname if set. (ctf_free_base): Removed, ctf_base is no longer freed: free ctf_dynbase instead. (ctf_set_version): Fix spacing. (upgrade_header): New, in-place header upgrading. (upgrade_types): Rename to... (upgrade_types_v1): ... this. Free ctf_dynbase, not ctf_base. No longer track old and new headers separately. No longer allow for header sizes explicitly: squeeze the headers out on upgrade (they are preserved in fp->ctf_header). Set ctf_dynbase, ctf_base and ctf_buf explicitly. Use ctf_free, not ctf_free_base. (upgrade_types): New, also handle ctf_parmax updating. (flip_header): Flip ctf_cuname. (flip_types): Flip BUF explicitly rather than deriving BUF from BASE. (ctf_bufopen): Store the header in fp->ctf_header. Correct minimum required alignment of objtoff and funcoff. No longer store it in the ctf_buf unless that buf is derived unmodified from the input. Set ctf_dynbase where ctf_base is dynamically allocated. Drop locals that duplicate fields in ctf_file: move allocation of ctf_file further up instead. Call upgrade_header as needed. Move version-specific ctf_parmax initialization into upgrade_types. More concise error handling. (ctf_file_close): No longer test for null pointers before freeing. Free ctf_dyncuname, ctf_dynbase, and ctf_header. Do not call ctf_free_base. (ctf_cuname): New. (ctf_cuname_set): New. * ctf-create.c (ctf_update): Populate ctf_cuname. (ctf_gzwrite): Write out the header explicitly. Remove obsolescent comment. (ctf_write): Likewise. (ctf_compress_write): Get the header from ctf_header, not ctf_base. Fix the compression length: fp->ctf_size never counted the CTF header. Simplify the compress call accordingly.
2019-07-07 00:36:21 +08:00
unsigned char *ctf_base; /* CTF file pointer. */
unsigned char *ctf_dynbase; /* Freeable CTF file pointer. */
unsigned char *ctf_buf; /* Uncompressed CTF data buffer. */
size_t ctf_size; /* Size of CTF header + uncompressed data. */
uint32_t *ctf_sxlate; /* Translation table for symtab entries. */
unsigned long ctf_nsyms; /* Number of entries in symtab xlate table. */
uint32_t *ctf_txlate; /* Translation table for type IDs. */
uint32_t *ctf_ptrtab; /* Translation table for pointer-to lookups. */
libctf: avoid the need to ever use ctf_update The method of operation of libctf when the dictionary is writable has before now been that types that are added land in the dynamic type section, which is a linked list and hash of IDs -> dynamic type definitions (and, recently a hash of names): the DTDs are a bit of CTF representing the ctf_type_t and ad hoc C structures representing the vlen. Historically, libctf was unable to do anything with these types, not even look them up by ID, let alone by name: if you wanted to do that say if you were adding a type that depended on one you just added) you called ctf_update, which serializes all the DTDs into a CTF file and reopens it, copying its guts over the fp it's called with. The ctf_updated types are then frozen in amber and unchangeable: all lookups will return the types in the static portion in preference to the dynamic portion, and we will refuse to re-add things that already exist in the static portion (and, of late, in the dynamic portion too). The libctf machinery remembers the boundary between static and dynamic types and looks in the right portion for each type. Lots of things still don't quite work with dynamic types (e.g. getting their size), but enough works to do a bunch of additions and then a ctf_update, most of the time. Except it doesn't, because ctf_add_type finds it necessary to walk the full dynamic type definition list looking for types with matching names, so it gets slower and slower with every type you add: fixing this requires calling ctf_update periodically for no other reason than to avoid massively slowing things down. This is all clunky and very slow but kind of works, until you consider that it is in fact possible and indeed necessary to modify one sort of type after it has been added: forwards. These are necessarily promoted to structs, unions or enums, and when they do so *their type ID does not change*. So all of a sudden we are changing types that already exist in the static portion. ctf_update gets massively confused by this and allocates space enough for the forward (with no members), but then emits the new dynamic type (with all the members) into it. You get an assertion failure after that, if you're lucky, or a coredump. So this commit rejigs things a bit and arranges to exclusively use the dynamic type definitions in writable dictionaries, and the static type definitions in readable dictionaries: we don't at any time have a mixture of static and dynamic types, and you don't need to call ctf_update to make things "appear". The ctf_dtbyname hash I introduced a few months ago, which maps things like "struct foo" to DTDs, is removed, replaced instead by a change of type of the four dictionaries which track names. Rather than just being (unresizable) ctf_hash_t's populated only at ctf_bufopen time, they are now a ctf_names_t structure, which is a pair of ctf_hash_t and ctf_dynhash_t, with the ctf_hash_t portion being used in readonly dictionaries, and the ctf_dynhash_t being used in writable ones. The decision as to which to use is centralized in the new functions ctf_lookup_by_rawname (which takes a type kind) and ctf_lookup_by_rawhash, which it calls (which takes a ctf_names_t *.) This change lets us switch from using static to dynamic name hashes on the fly across the entirety of libctf without complexifying anything: in fact, because we now centralize the knowledge about how to map from type kind to name hash, it actually simplifies things and lets us throw out quite a lot of now-unnecessary complexity, from ctf_dtnyname (replaced by the dynamic half of the name tables), through to ctf_dtnextid (now that a dictionary's static portion is never referenced if the dictionary is writable, we can just use ctf_typemax to indicate the maximum type: dynamic or non-dynamic does not matter, and we no longer need to track the boundary between the types). You can now ctf_rollback() as far as you like, even past a ctf_update or for that matter a full writeout; all the iteration functions work just as well on writable as on read-only dictionaries; ctf_add_type no longer needs expensive duplicated code to run over the dynamic types hunting for ones it might be interested in; and the linker no longer needs a hack to call ctf_update so that calling ctf_add_type is not impossibly expensive. There is still a bit more complexity: some new code paths in ctf-types.c need to know how to extract information from dynamic types. This complexity will go away again in a few months when libctf acquires a proper intermediate representation. You can still call ctf_update if you like (it's public API, after all), but its only effect now is to set the point to which ctf_discard rolls back. Obviously *something* still needs to serialize the CTF file before writeout, and this job is done by ctf_serialize, which does everything ctf_update used to except set the counter used by ctf_discard. It is automatically called by the various functions that do CTF writeout: nobody else ever needs to call it. With this in place, forwards that are promoted to non-forwards no longer crash the link, even if it happens tens of thousands of types later. v5: fix tabdamage. libctf/ * ctf-impl.h (ctf_names_t): New. (ctf_lookup_t) <ctf_hash>: Now a ctf_names_t, not a ctf_hash_t. (ctf_file_t) <ctf_structs>: Likewise. <ctf_unions>: Likewise. <ctf_enums>: Likewise. <ctf_names>: Likewise. <ctf_lookups>: Improve comment. <ctf_ptrtab_len>: New. <ctf_prov_strtab>: New. <ctf_str_prov_offset>: New. <ctf_dtbyname>: Remove, redundant to the names hashes. <ctf_dtnextid>: Remove, redundant to ctf_typemax. (ctf_dtdef_t) <dtd_name>: Remove. <dtd_data>: Note that the ctt_name is now populated. (ctf_str_atom_t) <csa_offset>: This is now the strtab offset for internal strings too. <csa_external_offset>: New, the external strtab offset. (CTF_INDEX_TO_TYPEPTR): Handle the LCTF_RDWR case. (ctf_name_table): New declaration. (ctf_lookup_by_rawname): Likewise. (ctf_lookup_by_rawhash): Likewise. (ctf_set_ctl_hashes): Likewise. (ctf_serialize): Likewise. (ctf_dtd_insert): Adjust. (ctf_simple_open_internal): Likewise. (ctf_bufopen_internal): Likewise. (ctf_list_empty_p): Likewise. (ctf_str_remove_ref): Likewise. (ctf_str_add): Returns uint32_t now. (ctf_str_add_ref): Likewise. (ctf_str_add_external): Now returns a boolean (int). * ctf-string.c (ctf_strraw_explicit): Check the ctf_prov_strtab for strings in the appropriate range. (ctf_str_create_atoms): Create the ctf_prov_strtab. Detect OOM when adding the null string to the new strtab. (ctf_str_free_atoms): Destroy the ctf_prov_strtab. (ctf_str_add_ref_internal): Add make_provisional argument. If make_provisional, populate the offset and fill in the ctf_prov_strtab accordingly. (ctf_str_add): Return the offset, not the string. (ctf_str_add_ref): Likewise. (ctf_str_add_external): Return a success integer. (ctf_str_remove_ref): New, remove a single ref. (ctf_str_count_strtab): Do not count the initial null string's length or the existence or length of any unreferenced internal atoms. (ctf_str_populate_sorttab): Skip atoms with no refs. (ctf_str_write_strtab): Populate the nullstr earlier. Add one to the cts_len for the null string, since it is no longer done in ctf_str_count_strtab. Adjust for csa_external_offset rename. Populate the csa_offset for both internal and external cases. Flush the ctf_prov_strtab afterwards, and reset the ctf_str_prov_offset. * ctf-create.c (ctf_grow_ptrtab): New. (ctf_create): Call it. Initialize new fields rather than old ones. Tell ctf_bufopen_internal that this is a writable dictionary. Set the ctl hashes and data model. (ctf_update): Rename to... (ctf_serialize): ... this. Leave a compatibility function behind. Tell ctf_simple_open_internal that this is a writable dictionary. Pass the new fields along from the old dictionary. Drop ctf_dtnextid and ctf_dtbyname. Use ctf_strraw, not dtd_name. Do not zero out the DTD's ctt_name. (ctf_prefixed_name): Rename to... (ctf_name_table): ... this. No longer return a prefixed name: return the applicable name table instead. (ctf_dtd_insert): Use it, and use the right name table. Pass in the kind we're adding. Migrate away from dtd_name. (ctf_dtd_delete): Adjust similarly. Remove the ref to the deleted ctt_name. (ctf_dtd_lookup_type_by_name): Remove. (ctf_dynamic_type): Always return NULL on read-only dictionaries. No longer check ctf_dtnextid: check ctf_typemax instead. (ctf_snapshot): No longer use ctf_dtnextid: use ctf_typemax instead. (ctf_rollback): Likewise. No longer fail with ECTF_OVERROLLBACK. Use ctf_name_table and the right name table, and migrate away from dtd_name as in ctf_dtd_delete. (ctf_add_generic): Pass in the kind explicitly and pass it to ctf_dtd_insert. Use ctf_typemax, not ctf_dtnextid. Migrate away from dtd_name to using ctf_str_add_ref to populate the ctt_name. Grow the ptrtab if needed. (ctf_add_encoded): Pass in the kind. (ctf_add_slice): Likewise. (ctf_add_array): Likewise. (ctf_add_function): Likewise. (ctf_add_typedef): Likewise. (ctf_add_reftype): Likewise. Initialize the ctf_ptrtab, checking ctt_name rather than dtd_name. (ctf_add_struct_sized): Pass in the kind. Use ctf_lookup_by_rawname, not ctf_hash_lookup_type / ctf_dtd_lookup_type_by_name. (ctf_add_union_sized): Likewise. (ctf_add_enum): Likewise. (ctf_add_enum_encoded): Likewise. (ctf_add_forward): Likewise. (ctf_add_type): Likewise. (ctf_compress_write): Call ctf_serialize: adjust for ctf_size not being initialized until after the call. (ctf_write_mem): Likewise. (ctf_write): Likewise. * ctf-archive.c (arc_write_one_ctf): Likewise. * ctf-lookup.c (ctf_lookup_by_name): Use ctf_lookuup_by_rawhash, not ctf_hash_lookup_type. (ctf_lookup_by_id): No longer check the readonly types if the dictionary is writable. * ctf-open.c (init_types): Assert that this dictionary is not writable. Adjust to use the new name hashes, ctf_name_table, and ctf_ptrtab_len. GNU style fix for the final ptrtab scan. (ctf_bufopen_internal): New 'writable' parameter. Flip on LCTF_RDWR if set. Drop out early when dictionary is writable. Split the ctf_lookups initialization into... (ctf_set_cth_hashes): ... this new function. (ctf_simple_open_internal): Adjust. New 'writable' parameter. (ctf_simple_open): Adjust accordingly. (ctf_bufopen): Likewise. (ctf_file_close): Destroy the appropriate name hashes. No longer destroy ctf_dtbyname, which is gone. (ctf_getdatasect): Remove spurious "extern". * ctf-types.c (ctf_lookup_by_rawname): New, look up types in the specified name table, given a kind. (ctf_lookup_by_rawhash): Likewise, given a ctf_names_t *. (ctf_member_iter): Add support for iterating over the dynamic type list. (ctf_enum_iter): Likewise. (ctf_variable_iter): Likewise. (ctf_type_rvisit): Likewise. (ctf_member_info): Add support for types in the dynamic type list. (ctf_enum_name): Likewise. (ctf_enum_value): Likewise. (ctf_func_type_info): Likewise. (ctf_func_type_args): Likewise. * ctf-link.c (ctf_accumulate_archive_names): No longer call ctf_update. (ctf_link_write): Likewise. (ctf_link_intern_extern_string): Adjust for new ctf_str_add_external return value. (ctf_link_add_strtab): Likewise. * ctf-util.c (ctf_list_empty_p): New.
2019-08-08 00:55:09 +08:00
size_t ctf_ptrtab_len; /* Num types storable in ptrtab currently. */
struct ctf_varent *ctf_vars; /* Sorted variable->type mapping. */
unsigned long ctf_nvars; /* Number of variables in ctf_vars. */
unsigned long ctf_typemax; /* Maximum valid type ID number. */
const ctf_dmodel_t *ctf_dmodel; /* Data model pointer (see above). */
libctf: allow the header to change between versions libctf supports dynamic upgrading of the type table as file format versions change, but before now has not supported changes to the CTF header. Doing this is complicated by the baroque storage method used: the CTF header is kept prepended to the rest of the CTF data, just as when read from the file, and written out from there, and is endian-flipped in place. This makes accessing it needlessly hard and makes it almost impossible to make the header larger if we add fields. The general storage machinery around the malloced ctf pointer (the 'ctf_base') is also overcomplicated: the pointer is sometimes malloced locally and sometimes assigned from a parameter, so freeing it requires checking to see if that parameter was used, needlessly coupling ctf_bufopen and ctf_file_close together. So split the header out into a new ctf_file_t.ctf_header, which is written out explicitly: squeeze it out of the CTF buffer whenever we reallocate it, and use ctf_file_t.ctf_buf to skip past the header when we do not need to reallocate (when no upgrading or endian-flipping is required). We now track whether the CTF base can be freed explicitly via a new ctf_dynbase pointer which is non-NULL only when freeing is possible. With all this done, we can upgrade the header on the fly and add new fields as desired, via a new upgrade_header function in ctf-open. As with other forms of upgrading, libctf upgrades older headers automatically to the latest supported version at open time. For a first use of this field, we add a new string field cth_cuname, and a corresponding setter/getter pair ctf_cuname_set and ctf_cuname: this is used by debuggers to determine whether a CTF section's types relate to a single compilation unit, or to all compilation units in the program. (Types with ambiguous definitions in different CUs have only one of these types placed in the top-level shared .ctf container: the rest are placed in much smaller per-CU containers, which have the shared container as their parent. Since CTF must be useful in the absence of DWARF, we store the names of the relevant CUs ourselves, so the debugger can look them up.) v5: fix tabdamage. include/ * ctf-api.h (ctf_cuname): New function. (ctf_cuname_set): Likewise. * ctf.h: Improve comment around upgrading, no longer implying that v2 is the target of upgrades (it is v3 now). (ctf_header_v2_t): New, old-format header for backward compatibility. (ctf_header_t): Add cth_cuname: this is the first of several header changes in format v3. libctf/ * ctf-impl.h (ctf_file_t): New fields ctf_header, ctf_dynbase, ctf_cuname, ctf_dyncuname: ctf_base and ctf_buf are no longer const. * ctf-open.c (ctf_set_base): Preserve the gap between ctf_buf and ctf_base: do not assume that it is always sizeof (ctf_header_t). Print out ctf_cuname: only print out ctf_parname if set. (ctf_free_base): Removed, ctf_base is no longer freed: free ctf_dynbase instead. (ctf_set_version): Fix spacing. (upgrade_header): New, in-place header upgrading. (upgrade_types): Rename to... (upgrade_types_v1): ... this. Free ctf_dynbase, not ctf_base. No longer track old and new headers separately. No longer allow for header sizes explicitly: squeeze the headers out on upgrade (they are preserved in fp->ctf_header). Set ctf_dynbase, ctf_base and ctf_buf explicitly. Use ctf_free, not ctf_free_base. (upgrade_types): New, also handle ctf_parmax updating. (flip_header): Flip ctf_cuname. (flip_types): Flip BUF explicitly rather than deriving BUF from BASE. (ctf_bufopen): Store the header in fp->ctf_header. Correct minimum required alignment of objtoff and funcoff. No longer store it in the ctf_buf unless that buf is derived unmodified from the input. Set ctf_dynbase where ctf_base is dynamically allocated. Drop locals that duplicate fields in ctf_file: move allocation of ctf_file further up instead. Call upgrade_header as needed. Move version-specific ctf_parmax initialization into upgrade_types. More concise error handling. (ctf_file_close): No longer test for null pointers before freeing. Free ctf_dyncuname, ctf_dynbase, and ctf_header. Do not call ctf_free_base. (ctf_cuname): New. (ctf_cuname_set): New. * ctf-create.c (ctf_update): Populate ctf_cuname. (ctf_gzwrite): Write out the header explicitly. Remove obsolescent comment. (ctf_write): Likewise. (ctf_compress_write): Get the header from ctf_header, not ctf_base. Fix the compression length: fp->ctf_size never counted the CTF header. Simplify the compress call accordingly.
2019-07-07 00:36:21 +08:00
const char *ctf_cuname; /* Compilation unit name (if any). */
char *ctf_dyncuname; /* Dynamically allocated name of CU. */
struct ctf_file *ctf_parent; /* Parent CTF container (if any). */
libctf: sort out potential refcount loops When you link TUs that contain conflicting types together, the resulting CTF section is an archive containing many CTF dicts. These dicts appear in ctf_link_outputs of the shared dict, with each ctf_import'ing that shared dict. ctf_importing a dict bumps its refcount to stop it going away while it's in use -- but if the shared dict (whose refcount is bumped) has the child dict (doing the bumping) in its ctf_link_outputs, we have a refcount loop, since the child dict only un-ctf_imports and drops the parent's refcount when it is freed, but the child is only freed when the parent's refcount falls to zero. (In the future, this will be able to go wrong on the inputs too, when an ld -r'ed deduplicated output with conflicts is relinked. Right now this cannot happen because we don't ctf_import such dicts at all. This will be fixed in a later commit in this series.) Fix this by introducing an internal-use-only ctf_import_unref function that imports a parent dict *witthout* bumping the parent's refcount, and using it when we create per-CU outputs. This function is only safe to use if you know the parent cannot go away while the child exists: but if the parent *owns* the child, as here, this is necessarily true. Record in the ctf_file_t whether a parent was imported via ctf_import or ctf_import_unref, so that if you do another ctf_import later on (or a ctf_import_unref) it can decide whether to drop the refcount of the existing parent being replaced depending on which function you used to import that one. Adjust ctf_serialize so that rather than doing a ctf_import (which is wrong if the original import was ctf_import_unref'fed), we just copy the parent field and refcount over and forcibly flip the unref flag on on the old copy we are going to discard. ctf_file_close also needs a bit of tweaking to only close the parent if it was not imported with ctf_import_unref: while we're at it, guard against repeated closes with a refcount of zero and stop them causing double-frees, even if destruction of things freed *inside* ctf_file_close cause such recursion. Verified no leaks or accesses to freed memory after all of this with valgrind. (It was leak-happy before.) libctf/ * ctf-impl.c (ctf_file_t) <ctf_parent_unreffed>: New. (ctf_import_unref): New. * ctf-open.c (ctf_file_close) Drop the refcount all the way to zero. Don't recurse back in if the refcount is already zero. (ctf_import): Check ctf_parent_unreffed before deciding whether to close a pre-existing parent. Set it to zero. (ctf_import_unreffed): New, as above, setting ctf_parent_unreffed to 1. * ctf-create.c (ctf_serialize): Do not ctf_import into the new child: use direct assignment, and set unreffed on the new and old children. * ctf-link.c (ctf_create_per_cu): Import the parent using ctf_import_unreffed.
2020-06-05 00:30:01 +08:00
int ctf_parent_unreffed; /* Parent set by ctf_import_unref? */
const char *ctf_parlabel; /* Label in parent container (if any). */
const char *ctf_parname; /* Basename of parent (if any). */
char *ctf_dynparname; /* Dynamically allocated name of parent. */
uint32_t ctf_parmax; /* Highest type ID of a parent type. */
uint32_t ctf_refcnt; /* Reference count (for parent links). */
uint32_t ctf_flags; /* Libctf flags (see below). */
int ctf_errno; /* Error code for most recent error. */
int ctf_version; /* CTF data version. */
ctf_dynhash_t *ctf_dthash; /* Hash of dynamic type definitions. */
ctf_list_t ctf_dtdefs; /* List of dynamic type definitions. */
ctf_dynhash_t *ctf_dvhash; /* Hash of dynamic variable mappings. */
ctf_list_t ctf_dvdefs; /* List of dynamic variable definitions. */
unsigned long ctf_dtoldid; /* Oldest id that has been committed. */
unsigned long ctf_snapshots; /* ctf_snapshot() plus ctf_update() count. */
unsigned long ctf_snapshot_lu; /* ctf_snapshot() call count at last update. */
ctf_archive_t *ctf_archive; /* Archive this ctf_file_t came from. */
libctf, ld, binutils: add textual error/warning reporting for libctf This commit adds a long-missing piece of infrastructure to libctf: the ability to report errors and warnings using all the power of printf, rather than being restricted to one errno value. Internally, libctf calls ctf_err_warn() to add errors and warnings to a list: a new iterator ctf_errwarning_next() then consumes this list one by one and hands it to the caller, which can free it. New errors and warnings are added until the list is consumed by the caller or the ctf_file_t is closed, so you can dump them at intervals. The caller can of course choose to print only those warnings it wants. (I am not sure whether we want objdump, readelf or ld to print warnings or not: right now I'm printing them, but maybe we only want to print errors? This entirely depends on whether warnings are voluminous things describing e.g. the inability to emit single types because of name clashes or something. There are no users of this infrastructure yet, so it's hard to say.) There is no internationalization here yet, but this at least adds a place where internationalization can be added, to one of ctf_errwarning_next or ctf_err_warn. We also provide a new ctf_assert() function which uses this infrastructure to provide non-fatal assertion failures while emitting an assert-like string to the caller: to save space and avoid needlessly duplicating unchanging strings, the assertion test is inlined but the print-things-out failure case is not. All assertions in libctf will be converted to use this machinery in future commits and propagate assertion-failure errors up, so that the linker in particular cannot be killed by libctf assertion failures when it could perfectly well just print warnings and drop the CTF section. include/ * ctf-api.h (ECTF_INTERNAL): Adjust error text. (ctf_errwarning_next): New. libctf/ * ctf-impl.h (ctf_assert): New. (ctf_err_warning_t): Likewise. (ctf_file_t) <ctf_errs_warnings>: Likewise. (ctf_err_warn): New prototype. (ctf_assert_fail_internal): Likewise. * ctf-inlines.h (ctf_assert_internal): Likewise. * ctf-open.c (ctf_file_close): Free ctf_errs_warnings. * ctf-create.c (ctf_serialize): Copy it on serialization. * ctf-subr.c (ctf_err_warn): New, add an error/warning. (ctf_errwarning_next): New iterator, free and pass back errors/warnings in succession. * libctf.ver (ctf_errwarning_next): Add. ld/ * ldlang.c (lang_ctf_errs_warnings): New, print CTF errors and warnings. Assert when libctf asserts. (lang_merge_ctf): Call it. (land_write_ctf): Likewise. binutils/ * objdump.c (ctf_archive_member): Print CTF errors and warnings. * readelf.c (dump_ctf_archive_member): Likewise.
2020-06-04 22:07:54 +08:00
ctf_list_t ctf_errs_warnings; /* CTF errors and warnings. */
libctf: add the ctf_link machinery This is the start of work on the core of the linking mechanism for CTF sections. This commit handles the type and string sections. The linker calls these functions in sequence: ctf_link_add_ctf: to add each CTF section in the input in turn to a newly-created ctf_file_t (which will appear in the output, and which itself will become the shared parent that contains types that all TUs have in common (in all link modes) and all types that do not have conflicting definitions between types (by default). Input files that are themselves products of ld -r are supported, though this is not heavily tested yet. ctf_link: called once all input files are added to merge the types in all the input containers into the output container, eliminating duplicates. ctf_link_add_strtab: called once the ELF string table is finalized and all its offsets are known, this calls a callback provided by the linker which returns the string content and offset of every string in the ELF strtab in turn: all these strings which appear in the input CTF strtab are eliminated from it in favour of the ELF strtab: equally, any strings that only appear in the input strtab will reappear in the internal CTF strtab of the output. ctf_link_shuffle_syms (not yet implemented): called once the ELF symtab is finalized, this calls a callback provided by the linker which returns information on every symbol in turn as a ctf_link_sym_t. This is then used to shuffle the function info and data object sections in the CTF section into symbol table order, eliminating the index sections which map those sections to symbol names before that point. Currently just returns ECTF_NOTYET. ctf_link_write: Returns a buffer containing either a serialized ctf_file_t (if there are no types with conflicting definitions in the object files in the link) or a ctf_archive_t containing a large ctf_file_t (the common types) and a bunch of small ones named after individual CUs in which conflicting types are found (containing the conflicting types, and all types that reference them). A threshold size above which compression takes place is passed as one parameter. (Currently, only gzip compression is supported, but I hope to add lzma as well.) Lifetime rules for this are simple: don't close the input CTF files until you've called ctf_link for the last time. We do not assume that symbols or strings passed in by the callback outlast the call to ctf_link_add_strtab or ctf_link_shuffle_syms. Right now, the duplicate elimination mechanism is the one already present as part of the ctf_add_type function, and is not particularly good: it misses numerous actual duplicates, and the conflicting-types detection hardly ever reports that types conflict, even when they do (one of them just tends to get silently dropped): it is also very slow. This will all be fixed in the next few weeks, but the fix hardly touches any of this code, and the linker does work without it, just not as well as it otherwise might. (And when no CTF section is present, there is no effect on performance, of course. So only people using a trunk GCC with not-yet-committed patches will even notice. By the time it gets upstream, things should be better.) v3: Fix error handling. v4: check for strdup failure. v5: fix tabdamage. include/ * ctf-api.h (struct ctf_link_sym): New, a symbol in flight to the libctf linking machinery. (CTF_LINK_SHARE_UNCONFLICTED): New. (CTF_LINK_SHARE_DUPLICATED): New. (ECTF_LINKADDEDLATE): New, replacing ECTF_UNUSED. (ECTF_NOTYET): New, a 'not yet implemented' message. (ctf_link_add_ctf): New, add an input file's CTF to the link. (ctf_link): New, merge the type and string sections. (ctf_link_strtab_string_f): New, callback for feeding strtab info. (ctf_link_iter_symbol_f): New, callback for feeding symtab info. (ctf_link_add_strtab): New, tell the CTF linker about the ELF strtab's strings. (ctf_link_shuffle_syms): New, ask the CTF linker to shuffle its symbols into symtab order. (ctf_link_write): New, ask the CTF linker to write the CTF out. libctf/ * ctf-link.c: New file, linking of the string and type sections. * Makefile.am (libctf_a_SOURCES): Add it. * Makefile.in: Regenerate. * ctf-impl.h (ctf_file_t): New fields ctf_link_inputs, ctf_link_outputs. * ctf-create.c (ctf_update): Update accordingly. * ctf-open.c (ctf_file_close): Likewise. * ctf-error.c (_ctf_errlist): Updated with new errors.
2019-07-14 04:06:55 +08:00
ctf_dynhash_t *ctf_link_inputs; /* Inputs to this link. */
ctf_dynhash_t *ctf_link_outputs; /* Additional outputs from this link. */
libctf, link: redo cu-mapping handling Now a bunch of stuff that doesn't apply to ld or any normal use of libctf, piled into one commit so that it's easier to ignore. The cu-mapping machinery associates incoming compilation unit names with outgoing names of CTF dictionaries that should correspond to them, for non-gdb CTF consumers that would like to group multiple TUs into a single child dict if conflicting types are found in it (the existing use case is one kernel module, one child CTF dict, even if the kernel module is composed of multiple CUs). The upcoming deduplicator needs to track not only the mapping from incoming CU name to outgoing dict name, but the inverse mapping from outgoing dict name to incoming CU name, so it can work over every CTF dict we might see in the output and link into it. So rejig the ctf-link machinery to do that. Simultaneously (because they are closely associated and were written at the same time), we add a new CTF_LINK_EMPTY_CU_MAPPINGS flag to ctf_link, which tells the ctf_link machinery to create empty child dicts for each outgoing CU mapping even if no CUs that correspond to it exist in the link. This is a bit (OK, quite a lot) of a waste of space, but some existing consumers require it. (Nobody else should use it.) Its value is not consecutive with existing CTF_LINK flag values because we're about to add more flags that are conceptually closer to the existing ones than this one is. include/ * ctf-api.h (CTF_LINK_EMPTY_CU_MAPPINGS): New. libctf/ * ctf-impl.h (ctf_file_t): Improve comments. <ctf_link_cu_mapping>: Split into... <ctf_link_in_cu_mapping>: ... this... <ctf_link_out_cu_mapping>: ... and this. * ctf-create.c (ctf_serialize): Adjust. * ctf-open.c (ctf_file_close): Likewise. * ctf-link.c (ctf_create_per_cu): Look things up in the in_cu_mapping instead of the cu_mapping. (ctf_link_add_cu_mapping): The deduplicating link will define what happens if many FROMs share a TO. (ctf_link_add_cu_mapping): Create in_cu_mapping and out_cu_mapping. Do not create ctf_link_outputs here any more, or create per-CU dicts here: they are already created when needed. (ctf_link_one_variable): Log a debug message if we skip a variable due to its type being concealed in a CU-mapped link. (This is probably too common a case to make into a warning.) (ctf_link): Create empty per-CU dicts if requested.
2020-06-06 00:36:16 +08:00
/* Map input types to output types: populated in each output dict.
Key is a ctf_link_type_key_t: value is a type ID. Used by
nondeduplicating links and ad-hoc ctf_add_type calls only. */
ctf_dynhash_t *ctf_link_type_mapping;
/* Map input CU names to output CTF dict names: populated in the top-level
output dict.
Key and value are dynamically-allocated strings. */
ctf_dynhash_t *ctf_link_in_cu_mapping;
/* Map output CTF dict names to input CU names: populated in the top-level
output dict. A hash of string to hash (set) of strings. Key and
individual value members are shared with ctf_link_in_cu_mapping. */
ctf_dynhash_t *ctf_link_out_cu_mapping;
libctf, link: add lazy linking: clean up input members: err/warn cleanup This rather large and intertwined pile of changes does three things: First, it transitions from dprintf to ctf_err_warn for things the user might care about: this one file is the major impetus for the ctf_err_warn infrastructure, because things like file names are crucial in linker error messages, and errno values are utterly incapable of communicating them Second, it stabilizes the ctf_link APIs: you can now call ctf_link_add_ctf without a CTF argument (only a NAME), to lazily ctf_open the file with the given NAME when needed, and close it as soon as possible, to save memory. This is not an API change because a null CTF argument was prohibited before now. Since getting CTF directly from files uses ctf_open, passing in only a NAME requires use of libctf, not libctf-nobfd. The linker's behaviour is unchanged, as it still passes in a ctf_archive_t as before. This also let us fix a leak: we were opening ctf_archives and their containing ctf_files, then only closing the files and leaving the archives open. Third, this commit restructures the ctf_link_in_member argument used by the CTF linking machinery and adjusts its users accordingly. We drop two members: - arcname, which is difficult to construct and then only used in error messages (that were only dprintf()ed, so never seen!) - share_mode, since we store the flags passed to ctf_link (including the share mode) in a new ctf_file_t.ctf_link_flags to help dedup get hold of it We rename others whose existing names were fairly dreadful: - done_main_member -> done_parent, using consistent terminology for .ctf as the parent of all archive members - main_input_fp -> in_fp_parent, likewise - file_name -> in_file_name, likewise We add one new member, cu_mapped. Finally, we move the various frees of things like mapping table data to the top-level ctf_link, since deduplicating links will want to do that too. include/ * ctf-api.h (ECTF_NEEDSBFD): New. (ECTF_NERR): Adjust. (ctf_link): Rename share_mode arg to flags. libctf/ * Makefile.am: Set -DNOBFD=1 in libctf-nobfd, and =0 elsewhere. * Makefile.in: Regenerated. * ctf-impl.h (ctf_link_input_name): New. (ctf_file_t) <ctf_link_flags>: New. * ctf-create.c (ctf_serialize): Adjust accordingly. * ctf-link.c: Define ctf_open as weak when PIC. (ctf_arc_close_thunk): Remove unnecessary thunk. (ctf_file_close_thunk): Likewise. (ctf_link_input_name): New. (ctf_link_input_t): New value of the ctf_file_t.ctf_link_input. (ctf_link_input_close): Adjust accordingly. (ctf_link_add_ctf_internal): New, split from... (ctf_link_add_ctf): ... here. Return error if lazy loading of CTF is not possible. Change to just call... (ctf_link_add): ... this new function. (ctf_link_add_cu_mapping): Transition to ctf_err_warn. Drop the ctf_file_close_thunk. (ctf_link_in_member_cb_arg_t) <file_name> Rename to... <in_file_name>: ... this. <arcname>: Drop. <share_mode>: Likewise (migrated to ctf_link_flags). <done_main_member>: Rename to... <done_parent>: ... this. <main_input_fp>: Rename to... <in_fp_parent>: ... this. <cu_mapped>: New. (ctf_link_one_type): Adjuwt accordingly. Transition to ctf_err_warn, removing a TODO. (ctf_link_one_variable): Note a case too common to warn about. Report in the debug stream if a cu-mapped link prevents addition of a conflicting variable. (ctf_link_one_input_archive_member): Adjust. (ctf_link_lazy_open): New, open a CTF archive for linking when needed. (ctf_link_close_one_input_archive): New, close it again. (ctf_link_one_input_archive): Adjust for lazy opening, member renames, and ctf_err_warn transition. Move the empty_link_type_mapping call to... (ctf_link): ... here. Adjut for renamings and thunk removal. Don't spuriously fail if some input contains no CTF data. (ctf_link_write): ctf_err_warn transition. * libctf.ver: Remove not-yet-stable comment.
2020-06-05 02:28:52 +08:00
/* CTF linker flags. */
int ctf_link_flags;
libctf, link: redo cu-mapping handling Now a bunch of stuff that doesn't apply to ld or any normal use of libctf, piled into one commit so that it's easier to ignore. The cu-mapping machinery associates incoming compilation unit names with outgoing names of CTF dictionaries that should correspond to them, for non-gdb CTF consumers that would like to group multiple TUs into a single child dict if conflicting types are found in it (the existing use case is one kernel module, one child CTF dict, even if the kernel module is composed of multiple CUs). The upcoming deduplicator needs to track not only the mapping from incoming CU name to outgoing dict name, but the inverse mapping from outgoing dict name to incoming CU name, so it can work over every CTF dict we might see in the output and link into it. So rejig the ctf-link machinery to do that. Simultaneously (because they are closely associated and were written at the same time), we add a new CTF_LINK_EMPTY_CU_MAPPINGS flag to ctf_link, which tells the ctf_link machinery to create empty child dicts for each outgoing CU mapping even if no CUs that correspond to it exist in the link. This is a bit (OK, quite a lot) of a waste of space, but some existing consumers require it. (Nobody else should use it.) Its value is not consecutive with existing CTF_LINK flag values because we're about to add more flags that are conceptually closer to the existing ones than this one is. include/ * ctf-api.h (CTF_LINK_EMPTY_CU_MAPPINGS): New. libctf/ * ctf-impl.h (ctf_file_t): Improve comments. <ctf_link_cu_mapping>: Split into... <ctf_link_in_cu_mapping>: ... this... <ctf_link_out_cu_mapping>: ... and this. * ctf-create.c (ctf_serialize): Adjust. * ctf-open.c (ctf_file_close): Likewise. * ctf-link.c (ctf_create_per_cu): Look things up in the in_cu_mapping instead of the cu_mapping. (ctf_link_add_cu_mapping): The deduplicating link will define what happens if many FROMs share a TO. (ctf_link_add_cu_mapping): Create in_cu_mapping and out_cu_mapping. Do not create ctf_link_outputs here any more, or create per-CU dicts here: they are already created when needed. (ctf_link_one_variable): Log a debug message if we skip a variable due to its type being concealed in a CU-mapped link. (This is probably too common a case to make into a warning.) (ctf_link): Create empty per-CU dicts if requested.
2020-06-06 00:36:16 +08:00
/* Allow the caller to change the name of link archive members. */
libctf: add CU-mapping machinery Once the deduplicator is capable of actually detecting conflicting types with the same name (i.e., not yet) we will place such conflicting types, and types that depend on them, into CTF dictionaries that are the child of the main dictionary we usually emit: currently, this will lead to the .ctf section becoming a CTF archive rather than a single dictionary, with the default-named archive member (_CTF_SECTION, or NULL) being the main shared dictionary with most of the types in it. By default, the sections are named after the compilation unit they come from (complete path and all), with the cuname field in the CTF header providing further evidence of the name without requiring the caller to engage in tiresome parsing. But some callers may not wish the mapping from input CU to output sub-dictionary to be purely CU-based. The machinery here allows this to be freely changed, in two ways: - callers can call ctf_link_add_cu_mapping to specify that a single input compilation unit should have its types placed in some other CU if they conflict: the CU will always be created, even if empty, so the consuming program can depend on its existence. You can map multiple input CUs to one output CU to force all their types to be merged together: if some of *those* types conflict, the behaviour is currently unspecified (the new deduplicator will specify it). - callers can call ctf_link_set_memb_name_changer to provide a function which is passed every CTF sub-dictionary name in turn (including _CTF_SECTION) and can return a new name, or NULL if no change is desired. The mapping from input to output names should not map two input names to the same output name: if this happens, the two are not merged but will result in an archive with two members with the same name (technically valid, but it's hard to access the second same-named member: you have to do an iteration over archive members). This is used by the kernel's ctfarchive machinery (not yet upstream) to encode CTF under member names like {module name}.ctf rather than .ctf.CU, but it is anticipated that other large projects may wish to have their own storage for CTF outside of .ctf sections and may wish to have new naming schemes that suit their special-purpose consumers. New in v3. v4: check for strdup failure. v5: fix tabdamage. include/ * ctf-api.h (ctf_link_add_cu_mapping): New. (ctf_link_memb_name_changer_f): New. (ctf_link_set_memb_name_changer): New. libctf/ * ctf-impl.h (ctf_file_t) <ctf_link_cu_mappping>: New. <ctf_link_memb_name_changer>: Likewise. <ctf_link_memb_name_changer_arg>: Likewise. * ctf-create.c (ctf_update): Update accordingly. * ctf-open.c (ctf_file_close): Likewise. * ctf-link.c (ctf_create_per_cu): Apply the cu mapping. (ctf_link_add_cu_mapping): New. (ctf_link_set_memb_name_changer): Likewise. (ctf_change_parent_name): New. (ctf_name_list_accum_cb_arg_t) <dynames>: New, storage for names allocated by the caller's ctf_link_memb_name_changer. <ndynames>: Likewise. (ctf_accumulate_archive_names): Call the ctf_link_memb_name_changer. (ctf_link_write): Likewise (for _CTF_SECTION only): also call ctf_change_parent_name. Free any resulting names.
2019-07-20 21:44:44 +08:00
ctf_link_memb_name_changer_f *ctf_link_memb_name_changer;
void *ctf_link_memb_name_changer_arg; /* Argument for it. */
/* Allow the caller to filter out variables they don't care about. */
ctf_link_variable_filter_f *ctf_link_variable_filter;
void *ctf_link_variable_filter_arg; /* Argument for it. */
libctf: properly handle ctf_add_type of forwards and self-reffing structs The code to handle structures (and unions) that refer to themselves in ctf_add_type is extremely dodgy. It works by looking through the list of not-yet-committed types for a structure with the same name as the structure in question and assuming, if it finds it, that this must be a reference to the same type. This is a linear search that gets ever slower as the dictionary grows, requiring you to call ctf_update at intervals to keep performance tolerable: but if you do that, you run into the problem that if a forward declared before the ctf_update is changed to a structure afterwards, ctf_update explodes. The last commit fixed most of this: this commit can use it, adding a new ctf_add_processing hash that tracks source type IDs that are currently being processed and uses it to avoid infinite recursion rather than the dynamic type list: we split ctf_add_type into a ctf_add_type_internal, so that ctf_add_type itself can become a wrapper that empties out this being-processed hash once the entire recursive type addition is over. Structure additions themselves avoid adding their dependent types quite so much by checking the type mapping and avoiding re-adding types we already know we have added. We also add support for adding forwards to dictionaries that already contain the thing they are a forward to: we just silently return the original type. v4: return existing struct/union/enum types properly, rather than using an uninitialized variable: shrinks sizes of CTF sections back down to roughly where they were in v1/v2 of this patch series. v5: fix tabdamage. libctf/ * ctf-impl.h (ctf_file_t) <ctf_add_processing>: New. * ctf-open.c (ctf_file_close): Free it. * ctf-create.c (ctf_serialize): Adjust. (membcmp): When reporting a conflict due to an error, report the error. (ctf_add_type): Turn into a ctf_add_processing wrapper. Rename to... (ctf_add_type_internal): ... this. Hand back types we are already in the middle of adding immediately. Hand back structs/unions with the same number of members immediately. Do not walk the dynamic list. Call ctf_add_type_internal, not ctf_add_type. Handle forwards promoted to other types and the inverse case identically. Add structs to the mapping as soon as we intern them, before they gain any members.
2019-08-08 01:01:08 +08:00
ctf_dynhash_t *ctf_add_processing; /* Types ctf_add_type is working on now. */
/* Atoms table for dedup string storage. All strings in the ctf_dedup_t are
stored here. Only the _alloc copy is allocated or freed: the
ctf_dedup_atoms may be pointed to some other CTF dict, to share its atoms.
We keep the atoms table outside the ctf_dedup so that atoms can be
preserved across multiple similar links, such as when doing cu-mapped
links. */
ctf_dynset_t *ctf_dedup_atoms;
ctf_dynset_t *ctf_dedup_atoms_alloc;
ctf_dedup_t ctf_dedup; /* Deduplicator state. */
char *ctf_tmp_typeslice; /* Storage for slicing up type names. */
size_t ctf_tmp_typeslicelen; /* Size of the typeslice. */
void *ctf_specific; /* Data for ctf_get/setspecific(). */
};
libctf: mmappable archives If you need to store a large number of CTF containers somewhere, this provides a dedicated facility for doing so: an mmappable archive format like a very simple tar or ar without all the system-dependent format horrors or need for heavy file copying, with built-in compression of files above a particular size threshold. libctf automatically mmap()s uncompressed elements of these archives, or uncompresses them, as needed. (If the platform does not support mmap(), copying into dynamically-allocated buffers is used.) Archive iteration operations are partitioned into raw and non-raw forms. Raw operations pass thhe raw archive contents to the callback: non-raw forms open each member with ctf_bufopen() and pass the resulting ctf_file_t to the iterator instead. This lets you manipulate the raw data in the archive, or the contents interpreted as a CTF file, as needed. It is not yet known whether we will store CTF archives in a linked ELF object in one of these (akin to debugdata) or whether they'll get one section per TU plus one parent container for types shared between them. (In the case of ELF objects with very large numbers of TUs, an archive of all of them would seem preferable, so we might just use an archive, and add lzma support so you can assume that .gnu_debugdata and .ctf are compressed using the same algorithm if both are present.) To make usage easier, the ctf_archive_t is not the on-disk representation but an abstraction over both ctf_file_t's and archives of many ctf_file_t's: users see both CTF archives and raw CTF files as ctf_archive_t's upon opening, the only difference being that a raw CTF file has only a single "archive member", named ".ctf" (the default if a null pointer is passed in as the name). The next commit will make use of this facility, in addition to providing the public interface to actually open archives. (In the future, it should be possible to have all CTF sections in an ELF file appear as an "archive" in the same fashion.) This machinery is also used to allow library-internal creators of ctf_archive_t's (such as the next commit) to stash away an ELF string and symbol table, so that all opens of members in a given archive will use them. This lets CTF archives exploit the ELF string and symbol table just like raw CTF files can. (All this leads to somewhat confusing type naming. The ctf_archive_t is a typedef for the opaque internal type, struct ctf_archive_internal: the non-internal "struct ctf_archive" is the on-disk structure meant for other libraries manipulating CTF files. It is probably clearest to use the struct name for struct ctf_archive_internal inside the program, and the typedef names outside.) libctf/ * ctf-archive.c: New. * ctf-impl.h (ctf_archive_internal): New type. (ctf_arc_open_internal): New declaration. (ctf_arc_bufopen): Likewise. (ctf_arc_close_internal): Likewise. include/ * ctf.h (CTFA_MAGIC): New. (struct ctf_archive): New. (struct ctf_archive_modent): Likewise. * ctf-api.h (ctf_archive_member_f): New. (ctf_archive_raw_member_f): Likewise. (ctf_arc_write): Likewise. (ctf_arc_close): Likewise. (ctf_arc_open_by_name): Likewise. (ctf_archive_iter): Likewise. (ctf_archive_raw_iter): Likewise. (ctf_get_arc): Likewise.
2019-04-24 18:30:17 +08:00
/* An abstraction over both a ctf_file_t and a ctf_archive_t. */
struct ctf_archive_internal
{
int ctfi_is_archive;
int ctfi_unmap_on_close;
libctf: mmappable archives If you need to store a large number of CTF containers somewhere, this provides a dedicated facility for doing so: an mmappable archive format like a very simple tar or ar without all the system-dependent format horrors or need for heavy file copying, with built-in compression of files above a particular size threshold. libctf automatically mmap()s uncompressed elements of these archives, or uncompresses them, as needed. (If the platform does not support mmap(), copying into dynamically-allocated buffers is used.) Archive iteration operations are partitioned into raw and non-raw forms. Raw operations pass thhe raw archive contents to the callback: non-raw forms open each member with ctf_bufopen() and pass the resulting ctf_file_t to the iterator instead. This lets you manipulate the raw data in the archive, or the contents interpreted as a CTF file, as needed. It is not yet known whether we will store CTF archives in a linked ELF object in one of these (akin to debugdata) or whether they'll get one section per TU plus one parent container for types shared between them. (In the case of ELF objects with very large numbers of TUs, an archive of all of them would seem preferable, so we might just use an archive, and add lzma support so you can assume that .gnu_debugdata and .ctf are compressed using the same algorithm if both are present.) To make usage easier, the ctf_archive_t is not the on-disk representation but an abstraction over both ctf_file_t's and archives of many ctf_file_t's: users see both CTF archives and raw CTF files as ctf_archive_t's upon opening, the only difference being that a raw CTF file has only a single "archive member", named ".ctf" (the default if a null pointer is passed in as the name). The next commit will make use of this facility, in addition to providing the public interface to actually open archives. (In the future, it should be possible to have all CTF sections in an ELF file appear as an "archive" in the same fashion.) This machinery is also used to allow library-internal creators of ctf_archive_t's (such as the next commit) to stash away an ELF string and symbol table, so that all opens of members in a given archive will use them. This lets CTF archives exploit the ELF string and symbol table just like raw CTF files can. (All this leads to somewhat confusing type naming. The ctf_archive_t is a typedef for the opaque internal type, struct ctf_archive_internal: the non-internal "struct ctf_archive" is the on-disk structure meant for other libraries manipulating CTF files. It is probably clearest to use the struct name for struct ctf_archive_internal inside the program, and the typedef names outside.) libctf/ * ctf-archive.c: New. * ctf-impl.h (ctf_archive_internal): New type. (ctf_arc_open_internal): New declaration. (ctf_arc_bufopen): Likewise. (ctf_arc_close_internal): Likewise. include/ * ctf.h (CTFA_MAGIC): New. (struct ctf_archive): New. (struct ctf_archive_modent): Likewise. * ctf-api.h (ctf_archive_member_f): New. (ctf_archive_raw_member_f): Likewise. (ctf_arc_write): Likewise. (ctf_arc_close): Likewise. (ctf_arc_open_by_name): Likewise. (ctf_archive_iter): Likewise. (ctf_archive_raw_iter): Likewise. (ctf_get_arc): Likewise.
2019-04-24 18:30:17 +08:00
ctf_file_t *ctfi_file;
struct ctf_archive *ctfi_archive;
ctf_sect_t ctfi_symsect;
ctf_sect_t ctfi_strsect;
libctf, binutils: support CTF archives like objdump objdump and readelf have one major CTF-related behavioural difference: objdump can read .ctf sections that contain CTF archives and extract and dump their members, while readelf cannot. Since the linker often emits CTF archives, this means that readelf intermittently and (from the user's perspective) randomly fails to read CTF in files that ld emits, with a confusing error message wrongly claiming that the CTF content is corrupt. This is purely because the archive-opening code in libctf was needlessly tangled up with the BFD code, so readelf couldn't use it. Here, we disentangle it, moving ctf_new_archive_internal from ctf-open-bfd.c into ctf-archive.c and merging it with the helper function in ctf-archive.c it was already using. We add a new public API function ctf_arc_bufopen, that looks very like ctf_bufopen but returns an archive given suitable section data rather than a ctf_file_t: the archive is a ctf_archive_t, so it can be called on raw CTF dictionaries (with no archive present) and will return a single-member synthetic "archive". There is a tiny lifetime tweak here: before now, the archive code could assume that the symbol section in the ctf_archive_internal wrapper structure was always owned by BFD if it was present and should always be freed: now, the caller can pass one in via ctf_arc_bufopen, wihch has the usual lifetime rules for such sections (caller frees): so we add an extra field to track whether this is an internal call from ctf-open-bfd, in which case we still free the symbol section. include/ * ctf-api.h (ctf_arc_bufopen): New. libctf/ * ctf-impl.h (ctf_new_archive_internal): Declare. (ctf_arc_bufopen): Remove. (ctf_archive_internal) <ctfi_free_symsect>: New. * ctf-archive.c (ctf_arc_close): Use it. (ctf_arc_bufopen): Fuse into... (ctf_new_archive_internal): ... this, moved across from... * ctf-open-bfd.c: ... here. (ctf_bfdopen_ctfsect): Use ctf_arc_bufopen. * libctf.ver: Add it. binutils/ * readelf.c (dump_section_as_ctf): Support .ctf archives using ctf_arc_bufopen. Automatically load the .ctf member of such archives as the parent of all other members, unless specifically overridden via --ctf-parent. Split out dumping code into... (dump_ctf_archive_member): ... here, as in objdump, and call it once per archive member. (dump_ctf_indent_lines): Code style fix.
2019-12-13 20:01:12 +08:00
int ctfi_free_symsect;
int ctfi_free_strsect;
libctf: mmappable archives If you need to store a large number of CTF containers somewhere, this provides a dedicated facility for doing so: an mmappable archive format like a very simple tar or ar without all the system-dependent format horrors or need for heavy file copying, with built-in compression of files above a particular size threshold. libctf automatically mmap()s uncompressed elements of these archives, or uncompresses them, as needed. (If the platform does not support mmap(), copying into dynamically-allocated buffers is used.) Archive iteration operations are partitioned into raw and non-raw forms. Raw operations pass thhe raw archive contents to the callback: non-raw forms open each member with ctf_bufopen() and pass the resulting ctf_file_t to the iterator instead. This lets you manipulate the raw data in the archive, or the contents interpreted as a CTF file, as needed. It is not yet known whether we will store CTF archives in a linked ELF object in one of these (akin to debugdata) or whether they'll get one section per TU plus one parent container for types shared between them. (In the case of ELF objects with very large numbers of TUs, an archive of all of them would seem preferable, so we might just use an archive, and add lzma support so you can assume that .gnu_debugdata and .ctf are compressed using the same algorithm if both are present.) To make usage easier, the ctf_archive_t is not the on-disk representation but an abstraction over both ctf_file_t's and archives of many ctf_file_t's: users see both CTF archives and raw CTF files as ctf_archive_t's upon opening, the only difference being that a raw CTF file has only a single "archive member", named ".ctf" (the default if a null pointer is passed in as the name). The next commit will make use of this facility, in addition to providing the public interface to actually open archives. (In the future, it should be possible to have all CTF sections in an ELF file appear as an "archive" in the same fashion.) This machinery is also used to allow library-internal creators of ctf_archive_t's (such as the next commit) to stash away an ELF string and symbol table, so that all opens of members in a given archive will use them. This lets CTF archives exploit the ELF string and symbol table just like raw CTF files can. (All this leads to somewhat confusing type naming. The ctf_archive_t is a typedef for the opaque internal type, struct ctf_archive_internal: the non-internal "struct ctf_archive" is the on-disk structure meant for other libraries manipulating CTF files. It is probably clearest to use the struct name for struct ctf_archive_internal inside the program, and the typedef names outside.) libctf/ * ctf-archive.c: New. * ctf-impl.h (ctf_archive_internal): New type. (ctf_arc_open_internal): New declaration. (ctf_arc_bufopen): Likewise. (ctf_arc_close_internal): Likewise. include/ * ctf.h (CTFA_MAGIC): New. (struct ctf_archive): New. (struct ctf_archive_modent): Likewise. * ctf-api.h (ctf_archive_member_f): New. (ctf_archive_raw_member_f): Likewise. (ctf_arc_write): Likewise. (ctf_arc_close): Likewise. (ctf_arc_open_by_name): Likewise. (ctf_archive_iter): Likewise. (ctf_archive_raw_iter): Likewise. (ctf_get_arc): Likewise.
2019-04-24 18:30:17 +08:00
void *ctfi_data;
bfd *ctfi_abfd; /* Optional source of section data. */
void (*ctfi_bfd_close) (struct ctf_archive_internal *);
libctf: mmappable archives If you need to store a large number of CTF containers somewhere, this provides a dedicated facility for doing so: an mmappable archive format like a very simple tar or ar without all the system-dependent format horrors or need for heavy file copying, with built-in compression of files above a particular size threshold. libctf automatically mmap()s uncompressed elements of these archives, or uncompresses them, as needed. (If the platform does not support mmap(), copying into dynamically-allocated buffers is used.) Archive iteration operations are partitioned into raw and non-raw forms. Raw operations pass thhe raw archive contents to the callback: non-raw forms open each member with ctf_bufopen() and pass the resulting ctf_file_t to the iterator instead. This lets you manipulate the raw data in the archive, or the contents interpreted as a CTF file, as needed. It is not yet known whether we will store CTF archives in a linked ELF object in one of these (akin to debugdata) or whether they'll get one section per TU plus one parent container for types shared between them. (In the case of ELF objects with very large numbers of TUs, an archive of all of them would seem preferable, so we might just use an archive, and add lzma support so you can assume that .gnu_debugdata and .ctf are compressed using the same algorithm if both are present.) To make usage easier, the ctf_archive_t is not the on-disk representation but an abstraction over both ctf_file_t's and archives of many ctf_file_t's: users see both CTF archives and raw CTF files as ctf_archive_t's upon opening, the only difference being that a raw CTF file has only a single "archive member", named ".ctf" (the default if a null pointer is passed in as the name). The next commit will make use of this facility, in addition to providing the public interface to actually open archives. (In the future, it should be possible to have all CTF sections in an ELF file appear as an "archive" in the same fashion.) This machinery is also used to allow library-internal creators of ctf_archive_t's (such as the next commit) to stash away an ELF string and symbol table, so that all opens of members in a given archive will use them. This lets CTF archives exploit the ELF string and symbol table just like raw CTF files can. (All this leads to somewhat confusing type naming. The ctf_archive_t is a typedef for the opaque internal type, struct ctf_archive_internal: the non-internal "struct ctf_archive" is the on-disk structure meant for other libraries manipulating CTF files. It is probably clearest to use the struct name for struct ctf_archive_internal inside the program, and the typedef names outside.) libctf/ * ctf-archive.c: New. * ctf-impl.h (ctf_archive_internal): New type. (ctf_arc_open_internal): New declaration. (ctf_arc_bufopen): Likewise. (ctf_arc_close_internal): Likewise. include/ * ctf.h (CTFA_MAGIC): New. (struct ctf_archive): New. (struct ctf_archive_modent): Likewise. * ctf-api.h (ctf_archive_member_f): New. (ctf_archive_raw_member_f): Likewise. (ctf_arc_write): Likewise. (ctf_arc_close): Likewise. (ctf_arc_open_by_name): Likewise. (ctf_archive_iter): Likewise. (ctf_archive_raw_iter): Likewise. (ctf_get_arc): Likewise.
2019-04-24 18:30:17 +08:00
};
libctf, next: introduce new class of easier-to-use iterators The libctf machinery currently only provides one way to iterate over its data structures: ctf_*_iter functions that take a callback and an arg and repeatedly call it. This *works*, but if you are doing a lot of iteration it is really quite inconvenient: you have to package up your local variables into structures over and over again and spawn lots of little functions even if it would be clearer in a single run of code. Look at ctf-string.c for an extreme example of how unreadable this can get, with three-line-long functions proliferating wildly. The deduplicator takes this to the Nth level. It iterates over a whole bunch of things: if we'd had to use _iter-class iterators for all of them there would be twenty additional functions in the deduplicator alone, for no other reason than that the iterator API requires it. Let's do something better. strtok_r gives us half the design: generators in a number of other languages give us the other half. The *_next API allows you to iterate over CTF-like entities in a single function using a normal while loop. e.g. here we are iterating over all the types in a dict: ctf_next_t *i = NULL; int *hidden; ctf_id_t id; while ((id = ctf_type_next (fp, &i, &hidden, 1)) != CTF_ERR) { /* do something with 'hidden' and 'id' */ } if (ctf_errno (fp) != ECTF_NEXT_END) /* iteration error */ Here we are walking through the members of a struct with CTF ID 'struct_type': ctf_next_t *i = NULL; ssize_t offset; const char *name; ctf_id_t membtype; while ((offset = ctf_member_next (fp, struct_type, &i, &name, &membtype)) >= 0 { /* do something with offset, name, and membtype */ } if (ctf_errno (fp) != ECTF_NEXT_END) /* iteration error */ Like every other while loop, this means you have access to all the local variables outside the loop while inside it, with no need to tiresomely package things up in structures, move the body of the loop into a separate function, etc, as you would with an iterator taking a callback. ctf_*_next allocates 'i' for you on first entry (when it must be NULL), and frees and NULLs it and returns a _next-dependent flag value when the iteration is over: the fp errno is set to ECTF_NEXT_END when the iteartion ends normally. If you want to exit early, call ctf_next_destroy on the iterator. You can copy iterators using ctf_next_copy, which copies their current iteration position so you can remember loop positions and go back to them later (or ctf_next_destroy them if you don't need them after all). Each _next function returns an always-likely-to-be-useful property of the thing being iterated over, and takes pointers to parameters for the others: with very few exceptions all those parameters can be NULLs if you're not interested in them, so e.g. you can iterate over only the offsets of members of a structure this way: while ((offset = ctf_member_next (fp, struct_id, &i, NULL, NULL)) >= 0) If you pass an iterator in use by one iteration function to another one, you get the new error ECTF_NEXT_WRONGFUN back; if you try to change ctf_file_t in mid-iteration, you get ECTF_NEXT_WRONGFP back. Internally the ctf_next_t remembers the iteration function in use, various sizes and increments useful for almost all iterations, then uses unions to overlap the actual entities being iterated over to keep ctf_next_t size down. Iterators available in the public API so far (all tested in actual use in the deduplicator): /* Iterate over the members of a STRUCT or UNION, returning each member's offset and optionally name and member type in turn. On end-of-iteration, returns -1. */ ssize_t ctf_member_next (ctf_file_t *fp, ctf_id_t type, ctf_next_t **it, const char **name, ctf_id_t *membtype); /* Iterate over the members of an enum TYPE, returning each enumerand's NAME or NULL at end of iteration or error, and optionally passing back the enumerand's integer VALue. */ const char * ctf_enum_next (ctf_file_t *fp, ctf_id_t type, ctf_next_t **it, int *val); /* Iterate over every type in the given CTF container (not including parents), optionally including non-user-visible types, returning each type ID and optionally the hidden flag in turn. Returns CTF_ERR on end of iteration or error. */ ctf_id_t ctf_type_next (ctf_file_t *fp, ctf_next_t **it, int *flag, int want_hidden); /* Iterate over every variable in the given CTF container, in arbitrary order, returning the name and type of each variable in turn. The NAME argument is not optional. Returns CTF_ERR on end of iteration or error. */ ctf_id_t ctf_variable_next (ctf_file_t *fp, ctf_next_t **it, const char **name); /* Iterate over all CTF files in an archive, returning each dict in turn as a ctf_file_t, and NULL on error or end of iteration. It is the caller's responsibility to close it. Parent dicts may be skipped. Regardless of whether they are skipped or not, the caller must ctf_import the parent if need be. */ ctf_file_t * ctf_archive_next (const ctf_archive_t *wrapper, ctf_next_t **it, const char **name, int skip_parent, int *errp); ctf_label_next is prototyped but not implemented yet. include/ * ctf-api.h (ECTF_NEXT_END): New error. (ECTF_NEXT_WRONGFUN): Likewise. (ECTF_NEXT_WRONGFP): Likewise. (ECTF_NERR): Adjust. (ctf_next_t): New. (ctf_next_create): New prototype. (ctf_next_destroy): Likewise. (ctf_next_copy): Likewise. (ctf_member_next): Likewise. (ctf_enum_next): Likewise. (ctf_type_next): Likewise. (ctf_label_next): Likewise. (ctf_variable_next): Likewise. libctf/ * ctf-impl.h (ctf_next): New. (ctf_get_dict): New prototype. * ctf-lookup.c (ctf_get_dict): New, split out of... (ctf_lookup_by_id): ... here. * ctf-util.c (ctf_next_create): New. (ctf_next_destroy): New. (ctf_next_copy): New. * ctf-types.c (includes): Add <assert.h>. (ctf_member_next): New. (ctf_enum_next): New. (ctf_type_iter): Document the lack of iteration over parent types. (ctf_type_next): New. (ctf_variable_next): New. * ctf-archive.c (ctf_archive_next): New. * libctf.ver: Add new public functions.
2020-06-03 22:13:24 +08:00
/* Iterator state for the *_next() functions. */
/* A single hash key/value pair. */
typedef struct ctf_next_hkv
{
void *hkv_key;
void *hkv_value;
} ctf_next_hkv_t;
libctf, next: introduce new class of easier-to-use iterators The libctf machinery currently only provides one way to iterate over its data structures: ctf_*_iter functions that take a callback and an arg and repeatedly call it. This *works*, but if you are doing a lot of iteration it is really quite inconvenient: you have to package up your local variables into structures over and over again and spawn lots of little functions even if it would be clearer in a single run of code. Look at ctf-string.c for an extreme example of how unreadable this can get, with three-line-long functions proliferating wildly. The deduplicator takes this to the Nth level. It iterates over a whole bunch of things: if we'd had to use _iter-class iterators for all of them there would be twenty additional functions in the deduplicator alone, for no other reason than that the iterator API requires it. Let's do something better. strtok_r gives us half the design: generators in a number of other languages give us the other half. The *_next API allows you to iterate over CTF-like entities in a single function using a normal while loop. e.g. here we are iterating over all the types in a dict: ctf_next_t *i = NULL; int *hidden; ctf_id_t id; while ((id = ctf_type_next (fp, &i, &hidden, 1)) != CTF_ERR) { /* do something with 'hidden' and 'id' */ } if (ctf_errno (fp) != ECTF_NEXT_END) /* iteration error */ Here we are walking through the members of a struct with CTF ID 'struct_type': ctf_next_t *i = NULL; ssize_t offset; const char *name; ctf_id_t membtype; while ((offset = ctf_member_next (fp, struct_type, &i, &name, &membtype)) >= 0 { /* do something with offset, name, and membtype */ } if (ctf_errno (fp) != ECTF_NEXT_END) /* iteration error */ Like every other while loop, this means you have access to all the local variables outside the loop while inside it, with no need to tiresomely package things up in structures, move the body of the loop into a separate function, etc, as you would with an iterator taking a callback. ctf_*_next allocates 'i' for you on first entry (when it must be NULL), and frees and NULLs it and returns a _next-dependent flag value when the iteration is over: the fp errno is set to ECTF_NEXT_END when the iteartion ends normally. If you want to exit early, call ctf_next_destroy on the iterator. You can copy iterators using ctf_next_copy, which copies their current iteration position so you can remember loop positions and go back to them later (or ctf_next_destroy them if you don't need them after all). Each _next function returns an always-likely-to-be-useful property of the thing being iterated over, and takes pointers to parameters for the others: with very few exceptions all those parameters can be NULLs if you're not interested in them, so e.g. you can iterate over only the offsets of members of a structure this way: while ((offset = ctf_member_next (fp, struct_id, &i, NULL, NULL)) >= 0) If you pass an iterator in use by one iteration function to another one, you get the new error ECTF_NEXT_WRONGFUN back; if you try to change ctf_file_t in mid-iteration, you get ECTF_NEXT_WRONGFP back. Internally the ctf_next_t remembers the iteration function in use, various sizes and increments useful for almost all iterations, then uses unions to overlap the actual entities being iterated over to keep ctf_next_t size down. Iterators available in the public API so far (all tested in actual use in the deduplicator): /* Iterate over the members of a STRUCT or UNION, returning each member's offset and optionally name and member type in turn. On end-of-iteration, returns -1. */ ssize_t ctf_member_next (ctf_file_t *fp, ctf_id_t type, ctf_next_t **it, const char **name, ctf_id_t *membtype); /* Iterate over the members of an enum TYPE, returning each enumerand's NAME or NULL at end of iteration or error, and optionally passing back the enumerand's integer VALue. */ const char * ctf_enum_next (ctf_file_t *fp, ctf_id_t type, ctf_next_t **it, int *val); /* Iterate over every type in the given CTF container (not including parents), optionally including non-user-visible types, returning each type ID and optionally the hidden flag in turn. Returns CTF_ERR on end of iteration or error. */ ctf_id_t ctf_type_next (ctf_file_t *fp, ctf_next_t **it, int *flag, int want_hidden); /* Iterate over every variable in the given CTF container, in arbitrary order, returning the name and type of each variable in turn. The NAME argument is not optional. Returns CTF_ERR on end of iteration or error. */ ctf_id_t ctf_variable_next (ctf_file_t *fp, ctf_next_t **it, const char **name); /* Iterate over all CTF files in an archive, returning each dict in turn as a ctf_file_t, and NULL on error or end of iteration. It is the caller's responsibility to close it. Parent dicts may be skipped. Regardless of whether they are skipped or not, the caller must ctf_import the parent if need be. */ ctf_file_t * ctf_archive_next (const ctf_archive_t *wrapper, ctf_next_t **it, const char **name, int skip_parent, int *errp); ctf_label_next is prototyped but not implemented yet. include/ * ctf-api.h (ECTF_NEXT_END): New error. (ECTF_NEXT_WRONGFUN): Likewise. (ECTF_NEXT_WRONGFP): Likewise. (ECTF_NERR): Adjust. (ctf_next_t): New. (ctf_next_create): New prototype. (ctf_next_destroy): Likewise. (ctf_next_copy): Likewise. (ctf_member_next): Likewise. (ctf_enum_next): Likewise. (ctf_type_next): Likewise. (ctf_label_next): Likewise. (ctf_variable_next): Likewise. libctf/ * ctf-impl.h (ctf_next): New. (ctf_get_dict): New prototype. * ctf-lookup.c (ctf_get_dict): New, split out of... (ctf_lookup_by_id): ... here. * ctf-util.c (ctf_next_create): New. (ctf_next_destroy): New. (ctf_next_copy): New. * ctf-types.c (includes): Add <assert.h>. (ctf_member_next): New. (ctf_enum_next): New. (ctf_type_iter): Document the lack of iteration over parent types. (ctf_type_next): New. (ctf_variable_next): New. * ctf-archive.c (ctf_archive_next): New. * libctf.ver: Add new public functions.
2020-06-03 22:13:24 +08:00
struct ctf_next
{
void (*ctn_iter_fun) (void);
ctf_id_t ctn_type;
ssize_t ctn_size;
ssize_t ctn_increment;
uint32_t ctn_n;
/* We can save space on this side of things by noting that a container is
either dynamic or not, as a whole, and a given iterator can only iterate
over one kind of thing at once: so we can overlap the DTD and non-DTD
members, and the structure, variable and enum members, etc. */
union
{
const ctf_member_t *ctn_mp;
const ctf_lmember_t *ctn_lmp;
const ctf_dmdef_t *ctn_dmd;
const ctf_enum_t *ctn_en;
const ctf_dvdef_t *ctn_dvd;
ctf_next_hkv_t *ctn_sorted_hkv;
void **ctn_hash_slot;
libctf, next: introduce new class of easier-to-use iterators The libctf machinery currently only provides one way to iterate over its data structures: ctf_*_iter functions that take a callback and an arg and repeatedly call it. This *works*, but if you are doing a lot of iteration it is really quite inconvenient: you have to package up your local variables into structures over and over again and spawn lots of little functions even if it would be clearer in a single run of code. Look at ctf-string.c for an extreme example of how unreadable this can get, with three-line-long functions proliferating wildly. The deduplicator takes this to the Nth level. It iterates over a whole bunch of things: if we'd had to use _iter-class iterators for all of them there would be twenty additional functions in the deduplicator alone, for no other reason than that the iterator API requires it. Let's do something better. strtok_r gives us half the design: generators in a number of other languages give us the other half. The *_next API allows you to iterate over CTF-like entities in a single function using a normal while loop. e.g. here we are iterating over all the types in a dict: ctf_next_t *i = NULL; int *hidden; ctf_id_t id; while ((id = ctf_type_next (fp, &i, &hidden, 1)) != CTF_ERR) { /* do something with 'hidden' and 'id' */ } if (ctf_errno (fp) != ECTF_NEXT_END) /* iteration error */ Here we are walking through the members of a struct with CTF ID 'struct_type': ctf_next_t *i = NULL; ssize_t offset; const char *name; ctf_id_t membtype; while ((offset = ctf_member_next (fp, struct_type, &i, &name, &membtype)) >= 0 { /* do something with offset, name, and membtype */ } if (ctf_errno (fp) != ECTF_NEXT_END) /* iteration error */ Like every other while loop, this means you have access to all the local variables outside the loop while inside it, with no need to tiresomely package things up in structures, move the body of the loop into a separate function, etc, as you would with an iterator taking a callback. ctf_*_next allocates 'i' for you on first entry (when it must be NULL), and frees and NULLs it and returns a _next-dependent flag value when the iteration is over: the fp errno is set to ECTF_NEXT_END when the iteartion ends normally. If you want to exit early, call ctf_next_destroy on the iterator. You can copy iterators using ctf_next_copy, which copies their current iteration position so you can remember loop positions and go back to them later (or ctf_next_destroy them if you don't need them after all). Each _next function returns an always-likely-to-be-useful property of the thing being iterated over, and takes pointers to parameters for the others: with very few exceptions all those parameters can be NULLs if you're not interested in them, so e.g. you can iterate over only the offsets of members of a structure this way: while ((offset = ctf_member_next (fp, struct_id, &i, NULL, NULL)) >= 0) If you pass an iterator in use by one iteration function to another one, you get the new error ECTF_NEXT_WRONGFUN back; if you try to change ctf_file_t in mid-iteration, you get ECTF_NEXT_WRONGFP back. Internally the ctf_next_t remembers the iteration function in use, various sizes and increments useful for almost all iterations, then uses unions to overlap the actual entities being iterated over to keep ctf_next_t size down. Iterators available in the public API so far (all tested in actual use in the deduplicator): /* Iterate over the members of a STRUCT or UNION, returning each member's offset and optionally name and member type in turn. On end-of-iteration, returns -1. */ ssize_t ctf_member_next (ctf_file_t *fp, ctf_id_t type, ctf_next_t **it, const char **name, ctf_id_t *membtype); /* Iterate over the members of an enum TYPE, returning each enumerand's NAME or NULL at end of iteration or error, and optionally passing back the enumerand's integer VALue. */ const char * ctf_enum_next (ctf_file_t *fp, ctf_id_t type, ctf_next_t **it, int *val); /* Iterate over every type in the given CTF container (not including parents), optionally including non-user-visible types, returning each type ID and optionally the hidden flag in turn. Returns CTF_ERR on end of iteration or error. */ ctf_id_t ctf_type_next (ctf_file_t *fp, ctf_next_t **it, int *flag, int want_hidden); /* Iterate over every variable in the given CTF container, in arbitrary order, returning the name and type of each variable in turn. The NAME argument is not optional. Returns CTF_ERR on end of iteration or error. */ ctf_id_t ctf_variable_next (ctf_file_t *fp, ctf_next_t **it, const char **name); /* Iterate over all CTF files in an archive, returning each dict in turn as a ctf_file_t, and NULL on error or end of iteration. It is the caller's responsibility to close it. Parent dicts may be skipped. Regardless of whether they are skipped or not, the caller must ctf_import the parent if need be. */ ctf_file_t * ctf_archive_next (const ctf_archive_t *wrapper, ctf_next_t **it, const char **name, int skip_parent, int *errp); ctf_label_next is prototyped but not implemented yet. include/ * ctf-api.h (ECTF_NEXT_END): New error. (ECTF_NEXT_WRONGFUN): Likewise. (ECTF_NEXT_WRONGFP): Likewise. (ECTF_NERR): Adjust. (ctf_next_t): New. (ctf_next_create): New prototype. (ctf_next_destroy): Likewise. (ctf_next_copy): Likewise. (ctf_member_next): Likewise. (ctf_enum_next): Likewise. (ctf_type_next): Likewise. (ctf_label_next): Likewise. (ctf_variable_next): Likewise. libctf/ * ctf-impl.h (ctf_next): New. (ctf_get_dict): New prototype. * ctf-lookup.c (ctf_get_dict): New, split out of... (ctf_lookup_by_id): ... here. * ctf-util.c (ctf_next_create): New. (ctf_next_destroy): New. (ctf_next_copy): New. * ctf-types.c (includes): Add <assert.h>. (ctf_member_next): New. (ctf_enum_next): New. (ctf_type_iter): Document the lack of iteration over parent types. (ctf_type_next): New. (ctf_variable_next): New. * ctf-archive.c (ctf_archive_next): New. * libctf.ver: Add new public functions.
2020-06-03 22:13:24 +08:00
} u;
/* This union is of various sorts of container we can iterate over:
currently dictionaries and archives, dynhashes, and dynsets. */
libctf, next: introduce new class of easier-to-use iterators The libctf machinery currently only provides one way to iterate over its data structures: ctf_*_iter functions that take a callback and an arg and repeatedly call it. This *works*, but if you are doing a lot of iteration it is really quite inconvenient: you have to package up your local variables into structures over and over again and spawn lots of little functions even if it would be clearer in a single run of code. Look at ctf-string.c for an extreme example of how unreadable this can get, with three-line-long functions proliferating wildly. The deduplicator takes this to the Nth level. It iterates over a whole bunch of things: if we'd had to use _iter-class iterators for all of them there would be twenty additional functions in the deduplicator alone, for no other reason than that the iterator API requires it. Let's do something better. strtok_r gives us half the design: generators in a number of other languages give us the other half. The *_next API allows you to iterate over CTF-like entities in a single function using a normal while loop. e.g. here we are iterating over all the types in a dict: ctf_next_t *i = NULL; int *hidden; ctf_id_t id; while ((id = ctf_type_next (fp, &i, &hidden, 1)) != CTF_ERR) { /* do something with 'hidden' and 'id' */ } if (ctf_errno (fp) != ECTF_NEXT_END) /* iteration error */ Here we are walking through the members of a struct with CTF ID 'struct_type': ctf_next_t *i = NULL; ssize_t offset; const char *name; ctf_id_t membtype; while ((offset = ctf_member_next (fp, struct_type, &i, &name, &membtype)) >= 0 { /* do something with offset, name, and membtype */ } if (ctf_errno (fp) != ECTF_NEXT_END) /* iteration error */ Like every other while loop, this means you have access to all the local variables outside the loop while inside it, with no need to tiresomely package things up in structures, move the body of the loop into a separate function, etc, as you would with an iterator taking a callback. ctf_*_next allocates 'i' for you on first entry (when it must be NULL), and frees and NULLs it and returns a _next-dependent flag value when the iteration is over: the fp errno is set to ECTF_NEXT_END when the iteartion ends normally. If you want to exit early, call ctf_next_destroy on the iterator. You can copy iterators using ctf_next_copy, which copies their current iteration position so you can remember loop positions and go back to them later (or ctf_next_destroy them if you don't need them after all). Each _next function returns an always-likely-to-be-useful property of the thing being iterated over, and takes pointers to parameters for the others: with very few exceptions all those parameters can be NULLs if you're not interested in them, so e.g. you can iterate over only the offsets of members of a structure this way: while ((offset = ctf_member_next (fp, struct_id, &i, NULL, NULL)) >= 0) If you pass an iterator in use by one iteration function to another one, you get the new error ECTF_NEXT_WRONGFUN back; if you try to change ctf_file_t in mid-iteration, you get ECTF_NEXT_WRONGFP back. Internally the ctf_next_t remembers the iteration function in use, various sizes and increments useful for almost all iterations, then uses unions to overlap the actual entities being iterated over to keep ctf_next_t size down. Iterators available in the public API so far (all tested in actual use in the deduplicator): /* Iterate over the members of a STRUCT or UNION, returning each member's offset and optionally name and member type in turn. On end-of-iteration, returns -1. */ ssize_t ctf_member_next (ctf_file_t *fp, ctf_id_t type, ctf_next_t **it, const char **name, ctf_id_t *membtype); /* Iterate over the members of an enum TYPE, returning each enumerand's NAME or NULL at end of iteration or error, and optionally passing back the enumerand's integer VALue. */ const char * ctf_enum_next (ctf_file_t *fp, ctf_id_t type, ctf_next_t **it, int *val); /* Iterate over every type in the given CTF container (not including parents), optionally including non-user-visible types, returning each type ID and optionally the hidden flag in turn. Returns CTF_ERR on end of iteration or error. */ ctf_id_t ctf_type_next (ctf_file_t *fp, ctf_next_t **it, int *flag, int want_hidden); /* Iterate over every variable in the given CTF container, in arbitrary order, returning the name and type of each variable in turn. The NAME argument is not optional. Returns CTF_ERR on end of iteration or error. */ ctf_id_t ctf_variable_next (ctf_file_t *fp, ctf_next_t **it, const char **name); /* Iterate over all CTF files in an archive, returning each dict in turn as a ctf_file_t, and NULL on error or end of iteration. It is the caller's responsibility to close it. Parent dicts may be skipped. Regardless of whether they are skipped or not, the caller must ctf_import the parent if need be. */ ctf_file_t * ctf_archive_next (const ctf_archive_t *wrapper, ctf_next_t **it, const char **name, int skip_parent, int *errp); ctf_label_next is prototyped but not implemented yet. include/ * ctf-api.h (ECTF_NEXT_END): New error. (ECTF_NEXT_WRONGFUN): Likewise. (ECTF_NEXT_WRONGFP): Likewise. (ECTF_NERR): Adjust. (ctf_next_t): New. (ctf_next_create): New prototype. (ctf_next_destroy): Likewise. (ctf_next_copy): Likewise. (ctf_member_next): Likewise. (ctf_enum_next): Likewise. (ctf_type_next): Likewise. (ctf_label_next): Likewise. (ctf_variable_next): Likewise. libctf/ * ctf-impl.h (ctf_next): New. (ctf_get_dict): New prototype. * ctf-lookup.c (ctf_get_dict): New, split out of... (ctf_lookup_by_id): ... here. * ctf-util.c (ctf_next_create): New. (ctf_next_destroy): New. (ctf_next_copy): New. * ctf-types.c (includes): Add <assert.h>. (ctf_member_next): New. (ctf_enum_next): New. (ctf_type_iter): Document the lack of iteration over parent types. (ctf_type_next): New. (ctf_variable_next): New. * ctf-archive.c (ctf_archive_next): New. * libctf.ver: Add new public functions.
2020-06-03 22:13:24 +08:00
union
{
const ctf_file_t *ctn_fp;
const ctf_archive_t *ctn_arc;
const ctf_dynhash_t *ctn_h;
const ctf_dynset_t *ctn_s;
libctf, next: introduce new class of easier-to-use iterators The libctf machinery currently only provides one way to iterate over its data structures: ctf_*_iter functions that take a callback and an arg and repeatedly call it. This *works*, but if you are doing a lot of iteration it is really quite inconvenient: you have to package up your local variables into structures over and over again and spawn lots of little functions even if it would be clearer in a single run of code. Look at ctf-string.c for an extreme example of how unreadable this can get, with three-line-long functions proliferating wildly. The deduplicator takes this to the Nth level. It iterates over a whole bunch of things: if we'd had to use _iter-class iterators for all of them there would be twenty additional functions in the deduplicator alone, for no other reason than that the iterator API requires it. Let's do something better. strtok_r gives us half the design: generators in a number of other languages give us the other half. The *_next API allows you to iterate over CTF-like entities in a single function using a normal while loop. e.g. here we are iterating over all the types in a dict: ctf_next_t *i = NULL; int *hidden; ctf_id_t id; while ((id = ctf_type_next (fp, &i, &hidden, 1)) != CTF_ERR) { /* do something with 'hidden' and 'id' */ } if (ctf_errno (fp) != ECTF_NEXT_END) /* iteration error */ Here we are walking through the members of a struct with CTF ID 'struct_type': ctf_next_t *i = NULL; ssize_t offset; const char *name; ctf_id_t membtype; while ((offset = ctf_member_next (fp, struct_type, &i, &name, &membtype)) >= 0 { /* do something with offset, name, and membtype */ } if (ctf_errno (fp) != ECTF_NEXT_END) /* iteration error */ Like every other while loop, this means you have access to all the local variables outside the loop while inside it, with no need to tiresomely package things up in structures, move the body of the loop into a separate function, etc, as you would with an iterator taking a callback. ctf_*_next allocates 'i' for you on first entry (when it must be NULL), and frees and NULLs it and returns a _next-dependent flag value when the iteration is over: the fp errno is set to ECTF_NEXT_END when the iteartion ends normally. If you want to exit early, call ctf_next_destroy on the iterator. You can copy iterators using ctf_next_copy, which copies their current iteration position so you can remember loop positions and go back to them later (or ctf_next_destroy them if you don't need them after all). Each _next function returns an always-likely-to-be-useful property of the thing being iterated over, and takes pointers to parameters for the others: with very few exceptions all those parameters can be NULLs if you're not interested in them, so e.g. you can iterate over only the offsets of members of a structure this way: while ((offset = ctf_member_next (fp, struct_id, &i, NULL, NULL)) >= 0) If you pass an iterator in use by one iteration function to another one, you get the new error ECTF_NEXT_WRONGFUN back; if you try to change ctf_file_t in mid-iteration, you get ECTF_NEXT_WRONGFP back. Internally the ctf_next_t remembers the iteration function in use, various sizes and increments useful for almost all iterations, then uses unions to overlap the actual entities being iterated over to keep ctf_next_t size down. Iterators available in the public API so far (all tested in actual use in the deduplicator): /* Iterate over the members of a STRUCT or UNION, returning each member's offset and optionally name and member type in turn. On end-of-iteration, returns -1. */ ssize_t ctf_member_next (ctf_file_t *fp, ctf_id_t type, ctf_next_t **it, const char **name, ctf_id_t *membtype); /* Iterate over the members of an enum TYPE, returning each enumerand's NAME or NULL at end of iteration or error, and optionally passing back the enumerand's integer VALue. */ const char * ctf_enum_next (ctf_file_t *fp, ctf_id_t type, ctf_next_t **it, int *val); /* Iterate over every type in the given CTF container (not including parents), optionally including non-user-visible types, returning each type ID and optionally the hidden flag in turn. Returns CTF_ERR on end of iteration or error. */ ctf_id_t ctf_type_next (ctf_file_t *fp, ctf_next_t **it, int *flag, int want_hidden); /* Iterate over every variable in the given CTF container, in arbitrary order, returning the name and type of each variable in turn. The NAME argument is not optional. Returns CTF_ERR on end of iteration or error. */ ctf_id_t ctf_variable_next (ctf_file_t *fp, ctf_next_t **it, const char **name); /* Iterate over all CTF files in an archive, returning each dict in turn as a ctf_file_t, and NULL on error or end of iteration. It is the caller's responsibility to close it. Parent dicts may be skipped. Regardless of whether they are skipped or not, the caller must ctf_import the parent if need be. */ ctf_file_t * ctf_archive_next (const ctf_archive_t *wrapper, ctf_next_t **it, const char **name, int skip_parent, int *errp); ctf_label_next is prototyped but not implemented yet. include/ * ctf-api.h (ECTF_NEXT_END): New error. (ECTF_NEXT_WRONGFUN): Likewise. (ECTF_NEXT_WRONGFP): Likewise. (ECTF_NERR): Adjust. (ctf_next_t): New. (ctf_next_create): New prototype. (ctf_next_destroy): Likewise. (ctf_next_copy): Likewise. (ctf_member_next): Likewise. (ctf_enum_next): Likewise. (ctf_type_next): Likewise. (ctf_label_next): Likewise. (ctf_variable_next): Likewise. libctf/ * ctf-impl.h (ctf_next): New. (ctf_get_dict): New prototype. * ctf-lookup.c (ctf_get_dict): New, split out of... (ctf_lookup_by_id): ... here. * ctf-util.c (ctf_next_create): New. (ctf_next_destroy): New. (ctf_next_copy): New. * ctf-types.c (includes): Add <assert.h>. (ctf_member_next): New. (ctf_enum_next): New. (ctf_type_iter): Document the lack of iteration over parent types. (ctf_type_next): New. (ctf_variable_next): New. * ctf-archive.c (ctf_archive_next): New. * libctf.ver: Add new public functions.
2020-06-03 22:13:24 +08:00
} cu;
};
/* Return x rounded up to an alignment boundary.
eg, P2ROUNDUP(0x1234, 0x100) == 0x1300 (0x13*align)
eg, P2ROUNDUP(0x5600, 0x100) == 0x5600 (0x56*align) */
#define P2ROUNDUP(x, align) (-(-(x) & -(align)))
/* * If an offs is not aligned already then round it up and align it. */
#define LCTF_ALIGN_OFFS(offs, align) ((offs + (align - 1)) & ~(align - 1))
#define LCTF_TYPE_ISPARENT(fp, id) ((id) <= fp->ctf_parmax)
#define LCTF_TYPE_ISCHILD(fp, id) ((id) > fp->ctf_parmax)
#define LCTF_TYPE_TO_INDEX(fp, id) ((id) & (fp->ctf_parmax))
#define LCTF_INDEX_TO_TYPE(fp, id, child) (child ? ((id) | (fp->ctf_parmax+1)) : \
(id))
#define LCTF_INDEX_TO_TYPEPTR(fp, i) \
libctf: avoid the need to ever use ctf_update The method of operation of libctf when the dictionary is writable has before now been that types that are added land in the dynamic type section, which is a linked list and hash of IDs -> dynamic type definitions (and, recently a hash of names): the DTDs are a bit of CTF representing the ctf_type_t and ad hoc C structures representing the vlen. Historically, libctf was unable to do anything with these types, not even look them up by ID, let alone by name: if you wanted to do that say if you were adding a type that depended on one you just added) you called ctf_update, which serializes all the DTDs into a CTF file and reopens it, copying its guts over the fp it's called with. The ctf_updated types are then frozen in amber and unchangeable: all lookups will return the types in the static portion in preference to the dynamic portion, and we will refuse to re-add things that already exist in the static portion (and, of late, in the dynamic portion too). The libctf machinery remembers the boundary between static and dynamic types and looks in the right portion for each type. Lots of things still don't quite work with dynamic types (e.g. getting their size), but enough works to do a bunch of additions and then a ctf_update, most of the time. Except it doesn't, because ctf_add_type finds it necessary to walk the full dynamic type definition list looking for types with matching names, so it gets slower and slower with every type you add: fixing this requires calling ctf_update periodically for no other reason than to avoid massively slowing things down. This is all clunky and very slow but kind of works, until you consider that it is in fact possible and indeed necessary to modify one sort of type after it has been added: forwards. These are necessarily promoted to structs, unions or enums, and when they do so *their type ID does not change*. So all of a sudden we are changing types that already exist in the static portion. ctf_update gets massively confused by this and allocates space enough for the forward (with no members), but then emits the new dynamic type (with all the members) into it. You get an assertion failure after that, if you're lucky, or a coredump. So this commit rejigs things a bit and arranges to exclusively use the dynamic type definitions in writable dictionaries, and the static type definitions in readable dictionaries: we don't at any time have a mixture of static and dynamic types, and you don't need to call ctf_update to make things "appear". The ctf_dtbyname hash I introduced a few months ago, which maps things like "struct foo" to DTDs, is removed, replaced instead by a change of type of the four dictionaries which track names. Rather than just being (unresizable) ctf_hash_t's populated only at ctf_bufopen time, they are now a ctf_names_t structure, which is a pair of ctf_hash_t and ctf_dynhash_t, with the ctf_hash_t portion being used in readonly dictionaries, and the ctf_dynhash_t being used in writable ones. The decision as to which to use is centralized in the new functions ctf_lookup_by_rawname (which takes a type kind) and ctf_lookup_by_rawhash, which it calls (which takes a ctf_names_t *.) This change lets us switch from using static to dynamic name hashes on the fly across the entirety of libctf without complexifying anything: in fact, because we now centralize the knowledge about how to map from type kind to name hash, it actually simplifies things and lets us throw out quite a lot of now-unnecessary complexity, from ctf_dtnyname (replaced by the dynamic half of the name tables), through to ctf_dtnextid (now that a dictionary's static portion is never referenced if the dictionary is writable, we can just use ctf_typemax to indicate the maximum type: dynamic or non-dynamic does not matter, and we no longer need to track the boundary between the types). You can now ctf_rollback() as far as you like, even past a ctf_update or for that matter a full writeout; all the iteration functions work just as well on writable as on read-only dictionaries; ctf_add_type no longer needs expensive duplicated code to run over the dynamic types hunting for ones it might be interested in; and the linker no longer needs a hack to call ctf_update so that calling ctf_add_type is not impossibly expensive. There is still a bit more complexity: some new code paths in ctf-types.c need to know how to extract information from dynamic types. This complexity will go away again in a few months when libctf acquires a proper intermediate representation. You can still call ctf_update if you like (it's public API, after all), but its only effect now is to set the point to which ctf_discard rolls back. Obviously *something* still needs to serialize the CTF file before writeout, and this job is done by ctf_serialize, which does everything ctf_update used to except set the counter used by ctf_discard. It is automatically called by the various functions that do CTF writeout: nobody else ever needs to call it. With this in place, forwards that are promoted to non-forwards no longer crash the link, even if it happens tens of thousands of types later. v5: fix tabdamage. libctf/ * ctf-impl.h (ctf_names_t): New. (ctf_lookup_t) <ctf_hash>: Now a ctf_names_t, not a ctf_hash_t. (ctf_file_t) <ctf_structs>: Likewise. <ctf_unions>: Likewise. <ctf_enums>: Likewise. <ctf_names>: Likewise. <ctf_lookups>: Improve comment. <ctf_ptrtab_len>: New. <ctf_prov_strtab>: New. <ctf_str_prov_offset>: New. <ctf_dtbyname>: Remove, redundant to the names hashes. <ctf_dtnextid>: Remove, redundant to ctf_typemax. (ctf_dtdef_t) <dtd_name>: Remove. <dtd_data>: Note that the ctt_name is now populated. (ctf_str_atom_t) <csa_offset>: This is now the strtab offset for internal strings too. <csa_external_offset>: New, the external strtab offset. (CTF_INDEX_TO_TYPEPTR): Handle the LCTF_RDWR case. (ctf_name_table): New declaration. (ctf_lookup_by_rawname): Likewise. (ctf_lookup_by_rawhash): Likewise. (ctf_set_ctl_hashes): Likewise. (ctf_serialize): Likewise. (ctf_dtd_insert): Adjust. (ctf_simple_open_internal): Likewise. (ctf_bufopen_internal): Likewise. (ctf_list_empty_p): Likewise. (ctf_str_remove_ref): Likewise. (ctf_str_add): Returns uint32_t now. (ctf_str_add_ref): Likewise. (ctf_str_add_external): Now returns a boolean (int). * ctf-string.c (ctf_strraw_explicit): Check the ctf_prov_strtab for strings in the appropriate range. (ctf_str_create_atoms): Create the ctf_prov_strtab. Detect OOM when adding the null string to the new strtab. (ctf_str_free_atoms): Destroy the ctf_prov_strtab. (ctf_str_add_ref_internal): Add make_provisional argument. If make_provisional, populate the offset and fill in the ctf_prov_strtab accordingly. (ctf_str_add): Return the offset, not the string. (ctf_str_add_ref): Likewise. (ctf_str_add_external): Return a success integer. (ctf_str_remove_ref): New, remove a single ref. (ctf_str_count_strtab): Do not count the initial null string's length or the existence or length of any unreferenced internal atoms. (ctf_str_populate_sorttab): Skip atoms with no refs. (ctf_str_write_strtab): Populate the nullstr earlier. Add one to the cts_len for the null string, since it is no longer done in ctf_str_count_strtab. Adjust for csa_external_offset rename. Populate the csa_offset for both internal and external cases. Flush the ctf_prov_strtab afterwards, and reset the ctf_str_prov_offset. * ctf-create.c (ctf_grow_ptrtab): New. (ctf_create): Call it. Initialize new fields rather than old ones. Tell ctf_bufopen_internal that this is a writable dictionary. Set the ctl hashes and data model. (ctf_update): Rename to... (ctf_serialize): ... this. Leave a compatibility function behind. Tell ctf_simple_open_internal that this is a writable dictionary. Pass the new fields along from the old dictionary. Drop ctf_dtnextid and ctf_dtbyname. Use ctf_strraw, not dtd_name. Do not zero out the DTD's ctt_name. (ctf_prefixed_name): Rename to... (ctf_name_table): ... this. No longer return a prefixed name: return the applicable name table instead. (ctf_dtd_insert): Use it, and use the right name table. Pass in the kind we're adding. Migrate away from dtd_name. (ctf_dtd_delete): Adjust similarly. Remove the ref to the deleted ctt_name. (ctf_dtd_lookup_type_by_name): Remove. (ctf_dynamic_type): Always return NULL on read-only dictionaries. No longer check ctf_dtnextid: check ctf_typemax instead. (ctf_snapshot): No longer use ctf_dtnextid: use ctf_typemax instead. (ctf_rollback): Likewise. No longer fail with ECTF_OVERROLLBACK. Use ctf_name_table and the right name table, and migrate away from dtd_name as in ctf_dtd_delete. (ctf_add_generic): Pass in the kind explicitly and pass it to ctf_dtd_insert. Use ctf_typemax, not ctf_dtnextid. Migrate away from dtd_name to using ctf_str_add_ref to populate the ctt_name. Grow the ptrtab if needed. (ctf_add_encoded): Pass in the kind. (ctf_add_slice): Likewise. (ctf_add_array): Likewise. (ctf_add_function): Likewise. (ctf_add_typedef): Likewise. (ctf_add_reftype): Likewise. Initialize the ctf_ptrtab, checking ctt_name rather than dtd_name. (ctf_add_struct_sized): Pass in the kind. Use ctf_lookup_by_rawname, not ctf_hash_lookup_type / ctf_dtd_lookup_type_by_name. (ctf_add_union_sized): Likewise. (ctf_add_enum): Likewise. (ctf_add_enum_encoded): Likewise. (ctf_add_forward): Likewise. (ctf_add_type): Likewise. (ctf_compress_write): Call ctf_serialize: adjust for ctf_size not being initialized until after the call. (ctf_write_mem): Likewise. (ctf_write): Likewise. * ctf-archive.c (arc_write_one_ctf): Likewise. * ctf-lookup.c (ctf_lookup_by_name): Use ctf_lookuup_by_rawhash, not ctf_hash_lookup_type. (ctf_lookup_by_id): No longer check the readonly types if the dictionary is writable. * ctf-open.c (init_types): Assert that this dictionary is not writable. Adjust to use the new name hashes, ctf_name_table, and ctf_ptrtab_len. GNU style fix for the final ptrtab scan. (ctf_bufopen_internal): New 'writable' parameter. Flip on LCTF_RDWR if set. Drop out early when dictionary is writable. Split the ctf_lookups initialization into... (ctf_set_cth_hashes): ... this new function. (ctf_simple_open_internal): Adjust. New 'writable' parameter. (ctf_simple_open): Adjust accordingly. (ctf_bufopen): Likewise. (ctf_file_close): Destroy the appropriate name hashes. No longer destroy ctf_dtbyname, which is gone. (ctf_getdatasect): Remove spurious "extern". * ctf-types.c (ctf_lookup_by_rawname): New, look up types in the specified name table, given a kind. (ctf_lookup_by_rawhash): Likewise, given a ctf_names_t *. (ctf_member_iter): Add support for iterating over the dynamic type list. (ctf_enum_iter): Likewise. (ctf_variable_iter): Likewise. (ctf_type_rvisit): Likewise. (ctf_member_info): Add support for types in the dynamic type list. (ctf_enum_name): Likewise. (ctf_enum_value): Likewise. (ctf_func_type_info): Likewise. (ctf_func_type_args): Likewise. * ctf-link.c (ctf_accumulate_archive_names): No longer call ctf_update. (ctf_link_write): Likewise. (ctf_link_intern_extern_string): Adjust for new ctf_str_add_external return value. (ctf_link_add_strtab): Likewise. * ctf-util.c (ctf_list_empty_p): New.
2019-08-08 00:55:09 +08:00
((fp->ctf_flags & LCTF_RDWR) ? \
&(ctf_dtd_lookup (fp, LCTF_INDEX_TO_TYPE \
(fp, i, fp->ctf_flags & LCTF_CHILD))->dtd_data) : \
(ctf_type_t *)((uintptr_t)(fp)->ctf_buf + (fp)->ctf_txlate[(i)]))
#define LCTF_INFO_KIND(fp, info) ((fp)->ctf_fileops->ctfo_get_kind(info))
#define LCTF_INFO_ISROOT(fp, info) ((fp)->ctf_fileops->ctfo_get_root(info))
#define LCTF_INFO_VLEN(fp, info) ((fp)->ctf_fileops->ctfo_get_vlen(info))
#define LCTF_VBYTES(fp, kind, size, vlen) \
((fp)->ctf_fileops->ctfo_get_vbytes(kind, size, vlen))
#define LCTF_CHILD 0x0001 /* CTF container is a child */
#define LCTF_RDWR 0x0002 /* CTF container is writable */
#define LCTF_DIRTY 0x0004 /* CTF container has been modified */
libctf: avoid the need to ever use ctf_update The method of operation of libctf when the dictionary is writable has before now been that types that are added land in the dynamic type section, which is a linked list and hash of IDs -> dynamic type definitions (and, recently a hash of names): the DTDs are a bit of CTF representing the ctf_type_t and ad hoc C structures representing the vlen. Historically, libctf was unable to do anything with these types, not even look them up by ID, let alone by name: if you wanted to do that say if you were adding a type that depended on one you just added) you called ctf_update, which serializes all the DTDs into a CTF file and reopens it, copying its guts over the fp it's called with. The ctf_updated types are then frozen in amber and unchangeable: all lookups will return the types in the static portion in preference to the dynamic portion, and we will refuse to re-add things that already exist in the static portion (and, of late, in the dynamic portion too). The libctf machinery remembers the boundary between static and dynamic types and looks in the right portion for each type. Lots of things still don't quite work with dynamic types (e.g. getting their size), but enough works to do a bunch of additions and then a ctf_update, most of the time. Except it doesn't, because ctf_add_type finds it necessary to walk the full dynamic type definition list looking for types with matching names, so it gets slower and slower with every type you add: fixing this requires calling ctf_update periodically for no other reason than to avoid massively slowing things down. This is all clunky and very slow but kind of works, until you consider that it is in fact possible and indeed necessary to modify one sort of type after it has been added: forwards. These are necessarily promoted to structs, unions or enums, and when they do so *their type ID does not change*. So all of a sudden we are changing types that already exist in the static portion. ctf_update gets massively confused by this and allocates space enough for the forward (with no members), but then emits the new dynamic type (with all the members) into it. You get an assertion failure after that, if you're lucky, or a coredump. So this commit rejigs things a bit and arranges to exclusively use the dynamic type definitions in writable dictionaries, and the static type definitions in readable dictionaries: we don't at any time have a mixture of static and dynamic types, and you don't need to call ctf_update to make things "appear". The ctf_dtbyname hash I introduced a few months ago, which maps things like "struct foo" to DTDs, is removed, replaced instead by a change of type of the four dictionaries which track names. Rather than just being (unresizable) ctf_hash_t's populated only at ctf_bufopen time, they are now a ctf_names_t structure, which is a pair of ctf_hash_t and ctf_dynhash_t, with the ctf_hash_t portion being used in readonly dictionaries, and the ctf_dynhash_t being used in writable ones. The decision as to which to use is centralized in the new functions ctf_lookup_by_rawname (which takes a type kind) and ctf_lookup_by_rawhash, which it calls (which takes a ctf_names_t *.) This change lets us switch from using static to dynamic name hashes on the fly across the entirety of libctf without complexifying anything: in fact, because we now centralize the knowledge about how to map from type kind to name hash, it actually simplifies things and lets us throw out quite a lot of now-unnecessary complexity, from ctf_dtnyname (replaced by the dynamic half of the name tables), through to ctf_dtnextid (now that a dictionary's static portion is never referenced if the dictionary is writable, we can just use ctf_typemax to indicate the maximum type: dynamic or non-dynamic does not matter, and we no longer need to track the boundary between the types). You can now ctf_rollback() as far as you like, even past a ctf_update or for that matter a full writeout; all the iteration functions work just as well on writable as on read-only dictionaries; ctf_add_type no longer needs expensive duplicated code to run over the dynamic types hunting for ones it might be interested in; and the linker no longer needs a hack to call ctf_update so that calling ctf_add_type is not impossibly expensive. There is still a bit more complexity: some new code paths in ctf-types.c need to know how to extract information from dynamic types. This complexity will go away again in a few months when libctf acquires a proper intermediate representation. You can still call ctf_update if you like (it's public API, after all), but its only effect now is to set the point to which ctf_discard rolls back. Obviously *something* still needs to serialize the CTF file before writeout, and this job is done by ctf_serialize, which does everything ctf_update used to except set the counter used by ctf_discard. It is automatically called by the various functions that do CTF writeout: nobody else ever needs to call it. With this in place, forwards that are promoted to non-forwards no longer crash the link, even if it happens tens of thousands of types later. v5: fix tabdamage. libctf/ * ctf-impl.h (ctf_names_t): New. (ctf_lookup_t) <ctf_hash>: Now a ctf_names_t, not a ctf_hash_t. (ctf_file_t) <ctf_structs>: Likewise. <ctf_unions>: Likewise. <ctf_enums>: Likewise. <ctf_names>: Likewise. <ctf_lookups>: Improve comment. <ctf_ptrtab_len>: New. <ctf_prov_strtab>: New. <ctf_str_prov_offset>: New. <ctf_dtbyname>: Remove, redundant to the names hashes. <ctf_dtnextid>: Remove, redundant to ctf_typemax. (ctf_dtdef_t) <dtd_name>: Remove. <dtd_data>: Note that the ctt_name is now populated. (ctf_str_atom_t) <csa_offset>: This is now the strtab offset for internal strings too. <csa_external_offset>: New, the external strtab offset. (CTF_INDEX_TO_TYPEPTR): Handle the LCTF_RDWR case. (ctf_name_table): New declaration. (ctf_lookup_by_rawname): Likewise. (ctf_lookup_by_rawhash): Likewise. (ctf_set_ctl_hashes): Likewise. (ctf_serialize): Likewise. (ctf_dtd_insert): Adjust. (ctf_simple_open_internal): Likewise. (ctf_bufopen_internal): Likewise. (ctf_list_empty_p): Likewise. (ctf_str_remove_ref): Likewise. (ctf_str_add): Returns uint32_t now. (ctf_str_add_ref): Likewise. (ctf_str_add_external): Now returns a boolean (int). * ctf-string.c (ctf_strraw_explicit): Check the ctf_prov_strtab for strings in the appropriate range. (ctf_str_create_atoms): Create the ctf_prov_strtab. Detect OOM when adding the null string to the new strtab. (ctf_str_free_atoms): Destroy the ctf_prov_strtab. (ctf_str_add_ref_internal): Add make_provisional argument. If make_provisional, populate the offset and fill in the ctf_prov_strtab accordingly. (ctf_str_add): Return the offset, not the string. (ctf_str_add_ref): Likewise. (ctf_str_add_external): Return a success integer. (ctf_str_remove_ref): New, remove a single ref. (ctf_str_count_strtab): Do not count the initial null string's length or the existence or length of any unreferenced internal atoms. (ctf_str_populate_sorttab): Skip atoms with no refs. (ctf_str_write_strtab): Populate the nullstr earlier. Add one to the cts_len for the null string, since it is no longer done in ctf_str_count_strtab. Adjust for csa_external_offset rename. Populate the csa_offset for both internal and external cases. Flush the ctf_prov_strtab afterwards, and reset the ctf_str_prov_offset. * ctf-create.c (ctf_grow_ptrtab): New. (ctf_create): Call it. Initialize new fields rather than old ones. Tell ctf_bufopen_internal that this is a writable dictionary. Set the ctl hashes and data model. (ctf_update): Rename to... (ctf_serialize): ... this. Leave a compatibility function behind. Tell ctf_simple_open_internal that this is a writable dictionary. Pass the new fields along from the old dictionary. Drop ctf_dtnextid and ctf_dtbyname. Use ctf_strraw, not dtd_name. Do not zero out the DTD's ctt_name. (ctf_prefixed_name): Rename to... (ctf_name_table): ... this. No longer return a prefixed name: return the applicable name table instead. (ctf_dtd_insert): Use it, and use the right name table. Pass in the kind we're adding. Migrate away from dtd_name. (ctf_dtd_delete): Adjust similarly. Remove the ref to the deleted ctt_name. (ctf_dtd_lookup_type_by_name): Remove. (ctf_dynamic_type): Always return NULL on read-only dictionaries. No longer check ctf_dtnextid: check ctf_typemax instead. (ctf_snapshot): No longer use ctf_dtnextid: use ctf_typemax instead. (ctf_rollback): Likewise. No longer fail with ECTF_OVERROLLBACK. Use ctf_name_table and the right name table, and migrate away from dtd_name as in ctf_dtd_delete. (ctf_add_generic): Pass in the kind explicitly and pass it to ctf_dtd_insert. Use ctf_typemax, not ctf_dtnextid. Migrate away from dtd_name to using ctf_str_add_ref to populate the ctt_name. Grow the ptrtab if needed. (ctf_add_encoded): Pass in the kind. (ctf_add_slice): Likewise. (ctf_add_array): Likewise. (ctf_add_function): Likewise. (ctf_add_typedef): Likewise. (ctf_add_reftype): Likewise. Initialize the ctf_ptrtab, checking ctt_name rather than dtd_name. (ctf_add_struct_sized): Pass in the kind. Use ctf_lookup_by_rawname, not ctf_hash_lookup_type / ctf_dtd_lookup_type_by_name. (ctf_add_union_sized): Likewise. (ctf_add_enum): Likewise. (ctf_add_enum_encoded): Likewise. (ctf_add_forward): Likewise. (ctf_add_type): Likewise. (ctf_compress_write): Call ctf_serialize: adjust for ctf_size not being initialized until after the call. (ctf_write_mem): Likewise. (ctf_write): Likewise. * ctf-archive.c (arc_write_one_ctf): Likewise. * ctf-lookup.c (ctf_lookup_by_name): Use ctf_lookuup_by_rawhash, not ctf_hash_lookup_type. (ctf_lookup_by_id): No longer check the readonly types if the dictionary is writable. * ctf-open.c (init_types): Assert that this dictionary is not writable. Adjust to use the new name hashes, ctf_name_table, and ctf_ptrtab_len. GNU style fix for the final ptrtab scan. (ctf_bufopen_internal): New 'writable' parameter. Flip on LCTF_RDWR if set. Drop out early when dictionary is writable. Split the ctf_lookups initialization into... (ctf_set_cth_hashes): ... this new function. (ctf_simple_open_internal): Adjust. New 'writable' parameter. (ctf_simple_open): Adjust accordingly. (ctf_bufopen): Likewise. (ctf_file_close): Destroy the appropriate name hashes. No longer destroy ctf_dtbyname, which is gone. (ctf_getdatasect): Remove spurious "extern". * ctf-types.c (ctf_lookup_by_rawname): New, look up types in the specified name table, given a kind. (ctf_lookup_by_rawhash): Likewise, given a ctf_names_t *. (ctf_member_iter): Add support for iterating over the dynamic type list. (ctf_enum_iter): Likewise. (ctf_variable_iter): Likewise. (ctf_type_rvisit): Likewise. (ctf_member_info): Add support for types in the dynamic type list. (ctf_enum_name): Likewise. (ctf_enum_value): Likewise. (ctf_func_type_info): Likewise. (ctf_func_type_args): Likewise. * ctf-link.c (ctf_accumulate_archive_names): No longer call ctf_update. (ctf_link_write): Likewise. (ctf_link_intern_extern_string): Adjust for new ctf_str_add_external return value. (ctf_link_add_strtab): Likewise. * ctf-util.c (ctf_list_empty_p): New.
2019-08-08 00:55:09 +08:00
extern ctf_names_t *ctf_name_table (ctf_file_t *, int);
extern const ctf_type_t *ctf_lookup_by_id (ctf_file_t **, ctf_id_t);
libctf: avoid the need to ever use ctf_update The method of operation of libctf when the dictionary is writable has before now been that types that are added land in the dynamic type section, which is a linked list and hash of IDs -> dynamic type definitions (and, recently a hash of names): the DTDs are a bit of CTF representing the ctf_type_t and ad hoc C structures representing the vlen. Historically, libctf was unable to do anything with these types, not even look them up by ID, let alone by name: if you wanted to do that say if you were adding a type that depended on one you just added) you called ctf_update, which serializes all the DTDs into a CTF file and reopens it, copying its guts over the fp it's called with. The ctf_updated types are then frozen in amber and unchangeable: all lookups will return the types in the static portion in preference to the dynamic portion, and we will refuse to re-add things that already exist in the static portion (and, of late, in the dynamic portion too). The libctf machinery remembers the boundary between static and dynamic types and looks in the right portion for each type. Lots of things still don't quite work with dynamic types (e.g. getting their size), but enough works to do a bunch of additions and then a ctf_update, most of the time. Except it doesn't, because ctf_add_type finds it necessary to walk the full dynamic type definition list looking for types with matching names, so it gets slower and slower with every type you add: fixing this requires calling ctf_update periodically for no other reason than to avoid massively slowing things down. This is all clunky and very slow but kind of works, until you consider that it is in fact possible and indeed necessary to modify one sort of type after it has been added: forwards. These are necessarily promoted to structs, unions or enums, and when they do so *their type ID does not change*. So all of a sudden we are changing types that already exist in the static portion. ctf_update gets massively confused by this and allocates space enough for the forward (with no members), but then emits the new dynamic type (with all the members) into it. You get an assertion failure after that, if you're lucky, or a coredump. So this commit rejigs things a bit and arranges to exclusively use the dynamic type definitions in writable dictionaries, and the static type definitions in readable dictionaries: we don't at any time have a mixture of static and dynamic types, and you don't need to call ctf_update to make things "appear". The ctf_dtbyname hash I introduced a few months ago, which maps things like "struct foo" to DTDs, is removed, replaced instead by a change of type of the four dictionaries which track names. Rather than just being (unresizable) ctf_hash_t's populated only at ctf_bufopen time, they are now a ctf_names_t structure, which is a pair of ctf_hash_t and ctf_dynhash_t, with the ctf_hash_t portion being used in readonly dictionaries, and the ctf_dynhash_t being used in writable ones. The decision as to which to use is centralized in the new functions ctf_lookup_by_rawname (which takes a type kind) and ctf_lookup_by_rawhash, which it calls (which takes a ctf_names_t *.) This change lets us switch from using static to dynamic name hashes on the fly across the entirety of libctf without complexifying anything: in fact, because we now centralize the knowledge about how to map from type kind to name hash, it actually simplifies things and lets us throw out quite a lot of now-unnecessary complexity, from ctf_dtnyname (replaced by the dynamic half of the name tables), through to ctf_dtnextid (now that a dictionary's static portion is never referenced if the dictionary is writable, we can just use ctf_typemax to indicate the maximum type: dynamic or non-dynamic does not matter, and we no longer need to track the boundary between the types). You can now ctf_rollback() as far as you like, even past a ctf_update or for that matter a full writeout; all the iteration functions work just as well on writable as on read-only dictionaries; ctf_add_type no longer needs expensive duplicated code to run over the dynamic types hunting for ones it might be interested in; and the linker no longer needs a hack to call ctf_update so that calling ctf_add_type is not impossibly expensive. There is still a bit more complexity: some new code paths in ctf-types.c need to know how to extract information from dynamic types. This complexity will go away again in a few months when libctf acquires a proper intermediate representation. You can still call ctf_update if you like (it's public API, after all), but its only effect now is to set the point to which ctf_discard rolls back. Obviously *something* still needs to serialize the CTF file before writeout, and this job is done by ctf_serialize, which does everything ctf_update used to except set the counter used by ctf_discard. It is automatically called by the various functions that do CTF writeout: nobody else ever needs to call it. With this in place, forwards that are promoted to non-forwards no longer crash the link, even if it happens tens of thousands of types later. v5: fix tabdamage. libctf/ * ctf-impl.h (ctf_names_t): New. (ctf_lookup_t) <ctf_hash>: Now a ctf_names_t, not a ctf_hash_t. (ctf_file_t) <ctf_structs>: Likewise. <ctf_unions>: Likewise. <ctf_enums>: Likewise. <ctf_names>: Likewise. <ctf_lookups>: Improve comment. <ctf_ptrtab_len>: New. <ctf_prov_strtab>: New. <ctf_str_prov_offset>: New. <ctf_dtbyname>: Remove, redundant to the names hashes. <ctf_dtnextid>: Remove, redundant to ctf_typemax. (ctf_dtdef_t) <dtd_name>: Remove. <dtd_data>: Note that the ctt_name is now populated. (ctf_str_atom_t) <csa_offset>: This is now the strtab offset for internal strings too. <csa_external_offset>: New, the external strtab offset. (CTF_INDEX_TO_TYPEPTR): Handle the LCTF_RDWR case. (ctf_name_table): New declaration. (ctf_lookup_by_rawname): Likewise. (ctf_lookup_by_rawhash): Likewise. (ctf_set_ctl_hashes): Likewise. (ctf_serialize): Likewise. (ctf_dtd_insert): Adjust. (ctf_simple_open_internal): Likewise. (ctf_bufopen_internal): Likewise. (ctf_list_empty_p): Likewise. (ctf_str_remove_ref): Likewise. (ctf_str_add): Returns uint32_t now. (ctf_str_add_ref): Likewise. (ctf_str_add_external): Now returns a boolean (int). * ctf-string.c (ctf_strraw_explicit): Check the ctf_prov_strtab for strings in the appropriate range. (ctf_str_create_atoms): Create the ctf_prov_strtab. Detect OOM when adding the null string to the new strtab. (ctf_str_free_atoms): Destroy the ctf_prov_strtab. (ctf_str_add_ref_internal): Add make_provisional argument. If make_provisional, populate the offset and fill in the ctf_prov_strtab accordingly. (ctf_str_add): Return the offset, not the string. (ctf_str_add_ref): Likewise. (ctf_str_add_external): Return a success integer. (ctf_str_remove_ref): New, remove a single ref. (ctf_str_count_strtab): Do not count the initial null string's length or the existence or length of any unreferenced internal atoms. (ctf_str_populate_sorttab): Skip atoms with no refs. (ctf_str_write_strtab): Populate the nullstr earlier. Add one to the cts_len for the null string, since it is no longer done in ctf_str_count_strtab. Adjust for csa_external_offset rename. Populate the csa_offset for both internal and external cases. Flush the ctf_prov_strtab afterwards, and reset the ctf_str_prov_offset. * ctf-create.c (ctf_grow_ptrtab): New. (ctf_create): Call it. Initialize new fields rather than old ones. Tell ctf_bufopen_internal that this is a writable dictionary. Set the ctl hashes and data model. (ctf_update): Rename to... (ctf_serialize): ... this. Leave a compatibility function behind. Tell ctf_simple_open_internal that this is a writable dictionary. Pass the new fields along from the old dictionary. Drop ctf_dtnextid and ctf_dtbyname. Use ctf_strraw, not dtd_name. Do not zero out the DTD's ctt_name. (ctf_prefixed_name): Rename to... (ctf_name_table): ... this. No longer return a prefixed name: return the applicable name table instead. (ctf_dtd_insert): Use it, and use the right name table. Pass in the kind we're adding. Migrate away from dtd_name. (ctf_dtd_delete): Adjust similarly. Remove the ref to the deleted ctt_name. (ctf_dtd_lookup_type_by_name): Remove. (ctf_dynamic_type): Always return NULL on read-only dictionaries. No longer check ctf_dtnextid: check ctf_typemax instead. (ctf_snapshot): No longer use ctf_dtnextid: use ctf_typemax instead. (ctf_rollback): Likewise. No longer fail with ECTF_OVERROLLBACK. Use ctf_name_table and the right name table, and migrate away from dtd_name as in ctf_dtd_delete. (ctf_add_generic): Pass in the kind explicitly and pass it to ctf_dtd_insert. Use ctf_typemax, not ctf_dtnextid. Migrate away from dtd_name to using ctf_str_add_ref to populate the ctt_name. Grow the ptrtab if needed. (ctf_add_encoded): Pass in the kind. (ctf_add_slice): Likewise. (ctf_add_array): Likewise. (ctf_add_function): Likewise. (ctf_add_typedef): Likewise. (ctf_add_reftype): Likewise. Initialize the ctf_ptrtab, checking ctt_name rather than dtd_name. (ctf_add_struct_sized): Pass in the kind. Use ctf_lookup_by_rawname, not ctf_hash_lookup_type / ctf_dtd_lookup_type_by_name. (ctf_add_union_sized): Likewise. (ctf_add_enum): Likewise. (ctf_add_enum_encoded): Likewise. (ctf_add_forward): Likewise. (ctf_add_type): Likewise. (ctf_compress_write): Call ctf_serialize: adjust for ctf_size not being initialized until after the call. (ctf_write_mem): Likewise. (ctf_write): Likewise. * ctf-archive.c (arc_write_one_ctf): Likewise. * ctf-lookup.c (ctf_lookup_by_name): Use ctf_lookuup_by_rawhash, not ctf_hash_lookup_type. (ctf_lookup_by_id): No longer check the readonly types if the dictionary is writable. * ctf-open.c (init_types): Assert that this dictionary is not writable. Adjust to use the new name hashes, ctf_name_table, and ctf_ptrtab_len. GNU style fix for the final ptrtab scan. (ctf_bufopen_internal): New 'writable' parameter. Flip on LCTF_RDWR if set. Drop out early when dictionary is writable. Split the ctf_lookups initialization into... (ctf_set_cth_hashes): ... this new function. (ctf_simple_open_internal): Adjust. New 'writable' parameter. (ctf_simple_open): Adjust accordingly. (ctf_bufopen): Likewise. (ctf_file_close): Destroy the appropriate name hashes. No longer destroy ctf_dtbyname, which is gone. (ctf_getdatasect): Remove spurious "extern". * ctf-types.c (ctf_lookup_by_rawname): New, look up types in the specified name table, given a kind. (ctf_lookup_by_rawhash): Likewise, given a ctf_names_t *. (ctf_member_iter): Add support for iterating over the dynamic type list. (ctf_enum_iter): Likewise. (ctf_variable_iter): Likewise. (ctf_type_rvisit): Likewise. (ctf_member_info): Add support for types in the dynamic type list. (ctf_enum_name): Likewise. (ctf_enum_value): Likewise. (ctf_func_type_info): Likewise. (ctf_func_type_args): Likewise. * ctf-link.c (ctf_accumulate_archive_names): No longer call ctf_update. (ctf_link_write): Likewise. (ctf_link_intern_extern_string): Adjust for new ctf_str_add_external return value. (ctf_link_add_strtab): Likewise. * ctf-util.c (ctf_list_empty_p): New.
2019-08-08 00:55:09 +08:00
extern ctf_id_t ctf_lookup_by_rawname (ctf_file_t *, int, const char *);
extern ctf_id_t ctf_lookup_by_rawhash (ctf_file_t *, ctf_names_t *, const char *);
extern void ctf_set_ctl_hashes (ctf_file_t *);
libctf, next: introduce new class of easier-to-use iterators The libctf machinery currently only provides one way to iterate over its data structures: ctf_*_iter functions that take a callback and an arg and repeatedly call it. This *works*, but if you are doing a lot of iteration it is really quite inconvenient: you have to package up your local variables into structures over and over again and spawn lots of little functions even if it would be clearer in a single run of code. Look at ctf-string.c for an extreme example of how unreadable this can get, with three-line-long functions proliferating wildly. The deduplicator takes this to the Nth level. It iterates over a whole bunch of things: if we'd had to use _iter-class iterators for all of them there would be twenty additional functions in the deduplicator alone, for no other reason than that the iterator API requires it. Let's do something better. strtok_r gives us half the design: generators in a number of other languages give us the other half. The *_next API allows you to iterate over CTF-like entities in a single function using a normal while loop. e.g. here we are iterating over all the types in a dict: ctf_next_t *i = NULL; int *hidden; ctf_id_t id; while ((id = ctf_type_next (fp, &i, &hidden, 1)) != CTF_ERR) { /* do something with 'hidden' and 'id' */ } if (ctf_errno (fp) != ECTF_NEXT_END) /* iteration error */ Here we are walking through the members of a struct with CTF ID 'struct_type': ctf_next_t *i = NULL; ssize_t offset; const char *name; ctf_id_t membtype; while ((offset = ctf_member_next (fp, struct_type, &i, &name, &membtype)) >= 0 { /* do something with offset, name, and membtype */ } if (ctf_errno (fp) != ECTF_NEXT_END) /* iteration error */ Like every other while loop, this means you have access to all the local variables outside the loop while inside it, with no need to tiresomely package things up in structures, move the body of the loop into a separate function, etc, as you would with an iterator taking a callback. ctf_*_next allocates 'i' for you on first entry (when it must be NULL), and frees and NULLs it and returns a _next-dependent flag value when the iteration is over: the fp errno is set to ECTF_NEXT_END when the iteartion ends normally. If you want to exit early, call ctf_next_destroy on the iterator. You can copy iterators using ctf_next_copy, which copies their current iteration position so you can remember loop positions and go back to them later (or ctf_next_destroy them if you don't need them after all). Each _next function returns an always-likely-to-be-useful property of the thing being iterated over, and takes pointers to parameters for the others: with very few exceptions all those parameters can be NULLs if you're not interested in them, so e.g. you can iterate over only the offsets of members of a structure this way: while ((offset = ctf_member_next (fp, struct_id, &i, NULL, NULL)) >= 0) If you pass an iterator in use by one iteration function to another one, you get the new error ECTF_NEXT_WRONGFUN back; if you try to change ctf_file_t in mid-iteration, you get ECTF_NEXT_WRONGFP back. Internally the ctf_next_t remembers the iteration function in use, various sizes and increments useful for almost all iterations, then uses unions to overlap the actual entities being iterated over to keep ctf_next_t size down. Iterators available in the public API so far (all tested in actual use in the deduplicator): /* Iterate over the members of a STRUCT or UNION, returning each member's offset and optionally name and member type in turn. On end-of-iteration, returns -1. */ ssize_t ctf_member_next (ctf_file_t *fp, ctf_id_t type, ctf_next_t **it, const char **name, ctf_id_t *membtype); /* Iterate over the members of an enum TYPE, returning each enumerand's NAME or NULL at end of iteration or error, and optionally passing back the enumerand's integer VALue. */ const char * ctf_enum_next (ctf_file_t *fp, ctf_id_t type, ctf_next_t **it, int *val); /* Iterate over every type in the given CTF container (not including parents), optionally including non-user-visible types, returning each type ID and optionally the hidden flag in turn. Returns CTF_ERR on end of iteration or error. */ ctf_id_t ctf_type_next (ctf_file_t *fp, ctf_next_t **it, int *flag, int want_hidden); /* Iterate over every variable in the given CTF container, in arbitrary order, returning the name and type of each variable in turn. The NAME argument is not optional. Returns CTF_ERR on end of iteration or error. */ ctf_id_t ctf_variable_next (ctf_file_t *fp, ctf_next_t **it, const char **name); /* Iterate over all CTF files in an archive, returning each dict in turn as a ctf_file_t, and NULL on error or end of iteration. It is the caller's responsibility to close it. Parent dicts may be skipped. Regardless of whether they are skipped or not, the caller must ctf_import the parent if need be. */ ctf_file_t * ctf_archive_next (const ctf_archive_t *wrapper, ctf_next_t **it, const char **name, int skip_parent, int *errp); ctf_label_next is prototyped but not implemented yet. include/ * ctf-api.h (ECTF_NEXT_END): New error. (ECTF_NEXT_WRONGFUN): Likewise. (ECTF_NEXT_WRONGFP): Likewise. (ECTF_NERR): Adjust. (ctf_next_t): New. (ctf_next_create): New prototype. (ctf_next_destroy): Likewise. (ctf_next_copy): Likewise. (ctf_member_next): Likewise. (ctf_enum_next): Likewise. (ctf_type_next): Likewise. (ctf_label_next): Likewise. (ctf_variable_next): Likewise. libctf/ * ctf-impl.h (ctf_next): New. (ctf_get_dict): New prototype. * ctf-lookup.c (ctf_get_dict): New, split out of... (ctf_lookup_by_id): ... here. * ctf-util.c (ctf_next_create): New. (ctf_next_destroy): New. (ctf_next_copy): New. * ctf-types.c (includes): Add <assert.h>. (ctf_member_next): New. (ctf_enum_next): New. (ctf_type_iter): Document the lack of iteration over parent types. (ctf_type_next): New. (ctf_variable_next): New. * ctf-archive.c (ctf_archive_next): New. * libctf.ver: Add new public functions.
2020-06-03 22:13:24 +08:00
extern ctf_file_t *ctf_get_dict (ctf_file_t *fp, ctf_id_t type);
typedef unsigned int (*ctf_hash_fun) (const void *ptr);
extern unsigned int ctf_hash_integer (const void *ptr);
extern unsigned int ctf_hash_string (const void *ptr);
extern unsigned int ctf_hash_type_key (const void *ptr);
extern unsigned int ctf_hash_type_id_key (const void *ptr);
typedef int (*ctf_hash_eq_fun) (const void *, const void *);
extern int ctf_hash_eq_integer (const void *, const void *);
extern int ctf_hash_eq_string (const void *, const void *);
extern int ctf_hash_eq_type_key (const void *, const void *);
extern int ctf_hash_eq_type_id_key (const void *, const void *);
libctf, hash: introduce the ctf_dynset There are many places in the deduplicator which use hashtables as tiny sets: keys with no value (and usually, but not always, no freeing function) often with only one or a few members. For each of these, even after the last change to not store the freeing functions, we are storing a little malloced block for each item just to track the key/value pair, and a little malloced block for the hash table itself just to track the freeing function because we can't use libiberty hashtab's freeing function because we are using that to free the little malloced per-item block. If we only have a key, we don't need any of that: we can ditch the per-malloced block because we don't have a value, and we can ditch the per-hashtab structure because we don't need to independently track the freeing functions since libiberty hashtab is doing it for us. That means we don't need an owner field in the (now nonexistent) item block either. Roughly speaking, this datatype saves about 25% in time and 20% in peak memory usage for normal links, even fairly big ones. So this might seem redundant, but it's really worth it. Instead of a _lookup function, a dynset has two distinct functions: ctf_dynset_exists, which returns true or false and an optional pointer to the set member, and ctf_dynhash_lookup_any, which is used if all members of the set are expected to be equivalent and we just want *any* member and we don't care which one. There is no iterator in this set of functions, not because we don't iterate over dynset members -- we do, a lot -- but because the iterator here is a member of an entirely new family of much more convenient iteration functions, introduced in the next commit. libctf/ * ctf-hash.c (ctf_dynset_eq_string): New. (ctf_dynset_create): New. (DYNSET_EMPTY_ENTRY_REPLACEMENT): New. (DYNSET_DELETED_ENTRY_REPLACEMENT): New. (key_to_internal): New. (internal_to_key): New. (ctf_dynset_insert): New. (ctf_dynset_remove): New. (ctf_dynset_destroy): New. (ctf_dynset_lookup): New. (ctf_dynset_exists): New. (ctf_dynset_lookup_any): New. (ctf_hash_insert_type): Coding style. (ctf_hash_define_type): Likewise. * ctf-impl.h (ctf_dynset_t): New. (ctf_dynset_eq_string): New. (ctf_dynset_create): New. (ctf_dynset_insert): New. (ctf_dynset_remove): New. (ctf_dynset_destroy): New. (ctf_dynset_lookup): New. (ctf_dynset_exists): New. (ctf_dynset_lookup_any): New. * ctf-inlines.h (ctf_dynset_cinsert): New.
2020-06-03 05:26:38 +08:00
extern int ctf_dynset_eq_string (const void *, const void *);
typedef void (*ctf_hash_free_fun) (void *);
typedef void (*ctf_hash_iter_f) (void *key, void *value, void *arg);
typedef int (*ctf_hash_iter_remove_f) (void *key, void *value, void *arg);
typedef int (*ctf_hash_iter_find_f) (void *key, void *value, void *arg);
typedef int (*ctf_hash_sort_f) (const ctf_next_hkv_t *, const ctf_next_hkv_t *,
void *arg);
extern ctf_hash_t *ctf_hash_create (unsigned long, ctf_hash_fun, ctf_hash_eq_fun);
extern int ctf_hash_insert_type (ctf_hash_t *, ctf_file_t *, uint32_t, uint32_t);
extern int ctf_hash_define_type (ctf_hash_t *, ctf_file_t *, uint32_t, uint32_t);
extern ctf_id_t ctf_hash_lookup_type (ctf_hash_t *, ctf_file_t *, const char *);
extern uint32_t ctf_hash_size (const ctf_hash_t *);
extern void ctf_hash_destroy (ctf_hash_t *);
extern ctf_dynhash_t *ctf_dynhash_create (ctf_hash_fun, ctf_hash_eq_fun,
ctf_hash_free_fun, ctf_hash_free_fun);
extern int ctf_dynhash_insert (ctf_dynhash_t *, void *, void *);
extern void ctf_dynhash_remove (ctf_dynhash_t *, const void *);
extern size_t ctf_dynhash_elements (ctf_dynhash_t *);
libctf: map from old to corresponding newly-added types in ctf_add_type This lets you call ctf_type_mapping (dest_fp, src_fp, src_type_id) and get told what type ID the corresponding type has in the target ctf_file_t. This works even if it was added by a recursive call, and because it is stored in the target ctf_file_t it works even if we had to add one type to multiple ctf_file_t's as part of conflicting type handling. We empty out this mapping after every archive is linked: because it maps input to output fps, and we only visit each input fp once, its contents are rendered entirely useless every time the source fp changes. v3: add several missing mapping additions. Add ctf_dynhash_empty, and empty after every input archive. v5: fix tabdamage. libctf/ * ctf-impl.h (ctf_file_t): New field ctf_link_type_mapping. (struct ctf_link_type_mapping_key): New. (ctf_hash_type_mapping_key): Likewise. (ctf_hash_eq_type_mapping_key): Likewise. (ctf_add_type_mapping): Likewise. (ctf_type_mapping): Likewise. (ctf_dynhash_empty): Likewise. * ctf-open.c (ctf_file_close): Update accordingly. * ctf-create.c (ctf_update): Likewise. (ctf_add_type): Populate the mapping. * ctf-hash.c (ctf_hash_type_mapping_key): Hash a type mapping key. (ctf_hash_eq_type_mapping_key): Check the key for equality. (ctf_dynhash_insert): Fix comment typo. (ctf_dynhash_empty): New. * ctf-link.c (ctf_add_type_mapping): New. (ctf_type_mapping): Likewise. (empty_link_type_mapping): New. (ctf_link_one_input_archive): Call it.
2019-07-14 04:31:26 +08:00
extern void ctf_dynhash_empty (ctf_dynhash_t *);
extern void *ctf_dynhash_lookup (ctf_dynhash_t *, const void *);
extern int ctf_dynhash_lookup_kv (ctf_dynhash_t *, const void *key,
const void **orig_key, void **value);
extern void ctf_dynhash_destroy (ctf_dynhash_t *);
extern void ctf_dynhash_iter (ctf_dynhash_t *, ctf_hash_iter_f, void *);
extern void ctf_dynhash_iter_remove (ctf_dynhash_t *, ctf_hash_iter_remove_f,
void *);
extern void *ctf_dynhash_iter_find (ctf_dynhash_t *, ctf_hash_iter_find_f,
void *);
extern int ctf_dynhash_next (ctf_dynhash_t *, ctf_next_t **,
void **key, void **value);
extern int ctf_dynhash_next_sorted (ctf_dynhash_t *, ctf_next_t **,
void **key, void **value, ctf_hash_sort_f,
void *);
libctf, hash: introduce the ctf_dynset There are many places in the deduplicator which use hashtables as tiny sets: keys with no value (and usually, but not always, no freeing function) often with only one or a few members. For each of these, even after the last change to not store the freeing functions, we are storing a little malloced block for each item just to track the key/value pair, and a little malloced block for the hash table itself just to track the freeing function because we can't use libiberty hashtab's freeing function because we are using that to free the little malloced per-item block. If we only have a key, we don't need any of that: we can ditch the per-malloced block because we don't have a value, and we can ditch the per-hashtab structure because we don't need to independently track the freeing functions since libiberty hashtab is doing it for us. That means we don't need an owner field in the (now nonexistent) item block either. Roughly speaking, this datatype saves about 25% in time and 20% in peak memory usage for normal links, even fairly big ones. So this might seem redundant, but it's really worth it. Instead of a _lookup function, a dynset has two distinct functions: ctf_dynset_exists, which returns true or false and an optional pointer to the set member, and ctf_dynhash_lookup_any, which is used if all members of the set are expected to be equivalent and we just want *any* member and we don't care which one. There is no iterator in this set of functions, not because we don't iterate over dynset members -- we do, a lot -- but because the iterator here is a member of an entirely new family of much more convenient iteration functions, introduced in the next commit. libctf/ * ctf-hash.c (ctf_dynset_eq_string): New. (ctf_dynset_create): New. (DYNSET_EMPTY_ENTRY_REPLACEMENT): New. (DYNSET_DELETED_ENTRY_REPLACEMENT): New. (key_to_internal): New. (internal_to_key): New. (ctf_dynset_insert): New. (ctf_dynset_remove): New. (ctf_dynset_destroy): New. (ctf_dynset_lookup): New. (ctf_dynset_exists): New. (ctf_dynset_lookup_any): New. (ctf_hash_insert_type): Coding style. (ctf_hash_define_type): Likewise. * ctf-impl.h (ctf_dynset_t): New. (ctf_dynset_eq_string): New. (ctf_dynset_create): New. (ctf_dynset_insert): New. (ctf_dynset_remove): New. (ctf_dynset_destroy): New. (ctf_dynset_lookup): New. (ctf_dynset_exists): New. (ctf_dynset_lookup_any): New. * ctf-inlines.h (ctf_dynset_cinsert): New.
2020-06-03 05:26:38 +08:00
extern ctf_dynset_t *ctf_dynset_create (htab_hash, htab_eq, ctf_hash_free_fun);
extern int ctf_dynset_insert (ctf_dynset_t *, void *);
extern void ctf_dynset_remove (ctf_dynset_t *, const void *);
extern void ctf_dynset_destroy (ctf_dynset_t *);
extern void *ctf_dynset_lookup (ctf_dynset_t *, const void *);
extern int ctf_dynset_exists (ctf_dynset_t *, const void *key,
const void **orig_key);
extern int ctf_dynset_next (ctf_dynset_t *, ctf_next_t **, void **key);
libctf, hash: introduce the ctf_dynset There are many places in the deduplicator which use hashtables as tiny sets: keys with no value (and usually, but not always, no freeing function) often with only one or a few members. For each of these, even after the last change to not store the freeing functions, we are storing a little malloced block for each item just to track the key/value pair, and a little malloced block for the hash table itself just to track the freeing function because we can't use libiberty hashtab's freeing function because we are using that to free the little malloced per-item block. If we only have a key, we don't need any of that: we can ditch the per-malloced block because we don't have a value, and we can ditch the per-hashtab structure because we don't need to independently track the freeing functions since libiberty hashtab is doing it for us. That means we don't need an owner field in the (now nonexistent) item block either. Roughly speaking, this datatype saves about 25% in time and 20% in peak memory usage for normal links, even fairly big ones. So this might seem redundant, but it's really worth it. Instead of a _lookup function, a dynset has two distinct functions: ctf_dynset_exists, which returns true or false and an optional pointer to the set member, and ctf_dynhash_lookup_any, which is used if all members of the set are expected to be equivalent and we just want *any* member and we don't care which one. There is no iterator in this set of functions, not because we don't iterate over dynset members -- we do, a lot -- but because the iterator here is a member of an entirely new family of much more convenient iteration functions, introduced in the next commit. libctf/ * ctf-hash.c (ctf_dynset_eq_string): New. (ctf_dynset_create): New. (DYNSET_EMPTY_ENTRY_REPLACEMENT): New. (DYNSET_DELETED_ENTRY_REPLACEMENT): New. (key_to_internal): New. (internal_to_key): New. (ctf_dynset_insert): New. (ctf_dynset_remove): New. (ctf_dynset_destroy): New. (ctf_dynset_lookup): New. (ctf_dynset_exists): New. (ctf_dynset_lookup_any): New. (ctf_hash_insert_type): Coding style. (ctf_hash_define_type): Likewise. * ctf-impl.h (ctf_dynset_t): New. (ctf_dynset_eq_string): New. (ctf_dynset_create): New. (ctf_dynset_insert): New. (ctf_dynset_remove): New. (ctf_dynset_destroy): New. (ctf_dynset_lookup): New. (ctf_dynset_exists): New. (ctf_dynset_lookup_any): New. * ctf-inlines.h (ctf_dynset_cinsert): New.
2020-06-03 05:26:38 +08:00
extern void *ctf_dynset_lookup_any (ctf_dynset_t *);
extern void ctf_sha1_init (ctf_sha1_t *);
extern void ctf_sha1_add (ctf_sha1_t *, const void *, size_t);
extern char *ctf_sha1_fini (ctf_sha1_t *, char *);
#define ctf_list_prev(elem) ((void *)(((ctf_list_t *)(elem))->l_prev))
#define ctf_list_next(elem) ((void *)(((ctf_list_t *)(elem))->l_next))
extern void ctf_list_append (ctf_list_t *, void *);
extern void ctf_list_prepend (ctf_list_t *, void *);
extern void ctf_list_delete (ctf_list_t *, void *);
libctf, link: tie in the deduplicating linker This fairly intricate commit connects up the CTF linker machinery (which operates in terms of ctf_archive_t's on ctf_link_inputs -> ctf_link_outputs) to the deduplicator (which operates in terms of arrays of ctf_file_t's, all the archives exploded). The nondeduplicating linker is retained, but is not called unless the CTF_LINK_NONDEDUP flag is passed in (which ld never does), or the environment variable LD_NO_CTF_DEDUP is set. Eventually, once we have confidence in the much-more-complex deduplicating linker, I hope the nondeduplicating linker can be removed. In brief, what this does is traverses each input archive in ctf_link_inputs, opening every member (if not already open) and tying child dicts to their parents, shoving them into an array and constructing a corresponding parents array that tells the deduplicator which dict is the parent of which child. We then call ctf_dedup and ctf_dedup_emit with that array of inputs, taking the outputs that result and putting them into ctf_link_outputs where the rest of the CTF linker expects to find them, then linking in the variables just as is done by the nondeduplicating linker. It also implements much of the CU-mapping side of things. The problem CU-mapping introduces is that if you map many input CUs into one output, this is saying that you want many translation units to produce at most one child dict if conflicting types are found in any of them. This means you can suddenly have multiple distinct types with the same name in the same dict, which libctf cannot really represent because it's not something you can do with C translation units. The deduplicator machinery already committed does as best it can with these, hiding types with conflicting names rather than making child dicts out of them: but we still need to call it. This is done similarly to the main link, taking the inputs (one CU output at a time), deduplicating them, taking the output and making it an input to the final link. Two (significant) optimizations are done: we share atoms tables between all these links and the final link (so e.g. all type hash values are shared, all decorated type names, etc); and any CU-mapped links with only one input (and no child dicts) doesn't need to do anything other than renaming the CU: the CU-mapped link phase can be skipped for it. Put together, large CU-mapped links can save 50% of their memory usage and about as much time (and the memory usage for CU-mapped links is significant, because all those output CUs have to have all their types stored in memory all at once). include/ * ctf-api.h (CTF_LINK_NONDEDUP): New, turn off the deduplicator. libctf/ * ctf-impl.h (ctf_list_splice): New. * ctf-util.h (ctf_list_splice): Likewise. * ctf-link.c (link_sort_inputs_cb_arg_t): Likewise. (ctf_link_sort_inputs): Likewise. (ctf_link_deduplicating_count_inputs): Likewise. (ctf_link_deduplicating_open_inputs): Likewise. (ctf_link_deduplicating_close_inputs): Likewise. (ctf_link_deduplicating_variables): Likewise. (ctf_link_deduplicating_per_cu): Likewise. (ctf_link_deduplicating): Likewise. (ctf_link): Call it.
2020-06-06 05:57:06 +08:00
extern void ctf_list_splice (ctf_list_t *, ctf_list_t *);
libctf: avoid the need to ever use ctf_update The method of operation of libctf when the dictionary is writable has before now been that types that are added land in the dynamic type section, which is a linked list and hash of IDs -> dynamic type definitions (and, recently a hash of names): the DTDs are a bit of CTF representing the ctf_type_t and ad hoc C structures representing the vlen. Historically, libctf was unable to do anything with these types, not even look them up by ID, let alone by name: if you wanted to do that say if you were adding a type that depended on one you just added) you called ctf_update, which serializes all the DTDs into a CTF file and reopens it, copying its guts over the fp it's called with. The ctf_updated types are then frozen in amber and unchangeable: all lookups will return the types in the static portion in preference to the dynamic portion, and we will refuse to re-add things that already exist in the static portion (and, of late, in the dynamic portion too). The libctf machinery remembers the boundary between static and dynamic types and looks in the right portion for each type. Lots of things still don't quite work with dynamic types (e.g. getting their size), but enough works to do a bunch of additions and then a ctf_update, most of the time. Except it doesn't, because ctf_add_type finds it necessary to walk the full dynamic type definition list looking for types with matching names, so it gets slower and slower with every type you add: fixing this requires calling ctf_update periodically for no other reason than to avoid massively slowing things down. This is all clunky and very slow but kind of works, until you consider that it is in fact possible and indeed necessary to modify one sort of type after it has been added: forwards. These are necessarily promoted to structs, unions or enums, and when they do so *their type ID does not change*. So all of a sudden we are changing types that already exist in the static portion. ctf_update gets massively confused by this and allocates space enough for the forward (with no members), but then emits the new dynamic type (with all the members) into it. You get an assertion failure after that, if you're lucky, or a coredump. So this commit rejigs things a bit and arranges to exclusively use the dynamic type definitions in writable dictionaries, and the static type definitions in readable dictionaries: we don't at any time have a mixture of static and dynamic types, and you don't need to call ctf_update to make things "appear". The ctf_dtbyname hash I introduced a few months ago, which maps things like "struct foo" to DTDs, is removed, replaced instead by a change of type of the four dictionaries which track names. Rather than just being (unresizable) ctf_hash_t's populated only at ctf_bufopen time, they are now a ctf_names_t structure, which is a pair of ctf_hash_t and ctf_dynhash_t, with the ctf_hash_t portion being used in readonly dictionaries, and the ctf_dynhash_t being used in writable ones. The decision as to which to use is centralized in the new functions ctf_lookup_by_rawname (which takes a type kind) and ctf_lookup_by_rawhash, which it calls (which takes a ctf_names_t *.) This change lets us switch from using static to dynamic name hashes on the fly across the entirety of libctf without complexifying anything: in fact, because we now centralize the knowledge about how to map from type kind to name hash, it actually simplifies things and lets us throw out quite a lot of now-unnecessary complexity, from ctf_dtnyname (replaced by the dynamic half of the name tables), through to ctf_dtnextid (now that a dictionary's static portion is never referenced if the dictionary is writable, we can just use ctf_typemax to indicate the maximum type: dynamic or non-dynamic does not matter, and we no longer need to track the boundary between the types). You can now ctf_rollback() as far as you like, even past a ctf_update or for that matter a full writeout; all the iteration functions work just as well on writable as on read-only dictionaries; ctf_add_type no longer needs expensive duplicated code to run over the dynamic types hunting for ones it might be interested in; and the linker no longer needs a hack to call ctf_update so that calling ctf_add_type is not impossibly expensive. There is still a bit more complexity: some new code paths in ctf-types.c need to know how to extract information from dynamic types. This complexity will go away again in a few months when libctf acquires a proper intermediate representation. You can still call ctf_update if you like (it's public API, after all), but its only effect now is to set the point to which ctf_discard rolls back. Obviously *something* still needs to serialize the CTF file before writeout, and this job is done by ctf_serialize, which does everything ctf_update used to except set the counter used by ctf_discard. It is automatically called by the various functions that do CTF writeout: nobody else ever needs to call it. With this in place, forwards that are promoted to non-forwards no longer crash the link, even if it happens tens of thousands of types later. v5: fix tabdamage. libctf/ * ctf-impl.h (ctf_names_t): New. (ctf_lookup_t) <ctf_hash>: Now a ctf_names_t, not a ctf_hash_t. (ctf_file_t) <ctf_structs>: Likewise. <ctf_unions>: Likewise. <ctf_enums>: Likewise. <ctf_names>: Likewise. <ctf_lookups>: Improve comment. <ctf_ptrtab_len>: New. <ctf_prov_strtab>: New. <ctf_str_prov_offset>: New. <ctf_dtbyname>: Remove, redundant to the names hashes. <ctf_dtnextid>: Remove, redundant to ctf_typemax. (ctf_dtdef_t) <dtd_name>: Remove. <dtd_data>: Note that the ctt_name is now populated. (ctf_str_atom_t) <csa_offset>: This is now the strtab offset for internal strings too. <csa_external_offset>: New, the external strtab offset. (CTF_INDEX_TO_TYPEPTR): Handle the LCTF_RDWR case. (ctf_name_table): New declaration. (ctf_lookup_by_rawname): Likewise. (ctf_lookup_by_rawhash): Likewise. (ctf_set_ctl_hashes): Likewise. (ctf_serialize): Likewise. (ctf_dtd_insert): Adjust. (ctf_simple_open_internal): Likewise. (ctf_bufopen_internal): Likewise. (ctf_list_empty_p): Likewise. (ctf_str_remove_ref): Likewise. (ctf_str_add): Returns uint32_t now. (ctf_str_add_ref): Likewise. (ctf_str_add_external): Now returns a boolean (int). * ctf-string.c (ctf_strraw_explicit): Check the ctf_prov_strtab for strings in the appropriate range. (ctf_str_create_atoms): Create the ctf_prov_strtab. Detect OOM when adding the null string to the new strtab. (ctf_str_free_atoms): Destroy the ctf_prov_strtab. (ctf_str_add_ref_internal): Add make_provisional argument. If make_provisional, populate the offset and fill in the ctf_prov_strtab accordingly. (ctf_str_add): Return the offset, not the string. (ctf_str_add_ref): Likewise. (ctf_str_add_external): Return a success integer. (ctf_str_remove_ref): New, remove a single ref. (ctf_str_count_strtab): Do not count the initial null string's length or the existence or length of any unreferenced internal atoms. (ctf_str_populate_sorttab): Skip atoms with no refs. (ctf_str_write_strtab): Populate the nullstr earlier. Add one to the cts_len for the null string, since it is no longer done in ctf_str_count_strtab. Adjust for csa_external_offset rename. Populate the csa_offset for both internal and external cases. Flush the ctf_prov_strtab afterwards, and reset the ctf_str_prov_offset. * ctf-create.c (ctf_grow_ptrtab): New. (ctf_create): Call it. Initialize new fields rather than old ones. Tell ctf_bufopen_internal that this is a writable dictionary. Set the ctl hashes and data model. (ctf_update): Rename to... (ctf_serialize): ... this. Leave a compatibility function behind. Tell ctf_simple_open_internal that this is a writable dictionary. Pass the new fields along from the old dictionary. Drop ctf_dtnextid and ctf_dtbyname. Use ctf_strraw, not dtd_name. Do not zero out the DTD's ctt_name. (ctf_prefixed_name): Rename to... (ctf_name_table): ... this. No longer return a prefixed name: return the applicable name table instead. (ctf_dtd_insert): Use it, and use the right name table. Pass in the kind we're adding. Migrate away from dtd_name. (ctf_dtd_delete): Adjust similarly. Remove the ref to the deleted ctt_name. (ctf_dtd_lookup_type_by_name): Remove. (ctf_dynamic_type): Always return NULL on read-only dictionaries. No longer check ctf_dtnextid: check ctf_typemax instead. (ctf_snapshot): No longer use ctf_dtnextid: use ctf_typemax instead. (ctf_rollback): Likewise. No longer fail with ECTF_OVERROLLBACK. Use ctf_name_table and the right name table, and migrate away from dtd_name as in ctf_dtd_delete. (ctf_add_generic): Pass in the kind explicitly and pass it to ctf_dtd_insert. Use ctf_typemax, not ctf_dtnextid. Migrate away from dtd_name to using ctf_str_add_ref to populate the ctt_name. Grow the ptrtab if needed. (ctf_add_encoded): Pass in the kind. (ctf_add_slice): Likewise. (ctf_add_array): Likewise. (ctf_add_function): Likewise. (ctf_add_typedef): Likewise. (ctf_add_reftype): Likewise. Initialize the ctf_ptrtab, checking ctt_name rather than dtd_name. (ctf_add_struct_sized): Pass in the kind. Use ctf_lookup_by_rawname, not ctf_hash_lookup_type / ctf_dtd_lookup_type_by_name. (ctf_add_union_sized): Likewise. (ctf_add_enum): Likewise. (ctf_add_enum_encoded): Likewise. (ctf_add_forward): Likewise. (ctf_add_type): Likewise. (ctf_compress_write): Call ctf_serialize: adjust for ctf_size not being initialized until after the call. (ctf_write_mem): Likewise. (ctf_write): Likewise. * ctf-archive.c (arc_write_one_ctf): Likewise. * ctf-lookup.c (ctf_lookup_by_name): Use ctf_lookuup_by_rawhash, not ctf_hash_lookup_type. (ctf_lookup_by_id): No longer check the readonly types if the dictionary is writable. * ctf-open.c (init_types): Assert that this dictionary is not writable. Adjust to use the new name hashes, ctf_name_table, and ctf_ptrtab_len. GNU style fix for the final ptrtab scan. (ctf_bufopen_internal): New 'writable' parameter. Flip on LCTF_RDWR if set. Drop out early when dictionary is writable. Split the ctf_lookups initialization into... (ctf_set_cth_hashes): ... this new function. (ctf_simple_open_internal): Adjust. New 'writable' parameter. (ctf_simple_open): Adjust accordingly. (ctf_bufopen): Likewise. (ctf_file_close): Destroy the appropriate name hashes. No longer destroy ctf_dtbyname, which is gone. (ctf_getdatasect): Remove spurious "extern". * ctf-types.c (ctf_lookup_by_rawname): New, look up types in the specified name table, given a kind. (ctf_lookup_by_rawhash): Likewise, given a ctf_names_t *. (ctf_member_iter): Add support for iterating over the dynamic type list. (ctf_enum_iter): Likewise. (ctf_variable_iter): Likewise. (ctf_type_rvisit): Likewise. (ctf_member_info): Add support for types in the dynamic type list. (ctf_enum_name): Likewise. (ctf_enum_value): Likewise. (ctf_func_type_info): Likewise. (ctf_func_type_args): Likewise. * ctf-link.c (ctf_accumulate_archive_names): No longer call ctf_update. (ctf_link_write): Likewise. (ctf_link_intern_extern_string): Adjust for new ctf_str_add_external return value. (ctf_link_add_strtab): Likewise. * ctf-util.c (ctf_list_empty_p): New.
2019-08-08 00:55:09 +08:00
extern int ctf_list_empty_p (ctf_list_t *lp);
extern int ctf_dtd_insert (ctf_file_t *, ctf_dtdef_t *, int flag, int kind);
extern void ctf_dtd_delete (ctf_file_t *, ctf_dtdef_t *);
extern ctf_dtdef_t *ctf_dtd_lookup (const ctf_file_t *, ctf_id_t);
extern ctf_dtdef_t *ctf_dynamic_type (const ctf_file_t *, ctf_id_t);
extern int ctf_dvd_insert (ctf_file_t *, ctf_dvdef_t *);
extern void ctf_dvd_delete (ctf_file_t *, ctf_dvdef_t *);
extern ctf_dvdef_t *ctf_dvd_lookup (const ctf_file_t *, const char *);
extern ctf_id_t ctf_add_encoded (ctf_file_t *, uint32_t, const char *,
const ctf_encoding_t *, uint32_t kind);
extern ctf_id_t ctf_add_reftype (ctf_file_t *, uint32_t, ctf_id_t,
uint32_t kind);
libctf: map from old to corresponding newly-added types in ctf_add_type This lets you call ctf_type_mapping (dest_fp, src_fp, src_type_id) and get told what type ID the corresponding type has in the target ctf_file_t. This works even if it was added by a recursive call, and because it is stored in the target ctf_file_t it works even if we had to add one type to multiple ctf_file_t's as part of conflicting type handling. We empty out this mapping after every archive is linked: because it maps input to output fps, and we only visit each input fp once, its contents are rendered entirely useless every time the source fp changes. v3: add several missing mapping additions. Add ctf_dynhash_empty, and empty after every input archive. v5: fix tabdamage. libctf/ * ctf-impl.h (ctf_file_t): New field ctf_link_type_mapping. (struct ctf_link_type_mapping_key): New. (ctf_hash_type_mapping_key): Likewise. (ctf_hash_eq_type_mapping_key): Likewise. (ctf_add_type_mapping): Likewise. (ctf_type_mapping): Likewise. (ctf_dynhash_empty): Likewise. * ctf-open.c (ctf_file_close): Update accordingly. * ctf-create.c (ctf_update): Likewise. (ctf_add_type): Populate the mapping. * ctf-hash.c (ctf_hash_type_mapping_key): Hash a type mapping key. (ctf_hash_eq_type_mapping_key): Check the key for equality. (ctf_dynhash_insert): Fix comment typo. (ctf_dynhash_empty): New. * ctf-link.c (ctf_add_type_mapping): New. (ctf_type_mapping): Likewise. (empty_link_type_mapping): New. (ctf_link_one_input_archive): Call it.
2019-07-14 04:31:26 +08:00
extern void ctf_add_type_mapping (ctf_file_t *src_fp, ctf_id_t src_type,
ctf_file_t *dst_fp, ctf_id_t dst_type);
extern ctf_id_t ctf_type_mapping (ctf_file_t *src_fp, ctf_id_t src_type,
ctf_file_t **dst_fp);
extern int ctf_dedup_atoms_init (ctf_file_t *);
extern int ctf_dedup (ctf_file_t *, ctf_file_t **, uint32_t ninputs,
uint32_t *parents, int cu_mapped);
extern void ctf_dedup_fini (ctf_file_t *, ctf_file_t **, uint32_t);
extern ctf_file_t **ctf_dedup_emit (ctf_file_t *, ctf_file_t **,
uint32_t ninputs, uint32_t *parents,
uint32_t *noutputs, int cu_mapped);
libctf: core type lookup Finally we get to the functions used to actually look up and enumerate properties of types in a container (names, sizes, members, what type a pointer or cv-qual references, determination of whether two types are assignment-compatible, etc). With a very few exceptions these do not work for types newly added via ctf_add_*(): they only work on types in read-only containers, or types added before the most recent call to ctf_update(). This also adds support for lookup of "variables" (string -> type ID mappings) and for generation of C type names corresponding to a type ID. libctf/ * ctf-decl.c: New file. * ctf-types.c: Likewise. * ctf-impl.h: New declarations. include/ * ctf-api.h (ctf_visit_f): New definition. (ctf_member_f): Likewise. (ctf_enum_f): Likewise. (ctf_variable_f): Likewise. (ctf_type_f): Likewise. (ctf_type_isparent): Likewise. (ctf_type_ischild): Likewise. (ctf_type_resolve): Likewise. (ctf_type_aname): Likewise. (ctf_type_lname): Likewise. (ctf_type_name): Likewise. (ctf_type_sizee): Likewise. (ctf_type_align): Likewise. (ctf_type_kind): Likewise. (ctf_type_reference): Likewise. (ctf_type_pointer): Likewise. (ctf_type_encoding): Likewise. (ctf_type_visit): Likewise. (ctf_type_cmp): Likewise. (ctf_type_compat): Likewise. (ctf_member_info): Likewise. (ctf_array_info): Likewise. (ctf_enum_name): Likewise. (ctf_enum_value): Likewise. (ctf_member_iter): Likewise. (ctf_enum_iter): Likewise. (ctf_type_iter): Likewise. (ctf_variable_iter): Likewise.
2019-04-24 18:03:37 +08:00
extern void ctf_decl_init (ctf_decl_t *);
extern void ctf_decl_fini (ctf_decl_t *);
extern void ctf_decl_push (ctf_decl_t *, ctf_file_t *, ctf_id_t);
_libctf_printflike_ (2, 3)
extern void ctf_decl_sprintf (ctf_decl_t *, const char *, ...);
extern char *ctf_decl_buf (ctf_decl_t *cd);
extern const char *ctf_strptr (ctf_file_t *, uint32_t);
libctf: support getting strings from the ELF strtab The CTF file format has always supported "external strtabs", which internally are strtab offsets with their MSB on: such refs get their strings from the strtab passed in at CTF file open time: this is usually intended to be the ELF strtab, and that's what this implementation is meant to support, though in theory the external strtab could come from anywhere. This commit adds support for these external strings in the ctf-string.c strtab tracking layer. It's quite easy: we just add a field csa_offset to the atoms table that tracks all strings: this field tracks the offset of the string in the ELF strtab (with its MSB already on, courtesy of a new macro CTF_SET_STID), and adds a new function that sets the csa_offset to the specified offset (plus MSB). Then we just need to avoid writing out strings to the internal strtab if they have csa_offset set, and note that the internal strtab is shorter than it might otherwise be. (We could in theory save a little more time here by eschewing sorting such strings, since we never actually write the strings out anywhere, but that would mean storing them separately and it's just not worth the complexity cost until profiling shows it's worth doing.) We also have to go through a bit of extra effort at variable-sorting time. This was previously using direct references to the internal strtab: it couldn't use ctf_strptr or ctf_strraw because the new strtab is not yet ready to put in its usual field (in a ctf_file_t that hasn't even been allocated yet at this stage): but now we're using the external strtab, this will no longer do because it'll be looking things up in the wrong strtab, with disastrous results. Instead, pass the new internal strtab in to a new ctf_strraw_explicit function which is just like ctf_strraw except you can specify a ne winternal strtab to use. But even now that it is using a new internal strtab, this is not quite enough: it can't look up strings in the external strtab because ld hasn't written it out yet, and when it does will write it straight to disk. Instead, when we write the internal strtab, note all the offset -> string mappings that we have noted belong in the *external* strtab to a new "synthetic external strtab" dynhash, ctf_syn_ext_strtab, and look in there at ctf_strraw time if it is set. This uses minimal extra memory (because only strings in the external strtab that we actually use are stored, and even those come straight out of the atoms table), but let both variable sorting and name interning when ctf_bufopen is next called work fine. (This also means that we don't need to filter out spurious ECTF_STRTAB warnings from ctf_bufopen but can pass them back to the caller, once we wrap ctf_bufopen so that we have a new internal variant of ctf_bufopen etc that we can pass the synthetic external strtab to. That error has been filtered out since the days of Solaris libctf, which didn't try to handle the problem of getting external strtabs right at construction time at all.) v3: add the synthetic strtab and all associated machinery. v5: fix tabdamage. include/ * ctf.h (CTF_SET_STID): New. libctf/ * ctf-impl.h (ctf_str_atom_t) <csa_offset>: New field. (ctf_file_t) <ctf_syn_ext_strtab>: Likewise. (ctf_str_add_ref): Name the last arg. (ctf_str_add_external) New. (ctf_str_add_strraw_explicit): Likewise. (ctf_simple_open_internal): Likewise. (ctf_bufopen_internal): Likewise. * ctf-string.c (ctf_strraw_explicit): Split from... (ctf_strraw): ... here, with new support for ctf_syn_ext_strtab. (ctf_str_add_ref_internal): Return the atom, not the string. (ctf_str_add): Adjust accordingly. (ctf_str_add_ref): Likewise. Move up in the file. (ctf_str_add_external): New: update the csa_offset. (ctf_str_count_strtab): Only account for strings with no csa_offset in the internal strtab length. (ctf_str_write_strtab): If the csa_offset is set, update the string's refs without writing the string out, and update the ctf_syn_ext_strtab. Make OOM handling less ugly. * ctf-create.c (struct ctf_sort_var_arg_cb): New. (ctf_update): Handle failure to populate the strtab. Pass in the new ctf_sort_var arg. Adjust for ctf_syn_ext_strtab addition. Call ctf_simple_open_internal, not ctf_simple_open. (ctf_sort_var): Call ctf_strraw_explicit rather than looking up strings by hand. * ctf-hash.c (ctf_hash_insert_type): Likewise (but using ctf_strraw). Adjust to diagnose ECTF_STRTAB nonetheless. * ctf-open.c (init_types): No longer filter out ECTF_STRTAB. (ctf_file_close): Destroy the ctf_syn_ext_strtab. (ctf_simple_open): Rename to, and reimplement as a wrapper around... (ctf_simple_open_internal): ... this new function, which calls ctf_bufopen_internal. (ctf_bufopen): Rename to, and reimplement as a wrapper around... (ctf_bufopen_internal): ... this new function, which sets ctf_syn_ext_strtab.
2019-07-14 03:33:01 +08:00
extern const char *ctf_strraw (ctf_file_t *, uint32_t);
extern const char *ctf_strraw_explicit (ctf_file_t *, uint32_t,
ctf_strs_t *);
libctf: deduplicate and sort the string table ctf.h states: > [...] the CTF string table does not contain any duplicated strings. Unfortunately this is entirely untrue: libctf has before now made no attempt whatsoever to deduplicate the string table. It computes the string table's length on the fly as it adds new strings to the dynamic CTF file, and ctf_update() just writes each string to the table and notes the current write position as it traverses the dynamic CTF file's data structures and builds the final CTF buffer. There is no global view of the strings and no deduplication. Fix this by erasing the ctf_dtvstrlen dead-reckoning length, and adding a new dynhash table ctf_str_atoms that maps unique strings to a list of references to those strings: a reference is a simple uint32_t * to some value somewhere in the under-construction CTF buffer that needs updating to note the string offset when the strtab is laid out. Adding a string is now a simple matter of calling ctf_str_add_ref(), which adds a new atom to the atoms table, if one doesn't already exist, and adding the location of the reference to this atom to the refs list attached to the atom: this works reliably as long as one takes care to only call ctf_str_add_ref() once the final location of the offset is known (so you can't call it on a temporary structure and then memcpy() that structure into place in the CTF buffer, because the ref will still point to the old location: ctf_update() changes accordingly). Generating the CTF string table is a matter of calling ctf_str_write_strtab(), which counts the length and number of elements in the atoms table using the ctf_dynhash_iter() function we just added, populating an array of pointers into the atoms table and sorting it into order (to help compressors), then traversing this table and emitting it, updating the refs to each atom as we go. The only complexity here is arranging to keep the null string at offset zero, since a lot of code in libctf depends on being able to leave strtab references at 0 to indicate 'no name'. Once the table is constructed and the refs updated, we know how long it is, so we can realloc() the partial CTF buffer we allocated earlier and can copy the table on to the end of it (and purge the refs because they're not needed any more and have been invalidated by the realloc() call in any case). The net effect of all this is a reduction in uncompressed strtab sizes of about 30% (perhaps a quarter to a half of all strings across the Linux kernel are eliminated as duplicates). Of course, duplicated strings are highly redundant, so the space saving after compression is only about 20%: when the other non-strtab sections are factored in, CTF sizes shrink by about 10%. No change in externally-visible API or file format (other than the reduction in pointless redundancy). libctf/ * ctf-impl.h: (struct ctf_strs_writable): New, non-const version of struct ctf_strs. (struct ctf_dtdef): Note that dtd_data.ctt_name is unpopulated. (struct ctf_str_atom): New, disambiguated single string. (struct ctf_str_atom_ref): New, points to some other location that references this string's offset. (struct ctf_file): New members ctf_str_atoms and ctf_str_num_refs. Remove member ctf_dtvstrlen: we no longer track the total strlen as we add strings. (ctf_str_create_atoms): Declare new function in ctf-string.c. (ctf_str_free_atoms): Likewise. (ctf_str_add): Likewise. (ctf_str_add_ref): Likewise. (ctf_str_purge_refs): Likewise. (ctf_str_write_strtab): Likewise. (ctf_realloc): Declare new function in ctf-util.c. * ctf-open.c (ctf_bufopen): Create the atoms table. (ctf_file_close): Destroy it. * ctf-create.c (ctf_update): Copy-and-free it on update. No longer special-case the position of the parname string. Construct the strtab by calling ctf_str_add_ref and ctf_str_write_strtab after the rest of each buffer element is constructed, not via open-coding: realloc the CTF buffer and append the strtab to it. No longer maintain ctf_dtvstrlen. Sort the variable entry table later, after strtab construction. (ctf_copy_membnames): Remove: integrated into ctf_copy_{s,l,e}members. (ctf_copy_smembers): Drop the string offset: call ctf_str_add_ref after buffer element construction instead. (ctf_copy_lmembers): Likewise. (ctf_copy_emembers): Likewise. (ctf_create): No longer maintain the ctf_dtvstrlen. (ctf_dtd_delete): Likewise. (ctf_dvd_delete): Likewise. (ctf_add_generic): Likewise. (ctf_add_enumerator): Likewise. (ctf_add_member_offset): Likewise. (ctf_add_variable): Likewise. (membadd): Likewise. * ctf-util.c (ctf_realloc): New, wrapper around realloc that aborts if there are active ctf_str_num_refs. (ctf_strraw): Move to ctf-string.c. (ctf_strptr): Likewise. * ctf-string.c: New file, strtab manipulation. * Makefile.am (libctf_a_SOURCES): Add it. * Makefile.in: Regenerate.
2019-06-27 20:51:10 +08:00
extern int ctf_str_create_atoms (ctf_file_t *);
extern void ctf_str_free_atoms (ctf_file_t *);
libctf: avoid the need to ever use ctf_update The method of operation of libctf when the dictionary is writable has before now been that types that are added land in the dynamic type section, which is a linked list and hash of IDs -> dynamic type definitions (and, recently a hash of names): the DTDs are a bit of CTF representing the ctf_type_t and ad hoc C structures representing the vlen. Historically, libctf was unable to do anything with these types, not even look them up by ID, let alone by name: if you wanted to do that say if you were adding a type that depended on one you just added) you called ctf_update, which serializes all the DTDs into a CTF file and reopens it, copying its guts over the fp it's called with. The ctf_updated types are then frozen in amber and unchangeable: all lookups will return the types in the static portion in preference to the dynamic portion, and we will refuse to re-add things that already exist in the static portion (and, of late, in the dynamic portion too). The libctf machinery remembers the boundary between static and dynamic types and looks in the right portion for each type. Lots of things still don't quite work with dynamic types (e.g. getting their size), but enough works to do a bunch of additions and then a ctf_update, most of the time. Except it doesn't, because ctf_add_type finds it necessary to walk the full dynamic type definition list looking for types with matching names, so it gets slower and slower with every type you add: fixing this requires calling ctf_update periodically for no other reason than to avoid massively slowing things down. This is all clunky and very slow but kind of works, until you consider that it is in fact possible and indeed necessary to modify one sort of type after it has been added: forwards. These are necessarily promoted to structs, unions or enums, and when they do so *their type ID does not change*. So all of a sudden we are changing types that already exist in the static portion. ctf_update gets massively confused by this and allocates space enough for the forward (with no members), but then emits the new dynamic type (with all the members) into it. You get an assertion failure after that, if you're lucky, or a coredump. So this commit rejigs things a bit and arranges to exclusively use the dynamic type definitions in writable dictionaries, and the static type definitions in readable dictionaries: we don't at any time have a mixture of static and dynamic types, and you don't need to call ctf_update to make things "appear". The ctf_dtbyname hash I introduced a few months ago, which maps things like "struct foo" to DTDs, is removed, replaced instead by a change of type of the four dictionaries which track names. Rather than just being (unresizable) ctf_hash_t's populated only at ctf_bufopen time, they are now a ctf_names_t structure, which is a pair of ctf_hash_t and ctf_dynhash_t, with the ctf_hash_t portion being used in readonly dictionaries, and the ctf_dynhash_t being used in writable ones. The decision as to which to use is centralized in the new functions ctf_lookup_by_rawname (which takes a type kind) and ctf_lookup_by_rawhash, which it calls (which takes a ctf_names_t *.) This change lets us switch from using static to dynamic name hashes on the fly across the entirety of libctf without complexifying anything: in fact, because we now centralize the knowledge about how to map from type kind to name hash, it actually simplifies things and lets us throw out quite a lot of now-unnecessary complexity, from ctf_dtnyname (replaced by the dynamic half of the name tables), through to ctf_dtnextid (now that a dictionary's static portion is never referenced if the dictionary is writable, we can just use ctf_typemax to indicate the maximum type: dynamic or non-dynamic does not matter, and we no longer need to track the boundary between the types). You can now ctf_rollback() as far as you like, even past a ctf_update or for that matter a full writeout; all the iteration functions work just as well on writable as on read-only dictionaries; ctf_add_type no longer needs expensive duplicated code to run over the dynamic types hunting for ones it might be interested in; and the linker no longer needs a hack to call ctf_update so that calling ctf_add_type is not impossibly expensive. There is still a bit more complexity: some new code paths in ctf-types.c need to know how to extract information from dynamic types. This complexity will go away again in a few months when libctf acquires a proper intermediate representation. You can still call ctf_update if you like (it's public API, after all), but its only effect now is to set the point to which ctf_discard rolls back. Obviously *something* still needs to serialize the CTF file before writeout, and this job is done by ctf_serialize, which does everything ctf_update used to except set the counter used by ctf_discard. It is automatically called by the various functions that do CTF writeout: nobody else ever needs to call it. With this in place, forwards that are promoted to non-forwards no longer crash the link, even if it happens tens of thousands of types later. v5: fix tabdamage. libctf/ * ctf-impl.h (ctf_names_t): New. (ctf_lookup_t) <ctf_hash>: Now a ctf_names_t, not a ctf_hash_t. (ctf_file_t) <ctf_structs>: Likewise. <ctf_unions>: Likewise. <ctf_enums>: Likewise. <ctf_names>: Likewise. <ctf_lookups>: Improve comment. <ctf_ptrtab_len>: New. <ctf_prov_strtab>: New. <ctf_str_prov_offset>: New. <ctf_dtbyname>: Remove, redundant to the names hashes. <ctf_dtnextid>: Remove, redundant to ctf_typemax. (ctf_dtdef_t) <dtd_name>: Remove. <dtd_data>: Note that the ctt_name is now populated. (ctf_str_atom_t) <csa_offset>: This is now the strtab offset for internal strings too. <csa_external_offset>: New, the external strtab offset. (CTF_INDEX_TO_TYPEPTR): Handle the LCTF_RDWR case. (ctf_name_table): New declaration. (ctf_lookup_by_rawname): Likewise. (ctf_lookup_by_rawhash): Likewise. (ctf_set_ctl_hashes): Likewise. (ctf_serialize): Likewise. (ctf_dtd_insert): Adjust. (ctf_simple_open_internal): Likewise. (ctf_bufopen_internal): Likewise. (ctf_list_empty_p): Likewise. (ctf_str_remove_ref): Likewise. (ctf_str_add): Returns uint32_t now. (ctf_str_add_ref): Likewise. (ctf_str_add_external): Now returns a boolean (int). * ctf-string.c (ctf_strraw_explicit): Check the ctf_prov_strtab for strings in the appropriate range. (ctf_str_create_atoms): Create the ctf_prov_strtab. Detect OOM when adding the null string to the new strtab. (ctf_str_free_atoms): Destroy the ctf_prov_strtab. (ctf_str_add_ref_internal): Add make_provisional argument. If make_provisional, populate the offset and fill in the ctf_prov_strtab accordingly. (ctf_str_add): Return the offset, not the string. (ctf_str_add_ref): Likewise. (ctf_str_add_external): Return a success integer. (ctf_str_remove_ref): New, remove a single ref. (ctf_str_count_strtab): Do not count the initial null string's length or the existence or length of any unreferenced internal atoms. (ctf_str_populate_sorttab): Skip atoms with no refs. (ctf_str_write_strtab): Populate the nullstr earlier. Add one to the cts_len for the null string, since it is no longer done in ctf_str_count_strtab. Adjust for csa_external_offset rename. Populate the csa_offset for both internal and external cases. Flush the ctf_prov_strtab afterwards, and reset the ctf_str_prov_offset. * ctf-create.c (ctf_grow_ptrtab): New. (ctf_create): Call it. Initialize new fields rather than old ones. Tell ctf_bufopen_internal that this is a writable dictionary. Set the ctl hashes and data model. (ctf_update): Rename to... (ctf_serialize): ... this. Leave a compatibility function behind. Tell ctf_simple_open_internal that this is a writable dictionary. Pass the new fields along from the old dictionary. Drop ctf_dtnextid and ctf_dtbyname. Use ctf_strraw, not dtd_name. Do not zero out the DTD's ctt_name. (ctf_prefixed_name): Rename to... (ctf_name_table): ... this. No longer return a prefixed name: return the applicable name table instead. (ctf_dtd_insert): Use it, and use the right name table. Pass in the kind we're adding. Migrate away from dtd_name. (ctf_dtd_delete): Adjust similarly. Remove the ref to the deleted ctt_name. (ctf_dtd_lookup_type_by_name): Remove. (ctf_dynamic_type): Always return NULL on read-only dictionaries. No longer check ctf_dtnextid: check ctf_typemax instead. (ctf_snapshot): No longer use ctf_dtnextid: use ctf_typemax instead. (ctf_rollback): Likewise. No longer fail with ECTF_OVERROLLBACK. Use ctf_name_table and the right name table, and migrate away from dtd_name as in ctf_dtd_delete. (ctf_add_generic): Pass in the kind explicitly and pass it to ctf_dtd_insert. Use ctf_typemax, not ctf_dtnextid. Migrate away from dtd_name to using ctf_str_add_ref to populate the ctt_name. Grow the ptrtab if needed. (ctf_add_encoded): Pass in the kind. (ctf_add_slice): Likewise. (ctf_add_array): Likewise. (ctf_add_function): Likewise. (ctf_add_typedef): Likewise. (ctf_add_reftype): Likewise. Initialize the ctf_ptrtab, checking ctt_name rather than dtd_name. (ctf_add_struct_sized): Pass in the kind. Use ctf_lookup_by_rawname, not ctf_hash_lookup_type / ctf_dtd_lookup_type_by_name. (ctf_add_union_sized): Likewise. (ctf_add_enum): Likewise. (ctf_add_enum_encoded): Likewise. (ctf_add_forward): Likewise. (ctf_add_type): Likewise. (ctf_compress_write): Call ctf_serialize: adjust for ctf_size not being initialized until after the call. (ctf_write_mem): Likewise. (ctf_write): Likewise. * ctf-archive.c (arc_write_one_ctf): Likewise. * ctf-lookup.c (ctf_lookup_by_name): Use ctf_lookuup_by_rawhash, not ctf_hash_lookup_type. (ctf_lookup_by_id): No longer check the readonly types if the dictionary is writable. * ctf-open.c (init_types): Assert that this dictionary is not writable. Adjust to use the new name hashes, ctf_name_table, and ctf_ptrtab_len. GNU style fix for the final ptrtab scan. (ctf_bufopen_internal): New 'writable' parameter. Flip on LCTF_RDWR if set. Drop out early when dictionary is writable. Split the ctf_lookups initialization into... (ctf_set_cth_hashes): ... this new function. (ctf_simple_open_internal): Adjust. New 'writable' parameter. (ctf_simple_open): Adjust accordingly. (ctf_bufopen): Likewise. (ctf_file_close): Destroy the appropriate name hashes. No longer destroy ctf_dtbyname, which is gone. (ctf_getdatasect): Remove spurious "extern". * ctf-types.c (ctf_lookup_by_rawname): New, look up types in the specified name table, given a kind. (ctf_lookup_by_rawhash): Likewise, given a ctf_names_t *. (ctf_member_iter): Add support for iterating over the dynamic type list. (ctf_enum_iter): Likewise. (ctf_variable_iter): Likewise. (ctf_type_rvisit): Likewise. (ctf_member_info): Add support for types in the dynamic type list. (ctf_enum_name): Likewise. (ctf_enum_value): Likewise. (ctf_func_type_info): Likewise. (ctf_func_type_args): Likewise. * ctf-link.c (ctf_accumulate_archive_names): No longer call ctf_update. (ctf_link_write): Likewise. (ctf_link_intern_extern_string): Adjust for new ctf_str_add_external return value. (ctf_link_add_strtab): Likewise. * ctf-util.c (ctf_list_empty_p): New.
2019-08-08 00:55:09 +08:00
extern uint32_t ctf_str_add (ctf_file_t *, const char *);
extern uint32_t ctf_str_add_ref (ctf_file_t *, const char *, uint32_t *ref);
extern int ctf_str_add_external (ctf_file_t *, const char *, uint32_t offset);
extern void ctf_str_remove_ref (ctf_file_t *, const char *, uint32_t *ref);
libctf: deduplicate and sort the string table ctf.h states: > [...] the CTF string table does not contain any duplicated strings. Unfortunately this is entirely untrue: libctf has before now made no attempt whatsoever to deduplicate the string table. It computes the string table's length on the fly as it adds new strings to the dynamic CTF file, and ctf_update() just writes each string to the table and notes the current write position as it traverses the dynamic CTF file's data structures and builds the final CTF buffer. There is no global view of the strings and no deduplication. Fix this by erasing the ctf_dtvstrlen dead-reckoning length, and adding a new dynhash table ctf_str_atoms that maps unique strings to a list of references to those strings: a reference is a simple uint32_t * to some value somewhere in the under-construction CTF buffer that needs updating to note the string offset when the strtab is laid out. Adding a string is now a simple matter of calling ctf_str_add_ref(), which adds a new atom to the atoms table, if one doesn't already exist, and adding the location of the reference to this atom to the refs list attached to the atom: this works reliably as long as one takes care to only call ctf_str_add_ref() once the final location of the offset is known (so you can't call it on a temporary structure and then memcpy() that structure into place in the CTF buffer, because the ref will still point to the old location: ctf_update() changes accordingly). Generating the CTF string table is a matter of calling ctf_str_write_strtab(), which counts the length and number of elements in the atoms table using the ctf_dynhash_iter() function we just added, populating an array of pointers into the atoms table and sorting it into order (to help compressors), then traversing this table and emitting it, updating the refs to each atom as we go. The only complexity here is arranging to keep the null string at offset zero, since a lot of code in libctf depends on being able to leave strtab references at 0 to indicate 'no name'. Once the table is constructed and the refs updated, we know how long it is, so we can realloc() the partial CTF buffer we allocated earlier and can copy the table on to the end of it (and purge the refs because they're not needed any more and have been invalidated by the realloc() call in any case). The net effect of all this is a reduction in uncompressed strtab sizes of about 30% (perhaps a quarter to a half of all strings across the Linux kernel are eliminated as duplicates). Of course, duplicated strings are highly redundant, so the space saving after compression is only about 20%: when the other non-strtab sections are factored in, CTF sizes shrink by about 10%. No change in externally-visible API or file format (other than the reduction in pointless redundancy). libctf/ * ctf-impl.h: (struct ctf_strs_writable): New, non-const version of struct ctf_strs. (struct ctf_dtdef): Note that dtd_data.ctt_name is unpopulated. (struct ctf_str_atom): New, disambiguated single string. (struct ctf_str_atom_ref): New, points to some other location that references this string's offset. (struct ctf_file): New members ctf_str_atoms and ctf_str_num_refs. Remove member ctf_dtvstrlen: we no longer track the total strlen as we add strings. (ctf_str_create_atoms): Declare new function in ctf-string.c. (ctf_str_free_atoms): Likewise. (ctf_str_add): Likewise. (ctf_str_add_ref): Likewise. (ctf_str_purge_refs): Likewise. (ctf_str_write_strtab): Likewise. (ctf_realloc): Declare new function in ctf-util.c. * ctf-open.c (ctf_bufopen): Create the atoms table. (ctf_file_close): Destroy it. * ctf-create.c (ctf_update): Copy-and-free it on update. No longer special-case the position of the parname string. Construct the strtab by calling ctf_str_add_ref and ctf_str_write_strtab after the rest of each buffer element is constructed, not via open-coding: realloc the CTF buffer and append the strtab to it. No longer maintain ctf_dtvstrlen. Sort the variable entry table later, after strtab construction. (ctf_copy_membnames): Remove: integrated into ctf_copy_{s,l,e}members. (ctf_copy_smembers): Drop the string offset: call ctf_str_add_ref after buffer element construction instead. (ctf_copy_lmembers): Likewise. (ctf_copy_emembers): Likewise. (ctf_create): No longer maintain the ctf_dtvstrlen. (ctf_dtd_delete): Likewise. (ctf_dvd_delete): Likewise. (ctf_add_generic): Likewise. (ctf_add_enumerator): Likewise. (ctf_add_member_offset): Likewise. (ctf_add_variable): Likewise. (membadd): Likewise. * ctf-util.c (ctf_realloc): New, wrapper around realloc that aborts if there are active ctf_str_num_refs. (ctf_strraw): Move to ctf-string.c. (ctf_strptr): Likewise. * ctf-string.c: New file, strtab manipulation. * Makefile.am (libctf_a_SOURCES): Add it. * Makefile.in: Regenerate.
2019-06-27 20:51:10 +08:00
extern void ctf_str_rollback (ctf_file_t *, ctf_snapshot_id_t);
extern void ctf_str_purge_refs (ctf_file_t *);
extern ctf_strs_writable_t ctf_str_write_strtab (ctf_file_t *);
extern struct ctf_archive_internal *
ctf_new_archive_internal (int is_archive, int unmap_on_close,
struct ctf_archive *, ctf_file_t *,
const ctf_sect_t *symsect,
const ctf_sect_t *strsect, int *errp);
libctf: mmappable archives If you need to store a large number of CTF containers somewhere, this provides a dedicated facility for doing so: an mmappable archive format like a very simple tar or ar without all the system-dependent format horrors or need for heavy file copying, with built-in compression of files above a particular size threshold. libctf automatically mmap()s uncompressed elements of these archives, or uncompresses them, as needed. (If the platform does not support mmap(), copying into dynamically-allocated buffers is used.) Archive iteration operations are partitioned into raw and non-raw forms. Raw operations pass thhe raw archive contents to the callback: non-raw forms open each member with ctf_bufopen() and pass the resulting ctf_file_t to the iterator instead. This lets you manipulate the raw data in the archive, or the contents interpreted as a CTF file, as needed. It is not yet known whether we will store CTF archives in a linked ELF object in one of these (akin to debugdata) or whether they'll get one section per TU plus one parent container for types shared between them. (In the case of ELF objects with very large numbers of TUs, an archive of all of them would seem preferable, so we might just use an archive, and add lzma support so you can assume that .gnu_debugdata and .ctf are compressed using the same algorithm if both are present.) To make usage easier, the ctf_archive_t is not the on-disk representation but an abstraction over both ctf_file_t's and archives of many ctf_file_t's: users see both CTF archives and raw CTF files as ctf_archive_t's upon opening, the only difference being that a raw CTF file has only a single "archive member", named ".ctf" (the default if a null pointer is passed in as the name). The next commit will make use of this facility, in addition to providing the public interface to actually open archives. (In the future, it should be possible to have all CTF sections in an ELF file appear as an "archive" in the same fashion.) This machinery is also used to allow library-internal creators of ctf_archive_t's (such as the next commit) to stash away an ELF string and symbol table, so that all opens of members in a given archive will use them. This lets CTF archives exploit the ELF string and symbol table just like raw CTF files can. (All this leads to somewhat confusing type naming. The ctf_archive_t is a typedef for the opaque internal type, struct ctf_archive_internal: the non-internal "struct ctf_archive" is the on-disk structure meant for other libraries manipulating CTF files. It is probably clearest to use the struct name for struct ctf_archive_internal inside the program, and the typedef names outside.) libctf/ * ctf-archive.c: New. * ctf-impl.h (ctf_archive_internal): New type. (ctf_arc_open_internal): New declaration. (ctf_arc_bufopen): Likewise. (ctf_arc_close_internal): Likewise. include/ * ctf.h (CTFA_MAGIC): New. (struct ctf_archive): New. (struct ctf_archive_modent): Likewise. * ctf-api.h (ctf_archive_member_f): New. (ctf_archive_raw_member_f): Likewise. (ctf_arc_write): Likewise. (ctf_arc_close): Likewise. (ctf_arc_open_by_name): Likewise. (ctf_archive_iter): Likewise. (ctf_archive_raw_iter): Likewise. (ctf_get_arc): Likewise.
2019-04-24 18:30:17 +08:00
extern struct ctf_archive *ctf_arc_open_internal (const char *, int *);
extern void ctf_arc_close_internal (struct ctf_archive *);
extern void *ctf_set_open_errno (int *, int);
libctf: fix a number of build problems found on Solaris and NetBSD - Use of nonportable <endian.h> - Use of qsort_r - Use of zlib without appropriate magic to pull in the binutils zlib - Use of off64_t without checking (fixed by dropping the unused fields that need off64_t entirely) - signedness problems due to long being too short a type on 32-bit platforms: ctf_id_t is now 'unsigned long', and CTF_ERR must be used only for functions that return ctf_id_t - One lingering use of bzero() and of <sys/errno.h> All fixed, using code from gnulib where possible. Relatedly, set cts_size in a couple of places it was missed (string table and symbol table loading upon ctf_bfdopen()). binutils/ * objdump.c (make_ctfsect): Drop cts_type, cts_flags, and cts_offset. * readelf.c (shdr_to_ctf_sect): Likewise. include/ * ctf-api.h (ctf_sect_t): Drop cts_type, cts_flags, and cts_offset. (ctf_id_t): This is now an unsigned type. (CTF_ERR): Cast it to ctf_id_t. Note that it should only be used for ctf_id_t-returning functions. libctf/ * Makefile.am (ZLIB): New. (ZLIBINC): Likewise. (AM_CFLAGS): Use them. (libctf_a_LIBADD): New, for LIBOBJS. * configure.ac: Check for zlib, endian.h, and qsort_r. * ctf-endian.h: New, providing htole64 and le64toh. * swap.h: Code style fixes. (bswap_identity_64): New. * qsort_r.c: New, from gnulib (with one added #include). * ctf-decls.h: New, providing a conditional qsort_r declaration, and unconditional definitions of MIN and MAX. * ctf-impl.h: Use it. Do not use <sys/errno.h>. (ctf_set_errno): Now returns unsigned long. * ctf-util.c (ctf_set_errno): Adjust here too. * ctf-archive.c: Use ctf-endian.h. (ctf_arc_open_by_offset): Use memset, not bzero. Drop cts_type, cts_flags and cts_offset. (ctf_arc_write): Drop debugging dependent on the size of off_t. * ctf-create.c: Provide a definition of roundup if not defined. (ctf_create): Drop cts_type, cts_flags and cts_offset. (ctf_add_reftype): Do not check if type IDs are below zero. (ctf_add_slice): Likewise. (ctf_add_typedef): Likewise. (ctf_add_member_offset): Cast error-returning ssize_t's to size_t when known error-free. Drop CTF_ERR usage for functions returning int. (ctf_add_member_encoded): Drop CTF_ERR usage for functions returning int. (ctf_add_variable): Likewise. (enumcmp): Likewise. (enumadd): Likewise. (membcmp): Likewise. (ctf_add_type): Likewise. Cast error-returning ssize_t's to size_t when known error-free. * ctf-dump.c (ctf_is_slice): Drop CTF_ERR usage for functions returning int: use CTF_ERR for functions returning ctf_type_id. (ctf_dump_label): Likewise. (ctf_dump_objts): Likewise. * ctf-labels.c (ctf_label_topmost): Likewise. (ctf_label_iter): Likewise. (ctf_label_info): Likewise. * ctf-lookup.c (ctf_func_args): Likewise. * ctf-open.c (upgrade_types): Cast to size_t where appropriate. (ctf_bufopen): Likewise. Use zlib types as needed. * ctf-types.c (ctf_member_iter): Drop CTF_ERR usage for functions returning int. (ctf_enum_iter): Likewise. (ctf_type_size): Likewise. (ctf_type_align): Likewise. Cast to size_t where appropriate. (ctf_type_kind_unsliced): Likewise. (ctf_type_kind): Likewise. (ctf_type_encoding): Likewise. (ctf_member_info): Likewise. (ctf_array_info): Likewise. (ctf_enum_value): Likewise. (ctf_type_rvisit): Likewise. * ctf-open-bfd.c (ctf_bfdopen): Drop cts_type, cts_flags and cts_offset. (ctf_simple_open): Likewise. (ctf_bfdopen_ctfsect): Likewise. Set cts_size properly. * Makefile.in: Regenerate. * aclocal.m4: Likewise. * config.h: Likewise. * configure: Likewise.
2019-05-31 17:10:51 +08:00
extern unsigned long ctf_set_errno (ctf_file_t *, int);
libctf: support getting strings from the ELF strtab The CTF file format has always supported "external strtabs", which internally are strtab offsets with their MSB on: such refs get their strings from the strtab passed in at CTF file open time: this is usually intended to be the ELF strtab, and that's what this implementation is meant to support, though in theory the external strtab could come from anywhere. This commit adds support for these external strings in the ctf-string.c strtab tracking layer. It's quite easy: we just add a field csa_offset to the atoms table that tracks all strings: this field tracks the offset of the string in the ELF strtab (with its MSB already on, courtesy of a new macro CTF_SET_STID), and adds a new function that sets the csa_offset to the specified offset (plus MSB). Then we just need to avoid writing out strings to the internal strtab if they have csa_offset set, and note that the internal strtab is shorter than it might otherwise be. (We could in theory save a little more time here by eschewing sorting such strings, since we never actually write the strings out anywhere, but that would mean storing them separately and it's just not worth the complexity cost until profiling shows it's worth doing.) We also have to go through a bit of extra effort at variable-sorting time. This was previously using direct references to the internal strtab: it couldn't use ctf_strptr or ctf_strraw because the new strtab is not yet ready to put in its usual field (in a ctf_file_t that hasn't even been allocated yet at this stage): but now we're using the external strtab, this will no longer do because it'll be looking things up in the wrong strtab, with disastrous results. Instead, pass the new internal strtab in to a new ctf_strraw_explicit function which is just like ctf_strraw except you can specify a ne winternal strtab to use. But even now that it is using a new internal strtab, this is not quite enough: it can't look up strings in the external strtab because ld hasn't written it out yet, and when it does will write it straight to disk. Instead, when we write the internal strtab, note all the offset -> string mappings that we have noted belong in the *external* strtab to a new "synthetic external strtab" dynhash, ctf_syn_ext_strtab, and look in there at ctf_strraw time if it is set. This uses minimal extra memory (because only strings in the external strtab that we actually use are stored, and even those come straight out of the atoms table), but let both variable sorting and name interning when ctf_bufopen is next called work fine. (This also means that we don't need to filter out spurious ECTF_STRTAB warnings from ctf_bufopen but can pass them back to the caller, once we wrap ctf_bufopen so that we have a new internal variant of ctf_bufopen etc that we can pass the synthetic external strtab to. That error has been filtered out since the days of Solaris libctf, which didn't try to handle the problem of getting external strtabs right at construction time at all.) v3: add the synthetic strtab and all associated machinery. v5: fix tabdamage. include/ * ctf.h (CTF_SET_STID): New. libctf/ * ctf-impl.h (ctf_str_atom_t) <csa_offset>: New field. (ctf_file_t) <ctf_syn_ext_strtab>: Likewise. (ctf_str_add_ref): Name the last arg. (ctf_str_add_external) New. (ctf_str_add_strraw_explicit): Likewise. (ctf_simple_open_internal): Likewise. (ctf_bufopen_internal): Likewise. * ctf-string.c (ctf_strraw_explicit): Split from... (ctf_strraw): ... here, with new support for ctf_syn_ext_strtab. (ctf_str_add_ref_internal): Return the atom, not the string. (ctf_str_add): Adjust accordingly. (ctf_str_add_ref): Likewise. Move up in the file. (ctf_str_add_external): New: update the csa_offset. (ctf_str_count_strtab): Only account for strings with no csa_offset in the internal strtab length. (ctf_str_write_strtab): If the csa_offset is set, update the string's refs without writing the string out, and update the ctf_syn_ext_strtab. Make OOM handling less ugly. * ctf-create.c (struct ctf_sort_var_arg_cb): New. (ctf_update): Handle failure to populate the strtab. Pass in the new ctf_sort_var arg. Adjust for ctf_syn_ext_strtab addition. Call ctf_simple_open_internal, not ctf_simple_open. (ctf_sort_var): Call ctf_strraw_explicit rather than looking up strings by hand. * ctf-hash.c (ctf_hash_insert_type): Likewise (but using ctf_strraw). Adjust to diagnose ECTF_STRTAB nonetheless. * ctf-open.c (init_types): No longer filter out ECTF_STRTAB. (ctf_file_close): Destroy the ctf_syn_ext_strtab. (ctf_simple_open): Rename to, and reimplement as a wrapper around... (ctf_simple_open_internal): ... this new function, which calls ctf_bufopen_internal. (ctf_bufopen): Rename to, and reimplement as a wrapper around... (ctf_bufopen_internal): ... this new function, which sets ctf_syn_ext_strtab.
2019-07-14 03:33:01 +08:00
extern ctf_file_t *ctf_simple_open_internal (const char *, size_t, const char *,
size_t, size_t,
const char *, size_t,
libctf: avoid the need to ever use ctf_update The method of operation of libctf when the dictionary is writable has before now been that types that are added land in the dynamic type section, which is a linked list and hash of IDs -> dynamic type definitions (and, recently a hash of names): the DTDs are a bit of CTF representing the ctf_type_t and ad hoc C structures representing the vlen. Historically, libctf was unable to do anything with these types, not even look them up by ID, let alone by name: if you wanted to do that say if you were adding a type that depended on one you just added) you called ctf_update, which serializes all the DTDs into a CTF file and reopens it, copying its guts over the fp it's called with. The ctf_updated types are then frozen in amber and unchangeable: all lookups will return the types in the static portion in preference to the dynamic portion, and we will refuse to re-add things that already exist in the static portion (and, of late, in the dynamic portion too). The libctf machinery remembers the boundary between static and dynamic types and looks in the right portion for each type. Lots of things still don't quite work with dynamic types (e.g. getting their size), but enough works to do a bunch of additions and then a ctf_update, most of the time. Except it doesn't, because ctf_add_type finds it necessary to walk the full dynamic type definition list looking for types with matching names, so it gets slower and slower with every type you add: fixing this requires calling ctf_update periodically for no other reason than to avoid massively slowing things down. This is all clunky and very slow but kind of works, until you consider that it is in fact possible and indeed necessary to modify one sort of type after it has been added: forwards. These are necessarily promoted to structs, unions or enums, and when they do so *their type ID does not change*. So all of a sudden we are changing types that already exist in the static portion. ctf_update gets massively confused by this and allocates space enough for the forward (with no members), but then emits the new dynamic type (with all the members) into it. You get an assertion failure after that, if you're lucky, or a coredump. So this commit rejigs things a bit and arranges to exclusively use the dynamic type definitions in writable dictionaries, and the static type definitions in readable dictionaries: we don't at any time have a mixture of static and dynamic types, and you don't need to call ctf_update to make things "appear". The ctf_dtbyname hash I introduced a few months ago, which maps things like "struct foo" to DTDs, is removed, replaced instead by a change of type of the four dictionaries which track names. Rather than just being (unresizable) ctf_hash_t's populated only at ctf_bufopen time, they are now a ctf_names_t structure, which is a pair of ctf_hash_t and ctf_dynhash_t, with the ctf_hash_t portion being used in readonly dictionaries, and the ctf_dynhash_t being used in writable ones. The decision as to which to use is centralized in the new functions ctf_lookup_by_rawname (which takes a type kind) and ctf_lookup_by_rawhash, which it calls (which takes a ctf_names_t *.) This change lets us switch from using static to dynamic name hashes on the fly across the entirety of libctf without complexifying anything: in fact, because we now centralize the knowledge about how to map from type kind to name hash, it actually simplifies things and lets us throw out quite a lot of now-unnecessary complexity, from ctf_dtnyname (replaced by the dynamic half of the name tables), through to ctf_dtnextid (now that a dictionary's static portion is never referenced if the dictionary is writable, we can just use ctf_typemax to indicate the maximum type: dynamic or non-dynamic does not matter, and we no longer need to track the boundary between the types). You can now ctf_rollback() as far as you like, even past a ctf_update or for that matter a full writeout; all the iteration functions work just as well on writable as on read-only dictionaries; ctf_add_type no longer needs expensive duplicated code to run over the dynamic types hunting for ones it might be interested in; and the linker no longer needs a hack to call ctf_update so that calling ctf_add_type is not impossibly expensive. There is still a bit more complexity: some new code paths in ctf-types.c need to know how to extract information from dynamic types. This complexity will go away again in a few months when libctf acquires a proper intermediate representation. You can still call ctf_update if you like (it's public API, after all), but its only effect now is to set the point to which ctf_discard rolls back. Obviously *something* still needs to serialize the CTF file before writeout, and this job is done by ctf_serialize, which does everything ctf_update used to except set the counter used by ctf_discard. It is automatically called by the various functions that do CTF writeout: nobody else ever needs to call it. With this in place, forwards that are promoted to non-forwards no longer crash the link, even if it happens tens of thousands of types later. v5: fix tabdamage. libctf/ * ctf-impl.h (ctf_names_t): New. (ctf_lookup_t) <ctf_hash>: Now a ctf_names_t, not a ctf_hash_t. (ctf_file_t) <ctf_structs>: Likewise. <ctf_unions>: Likewise. <ctf_enums>: Likewise. <ctf_names>: Likewise. <ctf_lookups>: Improve comment. <ctf_ptrtab_len>: New. <ctf_prov_strtab>: New. <ctf_str_prov_offset>: New. <ctf_dtbyname>: Remove, redundant to the names hashes. <ctf_dtnextid>: Remove, redundant to ctf_typemax. (ctf_dtdef_t) <dtd_name>: Remove. <dtd_data>: Note that the ctt_name is now populated. (ctf_str_atom_t) <csa_offset>: This is now the strtab offset for internal strings too. <csa_external_offset>: New, the external strtab offset. (CTF_INDEX_TO_TYPEPTR): Handle the LCTF_RDWR case. (ctf_name_table): New declaration. (ctf_lookup_by_rawname): Likewise. (ctf_lookup_by_rawhash): Likewise. (ctf_set_ctl_hashes): Likewise. (ctf_serialize): Likewise. (ctf_dtd_insert): Adjust. (ctf_simple_open_internal): Likewise. (ctf_bufopen_internal): Likewise. (ctf_list_empty_p): Likewise. (ctf_str_remove_ref): Likewise. (ctf_str_add): Returns uint32_t now. (ctf_str_add_ref): Likewise. (ctf_str_add_external): Now returns a boolean (int). * ctf-string.c (ctf_strraw_explicit): Check the ctf_prov_strtab for strings in the appropriate range. (ctf_str_create_atoms): Create the ctf_prov_strtab. Detect OOM when adding the null string to the new strtab. (ctf_str_free_atoms): Destroy the ctf_prov_strtab. (ctf_str_add_ref_internal): Add make_provisional argument. If make_provisional, populate the offset and fill in the ctf_prov_strtab accordingly. (ctf_str_add): Return the offset, not the string. (ctf_str_add_ref): Likewise. (ctf_str_add_external): Return a success integer. (ctf_str_remove_ref): New, remove a single ref. (ctf_str_count_strtab): Do not count the initial null string's length or the existence or length of any unreferenced internal atoms. (ctf_str_populate_sorttab): Skip atoms with no refs. (ctf_str_write_strtab): Populate the nullstr earlier. Add one to the cts_len for the null string, since it is no longer done in ctf_str_count_strtab. Adjust for csa_external_offset rename. Populate the csa_offset for both internal and external cases. Flush the ctf_prov_strtab afterwards, and reset the ctf_str_prov_offset. * ctf-create.c (ctf_grow_ptrtab): New. (ctf_create): Call it. Initialize new fields rather than old ones. Tell ctf_bufopen_internal that this is a writable dictionary. Set the ctl hashes and data model. (ctf_update): Rename to... (ctf_serialize): ... this. Leave a compatibility function behind. Tell ctf_simple_open_internal that this is a writable dictionary. Pass the new fields along from the old dictionary. Drop ctf_dtnextid and ctf_dtbyname. Use ctf_strraw, not dtd_name. Do not zero out the DTD's ctt_name. (ctf_prefixed_name): Rename to... (ctf_name_table): ... this. No longer return a prefixed name: return the applicable name table instead. (ctf_dtd_insert): Use it, and use the right name table. Pass in the kind we're adding. Migrate away from dtd_name. (ctf_dtd_delete): Adjust similarly. Remove the ref to the deleted ctt_name. (ctf_dtd_lookup_type_by_name): Remove. (ctf_dynamic_type): Always return NULL on read-only dictionaries. No longer check ctf_dtnextid: check ctf_typemax instead. (ctf_snapshot): No longer use ctf_dtnextid: use ctf_typemax instead. (ctf_rollback): Likewise. No longer fail with ECTF_OVERROLLBACK. Use ctf_name_table and the right name table, and migrate away from dtd_name as in ctf_dtd_delete. (ctf_add_generic): Pass in the kind explicitly and pass it to ctf_dtd_insert. Use ctf_typemax, not ctf_dtnextid. Migrate away from dtd_name to using ctf_str_add_ref to populate the ctt_name. Grow the ptrtab if needed. (ctf_add_encoded): Pass in the kind. (ctf_add_slice): Likewise. (ctf_add_array): Likewise. (ctf_add_function): Likewise. (ctf_add_typedef): Likewise. (ctf_add_reftype): Likewise. Initialize the ctf_ptrtab, checking ctt_name rather than dtd_name. (ctf_add_struct_sized): Pass in the kind. Use ctf_lookup_by_rawname, not ctf_hash_lookup_type / ctf_dtd_lookup_type_by_name. (ctf_add_union_sized): Likewise. (ctf_add_enum): Likewise. (ctf_add_enum_encoded): Likewise. (ctf_add_forward): Likewise. (ctf_add_type): Likewise. (ctf_compress_write): Call ctf_serialize: adjust for ctf_size not being initialized until after the call. (ctf_write_mem): Likewise. (ctf_write): Likewise. * ctf-archive.c (arc_write_one_ctf): Likewise. * ctf-lookup.c (ctf_lookup_by_name): Use ctf_lookuup_by_rawhash, not ctf_hash_lookup_type. (ctf_lookup_by_id): No longer check the readonly types if the dictionary is writable. * ctf-open.c (init_types): Assert that this dictionary is not writable. Adjust to use the new name hashes, ctf_name_table, and ctf_ptrtab_len. GNU style fix for the final ptrtab scan. (ctf_bufopen_internal): New 'writable' parameter. Flip on LCTF_RDWR if set. Drop out early when dictionary is writable. Split the ctf_lookups initialization into... (ctf_set_cth_hashes): ... this new function. (ctf_simple_open_internal): Adjust. New 'writable' parameter. (ctf_simple_open): Adjust accordingly. (ctf_bufopen): Likewise. (ctf_file_close): Destroy the appropriate name hashes. No longer destroy ctf_dtbyname, which is gone. (ctf_getdatasect): Remove spurious "extern". * ctf-types.c (ctf_lookup_by_rawname): New, look up types in the specified name table, given a kind. (ctf_lookup_by_rawhash): Likewise, given a ctf_names_t *. (ctf_member_iter): Add support for iterating over the dynamic type list. (ctf_enum_iter): Likewise. (ctf_variable_iter): Likewise. (ctf_type_rvisit): Likewise. (ctf_member_info): Add support for types in the dynamic type list. (ctf_enum_name): Likewise. (ctf_enum_value): Likewise. (ctf_func_type_info): Likewise. (ctf_func_type_args): Likewise. * ctf-link.c (ctf_accumulate_archive_names): No longer call ctf_update. (ctf_link_write): Likewise. (ctf_link_intern_extern_string): Adjust for new ctf_str_add_external return value. (ctf_link_add_strtab): Likewise. * ctf-util.c (ctf_list_empty_p): New.
2019-08-08 00:55:09 +08:00
ctf_dynhash_t *, int, int *);
libctf: support getting strings from the ELF strtab The CTF file format has always supported "external strtabs", which internally are strtab offsets with their MSB on: such refs get their strings from the strtab passed in at CTF file open time: this is usually intended to be the ELF strtab, and that's what this implementation is meant to support, though in theory the external strtab could come from anywhere. This commit adds support for these external strings in the ctf-string.c strtab tracking layer. It's quite easy: we just add a field csa_offset to the atoms table that tracks all strings: this field tracks the offset of the string in the ELF strtab (with its MSB already on, courtesy of a new macro CTF_SET_STID), and adds a new function that sets the csa_offset to the specified offset (plus MSB). Then we just need to avoid writing out strings to the internal strtab if they have csa_offset set, and note that the internal strtab is shorter than it might otherwise be. (We could in theory save a little more time here by eschewing sorting such strings, since we never actually write the strings out anywhere, but that would mean storing them separately and it's just not worth the complexity cost until profiling shows it's worth doing.) We also have to go through a bit of extra effort at variable-sorting time. This was previously using direct references to the internal strtab: it couldn't use ctf_strptr or ctf_strraw because the new strtab is not yet ready to put in its usual field (in a ctf_file_t that hasn't even been allocated yet at this stage): but now we're using the external strtab, this will no longer do because it'll be looking things up in the wrong strtab, with disastrous results. Instead, pass the new internal strtab in to a new ctf_strraw_explicit function which is just like ctf_strraw except you can specify a ne winternal strtab to use. But even now that it is using a new internal strtab, this is not quite enough: it can't look up strings in the external strtab because ld hasn't written it out yet, and when it does will write it straight to disk. Instead, when we write the internal strtab, note all the offset -> string mappings that we have noted belong in the *external* strtab to a new "synthetic external strtab" dynhash, ctf_syn_ext_strtab, and look in there at ctf_strraw time if it is set. This uses minimal extra memory (because only strings in the external strtab that we actually use are stored, and even those come straight out of the atoms table), but let both variable sorting and name interning when ctf_bufopen is next called work fine. (This also means that we don't need to filter out spurious ECTF_STRTAB warnings from ctf_bufopen but can pass them back to the caller, once we wrap ctf_bufopen so that we have a new internal variant of ctf_bufopen etc that we can pass the synthetic external strtab to. That error has been filtered out since the days of Solaris libctf, which didn't try to handle the problem of getting external strtabs right at construction time at all.) v3: add the synthetic strtab and all associated machinery. v5: fix tabdamage. include/ * ctf.h (CTF_SET_STID): New. libctf/ * ctf-impl.h (ctf_str_atom_t) <csa_offset>: New field. (ctf_file_t) <ctf_syn_ext_strtab>: Likewise. (ctf_str_add_ref): Name the last arg. (ctf_str_add_external) New. (ctf_str_add_strraw_explicit): Likewise. (ctf_simple_open_internal): Likewise. (ctf_bufopen_internal): Likewise. * ctf-string.c (ctf_strraw_explicit): Split from... (ctf_strraw): ... here, with new support for ctf_syn_ext_strtab. (ctf_str_add_ref_internal): Return the atom, not the string. (ctf_str_add): Adjust accordingly. (ctf_str_add_ref): Likewise. Move up in the file. (ctf_str_add_external): New: update the csa_offset. (ctf_str_count_strtab): Only account for strings with no csa_offset in the internal strtab length. (ctf_str_write_strtab): If the csa_offset is set, update the string's refs without writing the string out, and update the ctf_syn_ext_strtab. Make OOM handling less ugly. * ctf-create.c (struct ctf_sort_var_arg_cb): New. (ctf_update): Handle failure to populate the strtab. Pass in the new ctf_sort_var arg. Adjust for ctf_syn_ext_strtab addition. Call ctf_simple_open_internal, not ctf_simple_open. (ctf_sort_var): Call ctf_strraw_explicit rather than looking up strings by hand. * ctf-hash.c (ctf_hash_insert_type): Likewise (but using ctf_strraw). Adjust to diagnose ECTF_STRTAB nonetheless. * ctf-open.c (init_types): No longer filter out ECTF_STRTAB. (ctf_file_close): Destroy the ctf_syn_ext_strtab. (ctf_simple_open): Rename to, and reimplement as a wrapper around... (ctf_simple_open_internal): ... this new function, which calls ctf_bufopen_internal. (ctf_bufopen): Rename to, and reimplement as a wrapper around... (ctf_bufopen_internal): ... this new function, which sets ctf_syn_ext_strtab.
2019-07-14 03:33:01 +08:00
extern ctf_file_t *ctf_bufopen_internal (const ctf_sect_t *, const ctf_sect_t *,
const ctf_sect_t *, ctf_dynhash_t *,
libctf: avoid the need to ever use ctf_update The method of operation of libctf when the dictionary is writable has before now been that types that are added land in the dynamic type section, which is a linked list and hash of IDs -> dynamic type definitions (and, recently a hash of names): the DTDs are a bit of CTF representing the ctf_type_t and ad hoc C structures representing the vlen. Historically, libctf was unable to do anything with these types, not even look them up by ID, let alone by name: if you wanted to do that say if you were adding a type that depended on one you just added) you called ctf_update, which serializes all the DTDs into a CTF file and reopens it, copying its guts over the fp it's called with. The ctf_updated types are then frozen in amber and unchangeable: all lookups will return the types in the static portion in preference to the dynamic portion, and we will refuse to re-add things that already exist in the static portion (and, of late, in the dynamic portion too). The libctf machinery remembers the boundary between static and dynamic types and looks in the right portion for each type. Lots of things still don't quite work with dynamic types (e.g. getting their size), but enough works to do a bunch of additions and then a ctf_update, most of the time. Except it doesn't, because ctf_add_type finds it necessary to walk the full dynamic type definition list looking for types with matching names, so it gets slower and slower with every type you add: fixing this requires calling ctf_update periodically for no other reason than to avoid massively slowing things down. This is all clunky and very slow but kind of works, until you consider that it is in fact possible and indeed necessary to modify one sort of type after it has been added: forwards. These are necessarily promoted to structs, unions or enums, and when they do so *their type ID does not change*. So all of a sudden we are changing types that already exist in the static portion. ctf_update gets massively confused by this and allocates space enough for the forward (with no members), but then emits the new dynamic type (with all the members) into it. You get an assertion failure after that, if you're lucky, or a coredump. So this commit rejigs things a bit and arranges to exclusively use the dynamic type definitions in writable dictionaries, and the static type definitions in readable dictionaries: we don't at any time have a mixture of static and dynamic types, and you don't need to call ctf_update to make things "appear". The ctf_dtbyname hash I introduced a few months ago, which maps things like "struct foo" to DTDs, is removed, replaced instead by a change of type of the four dictionaries which track names. Rather than just being (unresizable) ctf_hash_t's populated only at ctf_bufopen time, they are now a ctf_names_t structure, which is a pair of ctf_hash_t and ctf_dynhash_t, with the ctf_hash_t portion being used in readonly dictionaries, and the ctf_dynhash_t being used in writable ones. The decision as to which to use is centralized in the new functions ctf_lookup_by_rawname (which takes a type kind) and ctf_lookup_by_rawhash, which it calls (which takes a ctf_names_t *.) This change lets us switch from using static to dynamic name hashes on the fly across the entirety of libctf without complexifying anything: in fact, because we now centralize the knowledge about how to map from type kind to name hash, it actually simplifies things and lets us throw out quite a lot of now-unnecessary complexity, from ctf_dtnyname (replaced by the dynamic half of the name tables), through to ctf_dtnextid (now that a dictionary's static portion is never referenced if the dictionary is writable, we can just use ctf_typemax to indicate the maximum type: dynamic or non-dynamic does not matter, and we no longer need to track the boundary between the types). You can now ctf_rollback() as far as you like, even past a ctf_update or for that matter a full writeout; all the iteration functions work just as well on writable as on read-only dictionaries; ctf_add_type no longer needs expensive duplicated code to run over the dynamic types hunting for ones it might be interested in; and the linker no longer needs a hack to call ctf_update so that calling ctf_add_type is not impossibly expensive. There is still a bit more complexity: some new code paths in ctf-types.c need to know how to extract information from dynamic types. This complexity will go away again in a few months when libctf acquires a proper intermediate representation. You can still call ctf_update if you like (it's public API, after all), but its only effect now is to set the point to which ctf_discard rolls back. Obviously *something* still needs to serialize the CTF file before writeout, and this job is done by ctf_serialize, which does everything ctf_update used to except set the counter used by ctf_discard. It is automatically called by the various functions that do CTF writeout: nobody else ever needs to call it. With this in place, forwards that are promoted to non-forwards no longer crash the link, even if it happens tens of thousands of types later. v5: fix tabdamage. libctf/ * ctf-impl.h (ctf_names_t): New. (ctf_lookup_t) <ctf_hash>: Now a ctf_names_t, not a ctf_hash_t. (ctf_file_t) <ctf_structs>: Likewise. <ctf_unions>: Likewise. <ctf_enums>: Likewise. <ctf_names>: Likewise. <ctf_lookups>: Improve comment. <ctf_ptrtab_len>: New. <ctf_prov_strtab>: New. <ctf_str_prov_offset>: New. <ctf_dtbyname>: Remove, redundant to the names hashes. <ctf_dtnextid>: Remove, redundant to ctf_typemax. (ctf_dtdef_t) <dtd_name>: Remove. <dtd_data>: Note that the ctt_name is now populated. (ctf_str_atom_t) <csa_offset>: This is now the strtab offset for internal strings too. <csa_external_offset>: New, the external strtab offset. (CTF_INDEX_TO_TYPEPTR): Handle the LCTF_RDWR case. (ctf_name_table): New declaration. (ctf_lookup_by_rawname): Likewise. (ctf_lookup_by_rawhash): Likewise. (ctf_set_ctl_hashes): Likewise. (ctf_serialize): Likewise. (ctf_dtd_insert): Adjust. (ctf_simple_open_internal): Likewise. (ctf_bufopen_internal): Likewise. (ctf_list_empty_p): Likewise. (ctf_str_remove_ref): Likewise. (ctf_str_add): Returns uint32_t now. (ctf_str_add_ref): Likewise. (ctf_str_add_external): Now returns a boolean (int). * ctf-string.c (ctf_strraw_explicit): Check the ctf_prov_strtab for strings in the appropriate range. (ctf_str_create_atoms): Create the ctf_prov_strtab. Detect OOM when adding the null string to the new strtab. (ctf_str_free_atoms): Destroy the ctf_prov_strtab. (ctf_str_add_ref_internal): Add make_provisional argument. If make_provisional, populate the offset and fill in the ctf_prov_strtab accordingly. (ctf_str_add): Return the offset, not the string. (ctf_str_add_ref): Likewise. (ctf_str_add_external): Return a success integer. (ctf_str_remove_ref): New, remove a single ref. (ctf_str_count_strtab): Do not count the initial null string's length or the existence or length of any unreferenced internal atoms. (ctf_str_populate_sorttab): Skip atoms with no refs. (ctf_str_write_strtab): Populate the nullstr earlier. Add one to the cts_len for the null string, since it is no longer done in ctf_str_count_strtab. Adjust for csa_external_offset rename. Populate the csa_offset for both internal and external cases. Flush the ctf_prov_strtab afterwards, and reset the ctf_str_prov_offset. * ctf-create.c (ctf_grow_ptrtab): New. (ctf_create): Call it. Initialize new fields rather than old ones. Tell ctf_bufopen_internal that this is a writable dictionary. Set the ctl hashes and data model. (ctf_update): Rename to... (ctf_serialize): ... this. Leave a compatibility function behind. Tell ctf_simple_open_internal that this is a writable dictionary. Pass the new fields along from the old dictionary. Drop ctf_dtnextid and ctf_dtbyname. Use ctf_strraw, not dtd_name. Do not zero out the DTD's ctt_name. (ctf_prefixed_name): Rename to... (ctf_name_table): ... this. No longer return a prefixed name: return the applicable name table instead. (ctf_dtd_insert): Use it, and use the right name table. Pass in the kind we're adding. Migrate away from dtd_name. (ctf_dtd_delete): Adjust similarly. Remove the ref to the deleted ctt_name. (ctf_dtd_lookup_type_by_name): Remove. (ctf_dynamic_type): Always return NULL on read-only dictionaries. No longer check ctf_dtnextid: check ctf_typemax instead. (ctf_snapshot): No longer use ctf_dtnextid: use ctf_typemax instead. (ctf_rollback): Likewise. No longer fail with ECTF_OVERROLLBACK. Use ctf_name_table and the right name table, and migrate away from dtd_name as in ctf_dtd_delete. (ctf_add_generic): Pass in the kind explicitly and pass it to ctf_dtd_insert. Use ctf_typemax, not ctf_dtnextid. Migrate away from dtd_name to using ctf_str_add_ref to populate the ctt_name. Grow the ptrtab if needed. (ctf_add_encoded): Pass in the kind. (ctf_add_slice): Likewise. (ctf_add_array): Likewise. (ctf_add_function): Likewise. (ctf_add_typedef): Likewise. (ctf_add_reftype): Likewise. Initialize the ctf_ptrtab, checking ctt_name rather than dtd_name. (ctf_add_struct_sized): Pass in the kind. Use ctf_lookup_by_rawname, not ctf_hash_lookup_type / ctf_dtd_lookup_type_by_name. (ctf_add_union_sized): Likewise. (ctf_add_enum): Likewise. (ctf_add_enum_encoded): Likewise. (ctf_add_forward): Likewise. (ctf_add_type): Likewise. (ctf_compress_write): Call ctf_serialize: adjust for ctf_size not being initialized until after the call. (ctf_write_mem): Likewise. (ctf_write): Likewise. * ctf-archive.c (arc_write_one_ctf): Likewise. * ctf-lookup.c (ctf_lookup_by_name): Use ctf_lookuup_by_rawhash, not ctf_hash_lookup_type. (ctf_lookup_by_id): No longer check the readonly types if the dictionary is writable. * ctf-open.c (init_types): Assert that this dictionary is not writable. Adjust to use the new name hashes, ctf_name_table, and ctf_ptrtab_len. GNU style fix for the final ptrtab scan. (ctf_bufopen_internal): New 'writable' parameter. Flip on LCTF_RDWR if set. Drop out early when dictionary is writable. Split the ctf_lookups initialization into... (ctf_set_cth_hashes): ... this new function. (ctf_simple_open_internal): Adjust. New 'writable' parameter. (ctf_simple_open): Adjust accordingly. (ctf_bufopen): Likewise. (ctf_file_close): Destroy the appropriate name hashes. No longer destroy ctf_dtbyname, which is gone. (ctf_getdatasect): Remove spurious "extern". * ctf-types.c (ctf_lookup_by_rawname): New, look up types in the specified name table, given a kind. (ctf_lookup_by_rawhash): Likewise, given a ctf_names_t *. (ctf_member_iter): Add support for iterating over the dynamic type list. (ctf_enum_iter): Likewise. (ctf_variable_iter): Likewise. (ctf_type_rvisit): Likewise. (ctf_member_info): Add support for types in the dynamic type list. (ctf_enum_name): Likewise. (ctf_enum_value): Likewise. (ctf_func_type_info): Likewise. (ctf_func_type_args): Likewise. * ctf-link.c (ctf_accumulate_archive_names): No longer call ctf_update. (ctf_link_write): Likewise. (ctf_link_intern_extern_string): Adjust for new ctf_str_add_external return value. (ctf_link_add_strtab): Likewise. * ctf-util.c (ctf_list_empty_p): New.
2019-08-08 00:55:09 +08:00
int, int *);
libctf: sort out potential refcount loops When you link TUs that contain conflicting types together, the resulting CTF section is an archive containing many CTF dicts. These dicts appear in ctf_link_outputs of the shared dict, with each ctf_import'ing that shared dict. ctf_importing a dict bumps its refcount to stop it going away while it's in use -- but if the shared dict (whose refcount is bumped) has the child dict (doing the bumping) in its ctf_link_outputs, we have a refcount loop, since the child dict only un-ctf_imports and drops the parent's refcount when it is freed, but the child is only freed when the parent's refcount falls to zero. (In the future, this will be able to go wrong on the inputs too, when an ld -r'ed deduplicated output with conflicts is relinked. Right now this cannot happen because we don't ctf_import such dicts at all. This will be fixed in a later commit in this series.) Fix this by introducing an internal-use-only ctf_import_unref function that imports a parent dict *witthout* bumping the parent's refcount, and using it when we create per-CU outputs. This function is only safe to use if you know the parent cannot go away while the child exists: but if the parent *owns* the child, as here, this is necessarily true. Record in the ctf_file_t whether a parent was imported via ctf_import or ctf_import_unref, so that if you do another ctf_import later on (or a ctf_import_unref) it can decide whether to drop the refcount of the existing parent being replaced depending on which function you used to import that one. Adjust ctf_serialize so that rather than doing a ctf_import (which is wrong if the original import was ctf_import_unref'fed), we just copy the parent field and refcount over and forcibly flip the unref flag on on the old copy we are going to discard. ctf_file_close also needs a bit of tweaking to only close the parent if it was not imported with ctf_import_unref: while we're at it, guard against repeated closes with a refcount of zero and stop them causing double-frees, even if destruction of things freed *inside* ctf_file_close cause such recursion. Verified no leaks or accesses to freed memory after all of this with valgrind. (It was leak-happy before.) libctf/ * ctf-impl.c (ctf_file_t) <ctf_parent_unreffed>: New. (ctf_import_unref): New. * ctf-open.c (ctf_file_close) Drop the refcount all the way to zero. Don't recurse back in if the refcount is already zero. (ctf_import): Check ctf_parent_unreffed before deciding whether to close a pre-existing parent. Set it to zero. (ctf_import_unreffed): New, as above, setting ctf_parent_unreffed to 1. * ctf-create.c (ctf_serialize): Do not ctf_import into the new child: use direct assignment, and set unreffed on the new and old children. * ctf-link.c (ctf_create_per_cu): Import the parent using ctf_import_unreffed.
2020-06-05 00:30:01 +08:00
extern int ctf_import_unref (ctf_file_t *fp, ctf_file_t *pfp);
libctf: avoid the need to ever use ctf_update The method of operation of libctf when the dictionary is writable has before now been that types that are added land in the dynamic type section, which is a linked list and hash of IDs -> dynamic type definitions (and, recently a hash of names): the DTDs are a bit of CTF representing the ctf_type_t and ad hoc C structures representing the vlen. Historically, libctf was unable to do anything with these types, not even look them up by ID, let alone by name: if you wanted to do that say if you were adding a type that depended on one you just added) you called ctf_update, which serializes all the DTDs into a CTF file and reopens it, copying its guts over the fp it's called with. The ctf_updated types are then frozen in amber and unchangeable: all lookups will return the types in the static portion in preference to the dynamic portion, and we will refuse to re-add things that already exist in the static portion (and, of late, in the dynamic portion too). The libctf machinery remembers the boundary between static and dynamic types and looks in the right portion for each type. Lots of things still don't quite work with dynamic types (e.g. getting their size), but enough works to do a bunch of additions and then a ctf_update, most of the time. Except it doesn't, because ctf_add_type finds it necessary to walk the full dynamic type definition list looking for types with matching names, so it gets slower and slower with every type you add: fixing this requires calling ctf_update periodically for no other reason than to avoid massively slowing things down. This is all clunky and very slow but kind of works, until you consider that it is in fact possible and indeed necessary to modify one sort of type after it has been added: forwards. These are necessarily promoted to structs, unions or enums, and when they do so *their type ID does not change*. So all of a sudden we are changing types that already exist in the static portion. ctf_update gets massively confused by this and allocates space enough for the forward (with no members), but then emits the new dynamic type (with all the members) into it. You get an assertion failure after that, if you're lucky, or a coredump. So this commit rejigs things a bit and arranges to exclusively use the dynamic type definitions in writable dictionaries, and the static type definitions in readable dictionaries: we don't at any time have a mixture of static and dynamic types, and you don't need to call ctf_update to make things "appear". The ctf_dtbyname hash I introduced a few months ago, which maps things like "struct foo" to DTDs, is removed, replaced instead by a change of type of the four dictionaries which track names. Rather than just being (unresizable) ctf_hash_t's populated only at ctf_bufopen time, they are now a ctf_names_t structure, which is a pair of ctf_hash_t and ctf_dynhash_t, with the ctf_hash_t portion being used in readonly dictionaries, and the ctf_dynhash_t being used in writable ones. The decision as to which to use is centralized in the new functions ctf_lookup_by_rawname (which takes a type kind) and ctf_lookup_by_rawhash, which it calls (which takes a ctf_names_t *.) This change lets us switch from using static to dynamic name hashes on the fly across the entirety of libctf without complexifying anything: in fact, because we now centralize the knowledge about how to map from type kind to name hash, it actually simplifies things and lets us throw out quite a lot of now-unnecessary complexity, from ctf_dtnyname (replaced by the dynamic half of the name tables), through to ctf_dtnextid (now that a dictionary's static portion is never referenced if the dictionary is writable, we can just use ctf_typemax to indicate the maximum type: dynamic or non-dynamic does not matter, and we no longer need to track the boundary between the types). You can now ctf_rollback() as far as you like, even past a ctf_update or for that matter a full writeout; all the iteration functions work just as well on writable as on read-only dictionaries; ctf_add_type no longer needs expensive duplicated code to run over the dynamic types hunting for ones it might be interested in; and the linker no longer needs a hack to call ctf_update so that calling ctf_add_type is not impossibly expensive. There is still a bit more complexity: some new code paths in ctf-types.c need to know how to extract information from dynamic types. This complexity will go away again in a few months when libctf acquires a proper intermediate representation. You can still call ctf_update if you like (it's public API, after all), but its only effect now is to set the point to which ctf_discard rolls back. Obviously *something* still needs to serialize the CTF file before writeout, and this job is done by ctf_serialize, which does everything ctf_update used to except set the counter used by ctf_discard. It is automatically called by the various functions that do CTF writeout: nobody else ever needs to call it. With this in place, forwards that are promoted to non-forwards no longer crash the link, even if it happens tens of thousands of types later. v5: fix tabdamage. libctf/ * ctf-impl.h (ctf_names_t): New. (ctf_lookup_t) <ctf_hash>: Now a ctf_names_t, not a ctf_hash_t. (ctf_file_t) <ctf_structs>: Likewise. <ctf_unions>: Likewise. <ctf_enums>: Likewise. <ctf_names>: Likewise. <ctf_lookups>: Improve comment. <ctf_ptrtab_len>: New. <ctf_prov_strtab>: New. <ctf_str_prov_offset>: New. <ctf_dtbyname>: Remove, redundant to the names hashes. <ctf_dtnextid>: Remove, redundant to ctf_typemax. (ctf_dtdef_t) <dtd_name>: Remove. <dtd_data>: Note that the ctt_name is now populated. (ctf_str_atom_t) <csa_offset>: This is now the strtab offset for internal strings too. <csa_external_offset>: New, the external strtab offset. (CTF_INDEX_TO_TYPEPTR): Handle the LCTF_RDWR case. (ctf_name_table): New declaration. (ctf_lookup_by_rawname): Likewise. (ctf_lookup_by_rawhash): Likewise. (ctf_set_ctl_hashes): Likewise. (ctf_serialize): Likewise. (ctf_dtd_insert): Adjust. (ctf_simple_open_internal): Likewise. (ctf_bufopen_internal): Likewise. (ctf_list_empty_p): Likewise. (ctf_str_remove_ref): Likewise. (ctf_str_add): Returns uint32_t now. (ctf_str_add_ref): Likewise. (ctf_str_add_external): Now returns a boolean (int). * ctf-string.c (ctf_strraw_explicit): Check the ctf_prov_strtab for strings in the appropriate range. (ctf_str_create_atoms): Create the ctf_prov_strtab. Detect OOM when adding the null string to the new strtab. (ctf_str_free_atoms): Destroy the ctf_prov_strtab. (ctf_str_add_ref_internal): Add make_provisional argument. If make_provisional, populate the offset and fill in the ctf_prov_strtab accordingly. (ctf_str_add): Return the offset, not the string. (ctf_str_add_ref): Likewise. (ctf_str_add_external): Return a success integer. (ctf_str_remove_ref): New, remove a single ref. (ctf_str_count_strtab): Do not count the initial null string's length or the existence or length of any unreferenced internal atoms. (ctf_str_populate_sorttab): Skip atoms with no refs. (ctf_str_write_strtab): Populate the nullstr earlier. Add one to the cts_len for the null string, since it is no longer done in ctf_str_count_strtab. Adjust for csa_external_offset rename. Populate the csa_offset for both internal and external cases. Flush the ctf_prov_strtab afterwards, and reset the ctf_str_prov_offset. * ctf-create.c (ctf_grow_ptrtab): New. (ctf_create): Call it. Initialize new fields rather than old ones. Tell ctf_bufopen_internal that this is a writable dictionary. Set the ctl hashes and data model. (ctf_update): Rename to... (ctf_serialize): ... this. Leave a compatibility function behind. Tell ctf_simple_open_internal that this is a writable dictionary. Pass the new fields along from the old dictionary. Drop ctf_dtnextid and ctf_dtbyname. Use ctf_strraw, not dtd_name. Do not zero out the DTD's ctt_name. (ctf_prefixed_name): Rename to... (ctf_name_table): ... this. No longer return a prefixed name: return the applicable name table instead. (ctf_dtd_insert): Use it, and use the right name table. Pass in the kind we're adding. Migrate away from dtd_name. (ctf_dtd_delete): Adjust similarly. Remove the ref to the deleted ctt_name. (ctf_dtd_lookup_type_by_name): Remove. (ctf_dynamic_type): Always return NULL on read-only dictionaries. No longer check ctf_dtnextid: check ctf_typemax instead. (ctf_snapshot): No longer use ctf_dtnextid: use ctf_typemax instead. (ctf_rollback): Likewise. No longer fail with ECTF_OVERROLLBACK. Use ctf_name_table and the right name table, and migrate away from dtd_name as in ctf_dtd_delete. (ctf_add_generic): Pass in the kind explicitly and pass it to ctf_dtd_insert. Use ctf_typemax, not ctf_dtnextid. Migrate away from dtd_name to using ctf_str_add_ref to populate the ctt_name. Grow the ptrtab if needed. (ctf_add_encoded): Pass in the kind. (ctf_add_slice): Likewise. (ctf_add_array): Likewise. (ctf_add_function): Likewise. (ctf_add_typedef): Likewise. (ctf_add_reftype): Likewise. Initialize the ctf_ptrtab, checking ctt_name rather than dtd_name. (ctf_add_struct_sized): Pass in the kind. Use ctf_lookup_by_rawname, not ctf_hash_lookup_type / ctf_dtd_lookup_type_by_name. (ctf_add_union_sized): Likewise. (ctf_add_enum): Likewise. (ctf_add_enum_encoded): Likewise. (ctf_add_forward): Likewise. (ctf_add_type): Likewise. (ctf_compress_write): Call ctf_serialize: adjust for ctf_size not being initialized until after the call. (ctf_write_mem): Likewise. (ctf_write): Likewise. * ctf-archive.c (arc_write_one_ctf): Likewise. * ctf-lookup.c (ctf_lookup_by_name): Use ctf_lookuup_by_rawhash, not ctf_hash_lookup_type. (ctf_lookup_by_id): No longer check the readonly types if the dictionary is writable. * ctf-open.c (init_types): Assert that this dictionary is not writable. Adjust to use the new name hashes, ctf_name_table, and ctf_ptrtab_len. GNU style fix for the final ptrtab scan. (ctf_bufopen_internal): New 'writable' parameter. Flip on LCTF_RDWR if set. Drop out early when dictionary is writable. Split the ctf_lookups initialization into... (ctf_set_cth_hashes): ... this new function. (ctf_simple_open_internal): Adjust. New 'writable' parameter. (ctf_simple_open): Adjust accordingly. (ctf_bufopen): Likewise. (ctf_file_close): Destroy the appropriate name hashes. No longer destroy ctf_dtbyname, which is gone. (ctf_getdatasect): Remove spurious "extern". * ctf-types.c (ctf_lookup_by_rawname): New, look up types in the specified name table, given a kind. (ctf_lookup_by_rawhash): Likewise, given a ctf_names_t *. (ctf_member_iter): Add support for iterating over the dynamic type list. (ctf_enum_iter): Likewise. (ctf_variable_iter): Likewise. (ctf_type_rvisit): Likewise. (ctf_member_info): Add support for types in the dynamic type list. (ctf_enum_name): Likewise. (ctf_enum_value): Likewise. (ctf_func_type_info): Likewise. (ctf_func_type_args): Likewise. * ctf-link.c (ctf_accumulate_archive_names): No longer call ctf_update. (ctf_link_write): Likewise. (ctf_link_intern_extern_string): Adjust for new ctf_str_add_external return value. (ctf_link_add_strtab): Likewise. * ctf-util.c (ctf_list_empty_p): New.
2019-08-08 00:55:09 +08:00
extern int ctf_serialize (ctf_file_t *);
libctf: support getting strings from the ELF strtab The CTF file format has always supported "external strtabs", which internally are strtab offsets with their MSB on: such refs get their strings from the strtab passed in at CTF file open time: this is usually intended to be the ELF strtab, and that's what this implementation is meant to support, though in theory the external strtab could come from anywhere. This commit adds support for these external strings in the ctf-string.c strtab tracking layer. It's quite easy: we just add a field csa_offset to the atoms table that tracks all strings: this field tracks the offset of the string in the ELF strtab (with its MSB already on, courtesy of a new macro CTF_SET_STID), and adds a new function that sets the csa_offset to the specified offset (plus MSB). Then we just need to avoid writing out strings to the internal strtab if they have csa_offset set, and note that the internal strtab is shorter than it might otherwise be. (We could in theory save a little more time here by eschewing sorting such strings, since we never actually write the strings out anywhere, but that would mean storing them separately and it's just not worth the complexity cost until profiling shows it's worth doing.) We also have to go through a bit of extra effort at variable-sorting time. This was previously using direct references to the internal strtab: it couldn't use ctf_strptr or ctf_strraw because the new strtab is not yet ready to put in its usual field (in a ctf_file_t that hasn't even been allocated yet at this stage): but now we're using the external strtab, this will no longer do because it'll be looking things up in the wrong strtab, with disastrous results. Instead, pass the new internal strtab in to a new ctf_strraw_explicit function which is just like ctf_strraw except you can specify a ne winternal strtab to use. But even now that it is using a new internal strtab, this is not quite enough: it can't look up strings in the external strtab because ld hasn't written it out yet, and when it does will write it straight to disk. Instead, when we write the internal strtab, note all the offset -> string mappings that we have noted belong in the *external* strtab to a new "synthetic external strtab" dynhash, ctf_syn_ext_strtab, and look in there at ctf_strraw time if it is set. This uses minimal extra memory (because only strings in the external strtab that we actually use are stored, and even those come straight out of the atoms table), but let both variable sorting and name interning when ctf_bufopen is next called work fine. (This also means that we don't need to filter out spurious ECTF_STRTAB warnings from ctf_bufopen but can pass them back to the caller, once we wrap ctf_bufopen so that we have a new internal variant of ctf_bufopen etc that we can pass the synthetic external strtab to. That error has been filtered out since the days of Solaris libctf, which didn't try to handle the problem of getting external strtabs right at construction time at all.) v3: add the synthetic strtab and all associated machinery. v5: fix tabdamage. include/ * ctf.h (CTF_SET_STID): New. libctf/ * ctf-impl.h (ctf_str_atom_t) <csa_offset>: New field. (ctf_file_t) <ctf_syn_ext_strtab>: Likewise. (ctf_str_add_ref): Name the last arg. (ctf_str_add_external) New. (ctf_str_add_strraw_explicit): Likewise. (ctf_simple_open_internal): Likewise. (ctf_bufopen_internal): Likewise. * ctf-string.c (ctf_strraw_explicit): Split from... (ctf_strraw): ... here, with new support for ctf_syn_ext_strtab. (ctf_str_add_ref_internal): Return the atom, not the string. (ctf_str_add): Adjust accordingly. (ctf_str_add_ref): Likewise. Move up in the file. (ctf_str_add_external): New: update the csa_offset. (ctf_str_count_strtab): Only account for strings with no csa_offset in the internal strtab length. (ctf_str_write_strtab): If the csa_offset is set, update the string's refs without writing the string out, and update the ctf_syn_ext_strtab. Make OOM handling less ugly. * ctf-create.c (struct ctf_sort_var_arg_cb): New. (ctf_update): Handle failure to populate the strtab. Pass in the new ctf_sort_var arg. Adjust for ctf_syn_ext_strtab addition. Call ctf_simple_open_internal, not ctf_simple_open. (ctf_sort_var): Call ctf_strraw_explicit rather than looking up strings by hand. * ctf-hash.c (ctf_hash_insert_type): Likewise (but using ctf_strraw). Adjust to diagnose ECTF_STRTAB nonetheless. * ctf-open.c (init_types): No longer filter out ECTF_STRTAB. (ctf_file_close): Destroy the ctf_syn_ext_strtab. (ctf_simple_open): Rename to, and reimplement as a wrapper around... (ctf_simple_open_internal): ... this new function, which calls ctf_bufopen_internal. (ctf_bufopen): Rename to, and reimplement as a wrapper around... (ctf_bufopen_internal): ... this new function, which sets ctf_syn_ext_strtab.
2019-07-14 03:33:01 +08:00
_libctf_malloc_
extern void *ctf_mmap (size_t length, size_t offset, int fd);
extern void ctf_munmap (void *, size_t);
extern ssize_t ctf_pread (int fd, void *buf, ssize_t count, off_t offset);
libctf: deduplicate and sort the string table ctf.h states: > [...] the CTF string table does not contain any duplicated strings. Unfortunately this is entirely untrue: libctf has before now made no attempt whatsoever to deduplicate the string table. It computes the string table's length on the fly as it adds new strings to the dynamic CTF file, and ctf_update() just writes each string to the table and notes the current write position as it traverses the dynamic CTF file's data structures and builds the final CTF buffer. There is no global view of the strings and no deduplication. Fix this by erasing the ctf_dtvstrlen dead-reckoning length, and adding a new dynhash table ctf_str_atoms that maps unique strings to a list of references to those strings: a reference is a simple uint32_t * to some value somewhere in the under-construction CTF buffer that needs updating to note the string offset when the strtab is laid out. Adding a string is now a simple matter of calling ctf_str_add_ref(), which adds a new atom to the atoms table, if one doesn't already exist, and adding the location of the reference to this atom to the refs list attached to the atom: this works reliably as long as one takes care to only call ctf_str_add_ref() once the final location of the offset is known (so you can't call it on a temporary structure and then memcpy() that structure into place in the CTF buffer, because the ref will still point to the old location: ctf_update() changes accordingly). Generating the CTF string table is a matter of calling ctf_str_write_strtab(), which counts the length and number of elements in the atoms table using the ctf_dynhash_iter() function we just added, populating an array of pointers into the atoms table and sorting it into order (to help compressors), then traversing this table and emitting it, updating the refs to each atom as we go. The only complexity here is arranging to keep the null string at offset zero, since a lot of code in libctf depends on being able to leave strtab references at 0 to indicate 'no name'. Once the table is constructed and the refs updated, we know how long it is, so we can realloc() the partial CTF buffer we allocated earlier and can copy the table on to the end of it (and purge the refs because they're not needed any more and have been invalidated by the realloc() call in any case). The net effect of all this is a reduction in uncompressed strtab sizes of about 30% (perhaps a quarter to a half of all strings across the Linux kernel are eliminated as duplicates). Of course, duplicated strings are highly redundant, so the space saving after compression is only about 20%: when the other non-strtab sections are factored in, CTF sizes shrink by about 10%. No change in externally-visible API or file format (other than the reduction in pointless redundancy). libctf/ * ctf-impl.h: (struct ctf_strs_writable): New, non-const version of struct ctf_strs. (struct ctf_dtdef): Note that dtd_data.ctt_name is unpopulated. (struct ctf_str_atom): New, disambiguated single string. (struct ctf_str_atom_ref): New, points to some other location that references this string's offset. (struct ctf_file): New members ctf_str_atoms and ctf_str_num_refs. Remove member ctf_dtvstrlen: we no longer track the total strlen as we add strings. (ctf_str_create_atoms): Declare new function in ctf-string.c. (ctf_str_free_atoms): Likewise. (ctf_str_add): Likewise. (ctf_str_add_ref): Likewise. (ctf_str_purge_refs): Likewise. (ctf_str_write_strtab): Likewise. (ctf_realloc): Declare new function in ctf-util.c. * ctf-open.c (ctf_bufopen): Create the atoms table. (ctf_file_close): Destroy it. * ctf-create.c (ctf_update): Copy-and-free it on update. No longer special-case the position of the parname string. Construct the strtab by calling ctf_str_add_ref and ctf_str_write_strtab after the rest of each buffer element is constructed, not via open-coding: realloc the CTF buffer and append the strtab to it. No longer maintain ctf_dtvstrlen. Sort the variable entry table later, after strtab construction. (ctf_copy_membnames): Remove: integrated into ctf_copy_{s,l,e}members. (ctf_copy_smembers): Drop the string offset: call ctf_str_add_ref after buffer element construction instead. (ctf_copy_lmembers): Likewise. (ctf_copy_emembers): Likewise. (ctf_create): No longer maintain the ctf_dtvstrlen. (ctf_dtd_delete): Likewise. (ctf_dvd_delete): Likewise. (ctf_add_generic): Likewise. (ctf_add_enumerator): Likewise. (ctf_add_member_offset): Likewise. (ctf_add_variable): Likewise. (membadd): Likewise. * ctf-util.c (ctf_realloc): New, wrapper around realloc that aborts if there are active ctf_str_num_refs. (ctf_strraw): Move to ctf-string.c. (ctf_strptr): Likewise. * ctf-string.c: New file, strtab manipulation. * Makefile.am (libctf_a_SOURCES): Add it. * Makefile.in: Regenerate.
2019-06-27 20:51:10 +08:00
extern void *ctf_realloc (ctf_file_t *, void *, size_t);
extern char *ctf_str_append (char *, const char *);
extern char *ctf_str_append_noerr (char *, const char *);
extern ctf_id_t ctf_type_resolve_unsliced (ctf_file_t *, ctf_id_t);
extern int ctf_type_kind_unsliced (ctf_file_t *, ctf_id_t);
_libctf_printflike_ (1, 2)
extern void ctf_dprintf (const char *, ...);
extern void libctf_init_debug (void);
libctf, ld, binutils: add textual error/warning reporting for libctf This commit adds a long-missing piece of infrastructure to libctf: the ability to report errors and warnings using all the power of printf, rather than being restricted to one errno value. Internally, libctf calls ctf_err_warn() to add errors and warnings to a list: a new iterator ctf_errwarning_next() then consumes this list one by one and hands it to the caller, which can free it. New errors and warnings are added until the list is consumed by the caller or the ctf_file_t is closed, so you can dump them at intervals. The caller can of course choose to print only those warnings it wants. (I am not sure whether we want objdump, readelf or ld to print warnings or not: right now I'm printing them, but maybe we only want to print errors? This entirely depends on whether warnings are voluminous things describing e.g. the inability to emit single types because of name clashes or something. There are no users of this infrastructure yet, so it's hard to say.) There is no internationalization here yet, but this at least adds a place where internationalization can be added, to one of ctf_errwarning_next or ctf_err_warn. We also provide a new ctf_assert() function which uses this infrastructure to provide non-fatal assertion failures while emitting an assert-like string to the caller: to save space and avoid needlessly duplicating unchanging strings, the assertion test is inlined but the print-things-out failure case is not. All assertions in libctf will be converted to use this machinery in future commits and propagate assertion-failure errors up, so that the linker in particular cannot be killed by libctf assertion failures when it could perfectly well just print warnings and drop the CTF section. include/ * ctf-api.h (ECTF_INTERNAL): Adjust error text. (ctf_errwarning_next): New. libctf/ * ctf-impl.h (ctf_assert): New. (ctf_err_warning_t): Likewise. (ctf_file_t) <ctf_errs_warnings>: Likewise. (ctf_err_warn): New prototype. (ctf_assert_fail_internal): Likewise. * ctf-inlines.h (ctf_assert_internal): Likewise. * ctf-open.c (ctf_file_close): Free ctf_errs_warnings. * ctf-create.c (ctf_serialize): Copy it on serialization. * ctf-subr.c (ctf_err_warn): New, add an error/warning. (ctf_errwarning_next): New iterator, free and pass back errors/warnings in succession. * libctf.ver (ctf_errwarning_next): Add. ld/ * ldlang.c (lang_ctf_errs_warnings): New, print CTF errors and warnings. Assert when libctf asserts. (lang_merge_ctf): Call it. (land_write_ctf): Likewise. binutils/ * objdump.c (ctf_archive_member): Print CTF errors and warnings. * readelf.c (dump_ctf_archive_member): Likewise.
2020-06-04 22:07:54 +08:00
_libctf_printflike_ (3, 4)
extern void ctf_err_warn (ctf_file_t *, int is_warning, const char *, ...);
extern void ctf_assert_fail_internal (ctf_file_t *, const char *,
size_t, const char *);
libctf, link: add lazy linking: clean up input members: err/warn cleanup This rather large and intertwined pile of changes does three things: First, it transitions from dprintf to ctf_err_warn for things the user might care about: this one file is the major impetus for the ctf_err_warn infrastructure, because things like file names are crucial in linker error messages, and errno values are utterly incapable of communicating them Second, it stabilizes the ctf_link APIs: you can now call ctf_link_add_ctf without a CTF argument (only a NAME), to lazily ctf_open the file with the given NAME when needed, and close it as soon as possible, to save memory. This is not an API change because a null CTF argument was prohibited before now. Since getting CTF directly from files uses ctf_open, passing in only a NAME requires use of libctf, not libctf-nobfd. The linker's behaviour is unchanged, as it still passes in a ctf_archive_t as before. This also let us fix a leak: we were opening ctf_archives and their containing ctf_files, then only closing the files and leaving the archives open. Third, this commit restructures the ctf_link_in_member argument used by the CTF linking machinery and adjusts its users accordingly. We drop two members: - arcname, which is difficult to construct and then only used in error messages (that were only dprintf()ed, so never seen!) - share_mode, since we store the flags passed to ctf_link (including the share mode) in a new ctf_file_t.ctf_link_flags to help dedup get hold of it We rename others whose existing names were fairly dreadful: - done_main_member -> done_parent, using consistent terminology for .ctf as the parent of all archive members - main_input_fp -> in_fp_parent, likewise - file_name -> in_file_name, likewise We add one new member, cu_mapped. Finally, we move the various frees of things like mapping table data to the top-level ctf_link, since deduplicating links will want to do that too. include/ * ctf-api.h (ECTF_NEEDSBFD): New. (ECTF_NERR): Adjust. (ctf_link): Rename share_mode arg to flags. libctf/ * Makefile.am: Set -DNOBFD=1 in libctf-nobfd, and =0 elsewhere. * Makefile.in: Regenerated. * ctf-impl.h (ctf_link_input_name): New. (ctf_file_t) <ctf_link_flags>: New. * ctf-create.c (ctf_serialize): Adjust accordingly. * ctf-link.c: Define ctf_open as weak when PIC. (ctf_arc_close_thunk): Remove unnecessary thunk. (ctf_file_close_thunk): Likewise. (ctf_link_input_name): New. (ctf_link_input_t): New value of the ctf_file_t.ctf_link_input. (ctf_link_input_close): Adjust accordingly. (ctf_link_add_ctf_internal): New, split from... (ctf_link_add_ctf): ... here. Return error if lazy loading of CTF is not possible. Change to just call... (ctf_link_add): ... this new function. (ctf_link_add_cu_mapping): Transition to ctf_err_warn. Drop the ctf_file_close_thunk. (ctf_link_in_member_cb_arg_t) <file_name> Rename to... <in_file_name>: ... this. <arcname>: Drop. <share_mode>: Likewise (migrated to ctf_link_flags). <done_main_member>: Rename to... <done_parent>: ... this. <main_input_fp>: Rename to... <in_fp_parent>: ... this. <cu_mapped>: New. (ctf_link_one_type): Adjuwt accordingly. Transition to ctf_err_warn, removing a TODO. (ctf_link_one_variable): Note a case too common to warn about. Report in the debug stream if a cu-mapped link prevents addition of a conflicting variable. (ctf_link_one_input_archive_member): Adjust. (ctf_link_lazy_open): New, open a CTF archive for linking when needed. (ctf_link_close_one_input_archive): New, close it again. (ctf_link_one_input_archive): Adjust for lazy opening, member renames, and ctf_err_warn transition. Move the empty_link_type_mapping call to... (ctf_link): ... here. Adjut for renamings and thunk removal. Don't spuriously fail if some input contains no CTF data. (ctf_link_write): ctf_err_warn transition. * libctf.ver: Remove not-yet-stable comment.
2020-06-05 02:28:52 +08:00
extern const char *ctf_link_input_name (ctf_file_t *);
libctf, ld, binutils: add textual error/warning reporting for libctf This commit adds a long-missing piece of infrastructure to libctf: the ability to report errors and warnings using all the power of printf, rather than being restricted to one errno value. Internally, libctf calls ctf_err_warn() to add errors and warnings to a list: a new iterator ctf_errwarning_next() then consumes this list one by one and hands it to the caller, which can free it. New errors and warnings are added until the list is consumed by the caller or the ctf_file_t is closed, so you can dump them at intervals. The caller can of course choose to print only those warnings it wants. (I am not sure whether we want objdump, readelf or ld to print warnings or not: right now I'm printing them, but maybe we only want to print errors? This entirely depends on whether warnings are voluminous things describing e.g. the inability to emit single types because of name clashes or something. There are no users of this infrastructure yet, so it's hard to say.) There is no internationalization here yet, but this at least adds a place where internationalization can be added, to one of ctf_errwarning_next or ctf_err_warn. We also provide a new ctf_assert() function which uses this infrastructure to provide non-fatal assertion failures while emitting an assert-like string to the caller: to save space and avoid needlessly duplicating unchanging strings, the assertion test is inlined but the print-things-out failure case is not. All assertions in libctf will be converted to use this machinery in future commits and propagate assertion-failure errors up, so that the linker in particular cannot be killed by libctf assertion failures when it could perfectly well just print warnings and drop the CTF section. include/ * ctf-api.h (ECTF_INTERNAL): Adjust error text. (ctf_errwarning_next): New. libctf/ * ctf-impl.h (ctf_assert): New. (ctf_err_warning_t): Likewise. (ctf_file_t) <ctf_errs_warnings>: Likewise. (ctf_err_warn): New prototype. (ctf_assert_fail_internal): Likewise. * ctf-inlines.h (ctf_assert_internal): Likewise. * ctf-open.c (ctf_file_close): Free ctf_errs_warnings. * ctf-create.c (ctf_serialize): Copy it on serialization. * ctf-subr.c (ctf_err_warn): New, add an error/warning. (ctf_errwarning_next): New iterator, free and pass back errors/warnings in succession. * libctf.ver (ctf_errwarning_next): Add. ld/ * ldlang.c (lang_ctf_errs_warnings): New, print CTF errors and warnings. Assert when libctf asserts. (lang_merge_ctf): Call it. (land_write_ctf): Likewise. binutils/ * objdump.c (ctf_archive_member): Print CTF errors and warnings. * readelf.c (dump_ctf_archive_member): Likewise.
2020-06-04 22:07:54 +08:00
extern Elf64_Sym *ctf_sym_to_elf64 (const Elf32_Sym *src, Elf64_Sym *dst);
extern const char *ctf_lookup_symbol_name (ctf_file_t *fp, unsigned long symidx);
/* Variables, all underscore-prepended. */
extern const char _CTF_SECTION[]; /* name of CTF ELF section */
extern const char _CTF_NULLSTR[]; /* empty string */
extern int _libctf_version; /* library client version */
extern int _libctf_debug; /* debugging messages enabled */
#include "ctf-inlines.h"
#ifdef __cplusplus
}
#endif
#endif /* _CTF_IMPL_H */