2021-03-18 20:37:52 +08:00
|
|
|
/* CTF dict creation.
|
2025-01-01 15:47:28 +08:00
|
|
|
Copyright (C) 2019-2025 Free Software Foundation, Inc.
|
2021-03-18 20:37:52 +08:00
|
|
|
|
|
|
|
This file is part of libctf.
|
|
|
|
|
|
|
|
libctf is free software; you can redistribute it and/or modify it under
|
|
|
|
the terms of the GNU General Public License as published by the Free
|
|
|
|
Software Foundation; either version 3, or (at your option) any later
|
|
|
|
version.
|
|
|
|
|
|
|
|
This program is distributed in the hope that it will be useful, but
|
|
|
|
WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
|
|
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
|
|
|
|
See the GNU General Public License for more details.
|
|
|
|
|
|
|
|
You should have received a copy of the GNU General Public License
|
|
|
|
along with this program; see the file COPYING. If not see
|
|
|
|
<http://www.gnu.org/licenses/>. */
|
|
|
|
|
|
|
|
#include <ctf-impl.h>
|
|
|
|
#include <assert.h>
|
|
|
|
#include <string.h>
|
|
|
|
#include <unistd.h>
|
|
|
|
#include <zlib.h>
|
|
|
|
|
|
|
|
#include <elf.h>
|
|
|
|
#include "elf-bfd.h"
|
|
|
|
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
/* Symtypetab sections. */
|
2021-03-18 20:37:52 +08:00
|
|
|
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
/* Symtypetab emission flags. */
|
|
|
|
|
|
|
|
#define CTF_SYMTYPETAB_EMIT_FUNCTION 0x1
|
|
|
|
#define CTF_SYMTYPETAB_EMIT_PAD 0x2
|
|
|
|
#define CTF_SYMTYPETAB_FORCE_INDEXED 0x4
|
|
|
|
|
|
|
|
/* Properties of symtypetab emission, shared by symtypetab section
|
|
|
|
sizing and symtypetab emission itself. */
|
|
|
|
|
|
|
|
typedef struct emit_symtypetab_state
|
2021-03-18 20:37:52 +08:00
|
|
|
{
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
/* True if linker-reported symbols are being filtered out. symfp is set if
|
|
|
|
this is true: otherwise, indexing is forced and the symflags indicate as
|
|
|
|
much. */
|
|
|
|
int filter_syms;
|
2021-03-18 20:37:52 +08:00
|
|
|
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
/* True if symbols are being sorted. */
|
|
|
|
int sort_syms;
|
2021-03-18 20:37:52 +08:00
|
|
|
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
/* Flags for symtypetab emission. */
|
|
|
|
int symflags;
|
2021-03-18 20:37:52 +08:00
|
|
|
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
/* The dict to which the linker has reported symbols. */
|
|
|
|
ctf_dict_t *symfp;
|
|
|
|
|
|
|
|
/* The maximum number of objects seen. */
|
|
|
|
size_t maxobjt;
|
|
|
|
|
|
|
|
/* The maximum number of func info entris seen. */
|
|
|
|
size_t maxfunc;
|
|
|
|
} emit_symtypetab_state_t;
|
2021-03-18 20:37:52 +08:00
|
|
|
|
|
|
|
/* Determine if a symbol is "skippable" and should never appear in the
|
|
|
|
symtypetab sections. */
|
|
|
|
|
|
|
|
int
|
|
|
|
ctf_symtab_skippable (ctf_link_sym_t *sym)
|
|
|
|
{
|
|
|
|
/* Never skip symbols whose name is not yet known. */
|
|
|
|
if (sym->st_nameidx_set)
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
return (sym->st_name == NULL || sym->st_name[0] == 0
|
|
|
|
|| sym->st_shndx == SHN_UNDEF
|
|
|
|
|| strcmp (sym->st_name, "_START_") == 0
|
|
|
|
|| strcmp (sym->st_name, "_END_") == 0
|
|
|
|
|| (sym->st_type == STT_OBJECT && sym->st_shndx == SHN_EXTABS
|
|
|
|
&& sym->st_value == 0));
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Get the number of symbols in a symbol hash, the count of symbols, the maximum
|
|
|
|
seen, the eventual size, without any padding elements, of the func/data and
|
|
|
|
(if generated) index sections, and the size of accumulated padding elements.
|
|
|
|
The linker-reported set of symbols is found in SYMFP: it may be NULL if
|
|
|
|
symbol filtering is not desired, in which case CTF_SYMTYPETAB_FORCE_INDEXED
|
|
|
|
will always be set in the flags.
|
|
|
|
|
|
|
|
Also figure out if any symbols need to be moved to the variable section, and
|
|
|
|
add them (if not already present). */
|
|
|
|
|
|
|
|
_libctf_nonnull_ ((1,3,4,5,6,7,8))
|
|
|
|
static int
|
|
|
|
symtypetab_density (ctf_dict_t *fp, ctf_dict_t *symfp, ctf_dynhash_t *symhash,
|
|
|
|
size_t *count, size_t *max, size_t *unpadsize,
|
|
|
|
size_t *padsize, size_t *idxsize, int flags)
|
|
|
|
{
|
|
|
|
ctf_next_t *i = NULL;
|
|
|
|
const void *name;
|
|
|
|
const void *ctf_sym;
|
|
|
|
ctf_dynhash_t *linker_known = NULL;
|
|
|
|
int err;
|
|
|
|
int beyond_max = 0;
|
|
|
|
|
|
|
|
*count = 0;
|
|
|
|
*max = 0;
|
|
|
|
*unpadsize = 0;
|
|
|
|
*idxsize = 0;
|
|
|
|
*padsize = 0;
|
|
|
|
|
|
|
|
if (!(flags & CTF_SYMTYPETAB_FORCE_INDEXED))
|
|
|
|
{
|
|
|
|
/* Make a dynhash citing only symbols reported by the linker of the
|
|
|
|
appropriate type, then traverse all potential-symbols we know the types
|
|
|
|
of, removing them from linker_known as we go. Once this is done, the
|
|
|
|
only symbols remaining in linker_known are symbols we don't know the
|
|
|
|
types of: we must emit pads for those symbols that are below the
|
|
|
|
maximum symbol we will emit (any beyond that are simply skipped).
|
|
|
|
|
|
|
|
If there are none, this symtypetab will be empty: just report that. */
|
|
|
|
|
|
|
|
if (!symfp->ctf_dynsyms)
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
if ((linker_known = ctf_dynhash_create (ctf_hash_string, ctf_hash_eq_string,
|
|
|
|
NULL, NULL)) == NULL)
|
|
|
|
return (ctf_set_errno (fp, ENOMEM));
|
|
|
|
|
|
|
|
while ((err = ctf_dynhash_cnext (symfp->ctf_dynsyms, &i,
|
|
|
|
&name, &ctf_sym)) == 0)
|
|
|
|
{
|
|
|
|
ctf_link_sym_t *sym = (ctf_link_sym_t *) ctf_sym;
|
|
|
|
|
|
|
|
if (((flags & CTF_SYMTYPETAB_EMIT_FUNCTION)
|
|
|
|
&& sym->st_type != STT_FUNC)
|
|
|
|
|| (!(flags & CTF_SYMTYPETAB_EMIT_FUNCTION)
|
|
|
|
&& sym->st_type != STT_OBJECT))
|
|
|
|
continue;
|
|
|
|
|
|
|
|
if (ctf_symtab_skippable (sym))
|
|
|
|
continue;
|
|
|
|
|
|
|
|
/* This should only be true briefly before all the names are
|
|
|
|
finalized, long before we get this far. */
|
|
|
|
if (!ctf_assert (fp, !sym->st_nameidx_set))
|
|
|
|
return -1; /* errno is set for us. */
|
|
|
|
|
|
|
|
if (ctf_dynhash_cinsert (linker_known, name, ctf_sym) < 0)
|
|
|
|
{
|
|
|
|
ctf_dynhash_destroy (linker_known);
|
|
|
|
return (ctf_set_errno (fp, ENOMEM));
|
|
|
|
}
|
|
|
|
}
|
|
|
|
if (err != ECTF_NEXT_END)
|
|
|
|
{
|
|
|
|
ctf_err_warn (fp, 0, err, _("iterating over linker-known symbols during "
|
|
|
|
"serialization"));
|
|
|
|
ctf_dynhash_destroy (linker_known);
|
|
|
|
return (ctf_set_errno (fp, err));
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
while ((err = ctf_dynhash_cnext (symhash, &i, &name, NULL)) == 0)
|
|
|
|
{
|
|
|
|
ctf_link_sym_t *sym;
|
|
|
|
|
|
|
|
if (!(flags & CTF_SYMTYPETAB_FORCE_INDEXED))
|
|
|
|
{
|
|
|
|
/* Linker did not report symbol in symtab. Remove it from the
|
|
|
|
set of known data symbols and continue. */
|
|
|
|
if ((sym = ctf_dynhash_lookup (symfp->ctf_dynsyms, name)) == NULL)
|
|
|
|
{
|
|
|
|
ctf_dynhash_remove (symhash, name);
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* We don't remove skippable symbols from the symhash because we don't
|
|
|
|
want them to be migrated into variables. */
|
|
|
|
if (ctf_symtab_skippable (sym))
|
|
|
|
continue;
|
|
|
|
|
|
|
|
if ((flags & CTF_SYMTYPETAB_EMIT_FUNCTION)
|
|
|
|
&& sym->st_type != STT_FUNC)
|
|
|
|
{
|
|
|
|
ctf_err_warn (fp, 1, 0, _("symbol %s (%x) added to CTF as a "
|
|
|
|
"function but is of type %x. "
|
|
|
|
"The symbol type lookup tables "
|
|
|
|
"are probably corrupted"),
|
|
|
|
sym->st_name, sym->st_symidx, sym->st_type);
|
|
|
|
ctf_dynhash_remove (symhash, name);
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
else if (!(flags & CTF_SYMTYPETAB_EMIT_FUNCTION)
|
|
|
|
&& sym->st_type != STT_OBJECT)
|
|
|
|
{
|
|
|
|
ctf_err_warn (fp, 1, 0, _("symbol %s (%x) added to CTF as a "
|
|
|
|
"data object but is of type %x. "
|
|
|
|
"The symbol type lookup tables "
|
|
|
|
"are probably corrupted"),
|
|
|
|
sym->st_name, sym->st_symidx, sym->st_type);
|
|
|
|
ctf_dynhash_remove (symhash, name);
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
|
|
|
ctf_dynhash_remove (linker_known, name);
|
|
|
|
|
|
|
|
if (*max < sym->st_symidx)
|
|
|
|
*max = sym->st_symidx;
|
|
|
|
}
|
|
|
|
else
|
|
|
|
(*max)++;
|
2024-04-09 07:23:35 +08:00
|
|
|
|
|
|
|
*unpadsize += sizeof (uint32_t);
|
|
|
|
(*count)++;
|
2021-03-18 20:37:52 +08:00
|
|
|
}
|
|
|
|
if (err != ECTF_NEXT_END)
|
|
|
|
{
|
|
|
|
ctf_err_warn (fp, 0, err, _("iterating over CTF symtypetab during "
|
|
|
|
"serialization"));
|
|
|
|
ctf_dynhash_destroy (linker_known);
|
|
|
|
return (ctf_set_errno (fp, err));
|
|
|
|
}
|
|
|
|
|
|
|
|
if (!(flags & CTF_SYMTYPETAB_FORCE_INDEXED))
|
|
|
|
{
|
|
|
|
while ((err = ctf_dynhash_cnext (linker_known, &i, NULL, &ctf_sym)) == 0)
|
|
|
|
{
|
|
|
|
ctf_link_sym_t *sym = (ctf_link_sym_t *) ctf_sym;
|
|
|
|
|
|
|
|
if (sym->st_symidx > *max)
|
|
|
|
beyond_max++;
|
|
|
|
}
|
|
|
|
if (err != ECTF_NEXT_END)
|
|
|
|
{
|
|
|
|
ctf_err_warn (fp, 0, err, _("iterating over linker-known symbols "
|
|
|
|
"during CTF serialization"));
|
|
|
|
ctf_dynhash_destroy (linker_known);
|
|
|
|
return (ctf_set_errno (fp, err));
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
*idxsize = *count * sizeof (uint32_t);
|
|
|
|
if (!(flags & CTF_SYMTYPETAB_FORCE_INDEXED))
|
|
|
|
*padsize = (ctf_dynhash_elements (linker_known) - beyond_max) * sizeof (uint32_t);
|
|
|
|
|
|
|
|
ctf_dynhash_destroy (linker_known);
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Emit an objt or func symtypetab into DP in a particular order defined by an
|
|
|
|
array of ctf_link_sym_t or symbol names passed in. The index has NIDX
|
|
|
|
elements in it: unindexed output would terminate at symbol OUTMAX and is in
|
|
|
|
any case no larger than SIZE bytes. Some index elements are expected to be
|
|
|
|
skipped: see symtypetab_density. The linker-reported set of symbols (if any)
|
|
|
|
is found in SYMFP. */
|
|
|
|
static int
|
|
|
|
emit_symtypetab (ctf_dict_t *fp, ctf_dict_t *symfp, uint32_t *dp,
|
|
|
|
ctf_link_sym_t **idx, const char **nameidx, uint32_t nidx,
|
|
|
|
uint32_t outmax, int size, int flags)
|
|
|
|
{
|
|
|
|
uint32_t i;
|
|
|
|
uint32_t *dpp = dp;
|
|
|
|
ctf_dynhash_t *symhash;
|
|
|
|
|
|
|
|
ctf_dprintf ("Emitting table of size %i, outmax %u, %u symtypetab entries, "
|
|
|
|
"flags %i\n", size, outmax, nidx, flags);
|
|
|
|
|
|
|
|
/* Empty table? Nothing to do. */
|
|
|
|
if (size == 0)
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
if (flags & CTF_SYMTYPETAB_EMIT_FUNCTION)
|
|
|
|
symhash = fp->ctf_funchash;
|
|
|
|
else
|
|
|
|
symhash = fp->ctf_objthash;
|
|
|
|
|
|
|
|
for (i = 0; i < nidx; i++)
|
|
|
|
{
|
|
|
|
const char *sym_name;
|
|
|
|
void *type;
|
|
|
|
|
|
|
|
/* If we have a linker-reported set of symbols, we may be given that set
|
|
|
|
to work from, or a set of symbol names. In both cases we want to look
|
|
|
|
at the corresponding linker-reported symbol (if any). */
|
|
|
|
if (!(flags & CTF_SYMTYPETAB_FORCE_INDEXED))
|
|
|
|
{
|
|
|
|
ctf_link_sym_t *this_link_sym;
|
|
|
|
|
|
|
|
if (idx)
|
|
|
|
this_link_sym = idx[i];
|
|
|
|
else
|
|
|
|
this_link_sym = ctf_dynhash_lookup (symfp->ctf_dynsyms, nameidx[i]);
|
|
|
|
|
|
|
|
/* Unreported symbol number. No pad, no nothing. */
|
|
|
|
if (!this_link_sym)
|
|
|
|
continue;
|
|
|
|
|
|
|
|
/* Symbol of the wrong type, or skippable? This symbol is not in this
|
|
|
|
table. */
|
|
|
|
if (((flags & CTF_SYMTYPETAB_EMIT_FUNCTION)
|
|
|
|
&& this_link_sym->st_type != STT_FUNC)
|
|
|
|
|| (!(flags & CTF_SYMTYPETAB_EMIT_FUNCTION)
|
|
|
|
&& this_link_sym->st_type != STT_OBJECT))
|
|
|
|
continue;
|
|
|
|
|
|
|
|
if (ctf_symtab_skippable (this_link_sym))
|
|
|
|
continue;
|
|
|
|
|
|
|
|
sym_name = this_link_sym->st_name;
|
|
|
|
|
|
|
|
/* Linker reports symbol of a different type to the symbol we actually
|
|
|
|
added? Skip the symbol. No pad, since the symbol doesn't actually
|
|
|
|
belong in this table at all. (Warned about in
|
|
|
|
symtypetab_density.) */
|
|
|
|
if ((this_link_sym->st_type == STT_FUNC)
|
|
|
|
&& (ctf_dynhash_lookup (fp->ctf_objthash, sym_name)))
|
|
|
|
continue;
|
|
|
|
|
|
|
|
if ((this_link_sym->st_type == STT_OBJECT)
|
|
|
|
&& (ctf_dynhash_lookup (fp->ctf_funchash, sym_name)))
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
else
|
|
|
|
sym_name = nameidx[i];
|
|
|
|
|
|
|
|
/* Symbol in index but no type set? Silently skip and (optionally)
|
|
|
|
pad. (In force-indexed mode, this is also where we track symbols of
|
|
|
|
the wrong type for this round of insertion.) */
|
|
|
|
if ((type = ctf_dynhash_lookup (symhash, sym_name)) == NULL)
|
|
|
|
{
|
|
|
|
if (flags & CTF_SYMTYPETAB_EMIT_PAD)
|
|
|
|
*dpp++ = 0;
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (!ctf_assert (fp, (((char *) dpp) - (char *) dp) < size))
|
|
|
|
return -1; /* errno is set for us. */
|
|
|
|
|
|
|
|
*dpp++ = (ctf_id_t) (uintptr_t) type;
|
|
|
|
|
|
|
|
/* When emitting unindexed output, all later symbols are pads: stop
|
|
|
|
early. */
|
|
|
|
if ((flags & CTF_SYMTYPETAB_EMIT_PAD) && idx[i]->st_symidx == outmax)
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Emit an objt or func symtypetab index into DP in a paticular order defined by
|
|
|
|
an array of symbol names passed in. Stop at NIDX. The linker-reported set
|
|
|
|
of symbols (if any) is found in SYMFP. */
|
|
|
|
static int
|
|
|
|
emit_symtypetab_index (ctf_dict_t *fp, ctf_dict_t *symfp, uint32_t *dp,
|
|
|
|
const char **idx, uint32_t nidx, int size, int flags)
|
|
|
|
{
|
|
|
|
uint32_t i;
|
|
|
|
uint32_t *dpp = dp;
|
|
|
|
ctf_dynhash_t *symhash;
|
|
|
|
|
|
|
|
ctf_dprintf ("Emitting index of size %i, %u entries reported by linker, "
|
|
|
|
"flags %i\n", size, nidx, flags);
|
|
|
|
|
|
|
|
/* Empty table? Nothing to do. */
|
|
|
|
if (size == 0)
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
if (flags & CTF_SYMTYPETAB_EMIT_FUNCTION)
|
|
|
|
symhash = fp->ctf_funchash;
|
|
|
|
else
|
|
|
|
symhash = fp->ctf_objthash;
|
|
|
|
|
|
|
|
/* Indexes should always be unpadded. */
|
|
|
|
if (!ctf_assert (fp, !(flags & CTF_SYMTYPETAB_EMIT_PAD)))
|
|
|
|
return -1; /* errno is set for us. */
|
|
|
|
|
|
|
|
for (i = 0; i < nidx; i++)
|
|
|
|
{
|
|
|
|
const char *sym_name;
|
|
|
|
void *type;
|
|
|
|
|
|
|
|
if (!(flags & CTF_SYMTYPETAB_FORCE_INDEXED))
|
|
|
|
{
|
|
|
|
ctf_link_sym_t *this_link_sym;
|
|
|
|
|
|
|
|
this_link_sym = ctf_dynhash_lookup (symfp->ctf_dynsyms, idx[i]);
|
|
|
|
|
|
|
|
/* This is an index: unreported symbols should never appear in it. */
|
|
|
|
if (!ctf_assert (fp, this_link_sym != NULL))
|
|
|
|
return -1; /* errno is set for us. */
|
|
|
|
|
|
|
|
/* Symbol of the wrong type, or skippable? This symbol is not in this
|
|
|
|
table. */
|
|
|
|
if (((flags & CTF_SYMTYPETAB_EMIT_FUNCTION)
|
|
|
|
&& this_link_sym->st_type != STT_FUNC)
|
|
|
|
|| (!(flags & CTF_SYMTYPETAB_EMIT_FUNCTION)
|
|
|
|
&& this_link_sym->st_type != STT_OBJECT))
|
|
|
|
continue;
|
|
|
|
|
|
|
|
if (ctf_symtab_skippable (this_link_sym))
|
|
|
|
continue;
|
|
|
|
|
|
|
|
sym_name = this_link_sym->st_name;
|
|
|
|
|
|
|
|
/* Linker reports symbol of a different type to the symbol we actually
|
|
|
|
added? Skip the symbol. */
|
|
|
|
if ((this_link_sym->st_type == STT_FUNC)
|
|
|
|
&& (ctf_dynhash_lookup (fp->ctf_objthash, sym_name)))
|
|
|
|
continue;
|
|
|
|
|
|
|
|
if ((this_link_sym->st_type == STT_OBJECT)
|
|
|
|
&& (ctf_dynhash_lookup (fp->ctf_funchash, sym_name)))
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
else
|
|
|
|
sym_name = idx[i];
|
|
|
|
|
|
|
|
/* Symbol in index and reported by linker, but no type set? Silently skip
|
|
|
|
and (optionally) pad. (In force-indexed mode, this is also where we
|
|
|
|
track symbols of the wrong type for this round of insertion.) */
|
|
|
|
if ((type = ctf_dynhash_lookup (symhash, sym_name)) == NULL)
|
|
|
|
continue;
|
|
|
|
|
|
|
|
ctf_str_add_ref (fp, sym_name, dpp++);
|
|
|
|
|
|
|
|
if (!ctf_assert (fp, (((char *) dpp) - (char *) dp) <= size))
|
|
|
|
return -1; /* errno is set for us. */
|
|
|
|
}
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
include, libctf, ld: extend variable section to contain functions too
The CTF variable section is an optional (usually-not-present) section in
the CTF dict which contains name -> type mappings corresponding to data
symbols that are present in the linker input but not in the output
symbol table: the idea is that programs that use their own symbol-
resolution mechanisms can use this section to look up the types of
symbols they have found using their own mechanism.
Because these removed symbols (mostly static variables, functions, etc)
all have names that are unlikely to appear in the ELF symtab and because
very few programs have their own symbol-resolution mechanisms, a special
linker flag (--ctf-variables) is needed to emit this section.
Historically, we emitted only removed data symbols into the variable
section. This seemed to make sense at the time, but in hindsight it
really doesn't: functions are symbols too, and a C program can look them
up just like any other type. So extend the variable section so that it
contains all static function symbols too (if it is emitted at all), with
types of kind CTF_K_FUNCTION.
This is a little fiddly. We relied on compiler assistance for data
symbols: the compiler simply emits all data symbols twice, once into the
symtypetab as an indexed symbol and once into the variable section.
Rather than wait for a suitably adjusted compiler that does the same for
function symbols, we can pluck unreported function symbols out of the
symtab and add them to the variable section ourselves. While we're at
it, we do the same with data symbols: this is redundant right now
because the compiler does it, but it costs very little time and lets the
compiler drop this kludge and save a little space in .o files.
include/
* ctf.h: Mention the new things we can see in the variable
section.
ld/
* testsuite/ld-ctf/data-func-conflicted-vars.d: New test.
libctf/
* ctf-link.c (ctf_link_deduplicating_variables): Duplicate
symbols into the variable section too.
* ctf-serialize.c (symtypetab_delete_nonstatic_vars): Rename
to...
(symtypetab_delete_nonstatics): ... this. Check the funchash
when pruning redundant variables.
(ctf_symtypetab_sect_sizes): Adjust accordingly.
* NEWS: Describe this change.
2022-03-16 23:29:25 +08:00
|
|
|
/* Delete symbols that have been assigned names from the variable section. Must
|
|
|
|
be called from within ctf_serialize, because that is the only place you can
|
|
|
|
safely delete variables without messing up ctf_rollback. */
|
2021-03-18 20:37:52 +08:00
|
|
|
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
static int
|
include, libctf, ld: extend variable section to contain functions too
The CTF variable section is an optional (usually-not-present) section in
the CTF dict which contains name -> type mappings corresponding to data
symbols that are present in the linker input but not in the output
symbol table: the idea is that programs that use their own symbol-
resolution mechanisms can use this section to look up the types of
symbols they have found using their own mechanism.
Because these removed symbols (mostly static variables, functions, etc)
all have names that are unlikely to appear in the ELF symtab and because
very few programs have their own symbol-resolution mechanisms, a special
linker flag (--ctf-variables) is needed to emit this section.
Historically, we emitted only removed data symbols into the variable
section. This seemed to make sense at the time, but in hindsight it
really doesn't: functions are symbols too, and a C program can look them
up just like any other type. So extend the variable section so that it
contains all static function symbols too (if it is emitted at all), with
types of kind CTF_K_FUNCTION.
This is a little fiddly. We relied on compiler assistance for data
symbols: the compiler simply emits all data symbols twice, once into the
symtypetab as an indexed symbol and once into the variable section.
Rather than wait for a suitably adjusted compiler that does the same for
function symbols, we can pluck unreported function symbols out of the
symtab and add them to the variable section ourselves. While we're at
it, we do the same with data symbols: this is redundant right now
because the compiler does it, but it costs very little time and lets the
compiler drop this kludge and save a little space in .o files.
include/
* ctf.h: Mention the new things we can see in the variable
section.
ld/
* testsuite/ld-ctf/data-func-conflicted-vars.d: New test.
libctf/
* ctf-link.c (ctf_link_deduplicating_variables): Duplicate
symbols into the variable section too.
* ctf-serialize.c (symtypetab_delete_nonstatic_vars): Rename
to...
(symtypetab_delete_nonstatics): ... this. Check the funchash
when pruning redundant variables.
(ctf_symtypetab_sect_sizes): Adjust accordingly.
* NEWS: Describe this change.
2022-03-16 23:29:25 +08:00
|
|
|
symtypetab_delete_nonstatics (ctf_dict_t *fp, ctf_dict_t *symfp)
|
2021-03-18 20:37:52 +08:00
|
|
|
{
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
ctf_dvdef_t *dvd, *nvd;
|
|
|
|
ctf_id_t type;
|
2021-03-18 20:37:52 +08:00
|
|
|
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
for (dvd = ctf_list_next (&fp->ctf_dvdefs); dvd != NULL; dvd = nvd)
|
2021-03-18 20:37:52 +08:00
|
|
|
{
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
nvd = ctf_list_next (dvd);
|
2021-03-18 20:37:52 +08:00
|
|
|
|
include, libctf, ld: extend variable section to contain functions too
The CTF variable section is an optional (usually-not-present) section in
the CTF dict which contains name -> type mappings corresponding to data
symbols that are present in the linker input but not in the output
symbol table: the idea is that programs that use their own symbol-
resolution mechanisms can use this section to look up the types of
symbols they have found using their own mechanism.
Because these removed symbols (mostly static variables, functions, etc)
all have names that are unlikely to appear in the ELF symtab and because
very few programs have their own symbol-resolution mechanisms, a special
linker flag (--ctf-variables) is needed to emit this section.
Historically, we emitted only removed data symbols into the variable
section. This seemed to make sense at the time, but in hindsight it
really doesn't: functions are symbols too, and a C program can look them
up just like any other type. So extend the variable section so that it
contains all static function symbols too (if it is emitted at all), with
types of kind CTF_K_FUNCTION.
This is a little fiddly. We relied on compiler assistance for data
symbols: the compiler simply emits all data symbols twice, once into the
symtypetab as an indexed symbol and once into the variable section.
Rather than wait for a suitably adjusted compiler that does the same for
function symbols, we can pluck unreported function symbols out of the
symtab and add them to the variable section ourselves. While we're at
it, we do the same with data symbols: this is redundant right now
because the compiler does it, but it costs very little time and lets the
compiler drop this kludge and save a little space in .o files.
include/
* ctf.h: Mention the new things we can see in the variable
section.
ld/
* testsuite/ld-ctf/data-func-conflicted-vars.d: New test.
libctf/
* ctf-link.c (ctf_link_deduplicating_variables): Duplicate
symbols into the variable section too.
* ctf-serialize.c (symtypetab_delete_nonstatic_vars): Rename
to...
(symtypetab_delete_nonstatics): ... this. Check the funchash
when pruning redundant variables.
(ctf_symtypetab_sect_sizes): Adjust accordingly.
* NEWS: Describe this change.
2022-03-16 23:29:25 +08:00
|
|
|
if ((((type = (ctf_id_t) (uintptr_t)
|
|
|
|
ctf_dynhash_lookup (fp->ctf_objthash, dvd->dvd_name)) > 0)
|
|
|
|
|| (type = (ctf_id_t) (uintptr_t)
|
|
|
|
ctf_dynhash_lookup (fp->ctf_funchash, dvd->dvd_name)) > 0)
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
&& ctf_dynhash_lookup (symfp->ctf_dynsyms, dvd->dvd_name) != NULL
|
|
|
|
&& type == dvd->dvd_type)
|
|
|
|
ctf_dvd_delete (fp, dvd);
|
2021-03-18 20:37:52 +08:00
|
|
|
}
|
|
|
|
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
return 0;
|
2021-03-18 20:37:52 +08:00
|
|
|
}
|
|
|
|
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
/* Figure out the sizes of the symtypetab sections, their indexed state,
|
|
|
|
etc. */
|
2021-03-18 20:37:52 +08:00
|
|
|
static int
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
ctf_symtypetab_sect_sizes (ctf_dict_t *fp, emit_symtypetab_state_t *s,
|
|
|
|
ctf_header_t *hdr, size_t *objt_size,
|
|
|
|
size_t *func_size, size_t *objtidx_size,
|
|
|
|
size_t *funcidx_size)
|
2021-03-18 20:37:52 +08:00
|
|
|
{
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
size_t nfuncs, nobjts;
|
2021-03-18 20:37:52 +08:00
|
|
|
size_t objt_unpadsize, func_unpadsize, objt_padsize, func_padsize;
|
|
|
|
|
|
|
|
/* If doing a writeout as part of linking, and the link flags request it,
|
|
|
|
filter out reported symbols from the variable section, and filter out all
|
|
|
|
other symbols from the symtypetab sections. (If we are not linking, the
|
|
|
|
symbols are sorted; if we are linking, don't bother sorting if we are not
|
libctf: make ctf_serialize() actually serialize
ctf_serialize() evolved from the old ctf_update(), which mutated the
in-memory CTF dict to make all the dynamic in-memory types into static,
unchanging written-to-the-dict types (by deserializing and reserializing
it): back in the days when you could only do type lookups on static types,
this meant you could see all the types you added recently, at the small,
small cost of making it impossible to change those older types ever again
and inducing an amortized O(n^2) cost if you actually wanted to add
references to types you added at arbitrary times to later types.
It also reset things so that ctf_discard() would throw away only types you
added after the most recent ctf_update() call.
Some time ago this was all changed so that you could look up dynamic types
just as easily as static types: ctf_update() changed so that only its
visible side-effect of affecting ctf_discard() remained: the old
ctf_update() was renamed to ctf_serialize(), made internal to libctf, and
called from the various functions that wrote files out.
... but it was still working by serializing and deserializing the entire
dict, swapping out its guts with the newly-serialized copy in an invasive
and horrible fashion that coupled ctf_serialize() to almost every field in
the ctf_dict_t. This is totally useless, and fixing it is easy: just rip
all that code out and have ctf_serialize return a serialized representation,
and let everything use that directly. This simplifies most of its callers
significantly.
(It also points up another bug: ctf_gzwrite() failed to call ctf_serialize()
at all, so it would only ever work for a dict you just ctf_write_mem()ed
yourself, just for its invisible side-effect of serializing the dict!)
This lets us simplify away a bunch of internal-only open-side functionality
for overriding the syn_ext_strtab and some just-added functionality for
forcing in an existing atoms table, without loss of functionality, and lets
us lift the restriction on reserializing a dict that was ctf_open()ed rather
than being ctf_create()d: it's now perfectly OK to open a dict, modify it
(except for adding members to existing structs, unions, or enums, which
fails with -ECTF_RDONLY), and write it out again, just as one would expect.
libctf/
* ctf-serialize.c (ctf_symtypetab_sect_sizes): Fix typos.
(ctf_type_sect_size): Add static type sizes too.
(ctf_serialize): Return the new dict rather than updating the
existing dict. No longer fail for dicts with static types;
copy them onto the start of the new types table.
(ctf_gzwrite): Actually serialize before gzwriting.
(ctf_write_mem): Improve forced (test-mode) endian-flipping:
flip dicts even if they are too small to be compressed.
Improve confusing variable naming.
* ctf-archive.c (arc_write_one_ctf): Don't bother to call
ctf_serialize: both the functions we call do so.
* ctf-string.c (ctf_str_create_atoms): Drop serializing case
(atoms arg).
* ctf-open.c (ctf_simple_open): Call ctf_bufopen directly.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Delete/rename to ctf_bufopen: no
longer bother with syn_ext_strtab or forced atoms table,
serialization no longer needs them.
* ctf-create.c (ctf_create): Call ctf_bufopen directly.
* ctf-impl.h (ctf_str_create_atoms): Drop atoms arg.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Likewise.
(ctf_serialize): Adjust.
* testsuite/libctf-lookup/add-to-opened.c: Adjust now that
this is supposed to work.
2024-03-26 21:04:20 +08:00
|
|
|
filtering out reported symbols: this is almost certainly an ld -r and only
|
2021-03-18 20:37:52 +08:00
|
|
|
the linker is likely to consume these symtypetabs again. The linker
|
libctf: make ctf_serialize() actually serialize
ctf_serialize() evolved from the old ctf_update(), which mutated the
in-memory CTF dict to make all the dynamic in-memory types into static,
unchanging written-to-the-dict types (by deserializing and reserializing
it): back in the days when you could only do type lookups on static types,
this meant you could see all the types you added recently, at the small,
small cost of making it impossible to change those older types ever again
and inducing an amortized O(n^2) cost if you actually wanted to add
references to types you added at arbitrary times to later types.
It also reset things so that ctf_discard() would throw away only types you
added after the most recent ctf_update() call.
Some time ago this was all changed so that you could look up dynamic types
just as easily as static types: ctf_update() changed so that only its
visible side-effect of affecting ctf_discard() remained: the old
ctf_update() was renamed to ctf_serialize(), made internal to libctf, and
called from the various functions that wrote files out.
... but it was still working by serializing and deserializing the entire
dict, swapping out its guts with the newly-serialized copy in an invasive
and horrible fashion that coupled ctf_serialize() to almost every field in
the ctf_dict_t. This is totally useless, and fixing it is easy: just rip
all that code out and have ctf_serialize return a serialized representation,
and let everything use that directly. This simplifies most of its callers
significantly.
(It also points up another bug: ctf_gzwrite() failed to call ctf_serialize()
at all, so it would only ever work for a dict you just ctf_write_mem()ed
yourself, just for its invisible side-effect of serializing the dict!)
This lets us simplify away a bunch of internal-only open-side functionality
for overriding the syn_ext_strtab and some just-added functionality for
forcing in an existing atoms table, without loss of functionality, and lets
us lift the restriction on reserializing a dict that was ctf_open()ed rather
than being ctf_create()d: it's now perfectly OK to open a dict, modify it
(except for adding members to existing structs, unions, or enums, which
fails with -ECTF_RDONLY), and write it out again, just as one would expect.
libctf/
* ctf-serialize.c (ctf_symtypetab_sect_sizes): Fix typos.
(ctf_type_sect_size): Add static type sizes too.
(ctf_serialize): Return the new dict rather than updating the
existing dict. No longer fail for dicts with static types;
copy them onto the start of the new types table.
(ctf_gzwrite): Actually serialize before gzwriting.
(ctf_write_mem): Improve forced (test-mode) endian-flipping:
flip dicts even if they are too small to be compressed.
Improve confusing variable naming.
* ctf-archive.c (arc_write_one_ctf): Don't bother to call
ctf_serialize: both the functions we call do so.
* ctf-string.c (ctf_str_create_atoms): Drop serializing case
(atoms arg).
* ctf-open.c (ctf_simple_open): Call ctf_bufopen directly.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Delete/rename to ctf_bufopen: no
longer bother with syn_ext_strtab or forced atoms table,
serialization no longer needs them.
* ctf-create.c (ctf_create): Call ctf_bufopen directly.
* ctf-impl.h (ctf_str_create_atoms): Drop atoms arg.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Likewise.
(ctf_serialize): Adjust.
* testsuite/libctf-lookup/add-to-opened.c: Adjust now that
this is supposed to work.
2024-03-26 21:04:20 +08:00
|
|
|
doesn't care what order the symtypetab entries are in, since it only
|
2021-03-18 20:37:52 +08:00
|
|
|
iterates over symbols and does not use the ctf_lookup_by_symbol* API.) */
|
|
|
|
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
s->sort_syms = 1;
|
2021-03-18 20:37:52 +08:00
|
|
|
if (fp->ctf_flags & LCTF_LINKING)
|
|
|
|
{
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
s->filter_syms = !(fp->ctf_link_flags & CTF_LINK_NO_FILTER_REPORTED_SYMS);
|
|
|
|
if (!s->filter_syms)
|
|
|
|
s->sort_syms = 0;
|
2021-03-18 20:37:52 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
/* Find the dict to which the linker has reported symbols, if any. */
|
|
|
|
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
if (s->filter_syms)
|
2021-03-18 20:37:52 +08:00
|
|
|
{
|
|
|
|
if (!fp->ctf_dynsyms && fp->ctf_parent && fp->ctf_parent->ctf_dynsyms)
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
s->symfp = fp->ctf_parent;
|
2021-03-18 20:37:52 +08:00
|
|
|
else
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
s->symfp = fp;
|
2021-03-18 20:37:52 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
/* If not filtering, keep all potential symbols in an unsorted, indexed
|
|
|
|
dict. */
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
if (!s->filter_syms)
|
|
|
|
s->symflags = CTF_SYMTYPETAB_FORCE_INDEXED;
|
2021-03-18 20:37:52 +08:00
|
|
|
else
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
hdr->cth_flags |= CTF_F_IDXSORTED;
|
2021-03-18 20:37:52 +08:00
|
|
|
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
if (!ctf_assert (fp, (s->filter_syms && s->symfp)
|
|
|
|
|| (!s->filter_syms && !s->symfp
|
|
|
|
&& ((s->symflags & CTF_SYMTYPETAB_FORCE_INDEXED) != 0))))
|
2021-03-18 20:37:52 +08:00
|
|
|
return -1;
|
|
|
|
|
|
|
|
/* Work out the sizes of the object and function sections, and work out the
|
|
|
|
number of pad (unassigned) symbols in each, and the overall size of the
|
|
|
|
sections. */
|
|
|
|
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
if (symtypetab_density (fp, s->symfp, fp->ctf_objthash, &nobjts, &s->maxobjt,
|
|
|
|
&objt_unpadsize, &objt_padsize, objtidx_size,
|
|
|
|
s->symflags) < 0)
|
2021-03-18 20:37:52 +08:00
|
|
|
return -1; /* errno is set for us. */
|
|
|
|
|
|
|
|
ctf_dprintf ("Object symtypetab: %i objects, max %i, unpadded size %i, "
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
"%i bytes of pads, index size %i\n", (int) nobjts,
|
|
|
|
(int) s->maxobjt, (int) objt_unpadsize, (int) objt_padsize,
|
|
|
|
(int) *objtidx_size);
|
2021-03-18 20:37:52 +08:00
|
|
|
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
if (symtypetab_density (fp, s->symfp, fp->ctf_funchash, &nfuncs, &s->maxfunc,
|
|
|
|
&func_unpadsize, &func_padsize, funcidx_size,
|
|
|
|
s->symflags | CTF_SYMTYPETAB_EMIT_FUNCTION) < 0)
|
2021-03-18 20:37:52 +08:00
|
|
|
return -1; /* errno is set for us. */
|
|
|
|
|
|
|
|
ctf_dprintf ("Function symtypetab: %i functions, max %i, unpadded size %i, "
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
"%i bytes of pads, index size %i\n", (int) nfuncs,
|
|
|
|
(int) s->maxfunc, (int) func_unpadsize, (int) func_padsize,
|
|
|
|
(int) *funcidx_size);
|
2021-03-18 20:37:52 +08:00
|
|
|
|
|
|
|
/* It is worth indexing each section if it would save space to do so, due to
|
|
|
|
reducing the number of pads sufficiently. A pad is the same size as a
|
|
|
|
single index entry: but index sections compress relatively poorly compared
|
|
|
|
to constant pads, so it takes a lot of contiguous padding to equal one
|
|
|
|
index section entry. It would be nice to be able to *verify* whether we
|
|
|
|
would save space after compression rather than guessing, but this seems
|
|
|
|
difficult, since it would require complete reserialization. Regardless, if
|
|
|
|
the linker has not reported any symbols (e.g. if this is not a final link
|
|
|
|
but just an ld -r), we must emit things in indexed fashion just as the
|
|
|
|
compiler does. */
|
|
|
|
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
*objt_size = objt_unpadsize;
|
|
|
|
if (!(s->symflags & CTF_SYMTYPETAB_FORCE_INDEXED)
|
2021-03-18 20:37:52 +08:00
|
|
|
&& ((objt_padsize + objt_unpadsize) * CTF_INDEX_PAD_THRESHOLD
|
|
|
|
> objt_padsize))
|
|
|
|
{
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
*objt_size += objt_padsize;
|
|
|
|
*objtidx_size = 0;
|
2021-03-18 20:37:52 +08:00
|
|
|
}
|
|
|
|
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
*func_size = func_unpadsize;
|
|
|
|
if (!(s->symflags & CTF_SYMTYPETAB_FORCE_INDEXED)
|
2021-03-18 20:37:52 +08:00
|
|
|
&& ((func_padsize + func_unpadsize) * CTF_INDEX_PAD_THRESHOLD
|
|
|
|
> func_padsize))
|
|
|
|
{
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
*func_size += func_padsize;
|
|
|
|
*funcidx_size = 0;
|
2021-03-18 20:37:52 +08:00
|
|
|
}
|
|
|
|
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
/* If we are filtering symbols out, those symbols that the linker has not
|
|
|
|
reported have now been removed from the ctf_objthash and ctf_funchash.
|
include, libctf, ld: extend variable section to contain functions too
The CTF variable section is an optional (usually-not-present) section in
the CTF dict which contains name -> type mappings corresponding to data
symbols that are present in the linker input but not in the output
symbol table: the idea is that programs that use their own symbol-
resolution mechanisms can use this section to look up the types of
symbols they have found using their own mechanism.
Because these removed symbols (mostly static variables, functions, etc)
all have names that are unlikely to appear in the ELF symtab and because
very few programs have their own symbol-resolution mechanisms, a special
linker flag (--ctf-variables) is needed to emit this section.
Historically, we emitted only removed data symbols into the variable
section. This seemed to make sense at the time, but in hindsight it
really doesn't: functions are symbols too, and a C program can look them
up just like any other type. So extend the variable section so that it
contains all static function symbols too (if it is emitted at all), with
types of kind CTF_K_FUNCTION.
This is a little fiddly. We relied on compiler assistance for data
symbols: the compiler simply emits all data symbols twice, once into the
symtypetab as an indexed symbol and once into the variable section.
Rather than wait for a suitably adjusted compiler that does the same for
function symbols, we can pluck unreported function symbols out of the
symtab and add them to the variable section ourselves. While we're at
it, we do the same with data symbols: this is redundant right now
because the compiler does it, but it costs very little time and lets the
compiler drop this kludge and save a little space in .o files.
include/
* ctf.h: Mention the new things we can see in the variable
section.
ld/
* testsuite/ld-ctf/data-func-conflicted-vars.d: New test.
libctf/
* ctf-link.c (ctf_link_deduplicating_variables): Duplicate
symbols into the variable section too.
* ctf-serialize.c (symtypetab_delete_nonstatic_vars): Rename
to...
(symtypetab_delete_nonstatics): ... this. Check the funchash
when pruning redundant variables.
(ctf_symtypetab_sect_sizes): Adjust accordingly.
* NEWS: Describe this change.
2022-03-16 23:29:25 +08:00
|
|
|
Delete entries from the variable section that duplicate newly-added
|
|
|
|
symbols. There's no need to migrate new ones in: we do that (if necessary)
|
|
|
|
in ctf_link_deduplicating_variables. */
|
2021-03-18 20:37:52 +08:00
|
|
|
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
if (s->filter_syms && s->symfp->ctf_dynsyms &&
|
include, libctf, ld: extend variable section to contain functions too
The CTF variable section is an optional (usually-not-present) section in
the CTF dict which contains name -> type mappings corresponding to data
symbols that are present in the linker input but not in the output
symbol table: the idea is that programs that use their own symbol-
resolution mechanisms can use this section to look up the types of
symbols they have found using their own mechanism.
Because these removed symbols (mostly static variables, functions, etc)
all have names that are unlikely to appear in the ELF symtab and because
very few programs have their own symbol-resolution mechanisms, a special
linker flag (--ctf-variables) is needed to emit this section.
Historically, we emitted only removed data symbols into the variable
section. This seemed to make sense at the time, but in hindsight it
really doesn't: functions are symbols too, and a C program can look them
up just like any other type. So extend the variable section so that it
contains all static function symbols too (if it is emitted at all), with
types of kind CTF_K_FUNCTION.
This is a little fiddly. We relied on compiler assistance for data
symbols: the compiler simply emits all data symbols twice, once into the
symtypetab as an indexed symbol and once into the variable section.
Rather than wait for a suitably adjusted compiler that does the same for
function symbols, we can pluck unreported function symbols out of the
symtab and add them to the variable section ourselves. While we're at
it, we do the same with data symbols: this is redundant right now
because the compiler does it, but it costs very little time and lets the
compiler drop this kludge and save a little space in .o files.
include/
* ctf.h: Mention the new things we can see in the variable
section.
ld/
* testsuite/ld-ctf/data-func-conflicted-vars.d: New test.
libctf/
* ctf-link.c (ctf_link_deduplicating_variables): Duplicate
symbols into the variable section too.
* ctf-serialize.c (symtypetab_delete_nonstatic_vars): Rename
to...
(symtypetab_delete_nonstatics): ... this. Check the funchash
when pruning redundant variables.
(ctf_symtypetab_sect_sizes): Adjust accordingly.
* NEWS: Describe this change.
2022-03-16 23:29:25 +08:00
|
|
|
symtypetab_delete_nonstatics (fp, s->symfp) < 0)
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
return -1;
|
2021-03-18 20:37:52 +08:00
|
|
|
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
return 0;
|
|
|
|
}
|
2021-03-18 20:37:52 +08:00
|
|
|
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
static int
|
|
|
|
ctf_emit_symtypetab_sects (ctf_dict_t *fp, emit_symtypetab_state_t *s,
|
|
|
|
unsigned char **tptr, size_t objt_size,
|
|
|
|
size_t func_size, size_t objtidx_size,
|
|
|
|
size_t funcidx_size)
|
|
|
|
{
|
|
|
|
unsigned char *t = *tptr;
|
|
|
|
size_t nsymtypes = 0;
|
|
|
|
const char **sym_name_order = NULL;
|
|
|
|
int err;
|
2021-03-18 20:37:52 +08:00
|
|
|
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
/* Sort the linker's symbols into name order if need be. */
|
2021-03-18 20:37:52 +08:00
|
|
|
|
|
|
|
if ((objtidx_size != 0) || (funcidx_size != 0))
|
|
|
|
{
|
|
|
|
ctf_next_t *i = NULL;
|
|
|
|
void *symname;
|
|
|
|
const char **walk;
|
|
|
|
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
if (s->filter_syms)
|
2021-03-18 20:37:52 +08:00
|
|
|
{
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
if (s->symfp->ctf_dynsyms)
|
|
|
|
nsymtypes = ctf_dynhash_elements (s->symfp->ctf_dynsyms);
|
2021-03-18 20:37:52 +08:00
|
|
|
else
|
|
|
|
nsymtypes = 0;
|
|
|
|
}
|
|
|
|
else
|
|
|
|
nsymtypes = ctf_dynhash_elements (fp->ctf_objthash)
|
|
|
|
+ ctf_dynhash_elements (fp->ctf_funchash);
|
|
|
|
|
|
|
|
if ((sym_name_order = calloc (nsymtypes, sizeof (const char *))) == NULL)
|
|
|
|
goto oom;
|
|
|
|
|
|
|
|
walk = sym_name_order;
|
|
|
|
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
if (s->filter_syms)
|
2021-03-18 20:37:52 +08:00
|
|
|
{
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
if (s->symfp->ctf_dynsyms)
|
2021-03-18 20:37:52 +08:00
|
|
|
{
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
while ((err = ctf_dynhash_next_sorted (s->symfp->ctf_dynsyms, &i,
|
2021-03-18 20:37:52 +08:00
|
|
|
&symname, NULL,
|
|
|
|
ctf_dynhash_sort_by_name,
|
|
|
|
NULL)) == 0)
|
|
|
|
*walk++ = (const char *) symname;
|
|
|
|
if (err != ECTF_NEXT_END)
|
|
|
|
goto symerr;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
ctf_hash_sort_f sort_fun = NULL;
|
|
|
|
|
|
|
|
/* Since we partition the set of symbols back into objt and func,
|
|
|
|
we can sort the two independently without harm. */
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
if (s->sort_syms)
|
2021-03-18 20:37:52 +08:00
|
|
|
sort_fun = ctf_dynhash_sort_by_name;
|
|
|
|
|
|
|
|
while ((err = ctf_dynhash_next_sorted (fp->ctf_objthash, &i, &symname,
|
|
|
|
NULL, sort_fun, NULL)) == 0)
|
|
|
|
*walk++ = (const char *) symname;
|
|
|
|
if (err != ECTF_NEXT_END)
|
|
|
|
goto symerr;
|
|
|
|
|
|
|
|
while ((err = ctf_dynhash_next_sorted (fp->ctf_funchash, &i, &symname,
|
|
|
|
NULL, sort_fun, NULL)) == 0)
|
|
|
|
*walk++ = (const char *) symname;
|
|
|
|
if (err != ECTF_NEXT_END)
|
|
|
|
goto symerr;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Emit the object and function sections, and if necessary their indexes.
|
|
|
|
Emission is done in symtab order if there is no index, and in index
|
|
|
|
(name) order otherwise. */
|
|
|
|
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
if ((objtidx_size == 0) && s->symfp && s->symfp->ctf_dynsymidx)
|
2021-03-18 20:37:52 +08:00
|
|
|
{
|
|
|
|
ctf_dprintf ("Emitting unindexed objt symtypetab\n");
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
if (emit_symtypetab (fp, s->symfp, (uint32_t *) t,
|
|
|
|
s->symfp->ctf_dynsymidx, NULL,
|
|
|
|
s->symfp->ctf_dynsymmax + 1, s->maxobjt,
|
|
|
|
objt_size, s->symflags | CTF_SYMTYPETAB_EMIT_PAD) < 0)
|
2021-03-18 20:37:52 +08:00
|
|
|
goto err; /* errno is set for us. */
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
ctf_dprintf ("Emitting indexed objt symtypetab\n");
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
if (emit_symtypetab (fp, s->symfp, (uint32_t *) t, NULL,
|
|
|
|
sym_name_order, nsymtypes, s->maxobjt,
|
|
|
|
objt_size, s->symflags) < 0)
|
2021-03-18 20:37:52 +08:00
|
|
|
goto err; /* errno is set for us. */
|
|
|
|
}
|
|
|
|
|
|
|
|
t += objt_size;
|
|
|
|
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
if ((funcidx_size == 0) && s->symfp && s->symfp->ctf_dynsymidx)
|
2021-03-18 20:37:52 +08:00
|
|
|
{
|
|
|
|
ctf_dprintf ("Emitting unindexed func symtypetab\n");
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
if (emit_symtypetab (fp, s->symfp, (uint32_t *) t,
|
|
|
|
s->symfp->ctf_dynsymidx, NULL,
|
|
|
|
s->symfp->ctf_dynsymmax + 1, s->maxfunc,
|
|
|
|
func_size, s->symflags | CTF_SYMTYPETAB_EMIT_FUNCTION
|
2021-03-18 20:37:52 +08:00
|
|
|
| CTF_SYMTYPETAB_EMIT_PAD) < 0)
|
|
|
|
goto err; /* errno is set for us. */
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
ctf_dprintf ("Emitting indexed func symtypetab\n");
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
if (emit_symtypetab (fp, s->symfp, (uint32_t *) t, NULL, sym_name_order,
|
|
|
|
nsymtypes, s->maxfunc, func_size,
|
|
|
|
s->symflags | CTF_SYMTYPETAB_EMIT_FUNCTION) < 0)
|
2021-03-18 20:37:52 +08:00
|
|
|
goto err; /* errno is set for us. */
|
|
|
|
}
|
|
|
|
|
|
|
|
t += func_size;
|
|
|
|
|
|
|
|
if (objtidx_size > 0)
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
if (emit_symtypetab_index (fp, s->symfp, (uint32_t *) t, sym_name_order,
|
|
|
|
nsymtypes, objtidx_size, s->symflags) < 0)
|
2021-03-18 20:37:52 +08:00
|
|
|
goto err;
|
|
|
|
|
|
|
|
t += objtidx_size;
|
|
|
|
|
|
|
|
if (funcidx_size > 0)
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
if (emit_symtypetab_index (fp, s->symfp, (uint32_t *) t, sym_name_order,
|
2021-03-18 20:37:52 +08:00
|
|
|
nsymtypes, funcidx_size,
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
s->symflags | CTF_SYMTYPETAB_EMIT_FUNCTION) < 0)
|
2021-03-18 20:37:52 +08:00
|
|
|
goto err;
|
|
|
|
|
|
|
|
t += funcidx_size;
|
|
|
|
free (sym_name_order);
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
*tptr = t;
|
2021-03-18 20:37:52 +08:00
|
|
|
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
return 0;
|
2021-03-18 20:37:52 +08:00
|
|
|
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
oom:
|
|
|
|
ctf_set_errno (fp, EAGAIN);
|
|
|
|
goto err;
|
|
|
|
symerr:
|
|
|
|
ctf_err_warn (fp, 0, err, _("error serializing symtypetabs"));
|
|
|
|
err:
|
|
|
|
free (sym_name_order);
|
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Type section. */
|
|
|
|
|
libctf: make ctf_serialize() actually serialize
ctf_serialize() evolved from the old ctf_update(), which mutated the
in-memory CTF dict to make all the dynamic in-memory types into static,
unchanging written-to-the-dict types (by deserializing and reserializing
it): back in the days when you could only do type lookups on static types,
this meant you could see all the types you added recently, at the small,
small cost of making it impossible to change those older types ever again
and inducing an amortized O(n^2) cost if you actually wanted to add
references to types you added at arbitrary times to later types.
It also reset things so that ctf_discard() would throw away only types you
added after the most recent ctf_update() call.
Some time ago this was all changed so that you could look up dynamic types
just as easily as static types: ctf_update() changed so that only its
visible side-effect of affecting ctf_discard() remained: the old
ctf_update() was renamed to ctf_serialize(), made internal to libctf, and
called from the various functions that wrote files out.
... but it was still working by serializing and deserializing the entire
dict, swapping out its guts with the newly-serialized copy in an invasive
and horrible fashion that coupled ctf_serialize() to almost every field in
the ctf_dict_t. This is totally useless, and fixing it is easy: just rip
all that code out and have ctf_serialize return a serialized representation,
and let everything use that directly. This simplifies most of its callers
significantly.
(It also points up another bug: ctf_gzwrite() failed to call ctf_serialize()
at all, so it would only ever work for a dict you just ctf_write_mem()ed
yourself, just for its invisible side-effect of serializing the dict!)
This lets us simplify away a bunch of internal-only open-side functionality
for overriding the syn_ext_strtab and some just-added functionality for
forcing in an existing atoms table, without loss of functionality, and lets
us lift the restriction on reserializing a dict that was ctf_open()ed rather
than being ctf_create()d: it's now perfectly OK to open a dict, modify it
(except for adding members to existing structs, unions, or enums, which
fails with -ECTF_RDONLY), and write it out again, just as one would expect.
libctf/
* ctf-serialize.c (ctf_symtypetab_sect_sizes): Fix typos.
(ctf_type_sect_size): Add static type sizes too.
(ctf_serialize): Return the new dict rather than updating the
existing dict. No longer fail for dicts with static types;
copy them onto the start of the new types table.
(ctf_gzwrite): Actually serialize before gzwriting.
(ctf_write_mem): Improve forced (test-mode) endian-flipping:
flip dicts even if they are too small to be compressed.
Improve confusing variable naming.
* ctf-archive.c (arc_write_one_ctf): Don't bother to call
ctf_serialize: both the functions we call do so.
* ctf-string.c (ctf_str_create_atoms): Drop serializing case
(atoms arg).
* ctf-open.c (ctf_simple_open): Call ctf_bufopen directly.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Delete/rename to ctf_bufopen: no
longer bother with syn_ext_strtab or forced atoms table,
serialization no longer needs them.
* ctf-create.c (ctf_create): Call ctf_bufopen directly.
* ctf-impl.h (ctf_str_create_atoms): Drop atoms arg.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Likewise.
(ctf_serialize): Adjust.
* testsuite/libctf-lookup/add-to-opened.c: Adjust now that
this is supposed to work.
2024-03-26 21:04:20 +08:00
|
|
|
/* Iterate through the static types and the dynamic type definition list and
|
|
|
|
compute the size of the CTF type section. */
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
|
|
|
|
static size_t
|
|
|
|
ctf_type_sect_size (ctf_dict_t *fp)
|
|
|
|
{
|
|
|
|
ctf_dtdef_t *dtd;
|
|
|
|
size_t type_size;
|
|
|
|
|
|
|
|
for (type_size = 0, dtd = ctf_list_next (&fp->ctf_dtdefs);
|
|
|
|
dtd != NULL; dtd = ctf_list_next (dtd))
|
|
|
|
{
|
|
|
|
uint32_t kind = LCTF_INFO_KIND (fp, dtd->dtd_data.ctt_info);
|
|
|
|
uint32_t vlen = LCTF_INFO_VLEN (fp, dtd->dtd_data.ctt_info);
|
libctf: eliminate dtd_u, part 5: structs / unions
Eliminate the dynamic member storage for structs and unions as we have
for other dynamic types. This is much like the previous enum
elimination, except that structs and unions are the only types for which
a full-sized ctf_type_t might be needed. Up to now, this decision has
been made in the individual ctf_add_{struct,union}_sized functions and
duplicated in ctf_add_member_offset. The vlen machinery lets us
simplify this, always allocating a ctf_lmember_t and setting the
dtd_data's ctt_size to CTF_LSIZE_SENT: we figure out whether this is
really justified and (almost always) repack things down into a
ctf_stype_t at ctf_serialize time.
This allows us to eliminate the dynamic member paths from the iterators and
query functions in ctf-types.c in favour of always using the large-structure
vlen stuff for dynamic types (the diff is ugly but that's just because of the
volume of reindentation this calls for). This also means the large-structure
vlen stuff gets more heavily tested, which is nice because it was an almost
totally unused code path before now (it only kicked in for structures of size
>4GiB, and how often do you see those?)
The only extra complexity here is ctf_add_type. Back in the days of the
nondeduplicating linker this was called a ridiculous number of times for
countless identical copies of structures: eschewing the repeated lookups of the
dtd in ctf_add_member_offset and adding the members directly saved an amazing
amount of time. Now the nondeduplicating linker is gone, this is extreme
overoptimization: we can rip out the direct addition and use ctf_member_next and
ctf_add_member_offset, just like ctf_dedup_emit does.
We augment a ctf_add_type test to try adding a self-referential struct, the only
thing the ctf_add_type part of this change really perturbs.
This completes the elimination of dtd_u.
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (ctf_dtdef_t) <dtu_members>: Remove.
<dtd_u>: Likewise.
(ctf_dmdef_t): Remove.
(struct ctf_next) <u.ctn_dmd>: Remove.
* ctf-create.c (INITIAL_VLEN): New, more-or-less arbitrary initial
vlen size.
(ctf_add_enum): Use it.
(ctf_dtd_delete): Do not free the (removed) dmd; remove string
refs from the vlen on struct deletion.
(ctf_add_struct_sized): Populate the vlen: do it by hand if
promoting forwards. Always populate the full-size
lsizehi/lsizelo members.
(ctf_add_union_sized): Likewise.
(ctf_add_member_offset): Set up the vlen rather than the dmd.
Expand it as needed, repointing string refs via
ctf_str_move_pending. Add the member names as pending strings.
Always populate the full-size lsizehi/lsizelo members.
(membadd): Remove, folding back into...
(ctf_add_type_internal): ... here, adding via an ordinary
ctf_add_struct_sized and _next iteration rather than doing
everything by hand.
* ctf-serialize.c (ctf_copy_smembers): Remove this...
(ctf_copy_lmembers): ... and this...
(ctf_emit_type_sect): ... folding into here. Figure out if a
ctf_stype_t is needed here, not in ctf_add_*_sized.
(ctf_type_sect_size): Figure out the ctf_stype_t stuff the same
way here.
* ctf-types.c (ctf_member_next): Remove the dmd path and always
use the vlen. Force large-structure usage for dynamic types.
(ctf_type_align): Likewise.
(ctf_member_info): Likewise.
(ctf_type_rvisit): Likewise.
* testsuite/libctf-regression/type-add-unnamed-struct-ctf.c: Add a
self-referential type to this test.
* testsuite/libctf-regression/type-add-unnamed-struct.c: Adjusted
accordingly.
* testsuite/libctf-regression/type-add-unnamed-struct.lk: Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
size_t type_ctt_size = dtd->dtd_data.ctt_size;
|
|
|
|
|
|
|
|
/* Shrink ctf_type_t-using types from a ctf_type_t to a ctf_stype_t
|
|
|
|
if possible. */
|
|
|
|
|
|
|
|
if (kind == CTF_K_STRUCT || kind == CTF_K_UNION)
|
|
|
|
{
|
|
|
|
size_t lsize = CTF_TYPE_LSIZE (&dtd->dtd_data);
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
|
libctf: eliminate dtd_u, part 5: structs / unions
Eliminate the dynamic member storage for structs and unions as we have
for other dynamic types. This is much like the previous enum
elimination, except that structs and unions are the only types for which
a full-sized ctf_type_t might be needed. Up to now, this decision has
been made in the individual ctf_add_{struct,union}_sized functions and
duplicated in ctf_add_member_offset. The vlen machinery lets us
simplify this, always allocating a ctf_lmember_t and setting the
dtd_data's ctt_size to CTF_LSIZE_SENT: we figure out whether this is
really justified and (almost always) repack things down into a
ctf_stype_t at ctf_serialize time.
This allows us to eliminate the dynamic member paths from the iterators and
query functions in ctf-types.c in favour of always using the large-structure
vlen stuff for dynamic types (the diff is ugly but that's just because of the
volume of reindentation this calls for). This also means the large-structure
vlen stuff gets more heavily tested, which is nice because it was an almost
totally unused code path before now (it only kicked in for structures of size
>4GiB, and how often do you see those?)
The only extra complexity here is ctf_add_type. Back in the days of the
nondeduplicating linker this was called a ridiculous number of times for
countless identical copies of structures: eschewing the repeated lookups of the
dtd in ctf_add_member_offset and adding the members directly saved an amazing
amount of time. Now the nondeduplicating linker is gone, this is extreme
overoptimization: we can rip out the direct addition and use ctf_member_next and
ctf_add_member_offset, just like ctf_dedup_emit does.
We augment a ctf_add_type test to try adding a self-referential struct, the only
thing the ctf_add_type part of this change really perturbs.
This completes the elimination of dtd_u.
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (ctf_dtdef_t) <dtu_members>: Remove.
<dtd_u>: Likewise.
(ctf_dmdef_t): Remove.
(struct ctf_next) <u.ctn_dmd>: Remove.
* ctf-create.c (INITIAL_VLEN): New, more-or-less arbitrary initial
vlen size.
(ctf_add_enum): Use it.
(ctf_dtd_delete): Do not free the (removed) dmd; remove string
refs from the vlen on struct deletion.
(ctf_add_struct_sized): Populate the vlen: do it by hand if
promoting forwards. Always populate the full-size
lsizehi/lsizelo members.
(ctf_add_union_sized): Likewise.
(ctf_add_member_offset): Set up the vlen rather than the dmd.
Expand it as needed, repointing string refs via
ctf_str_move_pending. Add the member names as pending strings.
Always populate the full-size lsizehi/lsizelo members.
(membadd): Remove, folding back into...
(ctf_add_type_internal): ... here, adding via an ordinary
ctf_add_struct_sized and _next iteration rather than doing
everything by hand.
* ctf-serialize.c (ctf_copy_smembers): Remove this...
(ctf_copy_lmembers): ... and this...
(ctf_emit_type_sect): ... folding into here. Figure out if a
ctf_stype_t is needed here, not in ctf_add_*_sized.
(ctf_type_sect_size): Figure out the ctf_stype_t stuff the same
way here.
* ctf-types.c (ctf_member_next): Remove the dmd path and always
use the vlen. Force large-structure usage for dynamic types.
(ctf_type_align): Likewise.
(ctf_member_info): Likewise.
(ctf_type_rvisit): Likewise.
* testsuite/libctf-regression/type-add-unnamed-struct-ctf.c: Add a
self-referential type to this test.
* testsuite/libctf-regression/type-add-unnamed-struct.c: Adjusted
accordingly.
* testsuite/libctf-regression/type-add-unnamed-struct.lk: Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
if (lsize <= CTF_MAX_SIZE)
|
|
|
|
type_ctt_size = lsize;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (type_ctt_size != CTF_LSIZE_SENT)
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
type_size += sizeof (ctf_stype_t);
|
|
|
|
else
|
|
|
|
type_size += sizeof (ctf_type_t);
|
|
|
|
|
|
|
|
switch (kind)
|
|
|
|
{
|
|
|
|
case CTF_K_INTEGER:
|
|
|
|
case CTF_K_FLOAT:
|
|
|
|
type_size += sizeof (uint32_t);
|
|
|
|
break;
|
|
|
|
case CTF_K_ARRAY:
|
|
|
|
type_size += sizeof (ctf_array_t);
|
|
|
|
break;
|
|
|
|
case CTF_K_SLICE:
|
|
|
|
type_size += sizeof (ctf_slice_t);
|
|
|
|
break;
|
|
|
|
case CTF_K_FUNCTION:
|
|
|
|
type_size += sizeof (uint32_t) * (vlen + (vlen & 1));
|
|
|
|
break;
|
|
|
|
case CTF_K_STRUCT:
|
|
|
|
case CTF_K_UNION:
|
libctf: eliminate dtd_u, part 5: structs / unions
Eliminate the dynamic member storage for structs and unions as we have
for other dynamic types. This is much like the previous enum
elimination, except that structs and unions are the only types for which
a full-sized ctf_type_t might be needed. Up to now, this decision has
been made in the individual ctf_add_{struct,union}_sized functions and
duplicated in ctf_add_member_offset. The vlen machinery lets us
simplify this, always allocating a ctf_lmember_t and setting the
dtd_data's ctt_size to CTF_LSIZE_SENT: we figure out whether this is
really justified and (almost always) repack things down into a
ctf_stype_t at ctf_serialize time.
This allows us to eliminate the dynamic member paths from the iterators and
query functions in ctf-types.c in favour of always using the large-structure
vlen stuff for dynamic types (the diff is ugly but that's just because of the
volume of reindentation this calls for). This also means the large-structure
vlen stuff gets more heavily tested, which is nice because it was an almost
totally unused code path before now (it only kicked in for structures of size
>4GiB, and how often do you see those?)
The only extra complexity here is ctf_add_type. Back in the days of the
nondeduplicating linker this was called a ridiculous number of times for
countless identical copies of structures: eschewing the repeated lookups of the
dtd in ctf_add_member_offset and adding the members directly saved an amazing
amount of time. Now the nondeduplicating linker is gone, this is extreme
overoptimization: we can rip out the direct addition and use ctf_member_next and
ctf_add_member_offset, just like ctf_dedup_emit does.
We augment a ctf_add_type test to try adding a self-referential struct, the only
thing the ctf_add_type part of this change really perturbs.
This completes the elimination of dtd_u.
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (ctf_dtdef_t) <dtu_members>: Remove.
<dtd_u>: Likewise.
(ctf_dmdef_t): Remove.
(struct ctf_next) <u.ctn_dmd>: Remove.
* ctf-create.c (INITIAL_VLEN): New, more-or-less arbitrary initial
vlen size.
(ctf_add_enum): Use it.
(ctf_dtd_delete): Do not free the (removed) dmd; remove string
refs from the vlen on struct deletion.
(ctf_add_struct_sized): Populate the vlen: do it by hand if
promoting forwards. Always populate the full-size
lsizehi/lsizelo members.
(ctf_add_union_sized): Likewise.
(ctf_add_member_offset): Set up the vlen rather than the dmd.
Expand it as needed, repointing string refs via
ctf_str_move_pending. Add the member names as pending strings.
Always populate the full-size lsizehi/lsizelo members.
(membadd): Remove, folding back into...
(ctf_add_type_internal): ... here, adding via an ordinary
ctf_add_struct_sized and _next iteration rather than doing
everything by hand.
* ctf-serialize.c (ctf_copy_smembers): Remove this...
(ctf_copy_lmembers): ... and this...
(ctf_emit_type_sect): ... folding into here. Figure out if a
ctf_stype_t is needed here, not in ctf_add_*_sized.
(ctf_type_sect_size): Figure out the ctf_stype_t stuff the same
way here.
* ctf-types.c (ctf_member_next): Remove the dmd path and always
use the vlen. Force large-structure usage for dynamic types.
(ctf_type_align): Likewise.
(ctf_member_info): Likewise.
(ctf_type_rvisit): Likewise.
* testsuite/libctf-regression/type-add-unnamed-struct-ctf.c: Add a
self-referential type to this test.
* testsuite/libctf-regression/type-add-unnamed-struct.c: Adjusted
accordingly.
* testsuite/libctf-regression/type-add-unnamed-struct.lk: Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
if (type_ctt_size < CTF_LSTRUCT_THRESH)
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
type_size += sizeof (ctf_member_t) * vlen;
|
|
|
|
else
|
|
|
|
type_size += sizeof (ctf_lmember_t) * vlen;
|
|
|
|
break;
|
|
|
|
case CTF_K_ENUM:
|
|
|
|
type_size += sizeof (ctf_enum_t) * vlen;
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
2021-03-18 20:37:52 +08:00
|
|
|
|
libctf: make ctf_serialize() actually serialize
ctf_serialize() evolved from the old ctf_update(), which mutated the
in-memory CTF dict to make all the dynamic in-memory types into static,
unchanging written-to-the-dict types (by deserializing and reserializing
it): back in the days when you could only do type lookups on static types,
this meant you could see all the types you added recently, at the small,
small cost of making it impossible to change those older types ever again
and inducing an amortized O(n^2) cost if you actually wanted to add
references to types you added at arbitrary times to later types.
It also reset things so that ctf_discard() would throw away only types you
added after the most recent ctf_update() call.
Some time ago this was all changed so that you could look up dynamic types
just as easily as static types: ctf_update() changed so that only its
visible side-effect of affecting ctf_discard() remained: the old
ctf_update() was renamed to ctf_serialize(), made internal to libctf, and
called from the various functions that wrote files out.
... but it was still working by serializing and deserializing the entire
dict, swapping out its guts with the newly-serialized copy in an invasive
and horrible fashion that coupled ctf_serialize() to almost every field in
the ctf_dict_t. This is totally useless, and fixing it is easy: just rip
all that code out and have ctf_serialize return a serialized representation,
and let everything use that directly. This simplifies most of its callers
significantly.
(It also points up another bug: ctf_gzwrite() failed to call ctf_serialize()
at all, so it would only ever work for a dict you just ctf_write_mem()ed
yourself, just for its invisible side-effect of serializing the dict!)
This lets us simplify away a bunch of internal-only open-side functionality
for overriding the syn_ext_strtab and some just-added functionality for
forcing in an existing atoms table, without loss of functionality, and lets
us lift the restriction on reserializing a dict that was ctf_open()ed rather
than being ctf_create()d: it's now perfectly OK to open a dict, modify it
(except for adding members to existing structs, unions, or enums, which
fails with -ECTF_RDONLY), and write it out again, just as one would expect.
libctf/
* ctf-serialize.c (ctf_symtypetab_sect_sizes): Fix typos.
(ctf_type_sect_size): Add static type sizes too.
(ctf_serialize): Return the new dict rather than updating the
existing dict. No longer fail for dicts with static types;
copy them onto the start of the new types table.
(ctf_gzwrite): Actually serialize before gzwriting.
(ctf_write_mem): Improve forced (test-mode) endian-flipping:
flip dicts even if they are too small to be compressed.
Improve confusing variable naming.
* ctf-archive.c (arc_write_one_ctf): Don't bother to call
ctf_serialize: both the functions we call do so.
* ctf-string.c (ctf_str_create_atoms): Drop serializing case
(atoms arg).
* ctf-open.c (ctf_simple_open): Call ctf_bufopen directly.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Delete/rename to ctf_bufopen: no
longer bother with syn_ext_strtab or forced atoms table,
serialization no longer needs them.
* ctf-create.c (ctf_create): Call ctf_bufopen directly.
* ctf-impl.h (ctf_str_create_atoms): Drop atoms arg.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Likewise.
(ctf_serialize): Adjust.
* testsuite/libctf-lookup/add-to-opened.c: Adjust now that
this is supposed to work.
2024-03-26 21:04:20 +08:00
|
|
|
return type_size + fp->ctf_header->cth_stroff - fp->ctf_header->cth_typeoff;
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
/* Take a final lap through the dynamic type definition list and copy the
|
|
|
|
appropriate type records to the output buffer, noting down the strings as
|
|
|
|
we go. */
|
|
|
|
|
|
|
|
static void
|
|
|
|
ctf_emit_type_sect (ctf_dict_t *fp, unsigned char **tptr)
|
|
|
|
{
|
|
|
|
unsigned char *t = *tptr;
|
|
|
|
ctf_dtdef_t *dtd;
|
2021-03-18 20:37:52 +08:00
|
|
|
|
|
|
|
for (dtd = ctf_list_next (&fp->ctf_dtdefs);
|
|
|
|
dtd != NULL; dtd = ctf_list_next (dtd))
|
|
|
|
{
|
|
|
|
uint32_t kind = LCTF_INFO_KIND (fp, dtd->dtd_data.ctt_info);
|
|
|
|
uint32_t vlen = LCTF_INFO_VLEN (fp, dtd->dtd_data.ctt_info);
|
libctf: eliminate dtd_u, part 5: structs / unions
Eliminate the dynamic member storage for structs and unions as we have
for other dynamic types. This is much like the previous enum
elimination, except that structs and unions are the only types for which
a full-sized ctf_type_t might be needed. Up to now, this decision has
been made in the individual ctf_add_{struct,union}_sized functions and
duplicated in ctf_add_member_offset. The vlen machinery lets us
simplify this, always allocating a ctf_lmember_t and setting the
dtd_data's ctt_size to CTF_LSIZE_SENT: we figure out whether this is
really justified and (almost always) repack things down into a
ctf_stype_t at ctf_serialize time.
This allows us to eliminate the dynamic member paths from the iterators and
query functions in ctf-types.c in favour of always using the large-structure
vlen stuff for dynamic types (the diff is ugly but that's just because of the
volume of reindentation this calls for). This also means the large-structure
vlen stuff gets more heavily tested, which is nice because it was an almost
totally unused code path before now (it only kicked in for structures of size
>4GiB, and how often do you see those?)
The only extra complexity here is ctf_add_type. Back in the days of the
nondeduplicating linker this was called a ridiculous number of times for
countless identical copies of structures: eschewing the repeated lookups of the
dtd in ctf_add_member_offset and adding the members directly saved an amazing
amount of time. Now the nondeduplicating linker is gone, this is extreme
overoptimization: we can rip out the direct addition and use ctf_member_next and
ctf_add_member_offset, just like ctf_dedup_emit does.
We augment a ctf_add_type test to try adding a self-referential struct, the only
thing the ctf_add_type part of this change really perturbs.
This completes the elimination of dtd_u.
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (ctf_dtdef_t) <dtu_members>: Remove.
<dtd_u>: Likewise.
(ctf_dmdef_t): Remove.
(struct ctf_next) <u.ctn_dmd>: Remove.
* ctf-create.c (INITIAL_VLEN): New, more-or-less arbitrary initial
vlen size.
(ctf_add_enum): Use it.
(ctf_dtd_delete): Do not free the (removed) dmd; remove string
refs from the vlen on struct deletion.
(ctf_add_struct_sized): Populate the vlen: do it by hand if
promoting forwards. Always populate the full-size
lsizehi/lsizelo members.
(ctf_add_union_sized): Likewise.
(ctf_add_member_offset): Set up the vlen rather than the dmd.
Expand it as needed, repointing string refs via
ctf_str_move_pending. Add the member names as pending strings.
Always populate the full-size lsizehi/lsizelo members.
(membadd): Remove, folding back into...
(ctf_add_type_internal): ... here, adding via an ordinary
ctf_add_struct_sized and _next iteration rather than doing
everything by hand.
* ctf-serialize.c (ctf_copy_smembers): Remove this...
(ctf_copy_lmembers): ... and this...
(ctf_emit_type_sect): ... folding into here. Figure out if a
ctf_stype_t is needed here, not in ctf_add_*_sized.
(ctf_type_sect_size): Figure out the ctf_stype_t stuff the same
way here.
* ctf-types.c (ctf_member_next): Remove the dmd path and always
use the vlen. Force large-structure usage for dynamic types.
(ctf_type_align): Likewise.
(ctf_member_info): Likewise.
(ctf_type_rvisit): Likewise.
* testsuite/libctf-regression/type-add-unnamed-struct-ctf.c: Add a
self-referential type to this test.
* testsuite/libctf-regression/type-add-unnamed-struct.c: Adjusted
accordingly.
* testsuite/libctf-regression/type-add-unnamed-struct.lk: Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
size_t type_ctt_size = dtd->dtd_data.ctt_size;
|
2021-03-18 20:37:52 +08:00
|
|
|
size_t len;
|
|
|
|
ctf_stype_t *copied;
|
|
|
|
const char *name;
|
2021-03-18 20:37:52 +08:00
|
|
|
size_t i;
|
2021-03-18 20:37:52 +08:00
|
|
|
|
libctf: eliminate dtd_u, part 5: structs / unions
Eliminate the dynamic member storage for structs and unions as we have
for other dynamic types. This is much like the previous enum
elimination, except that structs and unions are the only types for which
a full-sized ctf_type_t might be needed. Up to now, this decision has
been made in the individual ctf_add_{struct,union}_sized functions and
duplicated in ctf_add_member_offset. The vlen machinery lets us
simplify this, always allocating a ctf_lmember_t and setting the
dtd_data's ctt_size to CTF_LSIZE_SENT: we figure out whether this is
really justified and (almost always) repack things down into a
ctf_stype_t at ctf_serialize time.
This allows us to eliminate the dynamic member paths from the iterators and
query functions in ctf-types.c in favour of always using the large-structure
vlen stuff for dynamic types (the diff is ugly but that's just because of the
volume of reindentation this calls for). This also means the large-structure
vlen stuff gets more heavily tested, which is nice because it was an almost
totally unused code path before now (it only kicked in for structures of size
>4GiB, and how often do you see those?)
The only extra complexity here is ctf_add_type. Back in the days of the
nondeduplicating linker this was called a ridiculous number of times for
countless identical copies of structures: eschewing the repeated lookups of the
dtd in ctf_add_member_offset and adding the members directly saved an amazing
amount of time. Now the nondeduplicating linker is gone, this is extreme
overoptimization: we can rip out the direct addition and use ctf_member_next and
ctf_add_member_offset, just like ctf_dedup_emit does.
We augment a ctf_add_type test to try adding a self-referential struct, the only
thing the ctf_add_type part of this change really perturbs.
This completes the elimination of dtd_u.
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (ctf_dtdef_t) <dtu_members>: Remove.
<dtd_u>: Likewise.
(ctf_dmdef_t): Remove.
(struct ctf_next) <u.ctn_dmd>: Remove.
* ctf-create.c (INITIAL_VLEN): New, more-or-less arbitrary initial
vlen size.
(ctf_add_enum): Use it.
(ctf_dtd_delete): Do not free the (removed) dmd; remove string
refs from the vlen on struct deletion.
(ctf_add_struct_sized): Populate the vlen: do it by hand if
promoting forwards. Always populate the full-size
lsizehi/lsizelo members.
(ctf_add_union_sized): Likewise.
(ctf_add_member_offset): Set up the vlen rather than the dmd.
Expand it as needed, repointing string refs via
ctf_str_move_pending. Add the member names as pending strings.
Always populate the full-size lsizehi/lsizelo members.
(membadd): Remove, folding back into...
(ctf_add_type_internal): ... here, adding via an ordinary
ctf_add_struct_sized and _next iteration rather than doing
everything by hand.
* ctf-serialize.c (ctf_copy_smembers): Remove this...
(ctf_copy_lmembers): ... and this...
(ctf_emit_type_sect): ... folding into here. Figure out if a
ctf_stype_t is needed here, not in ctf_add_*_sized.
(ctf_type_sect_size): Figure out the ctf_stype_t stuff the same
way here.
* ctf-types.c (ctf_member_next): Remove the dmd path and always
use the vlen. Force large-structure usage for dynamic types.
(ctf_type_align): Likewise.
(ctf_member_info): Likewise.
(ctf_type_rvisit): Likewise.
* testsuite/libctf-regression/type-add-unnamed-struct-ctf.c: Add a
self-referential type to this test.
* testsuite/libctf-regression/type-add-unnamed-struct.c: Adjusted
accordingly.
* testsuite/libctf-regression/type-add-unnamed-struct.lk: Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
/* Shrink ctf_type_t-using types from a ctf_type_t to a ctf_stype_t
|
|
|
|
if possible. */
|
|
|
|
|
|
|
|
if (kind == CTF_K_STRUCT || kind == CTF_K_UNION)
|
|
|
|
{
|
|
|
|
size_t lsize = CTF_TYPE_LSIZE (&dtd->dtd_data);
|
|
|
|
|
|
|
|
if (lsize <= CTF_MAX_SIZE)
|
|
|
|
type_ctt_size = lsize;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (type_ctt_size != CTF_LSIZE_SENT)
|
2021-03-18 20:37:52 +08:00
|
|
|
len = sizeof (ctf_stype_t);
|
|
|
|
else
|
|
|
|
len = sizeof (ctf_type_t);
|
|
|
|
|
|
|
|
memcpy (t, &dtd->dtd_data, len);
|
|
|
|
copied = (ctf_stype_t *) t; /* name is at the start: constant offset. */
|
|
|
|
if (copied->ctt_name
|
|
|
|
&& (name = ctf_strraw (fp, copied->ctt_name)) != NULL)
|
2024-01-30 21:40:56 +08:00
|
|
|
ctf_str_add_ref (fp, name, &copied->ctt_name);
|
libctf: eliminate dtd_u, part 5: structs / unions
Eliminate the dynamic member storage for structs and unions as we have
for other dynamic types. This is much like the previous enum
elimination, except that structs and unions are the only types for which
a full-sized ctf_type_t might be needed. Up to now, this decision has
been made in the individual ctf_add_{struct,union}_sized functions and
duplicated in ctf_add_member_offset. The vlen machinery lets us
simplify this, always allocating a ctf_lmember_t and setting the
dtd_data's ctt_size to CTF_LSIZE_SENT: we figure out whether this is
really justified and (almost always) repack things down into a
ctf_stype_t at ctf_serialize time.
This allows us to eliminate the dynamic member paths from the iterators and
query functions in ctf-types.c in favour of always using the large-structure
vlen stuff for dynamic types (the diff is ugly but that's just because of the
volume of reindentation this calls for). This also means the large-structure
vlen stuff gets more heavily tested, which is nice because it was an almost
totally unused code path before now (it only kicked in for structures of size
>4GiB, and how often do you see those?)
The only extra complexity here is ctf_add_type. Back in the days of the
nondeduplicating linker this was called a ridiculous number of times for
countless identical copies of structures: eschewing the repeated lookups of the
dtd in ctf_add_member_offset and adding the members directly saved an amazing
amount of time. Now the nondeduplicating linker is gone, this is extreme
overoptimization: we can rip out the direct addition and use ctf_member_next and
ctf_add_member_offset, just like ctf_dedup_emit does.
We augment a ctf_add_type test to try adding a self-referential struct, the only
thing the ctf_add_type part of this change really perturbs.
This completes the elimination of dtd_u.
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (ctf_dtdef_t) <dtu_members>: Remove.
<dtd_u>: Likewise.
(ctf_dmdef_t): Remove.
(struct ctf_next) <u.ctn_dmd>: Remove.
* ctf-create.c (INITIAL_VLEN): New, more-or-less arbitrary initial
vlen size.
(ctf_add_enum): Use it.
(ctf_dtd_delete): Do not free the (removed) dmd; remove string
refs from the vlen on struct deletion.
(ctf_add_struct_sized): Populate the vlen: do it by hand if
promoting forwards. Always populate the full-size
lsizehi/lsizelo members.
(ctf_add_union_sized): Likewise.
(ctf_add_member_offset): Set up the vlen rather than the dmd.
Expand it as needed, repointing string refs via
ctf_str_move_pending. Add the member names as pending strings.
Always populate the full-size lsizehi/lsizelo members.
(membadd): Remove, folding back into...
(ctf_add_type_internal): ... here, adding via an ordinary
ctf_add_struct_sized and _next iteration rather than doing
everything by hand.
* ctf-serialize.c (ctf_copy_smembers): Remove this...
(ctf_copy_lmembers): ... and this...
(ctf_emit_type_sect): ... folding into here. Figure out if a
ctf_stype_t is needed here, not in ctf_add_*_sized.
(ctf_type_sect_size): Figure out the ctf_stype_t stuff the same
way here.
* ctf-types.c (ctf_member_next): Remove the dmd path and always
use the vlen. Force large-structure usage for dynamic types.
(ctf_type_align): Likewise.
(ctf_member_info): Likewise.
(ctf_type_rvisit): Likewise.
* testsuite/libctf-regression/type-add-unnamed-struct-ctf.c: Add a
self-referential type to this test.
* testsuite/libctf-regression/type-add-unnamed-struct.c: Adjusted
accordingly.
* testsuite/libctf-regression/type-add-unnamed-struct.lk: Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
copied->ctt_size = type_ctt_size;
|
2021-03-18 20:37:52 +08:00
|
|
|
t += len;
|
|
|
|
|
|
|
|
switch (kind)
|
|
|
|
{
|
|
|
|
case CTF_K_INTEGER:
|
|
|
|
case CTF_K_FLOAT:
|
libctf: eliminate dtd_u, part 1: int/float/slice
This series eliminates a lot of special-case code to handle dynamic
types (types added to writable dicts and not yet serialized).
Historically, when such types have variable-length data in their final
CTF representations, libctf has always worked by adding such types to a
special union (ctf_dtdef_t.dtd_u) in the dynamic type definition
structure, then picking the members out of this structure at
serialization time and packing them into their final form.
This has the advantage that the ctf_add_* code doesn't need to know
anything about the final CTF representation, but the significant
disadvantage that all code that looks up types in any way needs two code
paths, one for dynamic types, one for all others. Historically libctf
"handled" this by not supporting most type lookups on dynamic types at
all until ctf_update was called to do a complete reserialization of the
entire dict (it didn't emit an error, it just emitted wrong results).
Since commit 676c3ecbad6e9c4, which eliminated ctf_update in favour of
the internal-only ctf_serialize function, all the type-lookup paths
grew an extra branch to handle dynamic types.
We can eliminate this branch again by dropping the dtd_u stuff and
simply writing out the vlen in (close to) its final form at ctf_add_*
time: type lookup for types using this approach is then identical for
types in writable dicts and types that are in read-only ones, and
serialization is also simplified (we just need to write out the vlen
we already created).
The only complexity lies in type kinds for which multiple
vlen representations are valid depending on properties of the type,
e.g. structures. But we can start simple, adjusting ints, floats,
and slices to work this way, and leaving everything else as is.
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (ctf_dtdef_t) <dtd_u.dtu_enc>: Remove.
<dtd_u.dtu_slice>: Likewise.
<dtd_vlen>: New.
* ctf-create.c (ctf_add_generic): Perhaps allocate it. All
callers adjusted.
(ctf_dtd_delete): Free it.
(ctf_add_slice): Use the dtd_vlen, not dtu_enc.
(ctf_add_encoded): Likewise. Assert that this must be an int or
float.
* ctf-serialize.c (ctf_emit_type_sect): Just copy the dtd_vlen.
* ctf-dedup.c (ctf_dedup_rhash_type): Use the dtd_vlen, not
dtu_slice.
* ctf-types.c (ctf_type_reference): Likewise.
(ctf_type_encoding): Remove most dynamic-type-specific code: just
get the vlen from the right place. Report failure to look up the
underlying type's encoding.
2021-03-18 20:37:52 +08:00
|
|
|
memcpy (t, dtd->dtd_vlen, sizeof (uint32_t));
|
|
|
|
t += sizeof (uint32_t);
|
2021-03-18 20:37:52 +08:00
|
|
|
break;
|
|
|
|
|
|
|
|
case CTF_K_SLICE:
|
libctf: eliminate dtd_u, part 1: int/float/slice
This series eliminates a lot of special-case code to handle dynamic
types (types added to writable dicts and not yet serialized).
Historically, when such types have variable-length data in their final
CTF representations, libctf has always worked by adding such types to a
special union (ctf_dtdef_t.dtd_u) in the dynamic type definition
structure, then picking the members out of this structure at
serialization time and packing them into their final form.
This has the advantage that the ctf_add_* code doesn't need to know
anything about the final CTF representation, but the significant
disadvantage that all code that looks up types in any way needs two code
paths, one for dynamic types, one for all others. Historically libctf
"handled" this by not supporting most type lookups on dynamic types at
all until ctf_update was called to do a complete reserialization of the
entire dict (it didn't emit an error, it just emitted wrong results).
Since commit 676c3ecbad6e9c4, which eliminated ctf_update in favour of
the internal-only ctf_serialize function, all the type-lookup paths
grew an extra branch to handle dynamic types.
We can eliminate this branch again by dropping the dtd_u stuff and
simply writing out the vlen in (close to) its final form at ctf_add_*
time: type lookup for types using this approach is then identical for
types in writable dicts and types that are in read-only ones, and
serialization is also simplified (we just need to write out the vlen
we already created).
The only complexity lies in type kinds for which multiple
vlen representations are valid depending on properties of the type,
e.g. structures. But we can start simple, adjusting ints, floats,
and slices to work this way, and leaving everything else as is.
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (ctf_dtdef_t) <dtd_u.dtu_enc>: Remove.
<dtd_u.dtu_slice>: Likewise.
<dtd_vlen>: New.
* ctf-create.c (ctf_add_generic): Perhaps allocate it. All
callers adjusted.
(ctf_dtd_delete): Free it.
(ctf_add_slice): Use the dtd_vlen, not dtu_enc.
(ctf_add_encoded): Likewise. Assert that this must be an int or
float.
* ctf-serialize.c (ctf_emit_type_sect): Just copy the dtd_vlen.
* ctf-dedup.c (ctf_dedup_rhash_type): Use the dtd_vlen, not
dtu_slice.
* ctf-types.c (ctf_type_reference): Likewise.
(ctf_type_encoding): Remove most dynamic-type-specific code: just
get the vlen from the right place. Report failure to look up the
underlying type's encoding.
2021-03-18 20:37:52 +08:00
|
|
|
memcpy (t, dtd->dtd_vlen, sizeof (struct ctf_slice));
|
2021-03-18 20:37:52 +08:00
|
|
|
t += sizeof (struct ctf_slice);
|
|
|
|
break;
|
|
|
|
|
|
|
|
case CTF_K_ARRAY:
|
2021-03-18 20:37:52 +08:00
|
|
|
memcpy (t, dtd->dtd_vlen, sizeof (struct ctf_array));
|
|
|
|
t += sizeof (struct ctf_array);
|
2021-03-18 20:37:52 +08:00
|
|
|
break;
|
|
|
|
|
|
|
|
case CTF_K_FUNCTION:
|
2021-03-26 00:32:46 +08:00
|
|
|
/* Functions with no args also have no vlen. */
|
|
|
|
if (dtd->dtd_vlen)
|
|
|
|
memcpy (t, dtd->dtd_vlen, sizeof (uint32_t) * (vlen + (vlen & 1)));
|
2021-03-18 20:37:52 +08:00
|
|
|
t += sizeof (uint32_t) * (vlen + (vlen & 1));
|
|
|
|
break;
|
2021-03-18 20:37:52 +08:00
|
|
|
|
libctf: eliminate dtd_u, part 5: structs / unions
Eliminate the dynamic member storage for structs and unions as we have
for other dynamic types. This is much like the previous enum
elimination, except that structs and unions are the only types for which
a full-sized ctf_type_t might be needed. Up to now, this decision has
been made in the individual ctf_add_{struct,union}_sized functions and
duplicated in ctf_add_member_offset. The vlen machinery lets us
simplify this, always allocating a ctf_lmember_t and setting the
dtd_data's ctt_size to CTF_LSIZE_SENT: we figure out whether this is
really justified and (almost always) repack things down into a
ctf_stype_t at ctf_serialize time.
This allows us to eliminate the dynamic member paths from the iterators and
query functions in ctf-types.c in favour of always using the large-structure
vlen stuff for dynamic types (the diff is ugly but that's just because of the
volume of reindentation this calls for). This also means the large-structure
vlen stuff gets more heavily tested, which is nice because it was an almost
totally unused code path before now (it only kicked in for structures of size
>4GiB, and how often do you see those?)
The only extra complexity here is ctf_add_type. Back in the days of the
nondeduplicating linker this was called a ridiculous number of times for
countless identical copies of structures: eschewing the repeated lookups of the
dtd in ctf_add_member_offset and adding the members directly saved an amazing
amount of time. Now the nondeduplicating linker is gone, this is extreme
overoptimization: we can rip out the direct addition and use ctf_member_next and
ctf_add_member_offset, just like ctf_dedup_emit does.
We augment a ctf_add_type test to try adding a self-referential struct, the only
thing the ctf_add_type part of this change really perturbs.
This completes the elimination of dtd_u.
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (ctf_dtdef_t) <dtu_members>: Remove.
<dtd_u>: Likewise.
(ctf_dmdef_t): Remove.
(struct ctf_next) <u.ctn_dmd>: Remove.
* ctf-create.c (INITIAL_VLEN): New, more-or-less arbitrary initial
vlen size.
(ctf_add_enum): Use it.
(ctf_dtd_delete): Do not free the (removed) dmd; remove string
refs from the vlen on struct deletion.
(ctf_add_struct_sized): Populate the vlen: do it by hand if
promoting forwards. Always populate the full-size
lsizehi/lsizelo members.
(ctf_add_union_sized): Likewise.
(ctf_add_member_offset): Set up the vlen rather than the dmd.
Expand it as needed, repointing string refs via
ctf_str_move_pending. Add the member names as pending strings.
Always populate the full-size lsizehi/lsizelo members.
(membadd): Remove, folding back into...
(ctf_add_type_internal): ... here, adding via an ordinary
ctf_add_struct_sized and _next iteration rather than doing
everything by hand.
* ctf-serialize.c (ctf_copy_smembers): Remove this...
(ctf_copy_lmembers): ... and this...
(ctf_emit_type_sect): ... folding into here. Figure out if a
ctf_stype_t is needed here, not in ctf_add_*_sized.
(ctf_type_sect_size): Figure out the ctf_stype_t stuff the same
way here.
* ctf-types.c (ctf_member_next): Remove the dmd path and always
use the vlen. Force large-structure usage for dynamic types.
(ctf_type_align): Likewise.
(ctf_member_info): Likewise.
(ctf_type_rvisit): Likewise.
* testsuite/libctf-regression/type-add-unnamed-struct-ctf.c: Add a
self-referential type to this test.
* testsuite/libctf-regression/type-add-unnamed-struct.c: Adjusted
accordingly.
* testsuite/libctf-regression/type-add-unnamed-struct.lk: Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
/* These need to be copied across element by element, depending on
|
|
|
|
their ctt_size. */
|
2021-03-18 20:37:52 +08:00
|
|
|
case CTF_K_STRUCT:
|
|
|
|
case CTF_K_UNION:
|
libctf: eliminate dtd_u, part 5: structs / unions
Eliminate the dynamic member storage for structs and unions as we have
for other dynamic types. This is much like the previous enum
elimination, except that structs and unions are the only types for which
a full-sized ctf_type_t might be needed. Up to now, this decision has
been made in the individual ctf_add_{struct,union}_sized functions and
duplicated in ctf_add_member_offset. The vlen machinery lets us
simplify this, always allocating a ctf_lmember_t and setting the
dtd_data's ctt_size to CTF_LSIZE_SENT: we figure out whether this is
really justified and (almost always) repack things down into a
ctf_stype_t at ctf_serialize time.
This allows us to eliminate the dynamic member paths from the iterators and
query functions in ctf-types.c in favour of always using the large-structure
vlen stuff for dynamic types (the diff is ugly but that's just because of the
volume of reindentation this calls for). This also means the large-structure
vlen stuff gets more heavily tested, which is nice because it was an almost
totally unused code path before now (it only kicked in for structures of size
>4GiB, and how often do you see those?)
The only extra complexity here is ctf_add_type. Back in the days of the
nondeduplicating linker this was called a ridiculous number of times for
countless identical copies of structures: eschewing the repeated lookups of the
dtd in ctf_add_member_offset and adding the members directly saved an amazing
amount of time. Now the nondeduplicating linker is gone, this is extreme
overoptimization: we can rip out the direct addition and use ctf_member_next and
ctf_add_member_offset, just like ctf_dedup_emit does.
We augment a ctf_add_type test to try adding a self-referential struct, the only
thing the ctf_add_type part of this change really perturbs.
This completes the elimination of dtd_u.
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (ctf_dtdef_t) <dtu_members>: Remove.
<dtd_u>: Likewise.
(ctf_dmdef_t): Remove.
(struct ctf_next) <u.ctn_dmd>: Remove.
* ctf-create.c (INITIAL_VLEN): New, more-or-less arbitrary initial
vlen size.
(ctf_add_enum): Use it.
(ctf_dtd_delete): Do not free the (removed) dmd; remove string
refs from the vlen on struct deletion.
(ctf_add_struct_sized): Populate the vlen: do it by hand if
promoting forwards. Always populate the full-size
lsizehi/lsizelo members.
(ctf_add_union_sized): Likewise.
(ctf_add_member_offset): Set up the vlen rather than the dmd.
Expand it as needed, repointing string refs via
ctf_str_move_pending. Add the member names as pending strings.
Always populate the full-size lsizehi/lsizelo members.
(membadd): Remove, folding back into...
(ctf_add_type_internal): ... here, adding via an ordinary
ctf_add_struct_sized and _next iteration rather than doing
everything by hand.
* ctf-serialize.c (ctf_copy_smembers): Remove this...
(ctf_copy_lmembers): ... and this...
(ctf_emit_type_sect): ... folding into here. Figure out if a
ctf_stype_t is needed here, not in ctf_add_*_sized.
(ctf_type_sect_size): Figure out the ctf_stype_t stuff the same
way here.
* ctf-types.c (ctf_member_next): Remove the dmd path and always
use the vlen. Force large-structure usage for dynamic types.
(ctf_type_align): Likewise.
(ctf_member_info): Likewise.
(ctf_type_rvisit): Likewise.
* testsuite/libctf-regression/type-add-unnamed-struct-ctf.c: Add a
self-referential type to this test.
* testsuite/libctf-regression/type-add-unnamed-struct.c: Adjusted
accordingly.
* testsuite/libctf-regression/type-add-unnamed-struct.lk: Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
{
|
|
|
|
ctf_lmember_t *dtd_vlen = (ctf_lmember_t *) dtd->dtd_vlen;
|
|
|
|
ctf_lmember_t *t_lvlen = (ctf_lmember_t *) t;
|
|
|
|
ctf_member_t *t_vlen = (ctf_member_t *) t;
|
|
|
|
|
|
|
|
for (i = 0; i < vlen; i++)
|
|
|
|
{
|
|
|
|
const char *name = ctf_strraw (fp, dtd_vlen[i].ctlm_name);
|
|
|
|
|
|
|
|
ctf_str_add_ref (fp, name, &dtd_vlen[i].ctlm_name);
|
|
|
|
|
|
|
|
if (type_ctt_size < CTF_LSTRUCT_THRESH)
|
|
|
|
{
|
|
|
|
t_vlen[i].ctm_name = dtd_vlen[i].ctlm_name;
|
|
|
|
t_vlen[i].ctm_type = dtd_vlen[i].ctlm_type;
|
|
|
|
t_vlen[i].ctm_offset = CTF_LMEM_OFFSET (&dtd_vlen[i]);
|
|
|
|
ctf_str_add_ref (fp, name, &t_vlen[i].ctm_name);
|
|
|
|
}
|
|
|
|
else
|
|
|
|
{
|
|
|
|
t_lvlen[i] = dtd_vlen[i];
|
|
|
|
ctf_str_add_ref (fp, name, &t_lvlen[i].ctlm_name);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
if (type_ctt_size < CTF_LSTRUCT_THRESH)
|
|
|
|
t += sizeof (ctf_member_t) * vlen;
|
2021-03-18 20:37:52 +08:00
|
|
|
else
|
libctf: eliminate dtd_u, part 5: structs / unions
Eliminate the dynamic member storage for structs and unions as we have
for other dynamic types. This is much like the previous enum
elimination, except that structs and unions are the only types for which
a full-sized ctf_type_t might be needed. Up to now, this decision has
been made in the individual ctf_add_{struct,union}_sized functions and
duplicated in ctf_add_member_offset. The vlen machinery lets us
simplify this, always allocating a ctf_lmember_t and setting the
dtd_data's ctt_size to CTF_LSIZE_SENT: we figure out whether this is
really justified and (almost always) repack things down into a
ctf_stype_t at ctf_serialize time.
This allows us to eliminate the dynamic member paths from the iterators and
query functions in ctf-types.c in favour of always using the large-structure
vlen stuff for dynamic types (the diff is ugly but that's just because of the
volume of reindentation this calls for). This also means the large-structure
vlen stuff gets more heavily tested, which is nice because it was an almost
totally unused code path before now (it only kicked in for structures of size
>4GiB, and how often do you see those?)
The only extra complexity here is ctf_add_type. Back in the days of the
nondeduplicating linker this was called a ridiculous number of times for
countless identical copies of structures: eschewing the repeated lookups of the
dtd in ctf_add_member_offset and adding the members directly saved an amazing
amount of time. Now the nondeduplicating linker is gone, this is extreme
overoptimization: we can rip out the direct addition and use ctf_member_next and
ctf_add_member_offset, just like ctf_dedup_emit does.
We augment a ctf_add_type test to try adding a self-referential struct, the only
thing the ctf_add_type part of this change really perturbs.
This completes the elimination of dtd_u.
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-impl.h (ctf_dtdef_t) <dtu_members>: Remove.
<dtd_u>: Likewise.
(ctf_dmdef_t): Remove.
(struct ctf_next) <u.ctn_dmd>: Remove.
* ctf-create.c (INITIAL_VLEN): New, more-or-less arbitrary initial
vlen size.
(ctf_add_enum): Use it.
(ctf_dtd_delete): Do not free the (removed) dmd; remove string
refs from the vlen on struct deletion.
(ctf_add_struct_sized): Populate the vlen: do it by hand if
promoting forwards. Always populate the full-size
lsizehi/lsizelo members.
(ctf_add_union_sized): Likewise.
(ctf_add_member_offset): Set up the vlen rather than the dmd.
Expand it as needed, repointing string refs via
ctf_str_move_pending. Add the member names as pending strings.
Always populate the full-size lsizehi/lsizelo members.
(membadd): Remove, folding back into...
(ctf_add_type_internal): ... here, adding via an ordinary
ctf_add_struct_sized and _next iteration rather than doing
everything by hand.
* ctf-serialize.c (ctf_copy_smembers): Remove this...
(ctf_copy_lmembers): ... and this...
(ctf_emit_type_sect): ... folding into here. Figure out if a
ctf_stype_t is needed here, not in ctf_add_*_sized.
(ctf_type_sect_size): Figure out the ctf_stype_t stuff the same
way here.
* ctf-types.c (ctf_member_next): Remove the dmd path and always
use the vlen. Force large-structure usage for dynamic types.
(ctf_type_align): Likewise.
(ctf_member_info): Likewise.
(ctf_type_rvisit): Likewise.
* testsuite/libctf-regression/type-add-unnamed-struct-ctf.c: Add a
self-referential type to this test.
* testsuite/libctf-regression/type-add-unnamed-struct.c: Adjusted
accordingly.
* testsuite/libctf-regression/type-add-unnamed-struct.lk: Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
t += sizeof (ctf_lmember_t) * vlen;
|
2021-03-18 20:37:52 +08:00
|
|
|
break;
|
|
|
|
|
|
|
|
case CTF_K_ENUM:
|
2021-03-18 20:37:52 +08:00
|
|
|
{
|
|
|
|
ctf_enum_t *dtd_vlen = (struct ctf_enum *) dtd->dtd_vlen;
|
|
|
|
ctf_enum_t *t_vlen = (struct ctf_enum *) t;
|
|
|
|
|
|
|
|
memcpy (t, dtd->dtd_vlen, sizeof (struct ctf_enum) * vlen);
|
|
|
|
for (i = 0; i < vlen; i++)
|
|
|
|
{
|
|
|
|
const char *name = ctf_strraw (fp, dtd_vlen[i].cte_name);
|
|
|
|
|
|
|
|
ctf_str_add_ref (fp, name, &t_vlen[i].cte_name);
|
|
|
|
ctf_str_add_ref (fp, name, &dtd_vlen[i].cte_name);
|
|
|
|
}
|
|
|
|
t += sizeof (struct ctf_enum) * vlen;
|
|
|
|
|
|
|
|
break;
|
|
|
|
}
|
2021-03-18 20:37:52 +08:00
|
|
|
}
|
|
|
|
}
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
|
|
|
|
*tptr = t;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Variable section. */
|
|
|
|
|
|
|
|
/* Sort a newly-constructed static variable array. */
|
|
|
|
|
|
|
|
typedef struct ctf_sort_var_arg_cb
|
|
|
|
{
|
|
|
|
ctf_dict_t *fp;
|
|
|
|
ctf_strs_t *strtab;
|
|
|
|
} ctf_sort_var_arg_cb_t;
|
|
|
|
|
|
|
|
static int
|
|
|
|
ctf_sort_var (const void *one_, const void *two_, void *arg_)
|
|
|
|
{
|
|
|
|
const ctf_varent_t *one = one_;
|
|
|
|
const ctf_varent_t *two = two_;
|
|
|
|
ctf_sort_var_arg_cb_t *arg = arg_;
|
|
|
|
|
|
|
|
return (strcmp (ctf_strraw_explicit (arg->fp, one->ctv_name, arg->strtab),
|
|
|
|
ctf_strraw_explicit (arg->fp, two->ctv_name, arg->strtab)));
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Overall serialization. */
|
|
|
|
|
libctf: make ctf_serialize() actually serialize
ctf_serialize() evolved from the old ctf_update(), which mutated the
in-memory CTF dict to make all the dynamic in-memory types into static,
unchanging written-to-the-dict types (by deserializing and reserializing
it): back in the days when you could only do type lookups on static types,
this meant you could see all the types you added recently, at the small,
small cost of making it impossible to change those older types ever again
and inducing an amortized O(n^2) cost if you actually wanted to add
references to types you added at arbitrary times to later types.
It also reset things so that ctf_discard() would throw away only types you
added after the most recent ctf_update() call.
Some time ago this was all changed so that you could look up dynamic types
just as easily as static types: ctf_update() changed so that only its
visible side-effect of affecting ctf_discard() remained: the old
ctf_update() was renamed to ctf_serialize(), made internal to libctf, and
called from the various functions that wrote files out.
... but it was still working by serializing and deserializing the entire
dict, swapping out its guts with the newly-serialized copy in an invasive
and horrible fashion that coupled ctf_serialize() to almost every field in
the ctf_dict_t. This is totally useless, and fixing it is easy: just rip
all that code out and have ctf_serialize return a serialized representation,
and let everything use that directly. This simplifies most of its callers
significantly.
(It also points up another bug: ctf_gzwrite() failed to call ctf_serialize()
at all, so it would only ever work for a dict you just ctf_write_mem()ed
yourself, just for its invisible side-effect of serializing the dict!)
This lets us simplify away a bunch of internal-only open-side functionality
for overriding the syn_ext_strtab and some just-added functionality for
forcing in an existing atoms table, without loss of functionality, and lets
us lift the restriction on reserializing a dict that was ctf_open()ed rather
than being ctf_create()d: it's now perfectly OK to open a dict, modify it
(except for adding members to existing structs, unions, or enums, which
fails with -ECTF_RDONLY), and write it out again, just as one would expect.
libctf/
* ctf-serialize.c (ctf_symtypetab_sect_sizes): Fix typos.
(ctf_type_sect_size): Add static type sizes too.
(ctf_serialize): Return the new dict rather than updating the
existing dict. No longer fail for dicts with static types;
copy them onto the start of the new types table.
(ctf_gzwrite): Actually serialize before gzwriting.
(ctf_write_mem): Improve forced (test-mode) endian-flipping:
flip dicts even if they are too small to be compressed.
Improve confusing variable naming.
* ctf-archive.c (arc_write_one_ctf): Don't bother to call
ctf_serialize: both the functions we call do so.
* ctf-string.c (ctf_str_create_atoms): Drop serializing case
(atoms arg).
* ctf-open.c (ctf_simple_open): Call ctf_bufopen directly.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Delete/rename to ctf_bufopen: no
longer bother with syn_ext_strtab or forced atoms table,
serialization no longer needs them.
* ctf-create.c (ctf_create): Call ctf_bufopen directly.
* ctf-impl.h (ctf_str_create_atoms): Drop atoms arg.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Likewise.
(ctf_serialize): Adjust.
* testsuite/libctf-lookup/add-to-opened.c: Adjust now that
this is supposed to work.
2024-03-26 21:04:20 +08:00
|
|
|
/* Emit a new CTF dict which is a serialized copy of this one: also reify
|
|
|
|
the string table and update all offsets in the current dict suitably.
|
|
|
|
(This simplifies ctf-string.c a little, at the cost of storing a second
|
|
|
|
copy of the strtab if this dict was originally read in via ctf_open.)
|
libctf: support addition of types to dicts read via ctf_open()
libctf has long declared deserialized dictionaries (out of files or ELF
sections or memory buffers or whatever) to be read-only: back in the
furthest prehistory this was not the case, in that you could add a
few sorts of type to such dicts, but attempting to do so often caused
horrible memory corruption, so I banned the lot.
But it turns out real consumers want it (notably DTrace, which
synthesises pointers to types that don't have them and adds them to the
ctf_open()ed dicts if it needs them). Let's bring it back again, but
without the memory corruption and without the massive code duplication
required in days of yore to distinguish between static and dynamic
types: the representation of both types has been identical for a few
years, with the only difference being that types as a whole are stored in
a big buffer for types read in via ctf_open and per-type hashtables for
newly-added types.
So we discard the internally-visible concept of "readonly dictionaries"
in favour of declaring the *range of types* that were already present
when the dict was read in to be read-only: you can't modify them (say,
by adding members to them if they're structs, or calling ctf_set_array
on them), but you can add more types and point to them. (The API
remains the same, with calls sometimes returning ECTF_RDONLY, but now
they do so less often.)
This is a fairly invasive change, mostly because code written since the
ban was introduced didn't take the possibility of a static/dynamic split
into account. Some of these irregularities were hard to define as
anything but bugs.
Notably:
- The symbol handling was assuming that symbols only needed to be
looked for in dynamic hashtabs or static linker-laid-out indexed/
nonindexed layouts, but now we want to check both in case people
added more symbols to a dict they opened.
- The code that handles type additions wasn't checking to see if types
with the same name existed *at all* (so you could do
ctf_add_typedef (fp, "foo", bar) repeatedly without error). This
seems reasonable for types you just added, but we probably *do* want
to ban addition of types with names that override names we already
used in the ctf_open()ed portion, since that would probably corrupt
existing type relationships. (Doing things this way also avoids
causing new errors for any existing code that was doing this sort of
thing.)
- ctf_lookup_variable entirely failed to work for variables just added
by ctf_add_variable: you had to write the dict out and read it back
in again before they appeared.
- The symbol handling remembered what symbols you looked up but didn't
remember their types, so you could look up an object symbol and then
find it popping up when you asked for function symbols, which seems
less than ideal. Since we had to rejig things enough to be able to
distinguish function and object symbols internally anyway (in order
to give suitable errors if you try to add a symbol with a name that
already existed in the ctf_open()ed dict), this bug suddenly became
more visible and was easily fixed.
We do not (yet) support writing out dicts that have been previously read
in via ctf_open() or other deserializer (you can look things up in them,
but not write them out a second time). This never worked, so there is
no incompatibility; if it is needed at a later date, the serializer is a
little bit closer to having it work now (the only table we don't deal
with is the types table, and that's because the upcoming CTFv4 changes
are likely to make major changes to the way that table is represented
internally, so adding more code that depends on its current form seems
like a bad idea).
There is a new testcase that tests much of this, in particular that
modification of existing types is still banned and that you can add new
ones and chase them without error.
libctf/
* ctf-impl.h (struct ctf_dict.ctf_symhash): Split into...
(ctf_dict.ctf_symhash_func): ... this and...
(ctf_dict.ctf_symhash_objt): ... this.
(ctf_dict.ctf_stypes): New, counts static types.
(LCTF_INDEX_TO_TYPEPTR): Use it instead of CTF_RDWR.
(LCTF_RDWR): Deleted.
(LCTF_DIRTY): Renumbered.
(LCTF_LINKING): Likewise.
(ctf_lookup_variable_here): New.
(ctf_lookup_by_sym_or_name): Likewise.
(ctf_symbol_next_static): Likewise.
(ctf_add_variable_forced): Likewise.
(ctf_add_funcobjt_sym_forced): Likewise.
(ctf_simple_open_internal): Adjust.
(ctf_bufopen_internal): Likewise.
* ctf-create.c (ctf_grow_ptrtab): Adjust a lot to start with.
(ctf_create): Migrate a bunch of initializations into bufopen.
Force recreation of name tables. Do not forcibly override the
model, let ctf_bufopen do it.
(ctf_static_type): New.
(ctf_update): Drop LCTF_RDWR check.
(ctf_dynamic_type): Likewise.
(ctf_add_function): Likewise.
(ctf_add_type_internal): Likewise.
(ctf_rollback): Check ctf_stypes, not LCTF_RDWR.
(ctf_set_array): Likewise.
(ctf_add_struct_sized): Likewise.
(ctf_add_union_sized): Likewise.
(ctf_add_enum): Likewise.
(ctf_add_enumerator): Likewise (only on the target dict).
(ctf_add_member_offset): Likewise.
(ctf_add_generic): Drop LCTF_RDWR check. Ban addition of types
with colliding names.
(ctf_add_forward): Note safety under the new rules.
(ctf_add_variable): Split all but the existence check into...
(ctf_add_variable_forced): ... this new function.
(ctf_add_funcobjt_sym): Likewise...
(ctf_add_funcobjt_sym_forced): ... for this new function.
* ctf-link.c (ctf_link_add_linker_symbol): Ban calling on dicts
with any stypes.
(ctf_link_add_strtab): Likewise.
(ctf_link_shuffle_syms): Likewise.
(ctf_link_intern_extern_string): Note pre-existing prohibition.
* ctf-lookup.c (ctf_lookup_by_id): Drop LCTF_RDWR check.
(ctf_lookup_variable): Split out looking in a dict but not
its parent into...
(ctf_lookup_variable_here): ... this new function.
(ctf_lookup_symbol_idx): Track whether looking up a function or
object: cache them separately.
(ctf_symbol_next): Split out looking in non-dynamic symtypetab
entries to...
(ctf_symbol_next_static): ... this new function. Don't get confused
by the simultaneous presence of static and dynamic symtypetab entries.
(ctf_try_lookup_indexed): Don't waste time looking up symbols by
index before there can be any idea how symbols are numbered.
(ctf_lookup_by_sym_or_name): Distinguish between function and
data object lookups. Drop LCTF_RDWR.
(ctf_lookup_by_symbol): Adjust.
(ctf_lookup_by_symbol_name): Likewise.
* ctf-open.c (init_types): Rename to...
(init_static_types): ... this. Drop LCTF_RDWR. Populate ctf_stypes.
(ctf_simple_open): Drop writable arg.
(ctf_simple_open_internal): Likewise.
(ctf_bufopen): Likewise.
(ctf_bufopen_internal): Populate fields only used for writable dicts.
Drop LCTF_RDWR.
(ctf_dict_close): Cater for symhash cache split.
* ctf-serialize.c (ctf_serialize): Use ctf_stypes, not LCTF_RDWR.
* ctf-types.c (ctf_variable_next): Drop LCTF_RDWR.
* testsuite/libctf-lookup/add-to-opened*: New test.
2023-12-20 00:58:19 +08:00
|
|
|
|
libctf: make ctf_serialize() actually serialize
ctf_serialize() evolved from the old ctf_update(), which mutated the
in-memory CTF dict to make all the dynamic in-memory types into static,
unchanging written-to-the-dict types (by deserializing and reserializing
it): back in the days when you could only do type lookups on static types,
this meant you could see all the types you added recently, at the small,
small cost of making it impossible to change those older types ever again
and inducing an amortized O(n^2) cost if you actually wanted to add
references to types you added at arbitrary times to later types.
It also reset things so that ctf_discard() would throw away only types you
added after the most recent ctf_update() call.
Some time ago this was all changed so that you could look up dynamic types
just as easily as static types: ctf_update() changed so that only its
visible side-effect of affecting ctf_discard() remained: the old
ctf_update() was renamed to ctf_serialize(), made internal to libctf, and
called from the various functions that wrote files out.
... but it was still working by serializing and deserializing the entire
dict, swapping out its guts with the newly-serialized copy in an invasive
and horrible fashion that coupled ctf_serialize() to almost every field in
the ctf_dict_t. This is totally useless, and fixing it is easy: just rip
all that code out and have ctf_serialize return a serialized representation,
and let everything use that directly. This simplifies most of its callers
significantly.
(It also points up another bug: ctf_gzwrite() failed to call ctf_serialize()
at all, so it would only ever work for a dict you just ctf_write_mem()ed
yourself, just for its invisible side-effect of serializing the dict!)
This lets us simplify away a bunch of internal-only open-side functionality
for overriding the syn_ext_strtab and some just-added functionality for
forcing in an existing atoms table, without loss of functionality, and lets
us lift the restriction on reserializing a dict that was ctf_open()ed rather
than being ctf_create()d: it's now perfectly OK to open a dict, modify it
(except for adding members to existing structs, unions, or enums, which
fails with -ECTF_RDONLY), and write it out again, just as one would expect.
libctf/
* ctf-serialize.c (ctf_symtypetab_sect_sizes): Fix typos.
(ctf_type_sect_size): Add static type sizes too.
(ctf_serialize): Return the new dict rather than updating the
existing dict. No longer fail for dicts with static types;
copy them onto the start of the new types table.
(ctf_gzwrite): Actually serialize before gzwriting.
(ctf_write_mem): Improve forced (test-mode) endian-flipping:
flip dicts even if they are too small to be compressed.
Improve confusing variable naming.
* ctf-archive.c (arc_write_one_ctf): Don't bother to call
ctf_serialize: both the functions we call do so.
* ctf-string.c (ctf_str_create_atoms): Drop serializing case
(atoms arg).
* ctf-open.c (ctf_simple_open): Call ctf_bufopen directly.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Delete/rename to ctf_bufopen: no
longer bother with syn_ext_strtab or forced atoms table,
serialization no longer needs them.
* ctf-create.c (ctf_create): Call ctf_bufopen directly.
* ctf-impl.h (ctf_str_create_atoms): Drop atoms arg.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Likewise.
(ctf_serialize): Adjust.
* testsuite/libctf-lookup/add-to-opened.c: Adjust now that
this is supposed to work.
2024-03-26 21:04:20 +08:00
|
|
|
Other aspects of the existing dict are unchanged, although some
|
|
|
|
static entries may be duplicated in the dynamic state (which should
|
|
|
|
have no effect on visible operation). */
|
|
|
|
|
|
|
|
static unsigned char *
|
|
|
|
ctf_serialize (ctf_dict_t *fp, size_t *bufsiz)
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
{
|
|
|
|
ctf_header_t hdr, *hdrp;
|
|
|
|
ctf_dvdef_t *dvd;
|
|
|
|
ctf_varent_t *dvarents;
|
libctf: rethink strtab writeout
This commit finally adjusts strtab writeout so that repeated writeouts, or
writeouts of a dict that was read in earlier, only sorts the portion of the
strtab that was newly added.
There are three intertwined changes here:
- pull the contents of strtabs from newly ctf_bufopened dicts into the
atoms table, so that future additions will reuse the existing offset etc
rather than adding new identical strings
- allow the internal ctf_bufopen done by serialization to contribute its
existing atoms table, so that existing atoms can be used for the
remainder of the open process (like name table construction): this atoms
table currente gets thrown away in the mass reassignment done later in
ctf_serialize in any case, but it needs to be there during the open.
- rewrite ctf_str_write_strtab so that a) it uses iterators rather than
ctf_*_iter, reducing pointless structures which serve no other purpose
than to implement ordinary variable scope, but more clunkily, and b)
retains the existing strtab on the front of the new one, with its sort
retained, rather than resorting, so all existing already-written strtab
offsets remain valid across the call.
This latter change finally permits repeated serializations, and
reserializations of ctf_open()ed dicts, to work, but for now we keep the
code that prevents that because serialization is about to change again in a
way that will make it more obvious that doing such things is safe, and we
can take it out then.
(There are also some smaller changes like moving the purge of the refs table
into ctf_str_write_strtab(), since that's where the changes happen that
invalidate it, rather than doing it in ctf_serialize(). We also prohibit
something that has never worked, opening a dict and then reporting symbols
to it via ctf_link_add_strtab() et al: you must do that to newly-created
dicts which have had stuff ctf_link()ed into them. This is very unlikely
ever to be a problem in practice: linkers just don't do that sort of thing.)
libctf/
* ctf-create.c (ctf_create): Add (temporary) atoms arg.
* ctf-impl.h (struct ctf_dict.ctf_dynstrtab): New.
(ctf_str_create_atoms): Adjust.
(ctf_str_write_strtab): Likewise.
(ctf_simple_open_internal): Likewise.
* ctf-open.c (ctf_simple_open_internal): Add atoms arg.
(ctf_bufopen): Likewise.
(ctf_bufopen_internal): Initialize just enough of an
atoms table: pre-init from the atoms arg if supplied.
(ctf_simple_open): Adjust.
* ctf-serialize.c (ctf_serialize): Constify the strtab.
Move ref list purging into ctf_str_write_strtab.
Initialize the new dict with the old dict's atoms table.
Accept the new strtab from ctf_str_write_strtab.
Adjust for addition of ctf_dynstrtab.
* ctf-string.c (ctf_strraw_explicit): Improve comments.
(ctf_str_create_atoms): Prepopulate from an existing atoms table,
or alternatively pull in all strings from the strtab and turn
them into atoms.
(ctf_str_free_atoms): Free the dynstrtab and its strtab.
(struct ctf_strtab_write_state): Remove.
(ctf_str_count_strtab): Fold this...
(ctf_str_populate_sorttab): ... and this...
(ctf_str_write_strtab): ... into this. Prepend existing strings
to the strtab rather than resorting them (and wrecking their
offsets). Keep the dynstrtab updated. Update refs for all
atoms with refs, whether or not they are strings newly added
to the strtab.
2024-03-26 03:07:43 +08:00
|
|
|
const ctf_strs_writable_t *strtab;
|
libctf: support addition of types to dicts read via ctf_open()
libctf has long declared deserialized dictionaries (out of files or ELF
sections or memory buffers or whatever) to be read-only: back in the
furthest prehistory this was not the case, in that you could add a
few sorts of type to such dicts, but attempting to do so often caused
horrible memory corruption, so I banned the lot.
But it turns out real consumers want it (notably DTrace, which
synthesises pointers to types that don't have them and adds them to the
ctf_open()ed dicts if it needs them). Let's bring it back again, but
without the memory corruption and without the massive code duplication
required in days of yore to distinguish between static and dynamic
types: the representation of both types has been identical for a few
years, with the only difference being that types as a whole are stored in
a big buffer for types read in via ctf_open and per-type hashtables for
newly-added types.
So we discard the internally-visible concept of "readonly dictionaries"
in favour of declaring the *range of types* that were already present
when the dict was read in to be read-only: you can't modify them (say,
by adding members to them if they're structs, or calling ctf_set_array
on them), but you can add more types and point to them. (The API
remains the same, with calls sometimes returning ECTF_RDONLY, but now
they do so less often.)
This is a fairly invasive change, mostly because code written since the
ban was introduced didn't take the possibility of a static/dynamic split
into account. Some of these irregularities were hard to define as
anything but bugs.
Notably:
- The symbol handling was assuming that symbols only needed to be
looked for in dynamic hashtabs or static linker-laid-out indexed/
nonindexed layouts, but now we want to check both in case people
added more symbols to a dict they opened.
- The code that handles type additions wasn't checking to see if types
with the same name existed *at all* (so you could do
ctf_add_typedef (fp, "foo", bar) repeatedly without error). This
seems reasonable for types you just added, but we probably *do* want
to ban addition of types with names that override names we already
used in the ctf_open()ed portion, since that would probably corrupt
existing type relationships. (Doing things this way also avoids
causing new errors for any existing code that was doing this sort of
thing.)
- ctf_lookup_variable entirely failed to work for variables just added
by ctf_add_variable: you had to write the dict out and read it back
in again before they appeared.
- The symbol handling remembered what symbols you looked up but didn't
remember their types, so you could look up an object symbol and then
find it popping up when you asked for function symbols, which seems
less than ideal. Since we had to rejig things enough to be able to
distinguish function and object symbols internally anyway (in order
to give suitable errors if you try to add a symbol with a name that
already existed in the ctf_open()ed dict), this bug suddenly became
more visible and was easily fixed.
We do not (yet) support writing out dicts that have been previously read
in via ctf_open() or other deserializer (you can look things up in them,
but not write them out a second time). This never worked, so there is
no incompatibility; if it is needed at a later date, the serializer is a
little bit closer to having it work now (the only table we don't deal
with is the types table, and that's because the upcoming CTFv4 changes
are likely to make major changes to the way that table is represented
internally, so adding more code that depends on its current form seems
like a bad idea).
There is a new testcase that tests much of this, in particular that
modification of existing types is still banned and that you can add new
ones and chase them without error.
libctf/
* ctf-impl.h (struct ctf_dict.ctf_symhash): Split into...
(ctf_dict.ctf_symhash_func): ... this and...
(ctf_dict.ctf_symhash_objt): ... this.
(ctf_dict.ctf_stypes): New, counts static types.
(LCTF_INDEX_TO_TYPEPTR): Use it instead of CTF_RDWR.
(LCTF_RDWR): Deleted.
(LCTF_DIRTY): Renumbered.
(LCTF_LINKING): Likewise.
(ctf_lookup_variable_here): New.
(ctf_lookup_by_sym_or_name): Likewise.
(ctf_symbol_next_static): Likewise.
(ctf_add_variable_forced): Likewise.
(ctf_add_funcobjt_sym_forced): Likewise.
(ctf_simple_open_internal): Adjust.
(ctf_bufopen_internal): Likewise.
* ctf-create.c (ctf_grow_ptrtab): Adjust a lot to start with.
(ctf_create): Migrate a bunch of initializations into bufopen.
Force recreation of name tables. Do not forcibly override the
model, let ctf_bufopen do it.
(ctf_static_type): New.
(ctf_update): Drop LCTF_RDWR check.
(ctf_dynamic_type): Likewise.
(ctf_add_function): Likewise.
(ctf_add_type_internal): Likewise.
(ctf_rollback): Check ctf_stypes, not LCTF_RDWR.
(ctf_set_array): Likewise.
(ctf_add_struct_sized): Likewise.
(ctf_add_union_sized): Likewise.
(ctf_add_enum): Likewise.
(ctf_add_enumerator): Likewise (only on the target dict).
(ctf_add_member_offset): Likewise.
(ctf_add_generic): Drop LCTF_RDWR check. Ban addition of types
with colliding names.
(ctf_add_forward): Note safety under the new rules.
(ctf_add_variable): Split all but the existence check into...
(ctf_add_variable_forced): ... this new function.
(ctf_add_funcobjt_sym): Likewise...
(ctf_add_funcobjt_sym_forced): ... for this new function.
* ctf-link.c (ctf_link_add_linker_symbol): Ban calling on dicts
with any stypes.
(ctf_link_add_strtab): Likewise.
(ctf_link_shuffle_syms): Likewise.
(ctf_link_intern_extern_string): Note pre-existing prohibition.
* ctf-lookup.c (ctf_lookup_by_id): Drop LCTF_RDWR check.
(ctf_lookup_variable): Split out looking in a dict but not
its parent into...
(ctf_lookup_variable_here): ... this new function.
(ctf_lookup_symbol_idx): Track whether looking up a function or
object: cache them separately.
(ctf_symbol_next): Split out looking in non-dynamic symtypetab
entries to...
(ctf_symbol_next_static): ... this new function. Don't get confused
by the simultaneous presence of static and dynamic symtypetab entries.
(ctf_try_lookup_indexed): Don't waste time looking up symbols by
index before there can be any idea how symbols are numbered.
(ctf_lookup_by_sym_or_name): Distinguish between function and
data object lookups. Drop LCTF_RDWR.
(ctf_lookup_by_symbol): Adjust.
(ctf_lookup_by_symbol_name): Likewise.
* ctf-open.c (init_types): Rename to...
(init_static_types): ... this. Drop LCTF_RDWR. Populate ctf_stypes.
(ctf_simple_open): Drop writable arg.
(ctf_simple_open_internal): Likewise.
(ctf_bufopen): Likewise.
(ctf_bufopen_internal): Populate fields only used for writable dicts.
Drop LCTF_RDWR.
(ctf_dict_close): Cater for symhash cache split.
* ctf-serialize.c (ctf_serialize): Use ctf_stypes, not LCTF_RDWR.
* ctf-types.c (ctf_variable_next): Drop LCTF_RDWR.
* testsuite/libctf-lookup/add-to-opened*: New test.
2023-12-20 00:58:19 +08:00
|
|
|
int sym_functions = 0;
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
|
|
|
|
unsigned char *t;
|
|
|
|
unsigned long i;
|
|
|
|
size_t buf_size, type_size, objt_size, func_size;
|
|
|
|
size_t funcidx_size, objtidx_size;
|
|
|
|
size_t nvars;
|
|
|
|
unsigned char *buf = NULL, *newbuf;
|
|
|
|
|
|
|
|
emit_symtypetab_state_t symstate;
|
|
|
|
memset (&symstate, 0, sizeof (emit_symtypetab_state_t));
|
|
|
|
|
|
|
|
/* Fill in an initial CTF header. We will leave the label, object,
|
|
|
|
and function sections empty and only output a header, type section,
|
|
|
|
and string table. The type section begins at a 4-byte aligned
|
|
|
|
boundary past the CTF header itself (at relative offset zero). The flag
|
|
|
|
indicating a new-style function info section (an array of CTF_K_FUNCTION
|
|
|
|
type IDs in the types section) is flipped on. */
|
|
|
|
|
|
|
|
memset (&hdr, 0, sizeof (hdr));
|
|
|
|
hdr.cth_magic = CTF_MAGIC;
|
|
|
|
hdr.cth_version = CTF_VERSION;
|
|
|
|
|
|
|
|
/* This is a new-format func info section, and the symtab and strtab come out
|
|
|
|
of the dynsym and dynstr these days. */
|
|
|
|
hdr.cth_flags = (CTF_F_NEWFUNCINFO | CTF_F_DYNSTR);
|
|
|
|
|
libctf: support addition of types to dicts read via ctf_open()
libctf has long declared deserialized dictionaries (out of files or ELF
sections or memory buffers or whatever) to be read-only: back in the
furthest prehistory this was not the case, in that you could add a
few sorts of type to such dicts, but attempting to do so often caused
horrible memory corruption, so I banned the lot.
But it turns out real consumers want it (notably DTrace, which
synthesises pointers to types that don't have them and adds them to the
ctf_open()ed dicts if it needs them). Let's bring it back again, but
without the memory corruption and without the massive code duplication
required in days of yore to distinguish between static and dynamic
types: the representation of both types has been identical for a few
years, with the only difference being that types as a whole are stored in
a big buffer for types read in via ctf_open and per-type hashtables for
newly-added types.
So we discard the internally-visible concept of "readonly dictionaries"
in favour of declaring the *range of types* that were already present
when the dict was read in to be read-only: you can't modify them (say,
by adding members to them if they're structs, or calling ctf_set_array
on them), but you can add more types and point to them. (The API
remains the same, with calls sometimes returning ECTF_RDONLY, but now
they do so less often.)
This is a fairly invasive change, mostly because code written since the
ban was introduced didn't take the possibility of a static/dynamic split
into account. Some of these irregularities were hard to define as
anything but bugs.
Notably:
- The symbol handling was assuming that symbols only needed to be
looked for in dynamic hashtabs or static linker-laid-out indexed/
nonindexed layouts, but now we want to check both in case people
added more symbols to a dict they opened.
- The code that handles type additions wasn't checking to see if types
with the same name existed *at all* (so you could do
ctf_add_typedef (fp, "foo", bar) repeatedly without error). This
seems reasonable for types you just added, but we probably *do* want
to ban addition of types with names that override names we already
used in the ctf_open()ed portion, since that would probably corrupt
existing type relationships. (Doing things this way also avoids
causing new errors for any existing code that was doing this sort of
thing.)
- ctf_lookup_variable entirely failed to work for variables just added
by ctf_add_variable: you had to write the dict out and read it back
in again before they appeared.
- The symbol handling remembered what symbols you looked up but didn't
remember their types, so you could look up an object symbol and then
find it popping up when you asked for function symbols, which seems
less than ideal. Since we had to rejig things enough to be able to
distinguish function and object symbols internally anyway (in order
to give suitable errors if you try to add a symbol with a name that
already existed in the ctf_open()ed dict), this bug suddenly became
more visible and was easily fixed.
We do not (yet) support writing out dicts that have been previously read
in via ctf_open() or other deserializer (you can look things up in them,
but not write them out a second time). This never worked, so there is
no incompatibility; if it is needed at a later date, the serializer is a
little bit closer to having it work now (the only table we don't deal
with is the types table, and that's because the upcoming CTFv4 changes
are likely to make major changes to the way that table is represented
internally, so adding more code that depends on its current form seems
like a bad idea).
There is a new testcase that tests much of this, in particular that
modification of existing types is still banned and that you can add new
ones and chase them without error.
libctf/
* ctf-impl.h (struct ctf_dict.ctf_symhash): Split into...
(ctf_dict.ctf_symhash_func): ... this and...
(ctf_dict.ctf_symhash_objt): ... this.
(ctf_dict.ctf_stypes): New, counts static types.
(LCTF_INDEX_TO_TYPEPTR): Use it instead of CTF_RDWR.
(LCTF_RDWR): Deleted.
(LCTF_DIRTY): Renumbered.
(LCTF_LINKING): Likewise.
(ctf_lookup_variable_here): New.
(ctf_lookup_by_sym_or_name): Likewise.
(ctf_symbol_next_static): Likewise.
(ctf_add_variable_forced): Likewise.
(ctf_add_funcobjt_sym_forced): Likewise.
(ctf_simple_open_internal): Adjust.
(ctf_bufopen_internal): Likewise.
* ctf-create.c (ctf_grow_ptrtab): Adjust a lot to start with.
(ctf_create): Migrate a bunch of initializations into bufopen.
Force recreation of name tables. Do not forcibly override the
model, let ctf_bufopen do it.
(ctf_static_type): New.
(ctf_update): Drop LCTF_RDWR check.
(ctf_dynamic_type): Likewise.
(ctf_add_function): Likewise.
(ctf_add_type_internal): Likewise.
(ctf_rollback): Check ctf_stypes, not LCTF_RDWR.
(ctf_set_array): Likewise.
(ctf_add_struct_sized): Likewise.
(ctf_add_union_sized): Likewise.
(ctf_add_enum): Likewise.
(ctf_add_enumerator): Likewise (only on the target dict).
(ctf_add_member_offset): Likewise.
(ctf_add_generic): Drop LCTF_RDWR check. Ban addition of types
with colliding names.
(ctf_add_forward): Note safety under the new rules.
(ctf_add_variable): Split all but the existence check into...
(ctf_add_variable_forced): ... this new function.
(ctf_add_funcobjt_sym): Likewise...
(ctf_add_funcobjt_sym_forced): ... for this new function.
* ctf-link.c (ctf_link_add_linker_symbol): Ban calling on dicts
with any stypes.
(ctf_link_add_strtab): Likewise.
(ctf_link_shuffle_syms): Likewise.
(ctf_link_intern_extern_string): Note pre-existing prohibition.
* ctf-lookup.c (ctf_lookup_by_id): Drop LCTF_RDWR check.
(ctf_lookup_variable): Split out looking in a dict but not
its parent into...
(ctf_lookup_variable_here): ... this new function.
(ctf_lookup_symbol_idx): Track whether looking up a function or
object: cache them separately.
(ctf_symbol_next): Split out looking in non-dynamic symtypetab
entries to...
(ctf_symbol_next_static): ... this new function. Don't get confused
by the simultaneous presence of static and dynamic symtypetab entries.
(ctf_try_lookup_indexed): Don't waste time looking up symbols by
index before there can be any idea how symbols are numbered.
(ctf_lookup_by_sym_or_name): Distinguish between function and
data object lookups. Drop LCTF_RDWR.
(ctf_lookup_by_symbol): Adjust.
(ctf_lookup_by_symbol_name): Likewise.
* ctf-open.c (init_types): Rename to...
(init_static_types): ... this. Drop LCTF_RDWR. Populate ctf_stypes.
(ctf_simple_open): Drop writable arg.
(ctf_simple_open_internal): Likewise.
(ctf_bufopen): Likewise.
(ctf_bufopen_internal): Populate fields only used for writable dicts.
Drop LCTF_RDWR.
(ctf_dict_close): Cater for symhash cache split.
* ctf-serialize.c (ctf_serialize): Use ctf_stypes, not LCTF_RDWR.
* ctf-types.c (ctf_variable_next): Drop LCTF_RDWR.
* testsuite/libctf-lookup/add-to-opened*: New test.
2023-12-20 00:58:19 +08:00
|
|
|
/* Propagate all symbols in the symtypetabs into the dynamic state, so that
|
|
|
|
we can put them back in the right order. Symbols already in the dynamic
|
libctf: make ctf_serialize() actually serialize
ctf_serialize() evolved from the old ctf_update(), which mutated the
in-memory CTF dict to make all the dynamic in-memory types into static,
unchanging written-to-the-dict types (by deserializing and reserializing
it): back in the days when you could only do type lookups on static types,
this meant you could see all the types you added recently, at the small,
small cost of making it impossible to change those older types ever again
and inducing an amortized O(n^2) cost if you actually wanted to add
references to types you added at arbitrary times to later types.
It also reset things so that ctf_discard() would throw away only types you
added after the most recent ctf_update() call.
Some time ago this was all changed so that you could look up dynamic types
just as easily as static types: ctf_update() changed so that only its
visible side-effect of affecting ctf_discard() remained: the old
ctf_update() was renamed to ctf_serialize(), made internal to libctf, and
called from the various functions that wrote files out.
... but it was still working by serializing and deserializing the entire
dict, swapping out its guts with the newly-serialized copy in an invasive
and horrible fashion that coupled ctf_serialize() to almost every field in
the ctf_dict_t. This is totally useless, and fixing it is easy: just rip
all that code out and have ctf_serialize return a serialized representation,
and let everything use that directly. This simplifies most of its callers
significantly.
(It also points up another bug: ctf_gzwrite() failed to call ctf_serialize()
at all, so it would only ever work for a dict you just ctf_write_mem()ed
yourself, just for its invisible side-effect of serializing the dict!)
This lets us simplify away a bunch of internal-only open-side functionality
for overriding the syn_ext_strtab and some just-added functionality for
forcing in an existing atoms table, without loss of functionality, and lets
us lift the restriction on reserializing a dict that was ctf_open()ed rather
than being ctf_create()d: it's now perfectly OK to open a dict, modify it
(except for adding members to existing structs, unions, or enums, which
fails with -ECTF_RDONLY), and write it out again, just as one would expect.
libctf/
* ctf-serialize.c (ctf_symtypetab_sect_sizes): Fix typos.
(ctf_type_sect_size): Add static type sizes too.
(ctf_serialize): Return the new dict rather than updating the
existing dict. No longer fail for dicts with static types;
copy them onto the start of the new types table.
(ctf_gzwrite): Actually serialize before gzwriting.
(ctf_write_mem): Improve forced (test-mode) endian-flipping:
flip dicts even if they are too small to be compressed.
Improve confusing variable naming.
* ctf-archive.c (arc_write_one_ctf): Don't bother to call
ctf_serialize: both the functions we call do so.
* ctf-string.c (ctf_str_create_atoms): Drop serializing case
(atoms arg).
* ctf-open.c (ctf_simple_open): Call ctf_bufopen directly.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Delete/rename to ctf_bufopen: no
longer bother with syn_ext_strtab or forced atoms table,
serialization no longer needs them.
* ctf-create.c (ctf_create): Call ctf_bufopen directly.
* ctf-impl.h (ctf_str_create_atoms): Drop atoms arg.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Likewise.
(ctf_serialize): Adjust.
* testsuite/libctf-lookup/add-to-opened.c: Adjust now that
this is supposed to work.
2024-03-26 21:04:20 +08:00
|
|
|
state, likely due to repeated serialization, are left unchanged. */
|
libctf: support addition of types to dicts read via ctf_open()
libctf has long declared deserialized dictionaries (out of files or ELF
sections or memory buffers or whatever) to be read-only: back in the
furthest prehistory this was not the case, in that you could add a
few sorts of type to such dicts, but attempting to do so often caused
horrible memory corruption, so I banned the lot.
But it turns out real consumers want it (notably DTrace, which
synthesises pointers to types that don't have them and adds them to the
ctf_open()ed dicts if it needs them). Let's bring it back again, but
without the memory corruption and without the massive code duplication
required in days of yore to distinguish between static and dynamic
types: the representation of both types has been identical for a few
years, with the only difference being that types as a whole are stored in
a big buffer for types read in via ctf_open and per-type hashtables for
newly-added types.
So we discard the internally-visible concept of "readonly dictionaries"
in favour of declaring the *range of types* that were already present
when the dict was read in to be read-only: you can't modify them (say,
by adding members to them if they're structs, or calling ctf_set_array
on them), but you can add more types and point to them. (The API
remains the same, with calls sometimes returning ECTF_RDONLY, but now
they do so less often.)
This is a fairly invasive change, mostly because code written since the
ban was introduced didn't take the possibility of a static/dynamic split
into account. Some of these irregularities were hard to define as
anything but bugs.
Notably:
- The symbol handling was assuming that symbols only needed to be
looked for in dynamic hashtabs or static linker-laid-out indexed/
nonindexed layouts, but now we want to check both in case people
added more symbols to a dict they opened.
- The code that handles type additions wasn't checking to see if types
with the same name existed *at all* (so you could do
ctf_add_typedef (fp, "foo", bar) repeatedly without error). This
seems reasonable for types you just added, but we probably *do* want
to ban addition of types with names that override names we already
used in the ctf_open()ed portion, since that would probably corrupt
existing type relationships. (Doing things this way also avoids
causing new errors for any existing code that was doing this sort of
thing.)
- ctf_lookup_variable entirely failed to work for variables just added
by ctf_add_variable: you had to write the dict out and read it back
in again before they appeared.
- The symbol handling remembered what symbols you looked up but didn't
remember their types, so you could look up an object symbol and then
find it popping up when you asked for function symbols, which seems
less than ideal. Since we had to rejig things enough to be able to
distinguish function and object symbols internally anyway (in order
to give suitable errors if you try to add a symbol with a name that
already existed in the ctf_open()ed dict), this bug suddenly became
more visible and was easily fixed.
We do not (yet) support writing out dicts that have been previously read
in via ctf_open() or other deserializer (you can look things up in them,
but not write them out a second time). This never worked, so there is
no incompatibility; if it is needed at a later date, the serializer is a
little bit closer to having it work now (the only table we don't deal
with is the types table, and that's because the upcoming CTFv4 changes
are likely to make major changes to the way that table is represented
internally, so adding more code that depends on its current form seems
like a bad idea).
There is a new testcase that tests much of this, in particular that
modification of existing types is still banned and that you can add new
ones and chase them without error.
libctf/
* ctf-impl.h (struct ctf_dict.ctf_symhash): Split into...
(ctf_dict.ctf_symhash_func): ... this and...
(ctf_dict.ctf_symhash_objt): ... this.
(ctf_dict.ctf_stypes): New, counts static types.
(LCTF_INDEX_TO_TYPEPTR): Use it instead of CTF_RDWR.
(LCTF_RDWR): Deleted.
(LCTF_DIRTY): Renumbered.
(LCTF_LINKING): Likewise.
(ctf_lookup_variable_here): New.
(ctf_lookup_by_sym_or_name): Likewise.
(ctf_symbol_next_static): Likewise.
(ctf_add_variable_forced): Likewise.
(ctf_add_funcobjt_sym_forced): Likewise.
(ctf_simple_open_internal): Adjust.
(ctf_bufopen_internal): Likewise.
* ctf-create.c (ctf_grow_ptrtab): Adjust a lot to start with.
(ctf_create): Migrate a bunch of initializations into bufopen.
Force recreation of name tables. Do not forcibly override the
model, let ctf_bufopen do it.
(ctf_static_type): New.
(ctf_update): Drop LCTF_RDWR check.
(ctf_dynamic_type): Likewise.
(ctf_add_function): Likewise.
(ctf_add_type_internal): Likewise.
(ctf_rollback): Check ctf_stypes, not LCTF_RDWR.
(ctf_set_array): Likewise.
(ctf_add_struct_sized): Likewise.
(ctf_add_union_sized): Likewise.
(ctf_add_enum): Likewise.
(ctf_add_enumerator): Likewise (only on the target dict).
(ctf_add_member_offset): Likewise.
(ctf_add_generic): Drop LCTF_RDWR check. Ban addition of types
with colliding names.
(ctf_add_forward): Note safety under the new rules.
(ctf_add_variable): Split all but the existence check into...
(ctf_add_variable_forced): ... this new function.
(ctf_add_funcobjt_sym): Likewise...
(ctf_add_funcobjt_sym_forced): ... for this new function.
* ctf-link.c (ctf_link_add_linker_symbol): Ban calling on dicts
with any stypes.
(ctf_link_add_strtab): Likewise.
(ctf_link_shuffle_syms): Likewise.
(ctf_link_intern_extern_string): Note pre-existing prohibition.
* ctf-lookup.c (ctf_lookup_by_id): Drop LCTF_RDWR check.
(ctf_lookup_variable): Split out looking in a dict but not
its parent into...
(ctf_lookup_variable_here): ... this new function.
(ctf_lookup_symbol_idx): Track whether looking up a function or
object: cache them separately.
(ctf_symbol_next): Split out looking in non-dynamic symtypetab
entries to...
(ctf_symbol_next_static): ... this new function. Don't get confused
by the simultaneous presence of static and dynamic symtypetab entries.
(ctf_try_lookup_indexed): Don't waste time looking up symbols by
index before there can be any idea how symbols are numbered.
(ctf_lookup_by_sym_or_name): Distinguish between function and
data object lookups. Drop LCTF_RDWR.
(ctf_lookup_by_symbol): Adjust.
(ctf_lookup_by_symbol_name): Likewise.
* ctf-open.c (init_types): Rename to...
(init_static_types): ... this. Drop LCTF_RDWR. Populate ctf_stypes.
(ctf_simple_open): Drop writable arg.
(ctf_simple_open_internal): Likewise.
(ctf_bufopen): Likewise.
(ctf_bufopen_internal): Populate fields only used for writable dicts.
Drop LCTF_RDWR.
(ctf_dict_close): Cater for symhash cache split.
* ctf-serialize.c (ctf_serialize): Use ctf_stypes, not LCTF_RDWR.
* ctf-types.c (ctf_variable_next): Drop LCTF_RDWR.
* testsuite/libctf-lookup/add-to-opened*: New test.
2023-12-20 00:58:19 +08:00
|
|
|
do
|
|
|
|
{
|
|
|
|
ctf_next_t *it = NULL;
|
|
|
|
const char *sym_name;
|
|
|
|
ctf_id_t sym;
|
|
|
|
|
|
|
|
while ((sym = ctf_symbol_next_static (fp, &it, &sym_name,
|
|
|
|
sym_functions)) != CTF_ERR)
|
|
|
|
if ((ctf_add_funcobjt_sym_forced (fp, sym_functions, sym_name, sym)) < 0)
|
|
|
|
if (ctf_errno (fp) != ECTF_DUPLICATE)
|
libctf: make ctf_serialize() actually serialize
ctf_serialize() evolved from the old ctf_update(), which mutated the
in-memory CTF dict to make all the dynamic in-memory types into static,
unchanging written-to-the-dict types (by deserializing and reserializing
it): back in the days when you could only do type lookups on static types,
this meant you could see all the types you added recently, at the small,
small cost of making it impossible to change those older types ever again
and inducing an amortized O(n^2) cost if you actually wanted to add
references to types you added at arbitrary times to later types.
It also reset things so that ctf_discard() would throw away only types you
added after the most recent ctf_update() call.
Some time ago this was all changed so that you could look up dynamic types
just as easily as static types: ctf_update() changed so that only its
visible side-effect of affecting ctf_discard() remained: the old
ctf_update() was renamed to ctf_serialize(), made internal to libctf, and
called from the various functions that wrote files out.
... but it was still working by serializing and deserializing the entire
dict, swapping out its guts with the newly-serialized copy in an invasive
and horrible fashion that coupled ctf_serialize() to almost every field in
the ctf_dict_t. This is totally useless, and fixing it is easy: just rip
all that code out and have ctf_serialize return a serialized representation,
and let everything use that directly. This simplifies most of its callers
significantly.
(It also points up another bug: ctf_gzwrite() failed to call ctf_serialize()
at all, so it would only ever work for a dict you just ctf_write_mem()ed
yourself, just for its invisible side-effect of serializing the dict!)
This lets us simplify away a bunch of internal-only open-side functionality
for overriding the syn_ext_strtab and some just-added functionality for
forcing in an existing atoms table, without loss of functionality, and lets
us lift the restriction on reserializing a dict that was ctf_open()ed rather
than being ctf_create()d: it's now perfectly OK to open a dict, modify it
(except for adding members to existing structs, unions, or enums, which
fails with -ECTF_RDONLY), and write it out again, just as one would expect.
libctf/
* ctf-serialize.c (ctf_symtypetab_sect_sizes): Fix typos.
(ctf_type_sect_size): Add static type sizes too.
(ctf_serialize): Return the new dict rather than updating the
existing dict. No longer fail for dicts with static types;
copy them onto the start of the new types table.
(ctf_gzwrite): Actually serialize before gzwriting.
(ctf_write_mem): Improve forced (test-mode) endian-flipping:
flip dicts even if they are too small to be compressed.
Improve confusing variable naming.
* ctf-archive.c (arc_write_one_ctf): Don't bother to call
ctf_serialize: both the functions we call do so.
* ctf-string.c (ctf_str_create_atoms): Drop serializing case
(atoms arg).
* ctf-open.c (ctf_simple_open): Call ctf_bufopen directly.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Delete/rename to ctf_bufopen: no
longer bother with syn_ext_strtab or forced atoms table,
serialization no longer needs them.
* ctf-create.c (ctf_create): Call ctf_bufopen directly.
* ctf-impl.h (ctf_str_create_atoms): Drop atoms arg.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Likewise.
(ctf_serialize): Adjust.
* testsuite/libctf-lookup/add-to-opened.c: Adjust now that
this is supposed to work.
2024-03-26 21:04:20 +08:00
|
|
|
return NULL; /* errno is set for us. */
|
libctf: support addition of types to dicts read via ctf_open()
libctf has long declared deserialized dictionaries (out of files or ELF
sections or memory buffers or whatever) to be read-only: back in the
furthest prehistory this was not the case, in that you could add a
few sorts of type to such dicts, but attempting to do so often caused
horrible memory corruption, so I banned the lot.
But it turns out real consumers want it (notably DTrace, which
synthesises pointers to types that don't have them and adds them to the
ctf_open()ed dicts if it needs them). Let's bring it back again, but
without the memory corruption and without the massive code duplication
required in days of yore to distinguish between static and dynamic
types: the representation of both types has been identical for a few
years, with the only difference being that types as a whole are stored in
a big buffer for types read in via ctf_open and per-type hashtables for
newly-added types.
So we discard the internally-visible concept of "readonly dictionaries"
in favour of declaring the *range of types* that were already present
when the dict was read in to be read-only: you can't modify them (say,
by adding members to them if they're structs, or calling ctf_set_array
on them), but you can add more types and point to them. (The API
remains the same, with calls sometimes returning ECTF_RDONLY, but now
they do so less often.)
This is a fairly invasive change, mostly because code written since the
ban was introduced didn't take the possibility of a static/dynamic split
into account. Some of these irregularities were hard to define as
anything but bugs.
Notably:
- The symbol handling was assuming that symbols only needed to be
looked for in dynamic hashtabs or static linker-laid-out indexed/
nonindexed layouts, but now we want to check both in case people
added more symbols to a dict they opened.
- The code that handles type additions wasn't checking to see if types
with the same name existed *at all* (so you could do
ctf_add_typedef (fp, "foo", bar) repeatedly without error). This
seems reasonable for types you just added, but we probably *do* want
to ban addition of types with names that override names we already
used in the ctf_open()ed portion, since that would probably corrupt
existing type relationships. (Doing things this way also avoids
causing new errors for any existing code that was doing this sort of
thing.)
- ctf_lookup_variable entirely failed to work for variables just added
by ctf_add_variable: you had to write the dict out and read it back
in again before they appeared.
- The symbol handling remembered what symbols you looked up but didn't
remember their types, so you could look up an object symbol and then
find it popping up when you asked for function symbols, which seems
less than ideal. Since we had to rejig things enough to be able to
distinguish function and object symbols internally anyway (in order
to give suitable errors if you try to add a symbol with a name that
already existed in the ctf_open()ed dict), this bug suddenly became
more visible and was easily fixed.
We do not (yet) support writing out dicts that have been previously read
in via ctf_open() or other deserializer (you can look things up in them,
but not write them out a second time). This never worked, so there is
no incompatibility; if it is needed at a later date, the serializer is a
little bit closer to having it work now (the only table we don't deal
with is the types table, and that's because the upcoming CTFv4 changes
are likely to make major changes to the way that table is represented
internally, so adding more code that depends on its current form seems
like a bad idea).
There is a new testcase that tests much of this, in particular that
modification of existing types is still banned and that you can add new
ones and chase them without error.
libctf/
* ctf-impl.h (struct ctf_dict.ctf_symhash): Split into...
(ctf_dict.ctf_symhash_func): ... this and...
(ctf_dict.ctf_symhash_objt): ... this.
(ctf_dict.ctf_stypes): New, counts static types.
(LCTF_INDEX_TO_TYPEPTR): Use it instead of CTF_RDWR.
(LCTF_RDWR): Deleted.
(LCTF_DIRTY): Renumbered.
(LCTF_LINKING): Likewise.
(ctf_lookup_variable_here): New.
(ctf_lookup_by_sym_or_name): Likewise.
(ctf_symbol_next_static): Likewise.
(ctf_add_variable_forced): Likewise.
(ctf_add_funcobjt_sym_forced): Likewise.
(ctf_simple_open_internal): Adjust.
(ctf_bufopen_internal): Likewise.
* ctf-create.c (ctf_grow_ptrtab): Adjust a lot to start with.
(ctf_create): Migrate a bunch of initializations into bufopen.
Force recreation of name tables. Do not forcibly override the
model, let ctf_bufopen do it.
(ctf_static_type): New.
(ctf_update): Drop LCTF_RDWR check.
(ctf_dynamic_type): Likewise.
(ctf_add_function): Likewise.
(ctf_add_type_internal): Likewise.
(ctf_rollback): Check ctf_stypes, not LCTF_RDWR.
(ctf_set_array): Likewise.
(ctf_add_struct_sized): Likewise.
(ctf_add_union_sized): Likewise.
(ctf_add_enum): Likewise.
(ctf_add_enumerator): Likewise (only on the target dict).
(ctf_add_member_offset): Likewise.
(ctf_add_generic): Drop LCTF_RDWR check. Ban addition of types
with colliding names.
(ctf_add_forward): Note safety under the new rules.
(ctf_add_variable): Split all but the existence check into...
(ctf_add_variable_forced): ... this new function.
(ctf_add_funcobjt_sym): Likewise...
(ctf_add_funcobjt_sym_forced): ... for this new function.
* ctf-link.c (ctf_link_add_linker_symbol): Ban calling on dicts
with any stypes.
(ctf_link_add_strtab): Likewise.
(ctf_link_shuffle_syms): Likewise.
(ctf_link_intern_extern_string): Note pre-existing prohibition.
* ctf-lookup.c (ctf_lookup_by_id): Drop LCTF_RDWR check.
(ctf_lookup_variable): Split out looking in a dict but not
its parent into...
(ctf_lookup_variable_here): ... this new function.
(ctf_lookup_symbol_idx): Track whether looking up a function or
object: cache them separately.
(ctf_symbol_next): Split out looking in non-dynamic symtypetab
entries to...
(ctf_symbol_next_static): ... this new function. Don't get confused
by the simultaneous presence of static and dynamic symtypetab entries.
(ctf_try_lookup_indexed): Don't waste time looking up symbols by
index before there can be any idea how symbols are numbered.
(ctf_lookup_by_sym_or_name): Distinguish between function and
data object lookups. Drop LCTF_RDWR.
(ctf_lookup_by_symbol): Adjust.
(ctf_lookup_by_symbol_name): Likewise.
* ctf-open.c (init_types): Rename to...
(init_static_types): ... this. Drop LCTF_RDWR. Populate ctf_stypes.
(ctf_simple_open): Drop writable arg.
(ctf_simple_open_internal): Likewise.
(ctf_bufopen): Likewise.
(ctf_bufopen_internal): Populate fields only used for writable dicts.
Drop LCTF_RDWR.
(ctf_dict_close): Cater for symhash cache split.
* ctf-serialize.c (ctf_serialize): Use ctf_stypes, not LCTF_RDWR.
* ctf-types.c (ctf_variable_next): Drop LCTF_RDWR.
* testsuite/libctf-lookup/add-to-opened*: New test.
2023-12-20 00:58:19 +08:00
|
|
|
|
|
|
|
if (ctf_errno (fp) != ECTF_NEXT_END)
|
libctf: make ctf_serialize() actually serialize
ctf_serialize() evolved from the old ctf_update(), which mutated the
in-memory CTF dict to make all the dynamic in-memory types into static,
unchanging written-to-the-dict types (by deserializing and reserializing
it): back in the days when you could only do type lookups on static types,
this meant you could see all the types you added recently, at the small,
small cost of making it impossible to change those older types ever again
and inducing an amortized O(n^2) cost if you actually wanted to add
references to types you added at arbitrary times to later types.
It also reset things so that ctf_discard() would throw away only types you
added after the most recent ctf_update() call.
Some time ago this was all changed so that you could look up dynamic types
just as easily as static types: ctf_update() changed so that only its
visible side-effect of affecting ctf_discard() remained: the old
ctf_update() was renamed to ctf_serialize(), made internal to libctf, and
called from the various functions that wrote files out.
... but it was still working by serializing and deserializing the entire
dict, swapping out its guts with the newly-serialized copy in an invasive
and horrible fashion that coupled ctf_serialize() to almost every field in
the ctf_dict_t. This is totally useless, and fixing it is easy: just rip
all that code out and have ctf_serialize return a serialized representation,
and let everything use that directly. This simplifies most of its callers
significantly.
(It also points up another bug: ctf_gzwrite() failed to call ctf_serialize()
at all, so it would only ever work for a dict you just ctf_write_mem()ed
yourself, just for its invisible side-effect of serializing the dict!)
This lets us simplify away a bunch of internal-only open-side functionality
for overriding the syn_ext_strtab and some just-added functionality for
forcing in an existing atoms table, without loss of functionality, and lets
us lift the restriction on reserializing a dict that was ctf_open()ed rather
than being ctf_create()d: it's now perfectly OK to open a dict, modify it
(except for adding members to existing structs, unions, or enums, which
fails with -ECTF_RDONLY), and write it out again, just as one would expect.
libctf/
* ctf-serialize.c (ctf_symtypetab_sect_sizes): Fix typos.
(ctf_type_sect_size): Add static type sizes too.
(ctf_serialize): Return the new dict rather than updating the
existing dict. No longer fail for dicts with static types;
copy them onto the start of the new types table.
(ctf_gzwrite): Actually serialize before gzwriting.
(ctf_write_mem): Improve forced (test-mode) endian-flipping:
flip dicts even if they are too small to be compressed.
Improve confusing variable naming.
* ctf-archive.c (arc_write_one_ctf): Don't bother to call
ctf_serialize: both the functions we call do so.
* ctf-string.c (ctf_str_create_atoms): Drop serializing case
(atoms arg).
* ctf-open.c (ctf_simple_open): Call ctf_bufopen directly.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Delete/rename to ctf_bufopen: no
longer bother with syn_ext_strtab or forced atoms table,
serialization no longer needs them.
* ctf-create.c (ctf_create): Call ctf_bufopen directly.
* ctf-impl.h (ctf_str_create_atoms): Drop atoms arg.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Likewise.
(ctf_serialize): Adjust.
* testsuite/libctf-lookup/add-to-opened.c: Adjust now that
this is supposed to work.
2024-03-26 21:04:20 +08:00
|
|
|
return NULL; /* errno is set for us. */
|
libctf: support addition of types to dicts read via ctf_open()
libctf has long declared deserialized dictionaries (out of files or ELF
sections or memory buffers or whatever) to be read-only: back in the
furthest prehistory this was not the case, in that you could add a
few sorts of type to such dicts, but attempting to do so often caused
horrible memory corruption, so I banned the lot.
But it turns out real consumers want it (notably DTrace, which
synthesises pointers to types that don't have them and adds them to the
ctf_open()ed dicts if it needs them). Let's bring it back again, but
without the memory corruption and without the massive code duplication
required in days of yore to distinguish between static and dynamic
types: the representation of both types has been identical for a few
years, with the only difference being that types as a whole are stored in
a big buffer for types read in via ctf_open and per-type hashtables for
newly-added types.
So we discard the internally-visible concept of "readonly dictionaries"
in favour of declaring the *range of types* that were already present
when the dict was read in to be read-only: you can't modify them (say,
by adding members to them if they're structs, or calling ctf_set_array
on them), but you can add more types and point to them. (The API
remains the same, with calls sometimes returning ECTF_RDONLY, but now
they do so less often.)
This is a fairly invasive change, mostly because code written since the
ban was introduced didn't take the possibility of a static/dynamic split
into account. Some of these irregularities were hard to define as
anything but bugs.
Notably:
- The symbol handling was assuming that symbols only needed to be
looked for in dynamic hashtabs or static linker-laid-out indexed/
nonindexed layouts, but now we want to check both in case people
added more symbols to a dict they opened.
- The code that handles type additions wasn't checking to see if types
with the same name existed *at all* (so you could do
ctf_add_typedef (fp, "foo", bar) repeatedly without error). This
seems reasonable for types you just added, but we probably *do* want
to ban addition of types with names that override names we already
used in the ctf_open()ed portion, since that would probably corrupt
existing type relationships. (Doing things this way also avoids
causing new errors for any existing code that was doing this sort of
thing.)
- ctf_lookup_variable entirely failed to work for variables just added
by ctf_add_variable: you had to write the dict out and read it back
in again before they appeared.
- The symbol handling remembered what symbols you looked up but didn't
remember their types, so you could look up an object symbol and then
find it popping up when you asked for function symbols, which seems
less than ideal. Since we had to rejig things enough to be able to
distinguish function and object symbols internally anyway (in order
to give suitable errors if you try to add a symbol with a name that
already existed in the ctf_open()ed dict), this bug suddenly became
more visible and was easily fixed.
We do not (yet) support writing out dicts that have been previously read
in via ctf_open() or other deserializer (you can look things up in them,
but not write them out a second time). This never worked, so there is
no incompatibility; if it is needed at a later date, the serializer is a
little bit closer to having it work now (the only table we don't deal
with is the types table, and that's because the upcoming CTFv4 changes
are likely to make major changes to the way that table is represented
internally, so adding more code that depends on its current form seems
like a bad idea).
There is a new testcase that tests much of this, in particular that
modification of existing types is still banned and that you can add new
ones and chase them without error.
libctf/
* ctf-impl.h (struct ctf_dict.ctf_symhash): Split into...
(ctf_dict.ctf_symhash_func): ... this and...
(ctf_dict.ctf_symhash_objt): ... this.
(ctf_dict.ctf_stypes): New, counts static types.
(LCTF_INDEX_TO_TYPEPTR): Use it instead of CTF_RDWR.
(LCTF_RDWR): Deleted.
(LCTF_DIRTY): Renumbered.
(LCTF_LINKING): Likewise.
(ctf_lookup_variable_here): New.
(ctf_lookup_by_sym_or_name): Likewise.
(ctf_symbol_next_static): Likewise.
(ctf_add_variable_forced): Likewise.
(ctf_add_funcobjt_sym_forced): Likewise.
(ctf_simple_open_internal): Adjust.
(ctf_bufopen_internal): Likewise.
* ctf-create.c (ctf_grow_ptrtab): Adjust a lot to start with.
(ctf_create): Migrate a bunch of initializations into bufopen.
Force recreation of name tables. Do not forcibly override the
model, let ctf_bufopen do it.
(ctf_static_type): New.
(ctf_update): Drop LCTF_RDWR check.
(ctf_dynamic_type): Likewise.
(ctf_add_function): Likewise.
(ctf_add_type_internal): Likewise.
(ctf_rollback): Check ctf_stypes, not LCTF_RDWR.
(ctf_set_array): Likewise.
(ctf_add_struct_sized): Likewise.
(ctf_add_union_sized): Likewise.
(ctf_add_enum): Likewise.
(ctf_add_enumerator): Likewise (only on the target dict).
(ctf_add_member_offset): Likewise.
(ctf_add_generic): Drop LCTF_RDWR check. Ban addition of types
with colliding names.
(ctf_add_forward): Note safety under the new rules.
(ctf_add_variable): Split all but the existence check into...
(ctf_add_variable_forced): ... this new function.
(ctf_add_funcobjt_sym): Likewise...
(ctf_add_funcobjt_sym_forced): ... for this new function.
* ctf-link.c (ctf_link_add_linker_symbol): Ban calling on dicts
with any stypes.
(ctf_link_add_strtab): Likewise.
(ctf_link_shuffle_syms): Likewise.
(ctf_link_intern_extern_string): Note pre-existing prohibition.
* ctf-lookup.c (ctf_lookup_by_id): Drop LCTF_RDWR check.
(ctf_lookup_variable): Split out looking in a dict but not
its parent into...
(ctf_lookup_variable_here): ... this new function.
(ctf_lookup_symbol_idx): Track whether looking up a function or
object: cache them separately.
(ctf_symbol_next): Split out looking in non-dynamic symtypetab
entries to...
(ctf_symbol_next_static): ... this new function. Don't get confused
by the simultaneous presence of static and dynamic symtypetab entries.
(ctf_try_lookup_indexed): Don't waste time looking up symbols by
index before there can be any idea how symbols are numbered.
(ctf_lookup_by_sym_or_name): Distinguish between function and
data object lookups. Drop LCTF_RDWR.
(ctf_lookup_by_symbol): Adjust.
(ctf_lookup_by_symbol_name): Likewise.
* ctf-open.c (init_types): Rename to...
(init_static_types): ... this. Drop LCTF_RDWR. Populate ctf_stypes.
(ctf_simple_open): Drop writable arg.
(ctf_simple_open_internal): Likewise.
(ctf_bufopen): Likewise.
(ctf_bufopen_internal): Populate fields only used for writable dicts.
Drop LCTF_RDWR.
(ctf_dict_close): Cater for symhash cache split.
* ctf-serialize.c (ctf_serialize): Use ctf_stypes, not LCTF_RDWR.
* ctf-types.c (ctf_variable_next): Drop LCTF_RDWR.
* testsuite/libctf-lookup/add-to-opened*: New test.
2023-12-20 00:58:19 +08:00
|
|
|
} while (sym_functions++ < 1);
|
|
|
|
|
|
|
|
/* Figure out how big the symtypetabs are now. */
|
|
|
|
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
if (ctf_symtypetab_sect_sizes (fp, &symstate, &hdr, &objt_size, &func_size,
|
|
|
|
&objtidx_size, &funcidx_size) < 0)
|
libctf: make ctf_serialize() actually serialize
ctf_serialize() evolved from the old ctf_update(), which mutated the
in-memory CTF dict to make all the dynamic in-memory types into static,
unchanging written-to-the-dict types (by deserializing and reserializing
it): back in the days when you could only do type lookups on static types,
this meant you could see all the types you added recently, at the small,
small cost of making it impossible to change those older types ever again
and inducing an amortized O(n^2) cost if you actually wanted to add
references to types you added at arbitrary times to later types.
It also reset things so that ctf_discard() would throw away only types you
added after the most recent ctf_update() call.
Some time ago this was all changed so that you could look up dynamic types
just as easily as static types: ctf_update() changed so that only its
visible side-effect of affecting ctf_discard() remained: the old
ctf_update() was renamed to ctf_serialize(), made internal to libctf, and
called from the various functions that wrote files out.
... but it was still working by serializing and deserializing the entire
dict, swapping out its guts with the newly-serialized copy in an invasive
and horrible fashion that coupled ctf_serialize() to almost every field in
the ctf_dict_t. This is totally useless, and fixing it is easy: just rip
all that code out and have ctf_serialize return a serialized representation,
and let everything use that directly. This simplifies most of its callers
significantly.
(It also points up another bug: ctf_gzwrite() failed to call ctf_serialize()
at all, so it would only ever work for a dict you just ctf_write_mem()ed
yourself, just for its invisible side-effect of serializing the dict!)
This lets us simplify away a bunch of internal-only open-side functionality
for overriding the syn_ext_strtab and some just-added functionality for
forcing in an existing atoms table, without loss of functionality, and lets
us lift the restriction on reserializing a dict that was ctf_open()ed rather
than being ctf_create()d: it's now perfectly OK to open a dict, modify it
(except for adding members to existing structs, unions, or enums, which
fails with -ECTF_RDONLY), and write it out again, just as one would expect.
libctf/
* ctf-serialize.c (ctf_symtypetab_sect_sizes): Fix typos.
(ctf_type_sect_size): Add static type sizes too.
(ctf_serialize): Return the new dict rather than updating the
existing dict. No longer fail for dicts with static types;
copy them onto the start of the new types table.
(ctf_gzwrite): Actually serialize before gzwriting.
(ctf_write_mem): Improve forced (test-mode) endian-flipping:
flip dicts even if they are too small to be compressed.
Improve confusing variable naming.
* ctf-archive.c (arc_write_one_ctf): Don't bother to call
ctf_serialize: both the functions we call do so.
* ctf-string.c (ctf_str_create_atoms): Drop serializing case
(atoms arg).
* ctf-open.c (ctf_simple_open): Call ctf_bufopen directly.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Delete/rename to ctf_bufopen: no
longer bother with syn_ext_strtab or forced atoms table,
serialization no longer needs them.
* ctf-create.c (ctf_create): Call ctf_bufopen directly.
* ctf-impl.h (ctf_str_create_atoms): Drop atoms arg.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Likewise.
(ctf_serialize): Adjust.
* testsuite/libctf-lookup/add-to-opened.c: Adjust now that
this is supposed to work.
2024-03-26 21:04:20 +08:00
|
|
|
return NULL; /* errno is set for us. */
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
|
libctf: support addition of types to dicts read via ctf_open()
libctf has long declared deserialized dictionaries (out of files or ELF
sections or memory buffers or whatever) to be read-only: back in the
furthest prehistory this was not the case, in that you could add a
few sorts of type to such dicts, but attempting to do so often caused
horrible memory corruption, so I banned the lot.
But it turns out real consumers want it (notably DTrace, which
synthesises pointers to types that don't have them and adds them to the
ctf_open()ed dicts if it needs them). Let's bring it back again, but
without the memory corruption and without the massive code duplication
required in days of yore to distinguish between static and dynamic
types: the representation of both types has been identical for a few
years, with the only difference being that types as a whole are stored in
a big buffer for types read in via ctf_open and per-type hashtables for
newly-added types.
So we discard the internally-visible concept of "readonly dictionaries"
in favour of declaring the *range of types* that were already present
when the dict was read in to be read-only: you can't modify them (say,
by adding members to them if they're structs, or calling ctf_set_array
on them), but you can add more types and point to them. (The API
remains the same, with calls sometimes returning ECTF_RDONLY, but now
they do so less often.)
This is a fairly invasive change, mostly because code written since the
ban was introduced didn't take the possibility of a static/dynamic split
into account. Some of these irregularities were hard to define as
anything but bugs.
Notably:
- The symbol handling was assuming that symbols only needed to be
looked for in dynamic hashtabs or static linker-laid-out indexed/
nonindexed layouts, but now we want to check both in case people
added more symbols to a dict they opened.
- The code that handles type additions wasn't checking to see if types
with the same name existed *at all* (so you could do
ctf_add_typedef (fp, "foo", bar) repeatedly without error). This
seems reasonable for types you just added, but we probably *do* want
to ban addition of types with names that override names we already
used in the ctf_open()ed portion, since that would probably corrupt
existing type relationships. (Doing things this way also avoids
causing new errors for any existing code that was doing this sort of
thing.)
- ctf_lookup_variable entirely failed to work for variables just added
by ctf_add_variable: you had to write the dict out and read it back
in again before they appeared.
- The symbol handling remembered what symbols you looked up but didn't
remember their types, so you could look up an object symbol and then
find it popping up when you asked for function symbols, which seems
less than ideal. Since we had to rejig things enough to be able to
distinguish function and object symbols internally anyway (in order
to give suitable errors if you try to add a symbol with a name that
already existed in the ctf_open()ed dict), this bug suddenly became
more visible and was easily fixed.
We do not (yet) support writing out dicts that have been previously read
in via ctf_open() or other deserializer (you can look things up in them,
but not write them out a second time). This never worked, so there is
no incompatibility; if it is needed at a later date, the serializer is a
little bit closer to having it work now (the only table we don't deal
with is the types table, and that's because the upcoming CTFv4 changes
are likely to make major changes to the way that table is represented
internally, so adding more code that depends on its current form seems
like a bad idea).
There is a new testcase that tests much of this, in particular that
modification of existing types is still banned and that you can add new
ones and chase them without error.
libctf/
* ctf-impl.h (struct ctf_dict.ctf_symhash): Split into...
(ctf_dict.ctf_symhash_func): ... this and...
(ctf_dict.ctf_symhash_objt): ... this.
(ctf_dict.ctf_stypes): New, counts static types.
(LCTF_INDEX_TO_TYPEPTR): Use it instead of CTF_RDWR.
(LCTF_RDWR): Deleted.
(LCTF_DIRTY): Renumbered.
(LCTF_LINKING): Likewise.
(ctf_lookup_variable_here): New.
(ctf_lookup_by_sym_or_name): Likewise.
(ctf_symbol_next_static): Likewise.
(ctf_add_variable_forced): Likewise.
(ctf_add_funcobjt_sym_forced): Likewise.
(ctf_simple_open_internal): Adjust.
(ctf_bufopen_internal): Likewise.
* ctf-create.c (ctf_grow_ptrtab): Adjust a lot to start with.
(ctf_create): Migrate a bunch of initializations into bufopen.
Force recreation of name tables. Do not forcibly override the
model, let ctf_bufopen do it.
(ctf_static_type): New.
(ctf_update): Drop LCTF_RDWR check.
(ctf_dynamic_type): Likewise.
(ctf_add_function): Likewise.
(ctf_add_type_internal): Likewise.
(ctf_rollback): Check ctf_stypes, not LCTF_RDWR.
(ctf_set_array): Likewise.
(ctf_add_struct_sized): Likewise.
(ctf_add_union_sized): Likewise.
(ctf_add_enum): Likewise.
(ctf_add_enumerator): Likewise (only on the target dict).
(ctf_add_member_offset): Likewise.
(ctf_add_generic): Drop LCTF_RDWR check. Ban addition of types
with colliding names.
(ctf_add_forward): Note safety under the new rules.
(ctf_add_variable): Split all but the existence check into...
(ctf_add_variable_forced): ... this new function.
(ctf_add_funcobjt_sym): Likewise...
(ctf_add_funcobjt_sym_forced): ... for this new function.
* ctf-link.c (ctf_link_add_linker_symbol): Ban calling on dicts
with any stypes.
(ctf_link_add_strtab): Likewise.
(ctf_link_shuffle_syms): Likewise.
(ctf_link_intern_extern_string): Note pre-existing prohibition.
* ctf-lookup.c (ctf_lookup_by_id): Drop LCTF_RDWR check.
(ctf_lookup_variable): Split out looking in a dict but not
its parent into...
(ctf_lookup_variable_here): ... this new function.
(ctf_lookup_symbol_idx): Track whether looking up a function or
object: cache them separately.
(ctf_symbol_next): Split out looking in non-dynamic symtypetab
entries to...
(ctf_symbol_next_static): ... this new function. Don't get confused
by the simultaneous presence of static and dynamic symtypetab entries.
(ctf_try_lookup_indexed): Don't waste time looking up symbols by
index before there can be any idea how symbols are numbered.
(ctf_lookup_by_sym_or_name): Distinguish between function and
data object lookups. Drop LCTF_RDWR.
(ctf_lookup_by_symbol): Adjust.
(ctf_lookup_by_symbol_name): Likewise.
* ctf-open.c (init_types): Rename to...
(init_static_types): ... this. Drop LCTF_RDWR. Populate ctf_stypes.
(ctf_simple_open): Drop writable arg.
(ctf_simple_open_internal): Likewise.
(ctf_bufopen): Likewise.
(ctf_bufopen_internal): Populate fields only used for writable dicts.
Drop LCTF_RDWR.
(ctf_dict_close): Cater for symhash cache split.
* ctf-serialize.c (ctf_serialize): Use ctf_stypes, not LCTF_RDWR.
* ctf-types.c (ctf_variable_next): Drop LCTF_RDWR.
* testsuite/libctf-lookup/add-to-opened*: New test.
2023-12-20 00:58:19 +08:00
|
|
|
/* Propagate all vars into the dynamic state, so we can put them back later.
|
|
|
|
Variables already in the dynamic state, likely due to repeated
|
|
|
|
serialization, are left unchanged. */
|
|
|
|
|
|
|
|
for (i = 0; i < fp->ctf_nvars; i++)
|
|
|
|
{
|
|
|
|
const char *name = ctf_strptr (fp, fp->ctf_vars[i].ctv_name);
|
|
|
|
|
|
|
|
if (name != NULL && !ctf_dvd_lookup (fp, name))
|
|
|
|
if (ctf_add_variable_forced (fp, name, fp->ctf_vars[i].ctv_type) < 0)
|
libctf: make ctf_serialize() actually serialize
ctf_serialize() evolved from the old ctf_update(), which mutated the
in-memory CTF dict to make all the dynamic in-memory types into static,
unchanging written-to-the-dict types (by deserializing and reserializing
it): back in the days when you could only do type lookups on static types,
this meant you could see all the types you added recently, at the small,
small cost of making it impossible to change those older types ever again
and inducing an amortized O(n^2) cost if you actually wanted to add
references to types you added at arbitrary times to later types.
It also reset things so that ctf_discard() would throw away only types you
added after the most recent ctf_update() call.
Some time ago this was all changed so that you could look up dynamic types
just as easily as static types: ctf_update() changed so that only its
visible side-effect of affecting ctf_discard() remained: the old
ctf_update() was renamed to ctf_serialize(), made internal to libctf, and
called from the various functions that wrote files out.
... but it was still working by serializing and deserializing the entire
dict, swapping out its guts with the newly-serialized copy in an invasive
and horrible fashion that coupled ctf_serialize() to almost every field in
the ctf_dict_t. This is totally useless, and fixing it is easy: just rip
all that code out and have ctf_serialize return a serialized representation,
and let everything use that directly. This simplifies most of its callers
significantly.
(It also points up another bug: ctf_gzwrite() failed to call ctf_serialize()
at all, so it would only ever work for a dict you just ctf_write_mem()ed
yourself, just for its invisible side-effect of serializing the dict!)
This lets us simplify away a bunch of internal-only open-side functionality
for overriding the syn_ext_strtab and some just-added functionality for
forcing in an existing atoms table, without loss of functionality, and lets
us lift the restriction on reserializing a dict that was ctf_open()ed rather
than being ctf_create()d: it's now perfectly OK to open a dict, modify it
(except for adding members to existing structs, unions, or enums, which
fails with -ECTF_RDONLY), and write it out again, just as one would expect.
libctf/
* ctf-serialize.c (ctf_symtypetab_sect_sizes): Fix typos.
(ctf_type_sect_size): Add static type sizes too.
(ctf_serialize): Return the new dict rather than updating the
existing dict. No longer fail for dicts with static types;
copy them onto the start of the new types table.
(ctf_gzwrite): Actually serialize before gzwriting.
(ctf_write_mem): Improve forced (test-mode) endian-flipping:
flip dicts even if they are too small to be compressed.
Improve confusing variable naming.
* ctf-archive.c (arc_write_one_ctf): Don't bother to call
ctf_serialize: both the functions we call do so.
* ctf-string.c (ctf_str_create_atoms): Drop serializing case
(atoms arg).
* ctf-open.c (ctf_simple_open): Call ctf_bufopen directly.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Delete/rename to ctf_bufopen: no
longer bother with syn_ext_strtab or forced atoms table,
serialization no longer needs them.
* ctf-create.c (ctf_create): Call ctf_bufopen directly.
* ctf-impl.h (ctf_str_create_atoms): Drop atoms arg.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Likewise.
(ctf_serialize): Adjust.
* testsuite/libctf-lookup/add-to-opened.c: Adjust now that
this is supposed to work.
2024-03-26 21:04:20 +08:00
|
|
|
return NULL; /* errno is set for us. */
|
libctf: support addition of types to dicts read via ctf_open()
libctf has long declared deserialized dictionaries (out of files or ELF
sections or memory buffers or whatever) to be read-only: back in the
furthest prehistory this was not the case, in that you could add a
few sorts of type to such dicts, but attempting to do so often caused
horrible memory corruption, so I banned the lot.
But it turns out real consumers want it (notably DTrace, which
synthesises pointers to types that don't have them and adds them to the
ctf_open()ed dicts if it needs them). Let's bring it back again, but
without the memory corruption and without the massive code duplication
required in days of yore to distinguish between static and dynamic
types: the representation of both types has been identical for a few
years, with the only difference being that types as a whole are stored in
a big buffer for types read in via ctf_open and per-type hashtables for
newly-added types.
So we discard the internally-visible concept of "readonly dictionaries"
in favour of declaring the *range of types* that were already present
when the dict was read in to be read-only: you can't modify them (say,
by adding members to them if they're structs, or calling ctf_set_array
on them), but you can add more types and point to them. (The API
remains the same, with calls sometimes returning ECTF_RDONLY, but now
they do so less often.)
This is a fairly invasive change, mostly because code written since the
ban was introduced didn't take the possibility of a static/dynamic split
into account. Some of these irregularities were hard to define as
anything but bugs.
Notably:
- The symbol handling was assuming that symbols only needed to be
looked for in dynamic hashtabs or static linker-laid-out indexed/
nonindexed layouts, but now we want to check both in case people
added more symbols to a dict they opened.
- The code that handles type additions wasn't checking to see if types
with the same name existed *at all* (so you could do
ctf_add_typedef (fp, "foo", bar) repeatedly without error). This
seems reasonable for types you just added, but we probably *do* want
to ban addition of types with names that override names we already
used in the ctf_open()ed portion, since that would probably corrupt
existing type relationships. (Doing things this way also avoids
causing new errors for any existing code that was doing this sort of
thing.)
- ctf_lookup_variable entirely failed to work for variables just added
by ctf_add_variable: you had to write the dict out and read it back
in again before they appeared.
- The symbol handling remembered what symbols you looked up but didn't
remember their types, so you could look up an object symbol and then
find it popping up when you asked for function symbols, which seems
less than ideal. Since we had to rejig things enough to be able to
distinguish function and object symbols internally anyway (in order
to give suitable errors if you try to add a symbol with a name that
already existed in the ctf_open()ed dict), this bug suddenly became
more visible and was easily fixed.
We do not (yet) support writing out dicts that have been previously read
in via ctf_open() or other deserializer (you can look things up in them,
but not write them out a second time). This never worked, so there is
no incompatibility; if it is needed at a later date, the serializer is a
little bit closer to having it work now (the only table we don't deal
with is the types table, and that's because the upcoming CTFv4 changes
are likely to make major changes to the way that table is represented
internally, so adding more code that depends on its current form seems
like a bad idea).
There is a new testcase that tests much of this, in particular that
modification of existing types is still banned and that you can add new
ones and chase them without error.
libctf/
* ctf-impl.h (struct ctf_dict.ctf_symhash): Split into...
(ctf_dict.ctf_symhash_func): ... this and...
(ctf_dict.ctf_symhash_objt): ... this.
(ctf_dict.ctf_stypes): New, counts static types.
(LCTF_INDEX_TO_TYPEPTR): Use it instead of CTF_RDWR.
(LCTF_RDWR): Deleted.
(LCTF_DIRTY): Renumbered.
(LCTF_LINKING): Likewise.
(ctf_lookup_variable_here): New.
(ctf_lookup_by_sym_or_name): Likewise.
(ctf_symbol_next_static): Likewise.
(ctf_add_variable_forced): Likewise.
(ctf_add_funcobjt_sym_forced): Likewise.
(ctf_simple_open_internal): Adjust.
(ctf_bufopen_internal): Likewise.
* ctf-create.c (ctf_grow_ptrtab): Adjust a lot to start with.
(ctf_create): Migrate a bunch of initializations into bufopen.
Force recreation of name tables. Do not forcibly override the
model, let ctf_bufopen do it.
(ctf_static_type): New.
(ctf_update): Drop LCTF_RDWR check.
(ctf_dynamic_type): Likewise.
(ctf_add_function): Likewise.
(ctf_add_type_internal): Likewise.
(ctf_rollback): Check ctf_stypes, not LCTF_RDWR.
(ctf_set_array): Likewise.
(ctf_add_struct_sized): Likewise.
(ctf_add_union_sized): Likewise.
(ctf_add_enum): Likewise.
(ctf_add_enumerator): Likewise (only on the target dict).
(ctf_add_member_offset): Likewise.
(ctf_add_generic): Drop LCTF_RDWR check. Ban addition of types
with colliding names.
(ctf_add_forward): Note safety under the new rules.
(ctf_add_variable): Split all but the existence check into...
(ctf_add_variable_forced): ... this new function.
(ctf_add_funcobjt_sym): Likewise...
(ctf_add_funcobjt_sym_forced): ... for this new function.
* ctf-link.c (ctf_link_add_linker_symbol): Ban calling on dicts
with any stypes.
(ctf_link_add_strtab): Likewise.
(ctf_link_shuffle_syms): Likewise.
(ctf_link_intern_extern_string): Note pre-existing prohibition.
* ctf-lookup.c (ctf_lookup_by_id): Drop LCTF_RDWR check.
(ctf_lookup_variable): Split out looking in a dict but not
its parent into...
(ctf_lookup_variable_here): ... this new function.
(ctf_lookup_symbol_idx): Track whether looking up a function or
object: cache them separately.
(ctf_symbol_next): Split out looking in non-dynamic symtypetab
entries to...
(ctf_symbol_next_static): ... this new function. Don't get confused
by the simultaneous presence of static and dynamic symtypetab entries.
(ctf_try_lookup_indexed): Don't waste time looking up symbols by
index before there can be any idea how symbols are numbered.
(ctf_lookup_by_sym_or_name): Distinguish between function and
data object lookups. Drop LCTF_RDWR.
(ctf_lookup_by_symbol): Adjust.
(ctf_lookup_by_symbol_name): Likewise.
* ctf-open.c (init_types): Rename to...
(init_static_types): ... this. Drop LCTF_RDWR. Populate ctf_stypes.
(ctf_simple_open): Drop writable arg.
(ctf_simple_open_internal): Likewise.
(ctf_bufopen): Likewise.
(ctf_bufopen_internal): Populate fields only used for writable dicts.
Drop LCTF_RDWR.
(ctf_dict_close): Cater for symhash cache split.
* ctf-serialize.c (ctf_serialize): Use ctf_stypes, not LCTF_RDWR.
* ctf-types.c (ctf_variable_next): Drop LCTF_RDWR.
* testsuite/libctf-lookup/add-to-opened*: New test.
2023-12-20 00:58:19 +08:00
|
|
|
}
|
|
|
|
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
for (nvars = 0, dvd = ctf_list_next (&fp->ctf_dvdefs);
|
|
|
|
dvd != NULL; dvd = ctf_list_next (dvd), nvars++);
|
|
|
|
|
|
|
|
type_size = ctf_type_sect_size (fp);
|
|
|
|
|
|
|
|
/* Compute the size of the CTF buffer we need, sans only the string table,
|
|
|
|
then allocate a new buffer and memcpy the finished header to the start of
|
|
|
|
the buffer. (We will adjust this later with strtab length info.) */
|
|
|
|
|
|
|
|
hdr.cth_lbloff = hdr.cth_objtoff = 0;
|
|
|
|
hdr.cth_funcoff = hdr.cth_objtoff + objt_size;
|
|
|
|
hdr.cth_objtidxoff = hdr.cth_funcoff + func_size;
|
|
|
|
hdr.cth_funcidxoff = hdr.cth_objtidxoff + objtidx_size;
|
|
|
|
hdr.cth_varoff = hdr.cth_funcidxoff + funcidx_size;
|
|
|
|
hdr.cth_typeoff = hdr.cth_varoff + (nvars * sizeof (ctf_varent_t));
|
|
|
|
hdr.cth_stroff = hdr.cth_typeoff + type_size;
|
|
|
|
hdr.cth_strlen = 0;
|
|
|
|
|
|
|
|
buf_size = sizeof (ctf_header_t) + hdr.cth_stroff + hdr.cth_strlen;
|
|
|
|
|
|
|
|
if ((buf = malloc (buf_size)) == NULL)
|
libctf: make ctf_serialize() actually serialize
ctf_serialize() evolved from the old ctf_update(), which mutated the
in-memory CTF dict to make all the dynamic in-memory types into static,
unchanging written-to-the-dict types (by deserializing and reserializing
it): back in the days when you could only do type lookups on static types,
this meant you could see all the types you added recently, at the small,
small cost of making it impossible to change those older types ever again
and inducing an amortized O(n^2) cost if you actually wanted to add
references to types you added at arbitrary times to later types.
It also reset things so that ctf_discard() would throw away only types you
added after the most recent ctf_update() call.
Some time ago this was all changed so that you could look up dynamic types
just as easily as static types: ctf_update() changed so that only its
visible side-effect of affecting ctf_discard() remained: the old
ctf_update() was renamed to ctf_serialize(), made internal to libctf, and
called from the various functions that wrote files out.
... but it was still working by serializing and deserializing the entire
dict, swapping out its guts with the newly-serialized copy in an invasive
and horrible fashion that coupled ctf_serialize() to almost every field in
the ctf_dict_t. This is totally useless, and fixing it is easy: just rip
all that code out and have ctf_serialize return a serialized representation,
and let everything use that directly. This simplifies most of its callers
significantly.
(It also points up another bug: ctf_gzwrite() failed to call ctf_serialize()
at all, so it would only ever work for a dict you just ctf_write_mem()ed
yourself, just for its invisible side-effect of serializing the dict!)
This lets us simplify away a bunch of internal-only open-side functionality
for overriding the syn_ext_strtab and some just-added functionality for
forcing in an existing atoms table, without loss of functionality, and lets
us lift the restriction on reserializing a dict that was ctf_open()ed rather
than being ctf_create()d: it's now perfectly OK to open a dict, modify it
(except for adding members to existing structs, unions, or enums, which
fails with -ECTF_RDONLY), and write it out again, just as one would expect.
libctf/
* ctf-serialize.c (ctf_symtypetab_sect_sizes): Fix typos.
(ctf_type_sect_size): Add static type sizes too.
(ctf_serialize): Return the new dict rather than updating the
existing dict. No longer fail for dicts with static types;
copy them onto the start of the new types table.
(ctf_gzwrite): Actually serialize before gzwriting.
(ctf_write_mem): Improve forced (test-mode) endian-flipping:
flip dicts even if they are too small to be compressed.
Improve confusing variable naming.
* ctf-archive.c (arc_write_one_ctf): Don't bother to call
ctf_serialize: both the functions we call do so.
* ctf-string.c (ctf_str_create_atoms): Drop serializing case
(atoms arg).
* ctf-open.c (ctf_simple_open): Call ctf_bufopen directly.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Delete/rename to ctf_bufopen: no
longer bother with syn_ext_strtab or forced atoms table,
serialization no longer needs them.
* ctf-create.c (ctf_create): Call ctf_bufopen directly.
* ctf-impl.h (ctf_str_create_atoms): Drop atoms arg.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Likewise.
(ctf_serialize): Adjust.
* testsuite/libctf-lookup/add-to-opened.c: Adjust now that
this is supposed to work.
2024-03-26 21:04:20 +08:00
|
|
|
{
|
|
|
|
ctf_set_errno (fp, EAGAIN);
|
|
|
|
return NULL;
|
|
|
|
}
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
|
|
|
|
memcpy (buf, &hdr, sizeof (ctf_header_t));
|
|
|
|
t = (unsigned char *) buf + sizeof (ctf_header_t) + hdr.cth_objtoff;
|
|
|
|
|
|
|
|
hdrp = (ctf_header_t *) buf;
|
|
|
|
if ((fp->ctf_flags & LCTF_CHILD) && (fp->ctf_parname != NULL))
|
|
|
|
ctf_str_add_ref (fp, fp->ctf_parname, &hdrp->cth_parname);
|
|
|
|
if (fp->ctf_cuname != NULL)
|
|
|
|
ctf_str_add_ref (fp, fp->ctf_cuname, &hdrp->cth_cuname);
|
|
|
|
|
|
|
|
if (ctf_emit_symtypetab_sects (fp, &symstate, &t, objt_size, func_size,
|
|
|
|
objtidx_size, funcidx_size) < 0)
|
|
|
|
goto err;
|
|
|
|
|
|
|
|
assert (t == (unsigned char *) buf + sizeof (ctf_header_t) + hdr.cth_varoff);
|
|
|
|
|
|
|
|
/* Work over the variable list, translating everything into ctf_varent_t's and
|
|
|
|
prepping the string table. */
|
|
|
|
|
|
|
|
dvarents = (ctf_varent_t *) t;
|
|
|
|
for (i = 0, dvd = ctf_list_next (&fp->ctf_dvdefs); dvd != NULL;
|
|
|
|
dvd = ctf_list_next (dvd), i++)
|
|
|
|
{
|
|
|
|
ctf_varent_t *var = &dvarents[i];
|
|
|
|
|
|
|
|
ctf_str_add_ref (fp, dvd->dvd_name, &var->ctv_name);
|
|
|
|
var->ctv_type = (uint32_t) dvd->dvd_type;
|
|
|
|
}
|
|
|
|
assert (i == nvars);
|
|
|
|
|
|
|
|
t += sizeof (ctf_varent_t) * nvars;
|
|
|
|
|
|
|
|
assert (t == (unsigned char *) buf + sizeof (ctf_header_t) + hdr.cth_typeoff);
|
|
|
|
|
libctf: make ctf_serialize() actually serialize
ctf_serialize() evolved from the old ctf_update(), which mutated the
in-memory CTF dict to make all the dynamic in-memory types into static,
unchanging written-to-the-dict types (by deserializing and reserializing
it): back in the days when you could only do type lookups on static types,
this meant you could see all the types you added recently, at the small,
small cost of making it impossible to change those older types ever again
and inducing an amortized O(n^2) cost if you actually wanted to add
references to types you added at arbitrary times to later types.
It also reset things so that ctf_discard() would throw away only types you
added after the most recent ctf_update() call.
Some time ago this was all changed so that you could look up dynamic types
just as easily as static types: ctf_update() changed so that only its
visible side-effect of affecting ctf_discard() remained: the old
ctf_update() was renamed to ctf_serialize(), made internal to libctf, and
called from the various functions that wrote files out.
... but it was still working by serializing and deserializing the entire
dict, swapping out its guts with the newly-serialized copy in an invasive
and horrible fashion that coupled ctf_serialize() to almost every field in
the ctf_dict_t. This is totally useless, and fixing it is easy: just rip
all that code out and have ctf_serialize return a serialized representation,
and let everything use that directly. This simplifies most of its callers
significantly.
(It also points up another bug: ctf_gzwrite() failed to call ctf_serialize()
at all, so it would only ever work for a dict you just ctf_write_mem()ed
yourself, just for its invisible side-effect of serializing the dict!)
This lets us simplify away a bunch of internal-only open-side functionality
for overriding the syn_ext_strtab and some just-added functionality for
forcing in an existing atoms table, without loss of functionality, and lets
us lift the restriction on reserializing a dict that was ctf_open()ed rather
than being ctf_create()d: it's now perfectly OK to open a dict, modify it
(except for adding members to existing structs, unions, or enums, which
fails with -ECTF_RDONLY), and write it out again, just as one would expect.
libctf/
* ctf-serialize.c (ctf_symtypetab_sect_sizes): Fix typos.
(ctf_type_sect_size): Add static type sizes too.
(ctf_serialize): Return the new dict rather than updating the
existing dict. No longer fail for dicts with static types;
copy them onto the start of the new types table.
(ctf_gzwrite): Actually serialize before gzwriting.
(ctf_write_mem): Improve forced (test-mode) endian-flipping:
flip dicts even if they are too small to be compressed.
Improve confusing variable naming.
* ctf-archive.c (arc_write_one_ctf): Don't bother to call
ctf_serialize: both the functions we call do so.
* ctf-string.c (ctf_str_create_atoms): Drop serializing case
(atoms arg).
* ctf-open.c (ctf_simple_open): Call ctf_bufopen directly.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Delete/rename to ctf_bufopen: no
longer bother with syn_ext_strtab or forced atoms table,
serialization no longer needs them.
* ctf-create.c (ctf_create): Call ctf_bufopen directly.
* ctf-impl.h (ctf_str_create_atoms): Drop atoms arg.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Likewise.
(ctf_serialize): Adjust.
* testsuite/libctf-lookup/add-to-opened.c: Adjust now that
this is supposed to work.
2024-03-26 21:04:20 +08:00
|
|
|
/* Copy in existing static types, then emit new dynamic types. */
|
|
|
|
|
|
|
|
memcpy (t, fp->ctf_buf + fp->ctf_header->cth_typeoff,
|
|
|
|
fp->ctf_header->cth_stroff - fp->ctf_header->cth_typeoff);
|
|
|
|
t += fp->ctf_header->cth_stroff - fp->ctf_header->cth_typeoff;
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
ctf_emit_type_sect (fp, &t);
|
|
|
|
|
2021-03-18 20:37:52 +08:00
|
|
|
assert (t == (unsigned char *) buf + sizeof (ctf_header_t) + hdr.cth_stroff);
|
|
|
|
|
|
|
|
/* Construct the final string table and fill out all the string refs with the
|
libctf: rethink strtab writeout
This commit finally adjusts strtab writeout so that repeated writeouts, or
writeouts of a dict that was read in earlier, only sorts the portion of the
strtab that was newly added.
There are three intertwined changes here:
- pull the contents of strtabs from newly ctf_bufopened dicts into the
atoms table, so that future additions will reuse the existing offset etc
rather than adding new identical strings
- allow the internal ctf_bufopen done by serialization to contribute its
existing atoms table, so that existing atoms can be used for the
remainder of the open process (like name table construction): this atoms
table currente gets thrown away in the mass reassignment done later in
ctf_serialize in any case, but it needs to be there during the open.
- rewrite ctf_str_write_strtab so that a) it uses iterators rather than
ctf_*_iter, reducing pointless structures which serve no other purpose
than to implement ordinary variable scope, but more clunkily, and b)
retains the existing strtab on the front of the new one, with its sort
retained, rather than resorting, so all existing already-written strtab
offsets remain valid across the call.
This latter change finally permits repeated serializations, and
reserializations of ctf_open()ed dicts, to work, but for now we keep the
code that prevents that because serialization is about to change again in a
way that will make it more obvious that doing such things is safe, and we
can take it out then.
(There are also some smaller changes like moving the purge of the refs table
into ctf_str_write_strtab(), since that's where the changes happen that
invalidate it, rather than doing it in ctf_serialize(). We also prohibit
something that has never worked, opening a dict and then reporting symbols
to it via ctf_link_add_strtab() et al: you must do that to newly-created
dicts which have had stuff ctf_link()ed into them. This is very unlikely
ever to be a problem in practice: linkers just don't do that sort of thing.)
libctf/
* ctf-create.c (ctf_create): Add (temporary) atoms arg.
* ctf-impl.h (struct ctf_dict.ctf_dynstrtab): New.
(ctf_str_create_atoms): Adjust.
(ctf_str_write_strtab): Likewise.
(ctf_simple_open_internal): Likewise.
* ctf-open.c (ctf_simple_open_internal): Add atoms arg.
(ctf_bufopen): Likewise.
(ctf_bufopen_internal): Initialize just enough of an
atoms table: pre-init from the atoms arg if supplied.
(ctf_simple_open): Adjust.
* ctf-serialize.c (ctf_serialize): Constify the strtab.
Move ref list purging into ctf_str_write_strtab.
Initialize the new dict with the old dict's atoms table.
Accept the new strtab from ctf_str_write_strtab.
Adjust for addition of ctf_dynstrtab.
* ctf-string.c (ctf_strraw_explicit): Improve comments.
(ctf_str_create_atoms): Prepopulate from an existing atoms table,
or alternatively pull in all strings from the strtab and turn
them into atoms.
(ctf_str_free_atoms): Free the dynstrtab and its strtab.
(struct ctf_strtab_write_state): Remove.
(ctf_str_count_strtab): Fold this...
(ctf_str_populate_sorttab): ... and this...
(ctf_str_write_strtab): ... into this. Prepend existing strings
to the strtab rather than resorting them (and wrecking their
offsets). Keep the dynstrtab updated. Update refs for all
atoms with refs, whether or not they are strings newly added
to the strtab.
2024-03-26 03:07:43 +08:00
|
|
|
final offsets. */
|
|
|
|
|
2021-03-18 20:37:52 +08:00
|
|
|
strtab = ctf_str_write_strtab (fp);
|
|
|
|
|
libctf: rethink strtab writeout
This commit finally adjusts strtab writeout so that repeated writeouts, or
writeouts of a dict that was read in earlier, only sorts the portion of the
strtab that was newly added.
There are three intertwined changes here:
- pull the contents of strtabs from newly ctf_bufopened dicts into the
atoms table, so that future additions will reuse the existing offset etc
rather than adding new identical strings
- allow the internal ctf_bufopen done by serialization to contribute its
existing atoms table, so that existing atoms can be used for the
remainder of the open process (like name table construction): this atoms
table currente gets thrown away in the mass reassignment done later in
ctf_serialize in any case, but it needs to be there during the open.
- rewrite ctf_str_write_strtab so that a) it uses iterators rather than
ctf_*_iter, reducing pointless structures which serve no other purpose
than to implement ordinary variable scope, but more clunkily, and b)
retains the existing strtab on the front of the new one, with its sort
retained, rather than resorting, so all existing already-written strtab
offsets remain valid across the call.
This latter change finally permits repeated serializations, and
reserializations of ctf_open()ed dicts, to work, but for now we keep the
code that prevents that because serialization is about to change again in a
way that will make it more obvious that doing such things is safe, and we
can take it out then.
(There are also some smaller changes like moving the purge of the refs table
into ctf_str_write_strtab(), since that's where the changes happen that
invalidate it, rather than doing it in ctf_serialize(). We also prohibit
something that has never worked, opening a dict and then reporting symbols
to it via ctf_link_add_strtab() et al: you must do that to newly-created
dicts which have had stuff ctf_link()ed into them. This is very unlikely
ever to be a problem in practice: linkers just don't do that sort of thing.)
libctf/
* ctf-create.c (ctf_create): Add (temporary) atoms arg.
* ctf-impl.h (struct ctf_dict.ctf_dynstrtab): New.
(ctf_str_create_atoms): Adjust.
(ctf_str_write_strtab): Likewise.
(ctf_simple_open_internal): Likewise.
* ctf-open.c (ctf_simple_open_internal): Add atoms arg.
(ctf_bufopen): Likewise.
(ctf_bufopen_internal): Initialize just enough of an
atoms table: pre-init from the atoms arg if supplied.
(ctf_simple_open): Adjust.
* ctf-serialize.c (ctf_serialize): Constify the strtab.
Move ref list purging into ctf_str_write_strtab.
Initialize the new dict with the old dict's atoms table.
Accept the new strtab from ctf_str_write_strtab.
Adjust for addition of ctf_dynstrtab.
* ctf-string.c (ctf_strraw_explicit): Improve comments.
(ctf_str_create_atoms): Prepopulate from an existing atoms table,
or alternatively pull in all strings from the strtab and turn
them into atoms.
(ctf_str_free_atoms): Free the dynstrtab and its strtab.
(struct ctf_strtab_write_state): Remove.
(ctf_str_count_strtab): Fold this...
(ctf_str_populate_sorttab): ... and this...
(ctf_str_write_strtab): ... into this. Prepend existing strings
to the strtab rather than resorting them (and wrecking their
offsets). Keep the dynstrtab updated. Update refs for all
atoms with refs, whether or not they are strings newly added
to the strtab.
2024-03-26 03:07:43 +08:00
|
|
|
if (strtab == NULL)
|
2021-03-18 20:37:52 +08:00
|
|
|
goto oom;
|
|
|
|
|
|
|
|
/* Now the string table is constructed, we can sort the buffer of
|
|
|
|
ctf_varent_t's. */
|
libctf: rethink strtab writeout
This commit finally adjusts strtab writeout so that repeated writeouts, or
writeouts of a dict that was read in earlier, only sorts the portion of the
strtab that was newly added.
There are three intertwined changes here:
- pull the contents of strtabs from newly ctf_bufopened dicts into the
atoms table, so that future additions will reuse the existing offset etc
rather than adding new identical strings
- allow the internal ctf_bufopen done by serialization to contribute its
existing atoms table, so that existing atoms can be used for the
remainder of the open process (like name table construction): this atoms
table currente gets thrown away in the mass reassignment done later in
ctf_serialize in any case, but it needs to be there during the open.
- rewrite ctf_str_write_strtab so that a) it uses iterators rather than
ctf_*_iter, reducing pointless structures which serve no other purpose
than to implement ordinary variable scope, but more clunkily, and b)
retains the existing strtab on the front of the new one, with its sort
retained, rather than resorting, so all existing already-written strtab
offsets remain valid across the call.
This latter change finally permits repeated serializations, and
reserializations of ctf_open()ed dicts, to work, but for now we keep the
code that prevents that because serialization is about to change again in a
way that will make it more obvious that doing such things is safe, and we
can take it out then.
(There are also some smaller changes like moving the purge of the refs table
into ctf_str_write_strtab(), since that's where the changes happen that
invalidate it, rather than doing it in ctf_serialize(). We also prohibit
something that has never worked, opening a dict and then reporting symbols
to it via ctf_link_add_strtab() et al: you must do that to newly-created
dicts which have had stuff ctf_link()ed into them. This is very unlikely
ever to be a problem in practice: linkers just don't do that sort of thing.)
libctf/
* ctf-create.c (ctf_create): Add (temporary) atoms arg.
* ctf-impl.h (struct ctf_dict.ctf_dynstrtab): New.
(ctf_str_create_atoms): Adjust.
(ctf_str_write_strtab): Likewise.
(ctf_simple_open_internal): Likewise.
* ctf-open.c (ctf_simple_open_internal): Add atoms arg.
(ctf_bufopen): Likewise.
(ctf_bufopen_internal): Initialize just enough of an
atoms table: pre-init from the atoms arg if supplied.
(ctf_simple_open): Adjust.
* ctf-serialize.c (ctf_serialize): Constify the strtab.
Move ref list purging into ctf_str_write_strtab.
Initialize the new dict with the old dict's atoms table.
Accept the new strtab from ctf_str_write_strtab.
Adjust for addition of ctf_dynstrtab.
* ctf-string.c (ctf_strraw_explicit): Improve comments.
(ctf_str_create_atoms): Prepopulate from an existing atoms table,
or alternatively pull in all strings from the strtab and turn
them into atoms.
(ctf_str_free_atoms): Free the dynstrtab and its strtab.
(struct ctf_strtab_write_state): Remove.
(ctf_str_count_strtab): Fold this...
(ctf_str_populate_sorttab): ... and this...
(ctf_str_write_strtab): ... into this. Prepend existing strings
to the strtab rather than resorting them (and wrecking their
offsets). Keep the dynstrtab updated. Update refs for all
atoms with refs, whether or not they are strings newly added
to the strtab.
2024-03-26 03:07:43 +08:00
|
|
|
ctf_sort_var_arg_cb_t sort_var_arg = { fp, (ctf_strs_t *) strtab };
|
2021-03-18 20:37:52 +08:00
|
|
|
ctf_qsort_r (dvarents, nvars, sizeof (ctf_varent_t), ctf_sort_var,
|
|
|
|
&sort_var_arg);
|
|
|
|
|
libctf: rethink strtab writeout
This commit finally adjusts strtab writeout so that repeated writeouts, or
writeouts of a dict that was read in earlier, only sorts the portion of the
strtab that was newly added.
There are three intertwined changes here:
- pull the contents of strtabs from newly ctf_bufopened dicts into the
atoms table, so that future additions will reuse the existing offset etc
rather than adding new identical strings
- allow the internal ctf_bufopen done by serialization to contribute its
existing atoms table, so that existing atoms can be used for the
remainder of the open process (like name table construction): this atoms
table currente gets thrown away in the mass reassignment done later in
ctf_serialize in any case, but it needs to be there during the open.
- rewrite ctf_str_write_strtab so that a) it uses iterators rather than
ctf_*_iter, reducing pointless structures which serve no other purpose
than to implement ordinary variable scope, but more clunkily, and b)
retains the existing strtab on the front of the new one, with its sort
retained, rather than resorting, so all existing already-written strtab
offsets remain valid across the call.
This latter change finally permits repeated serializations, and
reserializations of ctf_open()ed dicts, to work, but for now we keep the
code that prevents that because serialization is about to change again in a
way that will make it more obvious that doing such things is safe, and we
can take it out then.
(There are also some smaller changes like moving the purge of the refs table
into ctf_str_write_strtab(), since that's where the changes happen that
invalidate it, rather than doing it in ctf_serialize(). We also prohibit
something that has never worked, opening a dict and then reporting symbols
to it via ctf_link_add_strtab() et al: you must do that to newly-created
dicts which have had stuff ctf_link()ed into them. This is very unlikely
ever to be a problem in practice: linkers just don't do that sort of thing.)
libctf/
* ctf-create.c (ctf_create): Add (temporary) atoms arg.
* ctf-impl.h (struct ctf_dict.ctf_dynstrtab): New.
(ctf_str_create_atoms): Adjust.
(ctf_str_write_strtab): Likewise.
(ctf_simple_open_internal): Likewise.
* ctf-open.c (ctf_simple_open_internal): Add atoms arg.
(ctf_bufopen): Likewise.
(ctf_bufopen_internal): Initialize just enough of an
atoms table: pre-init from the atoms arg if supplied.
(ctf_simple_open): Adjust.
* ctf-serialize.c (ctf_serialize): Constify the strtab.
Move ref list purging into ctf_str_write_strtab.
Initialize the new dict with the old dict's atoms table.
Accept the new strtab from ctf_str_write_strtab.
Adjust for addition of ctf_dynstrtab.
* ctf-string.c (ctf_strraw_explicit): Improve comments.
(ctf_str_create_atoms): Prepopulate from an existing atoms table,
or alternatively pull in all strings from the strtab and turn
them into atoms.
(ctf_str_free_atoms): Free the dynstrtab and its strtab.
(struct ctf_strtab_write_state): Remove.
(ctf_str_count_strtab): Fold this...
(ctf_str_populate_sorttab): ... and this...
(ctf_str_write_strtab): ... into this. Prepend existing strings
to the strtab rather than resorting them (and wrecking their
offsets). Keep the dynstrtab updated. Update refs for all
atoms with refs, whether or not they are strings newly added
to the strtab.
2024-03-26 03:07:43 +08:00
|
|
|
if ((newbuf = realloc (buf, buf_size + strtab->cts_len)) == NULL)
|
libctf: replace 'pending refs' abstraction
A few years ago we introduced a 'pending refs' abstraction to fix one
problem: serializing a dict, then changing it would tend to corrupt the dict
because the strtab sort we do on strtab writeout (to improve compression
efficiency) would modify the offset of any strings that sorted
lexicographically earlier in the strtab: so we added a new restriction that
all strings are added only at serialization time, and maintained a set of
'pending' refs that were added earlier, whose offsets we could update (like
other refs) at writeout time.
This was in hindsight seriously problematic for maintenance (because
serialization has to traverse all strings in all datatypes in the entire
dict), and has become impossible to sustain now that we can read in existing
dicts, modify them, and reserialize them again. We really don't want to
have to dig through the entire dict we jut read in just in order to dig out
all its strtab offsets, then *change* it, just for the sake of a sort that
adds a frankly trivial amount of compression efficiency.
Sorting *is* still worthwhile -- but it sacrifices very little to only sort
newly-added portions of the strtab, reusing older portions as necessary.
As a first stage in this, discard the whole "pending refs" abstraction and
replace it with "movable" refs, which are exactly like all other refs
(addresses containing the strtab offset of some string, which are updated
wiht the final strtab offset on serialization) except that we track them in
a reverse dict so that we can move the refs around (which we do whenever we
realloc() a buffer containing a bunch of structure members or something when
we add members to the structure).
libctf/
* ctf-create.c (ctf_add_enumerator): Call ctf_str_move_refs; add
a movable ref.
(ctf_add_member_offset): Likewise.
* ctf-util.c (ctf_realloc): Delete.
* ctf-serialize.c (ctf_serialize): No longer use it. Adjust to
new fields.
* ctf-string.c (ctf_str_purge_atom_refs): Purge movable refs.
(ctf_str_free_atom): Free freeable atoms' strings.
(ctf_str_create_atoms): Create the movable refs dynhash if needed.
(ctf_str_free_atoms): Destroy it.
(CTF_STR_MOVABLE): Switch (back) from ints to flags (see previous
reversion). Add new flag.
(aref_create): New, populate movable refs if need be.
(ctf_str_add_ref_internal): Switch back to flags, update refs
directly for nonprovisional strings (with already-known fixed offsets);
create refs via aref_create. Allocate strings only if not within an
mmapped strtab.
(ctf_str_add_movable_ref): New.
(ctf_str_add): Adjust to CTF_STR_* reintroduction.
(ctf_str_add_external): LIkewise.
(ctf_str_move_refs): New, move refs via ctf_str_movable_refs
backpointer.
(ctf_str_purge_refs): Drop ctf_str_num_refs.
(ctf_str_update_refs): Fix indentation.
* ctf-impl.h (struct ctf_str_atom_movable): New.
(struct ctf_dict.ctf_str_num_refs): Drop.
(struct ctf_dict.ctf_str_movable_refs): New.
(ctf_str_add_movable_ref): Declare.
(ctf_str_move_refs): Likewise.
(ctf_realloc): Drop.
2024-03-26 00:39:02 +08:00
|
|
|
goto oom;
|
|
|
|
|
2021-03-18 20:37:52 +08:00
|
|
|
buf = newbuf;
|
libctf: rethink strtab writeout
This commit finally adjusts strtab writeout so that repeated writeouts, or
writeouts of a dict that was read in earlier, only sorts the portion of the
strtab that was newly added.
There are three intertwined changes here:
- pull the contents of strtabs from newly ctf_bufopened dicts into the
atoms table, so that future additions will reuse the existing offset etc
rather than adding new identical strings
- allow the internal ctf_bufopen done by serialization to contribute its
existing atoms table, so that existing atoms can be used for the
remainder of the open process (like name table construction): this atoms
table currente gets thrown away in the mass reassignment done later in
ctf_serialize in any case, but it needs to be there during the open.
- rewrite ctf_str_write_strtab so that a) it uses iterators rather than
ctf_*_iter, reducing pointless structures which serve no other purpose
than to implement ordinary variable scope, but more clunkily, and b)
retains the existing strtab on the front of the new one, with its sort
retained, rather than resorting, so all existing already-written strtab
offsets remain valid across the call.
This latter change finally permits repeated serializations, and
reserializations of ctf_open()ed dicts, to work, but for now we keep the
code that prevents that because serialization is about to change again in a
way that will make it more obvious that doing such things is safe, and we
can take it out then.
(There are also some smaller changes like moving the purge of the refs table
into ctf_str_write_strtab(), since that's where the changes happen that
invalidate it, rather than doing it in ctf_serialize(). We also prohibit
something that has never worked, opening a dict and then reporting symbols
to it via ctf_link_add_strtab() et al: you must do that to newly-created
dicts which have had stuff ctf_link()ed into them. This is very unlikely
ever to be a problem in practice: linkers just don't do that sort of thing.)
libctf/
* ctf-create.c (ctf_create): Add (temporary) atoms arg.
* ctf-impl.h (struct ctf_dict.ctf_dynstrtab): New.
(ctf_str_create_atoms): Adjust.
(ctf_str_write_strtab): Likewise.
(ctf_simple_open_internal): Likewise.
* ctf-open.c (ctf_simple_open_internal): Add atoms arg.
(ctf_bufopen): Likewise.
(ctf_bufopen_internal): Initialize just enough of an
atoms table: pre-init from the atoms arg if supplied.
(ctf_simple_open): Adjust.
* ctf-serialize.c (ctf_serialize): Constify the strtab.
Move ref list purging into ctf_str_write_strtab.
Initialize the new dict with the old dict's atoms table.
Accept the new strtab from ctf_str_write_strtab.
Adjust for addition of ctf_dynstrtab.
* ctf-string.c (ctf_strraw_explicit): Improve comments.
(ctf_str_create_atoms): Prepopulate from an existing atoms table,
or alternatively pull in all strings from the strtab and turn
them into atoms.
(ctf_str_free_atoms): Free the dynstrtab and its strtab.
(struct ctf_strtab_write_state): Remove.
(ctf_str_count_strtab): Fold this...
(ctf_str_populate_sorttab): ... and this...
(ctf_str_write_strtab): ... into this. Prepend existing strings
to the strtab rather than resorting them (and wrecking their
offsets). Keep the dynstrtab updated. Update refs for all
atoms with refs, whether or not they are strings newly added
to the strtab.
2024-03-26 03:07:43 +08:00
|
|
|
memcpy (buf + buf_size, strtab->cts_strs, strtab->cts_len);
|
2021-03-18 20:37:52 +08:00
|
|
|
hdrp = (ctf_header_t *) buf;
|
libctf: rethink strtab writeout
This commit finally adjusts strtab writeout so that repeated writeouts, or
writeouts of a dict that was read in earlier, only sorts the portion of the
strtab that was newly added.
There are three intertwined changes here:
- pull the contents of strtabs from newly ctf_bufopened dicts into the
atoms table, so that future additions will reuse the existing offset etc
rather than adding new identical strings
- allow the internal ctf_bufopen done by serialization to contribute its
existing atoms table, so that existing atoms can be used for the
remainder of the open process (like name table construction): this atoms
table currente gets thrown away in the mass reassignment done later in
ctf_serialize in any case, but it needs to be there during the open.
- rewrite ctf_str_write_strtab so that a) it uses iterators rather than
ctf_*_iter, reducing pointless structures which serve no other purpose
than to implement ordinary variable scope, but more clunkily, and b)
retains the existing strtab on the front of the new one, with its sort
retained, rather than resorting, so all existing already-written strtab
offsets remain valid across the call.
This latter change finally permits repeated serializations, and
reserializations of ctf_open()ed dicts, to work, but for now we keep the
code that prevents that because serialization is about to change again in a
way that will make it more obvious that doing such things is safe, and we
can take it out then.
(There are also some smaller changes like moving the purge of the refs table
into ctf_str_write_strtab(), since that's where the changes happen that
invalidate it, rather than doing it in ctf_serialize(). We also prohibit
something that has never worked, opening a dict and then reporting symbols
to it via ctf_link_add_strtab() et al: you must do that to newly-created
dicts which have had stuff ctf_link()ed into them. This is very unlikely
ever to be a problem in practice: linkers just don't do that sort of thing.)
libctf/
* ctf-create.c (ctf_create): Add (temporary) atoms arg.
* ctf-impl.h (struct ctf_dict.ctf_dynstrtab): New.
(ctf_str_create_atoms): Adjust.
(ctf_str_write_strtab): Likewise.
(ctf_simple_open_internal): Likewise.
* ctf-open.c (ctf_simple_open_internal): Add atoms arg.
(ctf_bufopen): Likewise.
(ctf_bufopen_internal): Initialize just enough of an
atoms table: pre-init from the atoms arg if supplied.
(ctf_simple_open): Adjust.
* ctf-serialize.c (ctf_serialize): Constify the strtab.
Move ref list purging into ctf_str_write_strtab.
Initialize the new dict with the old dict's atoms table.
Accept the new strtab from ctf_str_write_strtab.
Adjust for addition of ctf_dynstrtab.
* ctf-string.c (ctf_strraw_explicit): Improve comments.
(ctf_str_create_atoms): Prepopulate from an existing atoms table,
or alternatively pull in all strings from the strtab and turn
them into atoms.
(ctf_str_free_atoms): Free the dynstrtab and its strtab.
(struct ctf_strtab_write_state): Remove.
(ctf_str_count_strtab): Fold this...
(ctf_str_populate_sorttab): ... and this...
(ctf_str_write_strtab): ... into this. Prepend existing strings
to the strtab rather than resorting them (and wrecking their
offsets). Keep the dynstrtab updated. Update refs for all
atoms with refs, whether or not they are strings newly added
to the strtab.
2024-03-26 03:07:43 +08:00
|
|
|
hdrp->cth_strlen = strtab->cts_len;
|
2021-03-18 20:37:52 +08:00
|
|
|
buf_size += hdrp->cth_strlen;
|
libctf: make ctf_serialize() actually serialize
ctf_serialize() evolved from the old ctf_update(), which mutated the
in-memory CTF dict to make all the dynamic in-memory types into static,
unchanging written-to-the-dict types (by deserializing and reserializing
it): back in the days when you could only do type lookups on static types,
this meant you could see all the types you added recently, at the small,
small cost of making it impossible to change those older types ever again
and inducing an amortized O(n^2) cost if you actually wanted to add
references to types you added at arbitrary times to later types.
It also reset things so that ctf_discard() would throw away only types you
added after the most recent ctf_update() call.
Some time ago this was all changed so that you could look up dynamic types
just as easily as static types: ctf_update() changed so that only its
visible side-effect of affecting ctf_discard() remained: the old
ctf_update() was renamed to ctf_serialize(), made internal to libctf, and
called from the various functions that wrote files out.
... but it was still working by serializing and deserializing the entire
dict, swapping out its guts with the newly-serialized copy in an invasive
and horrible fashion that coupled ctf_serialize() to almost every field in
the ctf_dict_t. This is totally useless, and fixing it is easy: just rip
all that code out and have ctf_serialize return a serialized representation,
and let everything use that directly. This simplifies most of its callers
significantly.
(It also points up another bug: ctf_gzwrite() failed to call ctf_serialize()
at all, so it would only ever work for a dict you just ctf_write_mem()ed
yourself, just for its invisible side-effect of serializing the dict!)
This lets us simplify away a bunch of internal-only open-side functionality
for overriding the syn_ext_strtab and some just-added functionality for
forcing in an existing atoms table, without loss of functionality, and lets
us lift the restriction on reserializing a dict that was ctf_open()ed rather
than being ctf_create()d: it's now perfectly OK to open a dict, modify it
(except for adding members to existing structs, unions, or enums, which
fails with -ECTF_RDONLY), and write it out again, just as one would expect.
libctf/
* ctf-serialize.c (ctf_symtypetab_sect_sizes): Fix typos.
(ctf_type_sect_size): Add static type sizes too.
(ctf_serialize): Return the new dict rather than updating the
existing dict. No longer fail for dicts with static types;
copy them onto the start of the new types table.
(ctf_gzwrite): Actually serialize before gzwriting.
(ctf_write_mem): Improve forced (test-mode) endian-flipping:
flip dicts even if they are too small to be compressed.
Improve confusing variable naming.
* ctf-archive.c (arc_write_one_ctf): Don't bother to call
ctf_serialize: both the functions we call do so.
* ctf-string.c (ctf_str_create_atoms): Drop serializing case
(atoms arg).
* ctf-open.c (ctf_simple_open): Call ctf_bufopen directly.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Delete/rename to ctf_bufopen: no
longer bother with syn_ext_strtab or forced atoms table,
serialization no longer needs them.
* ctf-create.c (ctf_create): Call ctf_bufopen directly.
* ctf-impl.h (ctf_str_create_atoms): Drop atoms arg.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Likewise.
(ctf_serialize): Adjust.
* testsuite/libctf-lookup/add-to-opened.c: Adjust now that
this is supposed to work.
2024-03-26 21:04:20 +08:00
|
|
|
*bufsiz = buf_size;
|
2021-03-18 20:37:52 +08:00
|
|
|
|
libctf: make ctf_serialize() actually serialize
ctf_serialize() evolved from the old ctf_update(), which mutated the
in-memory CTF dict to make all the dynamic in-memory types into static,
unchanging written-to-the-dict types (by deserializing and reserializing
it): back in the days when you could only do type lookups on static types,
this meant you could see all the types you added recently, at the small,
small cost of making it impossible to change those older types ever again
and inducing an amortized O(n^2) cost if you actually wanted to add
references to types you added at arbitrary times to later types.
It also reset things so that ctf_discard() would throw away only types you
added after the most recent ctf_update() call.
Some time ago this was all changed so that you could look up dynamic types
just as easily as static types: ctf_update() changed so that only its
visible side-effect of affecting ctf_discard() remained: the old
ctf_update() was renamed to ctf_serialize(), made internal to libctf, and
called from the various functions that wrote files out.
... but it was still working by serializing and deserializing the entire
dict, swapping out its guts with the newly-serialized copy in an invasive
and horrible fashion that coupled ctf_serialize() to almost every field in
the ctf_dict_t. This is totally useless, and fixing it is easy: just rip
all that code out and have ctf_serialize return a serialized representation,
and let everything use that directly. This simplifies most of its callers
significantly.
(It also points up another bug: ctf_gzwrite() failed to call ctf_serialize()
at all, so it would only ever work for a dict you just ctf_write_mem()ed
yourself, just for its invisible side-effect of serializing the dict!)
This lets us simplify away a bunch of internal-only open-side functionality
for overriding the syn_ext_strtab and some just-added functionality for
forcing in an existing atoms table, without loss of functionality, and lets
us lift the restriction on reserializing a dict that was ctf_open()ed rather
than being ctf_create()d: it's now perfectly OK to open a dict, modify it
(except for adding members to existing structs, unions, or enums, which
fails with -ECTF_RDONLY), and write it out again, just as one would expect.
libctf/
* ctf-serialize.c (ctf_symtypetab_sect_sizes): Fix typos.
(ctf_type_sect_size): Add static type sizes too.
(ctf_serialize): Return the new dict rather than updating the
existing dict. No longer fail for dicts with static types;
copy them onto the start of the new types table.
(ctf_gzwrite): Actually serialize before gzwriting.
(ctf_write_mem): Improve forced (test-mode) endian-flipping:
flip dicts even if they are too small to be compressed.
Improve confusing variable naming.
* ctf-archive.c (arc_write_one_ctf): Don't bother to call
ctf_serialize: both the functions we call do so.
* ctf-string.c (ctf_str_create_atoms): Drop serializing case
(atoms arg).
* ctf-open.c (ctf_simple_open): Call ctf_bufopen directly.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Delete/rename to ctf_bufopen: no
longer bother with syn_ext_strtab or forced atoms table,
serialization no longer needs them.
* ctf-create.c (ctf_create): Call ctf_bufopen directly.
* ctf-impl.h (ctf_str_create_atoms): Drop atoms arg.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Likewise.
(ctf_serialize): Adjust.
* testsuite/libctf-lookup/add-to-opened.c: Adjust now that
this is supposed to work.
2024-03-26 21:04:20 +08:00
|
|
|
return buf;
|
2021-03-18 20:37:52 +08:00
|
|
|
|
|
|
|
oom:
|
libctf: make ctf_serialize() actually serialize
ctf_serialize() evolved from the old ctf_update(), which mutated the
in-memory CTF dict to make all the dynamic in-memory types into static,
unchanging written-to-the-dict types (by deserializing and reserializing
it): back in the days when you could only do type lookups on static types,
this meant you could see all the types you added recently, at the small,
small cost of making it impossible to change those older types ever again
and inducing an amortized O(n^2) cost if you actually wanted to add
references to types you added at arbitrary times to later types.
It also reset things so that ctf_discard() would throw away only types you
added after the most recent ctf_update() call.
Some time ago this was all changed so that you could look up dynamic types
just as easily as static types: ctf_update() changed so that only its
visible side-effect of affecting ctf_discard() remained: the old
ctf_update() was renamed to ctf_serialize(), made internal to libctf, and
called from the various functions that wrote files out.
... but it was still working by serializing and deserializing the entire
dict, swapping out its guts with the newly-serialized copy in an invasive
and horrible fashion that coupled ctf_serialize() to almost every field in
the ctf_dict_t. This is totally useless, and fixing it is easy: just rip
all that code out and have ctf_serialize return a serialized representation,
and let everything use that directly. This simplifies most of its callers
significantly.
(It also points up another bug: ctf_gzwrite() failed to call ctf_serialize()
at all, so it would only ever work for a dict you just ctf_write_mem()ed
yourself, just for its invisible side-effect of serializing the dict!)
This lets us simplify away a bunch of internal-only open-side functionality
for overriding the syn_ext_strtab and some just-added functionality for
forcing in an existing atoms table, without loss of functionality, and lets
us lift the restriction on reserializing a dict that was ctf_open()ed rather
than being ctf_create()d: it's now perfectly OK to open a dict, modify it
(except for adding members to existing structs, unions, or enums, which
fails with -ECTF_RDONLY), and write it out again, just as one would expect.
libctf/
* ctf-serialize.c (ctf_symtypetab_sect_sizes): Fix typos.
(ctf_type_sect_size): Add static type sizes too.
(ctf_serialize): Return the new dict rather than updating the
existing dict. No longer fail for dicts with static types;
copy them onto the start of the new types table.
(ctf_gzwrite): Actually serialize before gzwriting.
(ctf_write_mem): Improve forced (test-mode) endian-flipping:
flip dicts even if they are too small to be compressed.
Improve confusing variable naming.
* ctf-archive.c (arc_write_one_ctf): Don't bother to call
ctf_serialize: both the functions we call do so.
* ctf-string.c (ctf_str_create_atoms): Drop serializing case
(atoms arg).
* ctf-open.c (ctf_simple_open): Call ctf_bufopen directly.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Delete/rename to ctf_bufopen: no
longer bother with syn_ext_strtab or forced atoms table,
serialization no longer needs them.
* ctf-create.c (ctf_create): Call ctf_bufopen directly.
* ctf-impl.h (ctf_str_create_atoms): Drop atoms arg.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Likewise.
(ctf_serialize): Adjust.
* testsuite/libctf-lookup/add-to-opened.c: Adjust now that
this is supposed to work.
2024-03-26 21:04:20 +08:00
|
|
|
ctf_set_errno (fp, EAGAIN);
|
2021-03-18 20:37:52 +08:00
|
|
|
err:
|
|
|
|
free (buf);
|
libctf: make ctf_serialize() actually serialize
ctf_serialize() evolved from the old ctf_update(), which mutated the
in-memory CTF dict to make all the dynamic in-memory types into static,
unchanging written-to-the-dict types (by deserializing and reserializing
it): back in the days when you could only do type lookups on static types,
this meant you could see all the types you added recently, at the small,
small cost of making it impossible to change those older types ever again
and inducing an amortized O(n^2) cost if you actually wanted to add
references to types you added at arbitrary times to later types.
It also reset things so that ctf_discard() would throw away only types you
added after the most recent ctf_update() call.
Some time ago this was all changed so that you could look up dynamic types
just as easily as static types: ctf_update() changed so that only its
visible side-effect of affecting ctf_discard() remained: the old
ctf_update() was renamed to ctf_serialize(), made internal to libctf, and
called from the various functions that wrote files out.
... but it was still working by serializing and deserializing the entire
dict, swapping out its guts with the newly-serialized copy in an invasive
and horrible fashion that coupled ctf_serialize() to almost every field in
the ctf_dict_t. This is totally useless, and fixing it is easy: just rip
all that code out and have ctf_serialize return a serialized representation,
and let everything use that directly. This simplifies most of its callers
significantly.
(It also points up another bug: ctf_gzwrite() failed to call ctf_serialize()
at all, so it would only ever work for a dict you just ctf_write_mem()ed
yourself, just for its invisible side-effect of serializing the dict!)
This lets us simplify away a bunch of internal-only open-side functionality
for overriding the syn_ext_strtab and some just-added functionality for
forcing in an existing atoms table, without loss of functionality, and lets
us lift the restriction on reserializing a dict that was ctf_open()ed rather
than being ctf_create()d: it's now perfectly OK to open a dict, modify it
(except for adding members to existing structs, unions, or enums, which
fails with -ECTF_RDONLY), and write it out again, just as one would expect.
libctf/
* ctf-serialize.c (ctf_symtypetab_sect_sizes): Fix typos.
(ctf_type_sect_size): Add static type sizes too.
(ctf_serialize): Return the new dict rather than updating the
existing dict. No longer fail for dicts with static types;
copy them onto the start of the new types table.
(ctf_gzwrite): Actually serialize before gzwriting.
(ctf_write_mem): Improve forced (test-mode) endian-flipping:
flip dicts even if they are too small to be compressed.
Improve confusing variable naming.
* ctf-archive.c (arc_write_one_ctf): Don't bother to call
ctf_serialize: both the functions we call do so.
* ctf-string.c (ctf_str_create_atoms): Drop serializing case
(atoms arg).
* ctf-open.c (ctf_simple_open): Call ctf_bufopen directly.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Delete/rename to ctf_bufopen: no
longer bother with syn_ext_strtab or forced atoms table,
serialization no longer needs them.
* ctf-create.c (ctf_create): Call ctf_bufopen directly.
* ctf-impl.h (ctf_str_create_atoms): Drop atoms arg.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Likewise.
(ctf_serialize): Adjust.
* testsuite/libctf-lookup/add-to-opened.c: Adjust now that
this is supposed to work.
2024-03-26 21:04:20 +08:00
|
|
|
return NULL; /* errno is set for us. */
|
2021-03-18 20:37:52 +08:00
|
|
|
}
|
|
|
|
|
libctf: split up ctf_serialize
ctf_serialize and its various pieces may be split out into a separate
file now, but ctf_serialize is still far too long and disordered, mixing
header initialization, sizing of multiple CTF sections, sorting and
emission of multiple CTF sections, strtab construction and ctf_dict_t
copying into a single ugly organically-grown mess.
Fix the worst of this by migrating all section sizing and emission into
separate functions, two per section (or class of section in the case of
the symtypetabs). Only the variable section is now sized and emitted
directly in ctf_serialize (because it only takes about three lines to do
so).
The section sizes themselves are still maintained by ctf_serialize so
that it can work out the header offsets, but ctf_symtypetab_sect_sizes
and ctf_emit_symtypetab_sects share a lot of extra state: migrate that
into a shared structure, emit_symtypetab_state_t.
(Test results unchanged.)
libctf/ChangeLog
2021-03-18 Nick Alcock <nick.alcock@oracle.com>
* ctf-serialize.c: General reshuffling, and...
(emit_symtypetab_state_t): New, migrated from
local variables in ctf_serialize.
(ctf_serialize): Split out most section sizing and
emission.
(ctf_symtypetab_sect_sizes): New (split out).
(ctf_emit_symtypetab_sects): Likewise.
(ctf_type_sect_size): Likewise.
(ctf_emit_type_sect): Likewise.
2021-03-18 20:37:52 +08:00
|
|
|
/* File writing. */
|
2021-03-18 20:37:52 +08:00
|
|
|
|
libctf: add LIBCTF_WRITE_FOREIGN_ENDIAN debugging option
libctf has always handled endianness differences by detecting
foreign-endian CTF dicts on the input and endian-flipping them: dicts
are always written in native endianness. This makes endian-awareness
very low overhead, but it means that the foreign-endian code paths
almost never get routinely tested, since "make check" usually reads in
dicts ld has just written out: only a few corrupted-CTF tests are
actually in fixed endianness, and even they only test the foreign-
endian code paths when you run make check on a big-endian machine.
(And the fix is surely not to add more .s-based tests like that, because
they are a nightmare to maintain compared to the C-code-based ones.)
To improve on this, add a new environment variable,
LIBCTF_WRITE_FOREIGN_ENDIAN, which causes libctf to unconditionally
endian-flip at ctf_write time, so the output is always in the wrong
endianness. This then tests the foreign-endian read paths properly
at open time.
Make this easier by restructuring the writeout code in ctf-serialize.c,
which duplicates the maybe-gzip-and-write-out code three times (once
for ctf_write_mem, with thresholding, and once each for
ctf_compress_write and ctf_write just so those can avoid thresholding
and/or compression). Instead, have the latter two call the former
with thresholds of 0 or (size_t) -1, respectively.
The endian-flipping code itself gains a bit of complexity, because
one single endian-flipper (flip_types) was assuming the input to be
in foreign-endian form and assuming it could pull things out of the
input once they had been flipped and make sense of them. At the
cost of a few lines of duplicated initializations, teach it to
read before flipping if we're flipping to foreign-endianness instead
of away from it.
libctf/
* ctf-impl.h (ctf_flip_header): No longer static.
(ctf_flip): Likewise.
* ctf-open.c (flip_header): Rename to...
(ctf_flip_header): ... this, now it is not private to one file.
(flip_ctf): Rename...
(ctf_flip): ... this too. Add FOREIGN_ENDIAN arg.
(flip_types): Likewise. Use it.
(ctf_bufopen_internal): Adjust calls.
* ctf-serialize.c (ctf_write_mem): Add flip_endian path via
a newly-allocated bounce buffer.
(ctf_compress_write): Move below ctf_write_mem and reimplement
in terms of it.
(ctf_write): Likewise.
(ctf_gzwrite): Note that this obscure writeout function does not
support endian-flipping.
2022-03-18 21:20:29 +08:00
|
|
|
/* Write the compressed CTF data stream to the specified gzFile descriptor. The
|
|
|
|
whole stream is compressed, and cannot be read by CTF opening functions in
|
|
|
|
this library until it is decompressed. (The functions below this one leave
|
|
|
|
the header uncompressed, and the CTF opening functions work on them without
|
|
|
|
manual decompression.)
|
|
|
|
|
|
|
|
No support for (testing-only) endian-flipping. */
|
2021-03-18 20:37:52 +08:00
|
|
|
int
|
|
|
|
ctf_gzwrite (ctf_dict_t *fp, gzFile fd)
|
|
|
|
{
|
libctf: make ctf_serialize() actually serialize
ctf_serialize() evolved from the old ctf_update(), which mutated the
in-memory CTF dict to make all the dynamic in-memory types into static,
unchanging written-to-the-dict types (by deserializing and reserializing
it): back in the days when you could only do type lookups on static types,
this meant you could see all the types you added recently, at the small,
small cost of making it impossible to change those older types ever again
and inducing an amortized O(n^2) cost if you actually wanted to add
references to types you added at arbitrary times to later types.
It also reset things so that ctf_discard() would throw away only types you
added after the most recent ctf_update() call.
Some time ago this was all changed so that you could look up dynamic types
just as easily as static types: ctf_update() changed so that only its
visible side-effect of affecting ctf_discard() remained: the old
ctf_update() was renamed to ctf_serialize(), made internal to libctf, and
called from the various functions that wrote files out.
... but it was still working by serializing and deserializing the entire
dict, swapping out its guts with the newly-serialized copy in an invasive
and horrible fashion that coupled ctf_serialize() to almost every field in
the ctf_dict_t. This is totally useless, and fixing it is easy: just rip
all that code out and have ctf_serialize return a serialized representation,
and let everything use that directly. This simplifies most of its callers
significantly.
(It also points up another bug: ctf_gzwrite() failed to call ctf_serialize()
at all, so it would only ever work for a dict you just ctf_write_mem()ed
yourself, just for its invisible side-effect of serializing the dict!)
This lets us simplify away a bunch of internal-only open-side functionality
for overriding the syn_ext_strtab and some just-added functionality for
forcing in an existing atoms table, without loss of functionality, and lets
us lift the restriction on reserializing a dict that was ctf_open()ed rather
than being ctf_create()d: it's now perfectly OK to open a dict, modify it
(except for adding members to existing structs, unions, or enums, which
fails with -ECTF_RDONLY), and write it out again, just as one would expect.
libctf/
* ctf-serialize.c (ctf_symtypetab_sect_sizes): Fix typos.
(ctf_type_sect_size): Add static type sizes too.
(ctf_serialize): Return the new dict rather than updating the
existing dict. No longer fail for dicts with static types;
copy them onto the start of the new types table.
(ctf_gzwrite): Actually serialize before gzwriting.
(ctf_write_mem): Improve forced (test-mode) endian-flipping:
flip dicts even if they are too small to be compressed.
Improve confusing variable naming.
* ctf-archive.c (arc_write_one_ctf): Don't bother to call
ctf_serialize: both the functions we call do so.
* ctf-string.c (ctf_str_create_atoms): Drop serializing case
(atoms arg).
* ctf-open.c (ctf_simple_open): Call ctf_bufopen directly.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Delete/rename to ctf_bufopen: no
longer bother with syn_ext_strtab or forced atoms table,
serialization no longer needs them.
* ctf-create.c (ctf_create): Call ctf_bufopen directly.
* ctf-impl.h (ctf_str_create_atoms): Drop atoms arg.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Likewise.
(ctf_serialize): Adjust.
* testsuite/libctf-lookup/add-to-opened.c: Adjust now that
this is supposed to work.
2024-03-26 21:04:20 +08:00
|
|
|
unsigned char *buf;
|
|
|
|
unsigned char *p;
|
|
|
|
size_t bufsiz;
|
|
|
|
size_t len, written = 0;
|
2021-03-18 20:37:52 +08:00
|
|
|
|
libctf: make ctf_serialize() actually serialize
ctf_serialize() evolved from the old ctf_update(), which mutated the
in-memory CTF dict to make all the dynamic in-memory types into static,
unchanging written-to-the-dict types (by deserializing and reserializing
it): back in the days when you could only do type lookups on static types,
this meant you could see all the types you added recently, at the small,
small cost of making it impossible to change those older types ever again
and inducing an amortized O(n^2) cost if you actually wanted to add
references to types you added at arbitrary times to later types.
It also reset things so that ctf_discard() would throw away only types you
added after the most recent ctf_update() call.
Some time ago this was all changed so that you could look up dynamic types
just as easily as static types: ctf_update() changed so that only its
visible side-effect of affecting ctf_discard() remained: the old
ctf_update() was renamed to ctf_serialize(), made internal to libctf, and
called from the various functions that wrote files out.
... but it was still working by serializing and deserializing the entire
dict, swapping out its guts with the newly-serialized copy in an invasive
and horrible fashion that coupled ctf_serialize() to almost every field in
the ctf_dict_t. This is totally useless, and fixing it is easy: just rip
all that code out and have ctf_serialize return a serialized representation,
and let everything use that directly. This simplifies most of its callers
significantly.
(It also points up another bug: ctf_gzwrite() failed to call ctf_serialize()
at all, so it would only ever work for a dict you just ctf_write_mem()ed
yourself, just for its invisible side-effect of serializing the dict!)
This lets us simplify away a bunch of internal-only open-side functionality
for overriding the syn_ext_strtab and some just-added functionality for
forcing in an existing atoms table, without loss of functionality, and lets
us lift the restriction on reserializing a dict that was ctf_open()ed rather
than being ctf_create()d: it's now perfectly OK to open a dict, modify it
(except for adding members to existing structs, unions, or enums, which
fails with -ECTF_RDONLY), and write it out again, just as one would expect.
libctf/
* ctf-serialize.c (ctf_symtypetab_sect_sizes): Fix typos.
(ctf_type_sect_size): Add static type sizes too.
(ctf_serialize): Return the new dict rather than updating the
existing dict. No longer fail for dicts with static types;
copy them onto the start of the new types table.
(ctf_gzwrite): Actually serialize before gzwriting.
(ctf_write_mem): Improve forced (test-mode) endian-flipping:
flip dicts even if they are too small to be compressed.
Improve confusing variable naming.
* ctf-archive.c (arc_write_one_ctf): Don't bother to call
ctf_serialize: both the functions we call do so.
* ctf-string.c (ctf_str_create_atoms): Drop serializing case
(atoms arg).
* ctf-open.c (ctf_simple_open): Call ctf_bufopen directly.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Delete/rename to ctf_bufopen: no
longer bother with syn_ext_strtab or forced atoms table,
serialization no longer needs them.
* ctf-create.c (ctf_create): Call ctf_bufopen directly.
* ctf-impl.h (ctf_str_create_atoms): Drop atoms arg.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Likewise.
(ctf_serialize): Adjust.
* testsuite/libctf-lookup/add-to-opened.c: Adjust now that
this is supposed to work.
2024-03-26 21:04:20 +08:00
|
|
|
if ((buf = ctf_serialize (fp, &bufsiz)) == NULL)
|
|
|
|
return -1; /* errno is set for us. */
|
2021-03-18 20:37:52 +08:00
|
|
|
|
libctf: make ctf_serialize() actually serialize
ctf_serialize() evolved from the old ctf_update(), which mutated the
in-memory CTF dict to make all the dynamic in-memory types into static,
unchanging written-to-the-dict types (by deserializing and reserializing
it): back in the days when you could only do type lookups on static types,
this meant you could see all the types you added recently, at the small,
small cost of making it impossible to change those older types ever again
and inducing an amortized O(n^2) cost if you actually wanted to add
references to types you added at arbitrary times to later types.
It also reset things so that ctf_discard() would throw away only types you
added after the most recent ctf_update() call.
Some time ago this was all changed so that you could look up dynamic types
just as easily as static types: ctf_update() changed so that only its
visible side-effect of affecting ctf_discard() remained: the old
ctf_update() was renamed to ctf_serialize(), made internal to libctf, and
called from the various functions that wrote files out.
... but it was still working by serializing and deserializing the entire
dict, swapping out its guts with the newly-serialized copy in an invasive
and horrible fashion that coupled ctf_serialize() to almost every field in
the ctf_dict_t. This is totally useless, and fixing it is easy: just rip
all that code out and have ctf_serialize return a serialized representation,
and let everything use that directly. This simplifies most of its callers
significantly.
(It also points up another bug: ctf_gzwrite() failed to call ctf_serialize()
at all, so it would only ever work for a dict you just ctf_write_mem()ed
yourself, just for its invisible side-effect of serializing the dict!)
This lets us simplify away a bunch of internal-only open-side functionality
for overriding the syn_ext_strtab and some just-added functionality for
forcing in an existing atoms table, without loss of functionality, and lets
us lift the restriction on reserializing a dict that was ctf_open()ed rather
than being ctf_create()d: it's now perfectly OK to open a dict, modify it
(except for adding members to existing structs, unions, or enums, which
fails with -ECTF_RDONLY), and write it out again, just as one would expect.
libctf/
* ctf-serialize.c (ctf_symtypetab_sect_sizes): Fix typos.
(ctf_type_sect_size): Add static type sizes too.
(ctf_serialize): Return the new dict rather than updating the
existing dict. No longer fail for dicts with static types;
copy them onto the start of the new types table.
(ctf_gzwrite): Actually serialize before gzwriting.
(ctf_write_mem): Improve forced (test-mode) endian-flipping:
flip dicts even if they are too small to be compressed.
Improve confusing variable naming.
* ctf-archive.c (arc_write_one_ctf): Don't bother to call
ctf_serialize: both the functions we call do so.
* ctf-string.c (ctf_str_create_atoms): Drop serializing case
(atoms arg).
* ctf-open.c (ctf_simple_open): Call ctf_bufopen directly.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Delete/rename to ctf_bufopen: no
longer bother with syn_ext_strtab or forced atoms table,
serialization no longer needs them.
* ctf-create.c (ctf_create): Call ctf_bufopen directly.
* ctf-impl.h (ctf_str_create_atoms): Drop atoms arg.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Likewise.
(ctf_serialize): Adjust.
* testsuite/libctf-lookup/add-to-opened.c: Adjust now that
this is supposed to work.
2024-03-26 21:04:20 +08:00
|
|
|
p = buf;
|
|
|
|
while (written < bufsiz)
|
2021-03-18 20:37:52 +08:00
|
|
|
{
|
libctf: make ctf_serialize() actually serialize
ctf_serialize() evolved from the old ctf_update(), which mutated the
in-memory CTF dict to make all the dynamic in-memory types into static,
unchanging written-to-the-dict types (by deserializing and reserializing
it): back in the days when you could only do type lookups on static types,
this meant you could see all the types you added recently, at the small,
small cost of making it impossible to change those older types ever again
and inducing an amortized O(n^2) cost if you actually wanted to add
references to types you added at arbitrary times to later types.
It also reset things so that ctf_discard() would throw away only types you
added after the most recent ctf_update() call.
Some time ago this was all changed so that you could look up dynamic types
just as easily as static types: ctf_update() changed so that only its
visible side-effect of affecting ctf_discard() remained: the old
ctf_update() was renamed to ctf_serialize(), made internal to libctf, and
called from the various functions that wrote files out.
... but it was still working by serializing and deserializing the entire
dict, swapping out its guts with the newly-serialized copy in an invasive
and horrible fashion that coupled ctf_serialize() to almost every field in
the ctf_dict_t. This is totally useless, and fixing it is easy: just rip
all that code out and have ctf_serialize return a serialized representation,
and let everything use that directly. This simplifies most of its callers
significantly.
(It also points up another bug: ctf_gzwrite() failed to call ctf_serialize()
at all, so it would only ever work for a dict you just ctf_write_mem()ed
yourself, just for its invisible side-effect of serializing the dict!)
This lets us simplify away a bunch of internal-only open-side functionality
for overriding the syn_ext_strtab and some just-added functionality for
forcing in an existing atoms table, without loss of functionality, and lets
us lift the restriction on reserializing a dict that was ctf_open()ed rather
than being ctf_create()d: it's now perfectly OK to open a dict, modify it
(except for adding members to existing structs, unions, or enums, which
fails with -ECTF_RDONLY), and write it out again, just as one would expect.
libctf/
* ctf-serialize.c (ctf_symtypetab_sect_sizes): Fix typos.
(ctf_type_sect_size): Add static type sizes too.
(ctf_serialize): Return the new dict rather than updating the
existing dict. No longer fail for dicts with static types;
copy them onto the start of the new types table.
(ctf_gzwrite): Actually serialize before gzwriting.
(ctf_write_mem): Improve forced (test-mode) endian-flipping:
flip dicts even if they are too small to be compressed.
Improve confusing variable naming.
* ctf-archive.c (arc_write_one_ctf): Don't bother to call
ctf_serialize: both the functions we call do so.
* ctf-string.c (ctf_str_create_atoms): Drop serializing case
(atoms arg).
* ctf-open.c (ctf_simple_open): Call ctf_bufopen directly.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Delete/rename to ctf_bufopen: no
longer bother with syn_ext_strtab or forced atoms table,
serialization no longer needs them.
* ctf-create.c (ctf_create): Call ctf_bufopen directly.
* ctf-impl.h (ctf_str_create_atoms): Drop atoms arg.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Likewise.
(ctf_serialize): Adjust.
* testsuite/libctf-lookup/add-to-opened.c: Adjust now that
this is supposed to work.
2024-03-26 21:04:20 +08:00
|
|
|
if ((len = gzwrite (fd, p, bufsiz - written)) <= 0)
|
|
|
|
{
|
|
|
|
free (buf);
|
|
|
|
return (ctf_set_errno (fp, errno));
|
|
|
|
}
|
|
|
|
written += len;
|
|
|
|
p += len;
|
2021-03-18 20:37:52 +08:00
|
|
|
}
|
|
|
|
|
libctf: make ctf_serialize() actually serialize
ctf_serialize() evolved from the old ctf_update(), which mutated the
in-memory CTF dict to make all the dynamic in-memory types into static,
unchanging written-to-the-dict types (by deserializing and reserializing
it): back in the days when you could only do type lookups on static types,
this meant you could see all the types you added recently, at the small,
small cost of making it impossible to change those older types ever again
and inducing an amortized O(n^2) cost if you actually wanted to add
references to types you added at arbitrary times to later types.
It also reset things so that ctf_discard() would throw away only types you
added after the most recent ctf_update() call.
Some time ago this was all changed so that you could look up dynamic types
just as easily as static types: ctf_update() changed so that only its
visible side-effect of affecting ctf_discard() remained: the old
ctf_update() was renamed to ctf_serialize(), made internal to libctf, and
called from the various functions that wrote files out.
... but it was still working by serializing and deserializing the entire
dict, swapping out its guts with the newly-serialized copy in an invasive
and horrible fashion that coupled ctf_serialize() to almost every field in
the ctf_dict_t. This is totally useless, and fixing it is easy: just rip
all that code out and have ctf_serialize return a serialized representation,
and let everything use that directly. This simplifies most of its callers
significantly.
(It also points up another bug: ctf_gzwrite() failed to call ctf_serialize()
at all, so it would only ever work for a dict you just ctf_write_mem()ed
yourself, just for its invisible side-effect of serializing the dict!)
This lets us simplify away a bunch of internal-only open-side functionality
for overriding the syn_ext_strtab and some just-added functionality for
forcing in an existing atoms table, without loss of functionality, and lets
us lift the restriction on reserializing a dict that was ctf_open()ed rather
than being ctf_create()d: it's now perfectly OK to open a dict, modify it
(except for adding members to existing structs, unions, or enums, which
fails with -ECTF_RDONLY), and write it out again, just as one would expect.
libctf/
* ctf-serialize.c (ctf_symtypetab_sect_sizes): Fix typos.
(ctf_type_sect_size): Add static type sizes too.
(ctf_serialize): Return the new dict rather than updating the
existing dict. No longer fail for dicts with static types;
copy them onto the start of the new types table.
(ctf_gzwrite): Actually serialize before gzwriting.
(ctf_write_mem): Improve forced (test-mode) endian-flipping:
flip dicts even if they are too small to be compressed.
Improve confusing variable naming.
* ctf-archive.c (arc_write_one_ctf): Don't bother to call
ctf_serialize: both the functions we call do so.
* ctf-string.c (ctf_str_create_atoms): Drop serializing case
(atoms arg).
* ctf-open.c (ctf_simple_open): Call ctf_bufopen directly.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Delete/rename to ctf_bufopen: no
longer bother with syn_ext_strtab or forced atoms table,
serialization no longer needs them.
* ctf-create.c (ctf_create): Call ctf_bufopen directly.
* ctf-impl.h (ctf_str_create_atoms): Drop atoms arg.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Likewise.
(ctf_serialize): Adjust.
* testsuite/libctf-lookup/add-to-opened.c: Adjust now that
this is supposed to work.
2024-03-26 21:04:20 +08:00
|
|
|
free (buf);
|
2021-03-18 20:37:52 +08:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Optionally compress the specified CTF data stream and return it as a new
|
libctf: add LIBCTF_WRITE_FOREIGN_ENDIAN debugging option
libctf has always handled endianness differences by detecting
foreign-endian CTF dicts on the input and endian-flipping them: dicts
are always written in native endianness. This makes endian-awareness
very low overhead, but it means that the foreign-endian code paths
almost never get routinely tested, since "make check" usually reads in
dicts ld has just written out: only a few corrupted-CTF tests are
actually in fixed endianness, and even they only test the foreign-
endian code paths when you run make check on a big-endian machine.
(And the fix is surely not to add more .s-based tests like that, because
they are a nightmare to maintain compared to the C-code-based ones.)
To improve on this, add a new environment variable,
LIBCTF_WRITE_FOREIGN_ENDIAN, which causes libctf to unconditionally
endian-flip at ctf_write time, so the output is always in the wrong
endianness. This then tests the foreign-endian read paths properly
at open time.
Make this easier by restructuring the writeout code in ctf-serialize.c,
which duplicates the maybe-gzip-and-write-out code three times (once
for ctf_write_mem, with thresholding, and once each for
ctf_compress_write and ctf_write just so those can avoid thresholding
and/or compression). Instead, have the latter two call the former
with thresholds of 0 or (size_t) -1, respectively.
The endian-flipping code itself gains a bit of complexity, because
one single endian-flipper (flip_types) was assuming the input to be
in foreign-endian form and assuming it could pull things out of the
input once they had been flipped and make sense of them. At the
cost of a few lines of duplicated initializations, teach it to
read before flipping if we're flipping to foreign-endianness instead
of away from it.
libctf/
* ctf-impl.h (ctf_flip_header): No longer static.
(ctf_flip): Likewise.
* ctf-open.c (flip_header): Rename to...
(ctf_flip_header): ... this, now it is not private to one file.
(flip_ctf): Rename...
(ctf_flip): ... this too. Add FOREIGN_ENDIAN arg.
(flip_types): Likewise. Use it.
(ctf_bufopen_internal): Adjust calls.
* ctf-serialize.c (ctf_write_mem): Add flip_endian path via
a newly-allocated bounce buffer.
(ctf_compress_write): Move below ctf_write_mem and reimplement
in terms of it.
(ctf_write): Likewise.
(ctf_gzwrite): Note that this obscure writeout function does not
support endian-flipping.
2022-03-18 21:20:29 +08:00
|
|
|
dynamically-allocated string. Possibly write it with reversed
|
|
|
|
endianness. */
|
2021-03-18 20:37:52 +08:00
|
|
|
unsigned char *
|
|
|
|
ctf_write_mem (ctf_dict_t *fp, size_t *size, size_t threshold)
|
|
|
|
{
|
libctf: make ctf_serialize() actually serialize
ctf_serialize() evolved from the old ctf_update(), which mutated the
in-memory CTF dict to make all the dynamic in-memory types into static,
unchanging written-to-the-dict types (by deserializing and reserializing
it): back in the days when you could only do type lookups on static types,
this meant you could see all the types you added recently, at the small,
small cost of making it impossible to change those older types ever again
and inducing an amortized O(n^2) cost if you actually wanted to add
references to types you added at arbitrary times to later types.
It also reset things so that ctf_discard() would throw away only types you
added after the most recent ctf_update() call.
Some time ago this was all changed so that you could look up dynamic types
just as easily as static types: ctf_update() changed so that only its
visible side-effect of affecting ctf_discard() remained: the old
ctf_update() was renamed to ctf_serialize(), made internal to libctf, and
called from the various functions that wrote files out.
... but it was still working by serializing and deserializing the entire
dict, swapping out its guts with the newly-serialized copy in an invasive
and horrible fashion that coupled ctf_serialize() to almost every field in
the ctf_dict_t. This is totally useless, and fixing it is easy: just rip
all that code out and have ctf_serialize return a serialized representation,
and let everything use that directly. This simplifies most of its callers
significantly.
(It also points up another bug: ctf_gzwrite() failed to call ctf_serialize()
at all, so it would only ever work for a dict you just ctf_write_mem()ed
yourself, just for its invisible side-effect of serializing the dict!)
This lets us simplify away a bunch of internal-only open-side functionality
for overriding the syn_ext_strtab and some just-added functionality for
forcing in an existing atoms table, without loss of functionality, and lets
us lift the restriction on reserializing a dict that was ctf_open()ed rather
than being ctf_create()d: it's now perfectly OK to open a dict, modify it
(except for adding members to existing structs, unions, or enums, which
fails with -ECTF_RDONLY), and write it out again, just as one would expect.
libctf/
* ctf-serialize.c (ctf_symtypetab_sect_sizes): Fix typos.
(ctf_type_sect_size): Add static type sizes too.
(ctf_serialize): Return the new dict rather than updating the
existing dict. No longer fail for dicts with static types;
copy them onto the start of the new types table.
(ctf_gzwrite): Actually serialize before gzwriting.
(ctf_write_mem): Improve forced (test-mode) endian-flipping:
flip dicts even if they are too small to be compressed.
Improve confusing variable naming.
* ctf-archive.c (arc_write_one_ctf): Don't bother to call
ctf_serialize: both the functions we call do so.
* ctf-string.c (ctf_str_create_atoms): Drop serializing case
(atoms arg).
* ctf-open.c (ctf_simple_open): Call ctf_bufopen directly.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Delete/rename to ctf_bufopen: no
longer bother with syn_ext_strtab or forced atoms table,
serialization no longer needs them.
* ctf-create.c (ctf_create): Call ctf_bufopen directly.
* ctf-impl.h (ctf_str_create_atoms): Drop atoms arg.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Likewise.
(ctf_serialize): Adjust.
* testsuite/libctf-lookup/add-to-opened.c: Adjust now that
this is supposed to work.
2024-03-26 21:04:20 +08:00
|
|
|
unsigned char *rawbuf;
|
|
|
|
unsigned char *buf = NULL;
|
2021-03-18 20:37:52 +08:00
|
|
|
unsigned char *bp;
|
libctf: make ctf_serialize() actually serialize
ctf_serialize() evolved from the old ctf_update(), which mutated the
in-memory CTF dict to make all the dynamic in-memory types into static,
unchanging written-to-the-dict types (by deserializing and reserializing
it): back in the days when you could only do type lookups on static types,
this meant you could see all the types you added recently, at the small,
small cost of making it impossible to change those older types ever again
and inducing an amortized O(n^2) cost if you actually wanted to add
references to types you added at arbitrary times to later types.
It also reset things so that ctf_discard() would throw away only types you
added after the most recent ctf_update() call.
Some time ago this was all changed so that you could look up dynamic types
just as easily as static types: ctf_update() changed so that only its
visible side-effect of affecting ctf_discard() remained: the old
ctf_update() was renamed to ctf_serialize(), made internal to libctf, and
called from the various functions that wrote files out.
... but it was still working by serializing and deserializing the entire
dict, swapping out its guts with the newly-serialized copy in an invasive
and horrible fashion that coupled ctf_serialize() to almost every field in
the ctf_dict_t. This is totally useless, and fixing it is easy: just rip
all that code out and have ctf_serialize return a serialized representation,
and let everything use that directly. This simplifies most of its callers
significantly.
(It also points up another bug: ctf_gzwrite() failed to call ctf_serialize()
at all, so it would only ever work for a dict you just ctf_write_mem()ed
yourself, just for its invisible side-effect of serializing the dict!)
This lets us simplify away a bunch of internal-only open-side functionality
for overriding the syn_ext_strtab and some just-added functionality for
forcing in an existing atoms table, without loss of functionality, and lets
us lift the restriction on reserializing a dict that was ctf_open()ed rather
than being ctf_create()d: it's now perfectly OK to open a dict, modify it
(except for adding members to existing structs, unions, or enums, which
fails with -ECTF_RDONLY), and write it out again, just as one would expect.
libctf/
* ctf-serialize.c (ctf_symtypetab_sect_sizes): Fix typos.
(ctf_type_sect_size): Add static type sizes too.
(ctf_serialize): Return the new dict rather than updating the
existing dict. No longer fail for dicts with static types;
copy them onto the start of the new types table.
(ctf_gzwrite): Actually serialize before gzwriting.
(ctf_write_mem): Improve forced (test-mode) endian-flipping:
flip dicts even if they are too small to be compressed.
Improve confusing variable naming.
* ctf-archive.c (arc_write_one_ctf): Don't bother to call
ctf_serialize: both the functions we call do so.
* ctf-string.c (ctf_str_create_atoms): Drop serializing case
(atoms arg).
* ctf-open.c (ctf_simple_open): Call ctf_bufopen directly.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Delete/rename to ctf_bufopen: no
longer bother with syn_ext_strtab or forced atoms table,
serialization no longer needs them.
* ctf-create.c (ctf_create): Call ctf_bufopen directly.
* ctf-impl.h (ctf_str_create_atoms): Drop atoms arg.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Likewise.
(ctf_serialize): Adjust.
* testsuite/libctf-lookup/add-to-opened.c: Adjust now that
this is supposed to work.
2024-03-26 21:04:20 +08:00
|
|
|
ctf_header_t *rawhp, *hp;
|
|
|
|
unsigned char *src;
|
|
|
|
size_t rawbufsiz;
|
|
|
|
size_t alloc_len = 0;
|
|
|
|
int uncompressed = 0;
|
libctf: add LIBCTF_WRITE_FOREIGN_ENDIAN debugging option
libctf has always handled endianness differences by detecting
foreign-endian CTF dicts on the input and endian-flipping them: dicts
are always written in native endianness. This makes endian-awareness
very low overhead, but it means that the foreign-endian code paths
almost never get routinely tested, since "make check" usually reads in
dicts ld has just written out: only a few corrupted-CTF tests are
actually in fixed endianness, and even they only test the foreign-
endian code paths when you run make check on a big-endian machine.
(And the fix is surely not to add more .s-based tests like that, because
they are a nightmare to maintain compared to the C-code-based ones.)
To improve on this, add a new environment variable,
LIBCTF_WRITE_FOREIGN_ENDIAN, which causes libctf to unconditionally
endian-flip at ctf_write time, so the output is always in the wrong
endianness. This then tests the foreign-endian read paths properly
at open time.
Make this easier by restructuring the writeout code in ctf-serialize.c,
which duplicates the maybe-gzip-and-write-out code three times (once
for ctf_write_mem, with thresholding, and once each for
ctf_compress_write and ctf_write just so those can avoid thresholding
and/or compression). Instead, have the latter two call the former
with thresholds of 0 or (size_t) -1, respectively.
The endian-flipping code itself gains a bit of complexity, because
one single endian-flipper (flip_types) was assuming the input to be
in foreign-endian form and assuming it could pull things out of the
input once they had been flipped and make sense of them. At the
cost of a few lines of duplicated initializations, teach it to
read before flipping if we're flipping to foreign-endianness instead
of away from it.
libctf/
* ctf-impl.h (ctf_flip_header): No longer static.
(ctf_flip): Likewise.
* ctf-open.c (flip_header): Rename to...
(ctf_flip_header): ... this, now it is not private to one file.
(flip_ctf): Rename...
(ctf_flip): ... this too. Add FOREIGN_ENDIAN arg.
(flip_types): Likewise. Use it.
(ctf_bufopen_internal): Adjust calls.
* ctf-serialize.c (ctf_write_mem): Add flip_endian path via
a newly-allocated bounce buffer.
(ctf_compress_write): Move below ctf_write_mem and reimplement
in terms of it.
(ctf_write): Likewise.
(ctf_gzwrite): Note that this obscure writeout function does not
support endian-flipping.
2022-03-18 21:20:29 +08:00
|
|
|
int flip_endian;
|
2021-03-18 20:37:52 +08:00
|
|
|
int rc;
|
|
|
|
|
libctf: add LIBCTF_WRITE_FOREIGN_ENDIAN debugging option
libctf has always handled endianness differences by detecting
foreign-endian CTF dicts on the input and endian-flipping them: dicts
are always written in native endianness. This makes endian-awareness
very low overhead, but it means that the foreign-endian code paths
almost never get routinely tested, since "make check" usually reads in
dicts ld has just written out: only a few corrupted-CTF tests are
actually in fixed endianness, and even they only test the foreign-
endian code paths when you run make check on a big-endian machine.
(And the fix is surely not to add more .s-based tests like that, because
they are a nightmare to maintain compared to the C-code-based ones.)
To improve on this, add a new environment variable,
LIBCTF_WRITE_FOREIGN_ENDIAN, which causes libctf to unconditionally
endian-flip at ctf_write time, so the output is always in the wrong
endianness. This then tests the foreign-endian read paths properly
at open time.
Make this easier by restructuring the writeout code in ctf-serialize.c,
which duplicates the maybe-gzip-and-write-out code three times (once
for ctf_write_mem, with thresholding, and once each for
ctf_compress_write and ctf_write just so those can avoid thresholding
and/or compression). Instead, have the latter two call the former
with thresholds of 0 or (size_t) -1, respectively.
The endian-flipping code itself gains a bit of complexity, because
one single endian-flipper (flip_types) was assuming the input to be
in foreign-endian form and assuming it could pull things out of the
input once they had been flipped and make sense of them. At the
cost of a few lines of duplicated initializations, teach it to
read before flipping if we're flipping to foreign-endianness instead
of away from it.
libctf/
* ctf-impl.h (ctf_flip_header): No longer static.
(ctf_flip): Likewise.
* ctf-open.c (flip_header): Rename to...
(ctf_flip_header): ... this, now it is not private to one file.
(flip_ctf): Rename...
(ctf_flip): ... this too. Add FOREIGN_ENDIAN arg.
(flip_types): Likewise. Use it.
(ctf_bufopen_internal): Adjust calls.
* ctf-serialize.c (ctf_write_mem): Add flip_endian path via
a newly-allocated bounce buffer.
(ctf_compress_write): Move below ctf_write_mem and reimplement
in terms of it.
(ctf_write): Likewise.
(ctf_gzwrite): Note that this obscure writeout function does not
support endian-flipping.
2022-03-18 21:20:29 +08:00
|
|
|
flip_endian = getenv ("LIBCTF_WRITE_FOREIGN_ENDIAN") != NULL;
|
|
|
|
|
libctf: make ctf_serialize() actually serialize
ctf_serialize() evolved from the old ctf_update(), which mutated the
in-memory CTF dict to make all the dynamic in-memory types into static,
unchanging written-to-the-dict types (by deserializing and reserializing
it): back in the days when you could only do type lookups on static types,
this meant you could see all the types you added recently, at the small,
small cost of making it impossible to change those older types ever again
and inducing an amortized O(n^2) cost if you actually wanted to add
references to types you added at arbitrary times to later types.
It also reset things so that ctf_discard() would throw away only types you
added after the most recent ctf_update() call.
Some time ago this was all changed so that you could look up dynamic types
just as easily as static types: ctf_update() changed so that only its
visible side-effect of affecting ctf_discard() remained: the old
ctf_update() was renamed to ctf_serialize(), made internal to libctf, and
called from the various functions that wrote files out.
... but it was still working by serializing and deserializing the entire
dict, swapping out its guts with the newly-serialized copy in an invasive
and horrible fashion that coupled ctf_serialize() to almost every field in
the ctf_dict_t. This is totally useless, and fixing it is easy: just rip
all that code out and have ctf_serialize return a serialized representation,
and let everything use that directly. This simplifies most of its callers
significantly.
(It also points up another bug: ctf_gzwrite() failed to call ctf_serialize()
at all, so it would only ever work for a dict you just ctf_write_mem()ed
yourself, just for its invisible side-effect of serializing the dict!)
This lets us simplify away a bunch of internal-only open-side functionality
for overriding the syn_ext_strtab and some just-added functionality for
forcing in an existing atoms table, without loss of functionality, and lets
us lift the restriction on reserializing a dict that was ctf_open()ed rather
than being ctf_create()d: it's now perfectly OK to open a dict, modify it
(except for adding members to existing structs, unions, or enums, which
fails with -ECTF_RDONLY), and write it out again, just as one would expect.
libctf/
* ctf-serialize.c (ctf_symtypetab_sect_sizes): Fix typos.
(ctf_type_sect_size): Add static type sizes too.
(ctf_serialize): Return the new dict rather than updating the
existing dict. No longer fail for dicts with static types;
copy them onto the start of the new types table.
(ctf_gzwrite): Actually serialize before gzwriting.
(ctf_write_mem): Improve forced (test-mode) endian-flipping:
flip dicts even if they are too small to be compressed.
Improve confusing variable naming.
* ctf-archive.c (arc_write_one_ctf): Don't bother to call
ctf_serialize: both the functions we call do so.
* ctf-string.c (ctf_str_create_atoms): Drop serializing case
(atoms arg).
* ctf-open.c (ctf_simple_open): Call ctf_bufopen directly.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Delete/rename to ctf_bufopen: no
longer bother with syn_ext_strtab or forced atoms table,
serialization no longer needs them.
* ctf-create.c (ctf_create): Call ctf_bufopen directly.
* ctf-impl.h (ctf_str_create_atoms): Drop atoms arg.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Likewise.
(ctf_serialize): Adjust.
* testsuite/libctf-lookup/add-to-opened.c: Adjust now that
this is supposed to work.
2024-03-26 21:04:20 +08:00
|
|
|
if ((rawbuf = ctf_serialize (fp, &rawbufsiz)) == NULL)
|
2021-03-18 20:37:52 +08:00
|
|
|
return NULL; /* errno is set for us. */
|
|
|
|
|
libctf: make ctf_serialize() actually serialize
ctf_serialize() evolved from the old ctf_update(), which mutated the
in-memory CTF dict to make all the dynamic in-memory types into static,
unchanging written-to-the-dict types (by deserializing and reserializing
it): back in the days when you could only do type lookups on static types,
this meant you could see all the types you added recently, at the small,
small cost of making it impossible to change those older types ever again
and inducing an amortized O(n^2) cost if you actually wanted to add
references to types you added at arbitrary times to later types.
It also reset things so that ctf_discard() would throw away only types you
added after the most recent ctf_update() call.
Some time ago this was all changed so that you could look up dynamic types
just as easily as static types: ctf_update() changed so that only its
visible side-effect of affecting ctf_discard() remained: the old
ctf_update() was renamed to ctf_serialize(), made internal to libctf, and
called from the various functions that wrote files out.
... but it was still working by serializing and deserializing the entire
dict, swapping out its guts with the newly-serialized copy in an invasive
and horrible fashion that coupled ctf_serialize() to almost every field in
the ctf_dict_t. This is totally useless, and fixing it is easy: just rip
all that code out and have ctf_serialize return a serialized representation,
and let everything use that directly. This simplifies most of its callers
significantly.
(It also points up another bug: ctf_gzwrite() failed to call ctf_serialize()
at all, so it would only ever work for a dict you just ctf_write_mem()ed
yourself, just for its invisible side-effect of serializing the dict!)
This lets us simplify away a bunch of internal-only open-side functionality
for overriding the syn_ext_strtab and some just-added functionality for
forcing in an existing atoms table, without loss of functionality, and lets
us lift the restriction on reserializing a dict that was ctf_open()ed rather
than being ctf_create()d: it's now perfectly OK to open a dict, modify it
(except for adding members to existing structs, unions, or enums, which
fails with -ECTF_RDONLY), and write it out again, just as one would expect.
libctf/
* ctf-serialize.c (ctf_symtypetab_sect_sizes): Fix typos.
(ctf_type_sect_size): Add static type sizes too.
(ctf_serialize): Return the new dict rather than updating the
existing dict. No longer fail for dicts with static types;
copy them onto the start of the new types table.
(ctf_gzwrite): Actually serialize before gzwriting.
(ctf_write_mem): Improve forced (test-mode) endian-flipping:
flip dicts even if they are too small to be compressed.
Improve confusing variable naming.
* ctf-archive.c (arc_write_one_ctf): Don't bother to call
ctf_serialize: both the functions we call do so.
* ctf-string.c (ctf_str_create_atoms): Drop serializing case
(atoms arg).
* ctf-open.c (ctf_simple_open): Call ctf_bufopen directly.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Delete/rename to ctf_bufopen: no
longer bother with syn_ext_strtab or forced atoms table,
serialization no longer needs them.
* ctf-create.c (ctf_create): Call ctf_bufopen directly.
* ctf-impl.h (ctf_str_create_atoms): Drop atoms arg.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Likewise.
(ctf_serialize): Adjust.
* testsuite/libctf-lookup/add-to-opened.c: Adjust now that
this is supposed to work.
2024-03-26 21:04:20 +08:00
|
|
|
if (!ctf_assert (fp, rawbufsiz >= sizeof (ctf_header_t)))
|
|
|
|
goto err;
|
|
|
|
|
|
|
|
if (rawbufsiz >= threshold)
|
|
|
|
alloc_len = compressBound (rawbufsiz - sizeof (ctf_header_t))
|
|
|
|
+ sizeof (ctf_header_t);
|
|
|
|
|
2024-07-16 02:55:40 +08:00
|
|
|
/* Trivial operation if the buffer is too small to bother compressing, and
|
|
|
|
we're not doing a forced write-time flip. */
|
libctf: make ctf_serialize() actually serialize
ctf_serialize() evolved from the old ctf_update(), which mutated the
in-memory CTF dict to make all the dynamic in-memory types into static,
unchanging written-to-the-dict types (by deserializing and reserializing
it): back in the days when you could only do type lookups on static types,
this meant you could see all the types you added recently, at the small,
small cost of making it impossible to change those older types ever again
and inducing an amortized O(n^2) cost if you actually wanted to add
references to types you added at arbitrary times to later types.
It also reset things so that ctf_discard() would throw away only types you
added after the most recent ctf_update() call.
Some time ago this was all changed so that you could look up dynamic types
just as easily as static types: ctf_update() changed so that only its
visible side-effect of affecting ctf_discard() remained: the old
ctf_update() was renamed to ctf_serialize(), made internal to libctf, and
called from the various functions that wrote files out.
... but it was still working by serializing and deserializing the entire
dict, swapping out its guts with the newly-serialized copy in an invasive
and horrible fashion that coupled ctf_serialize() to almost every field in
the ctf_dict_t. This is totally useless, and fixing it is easy: just rip
all that code out and have ctf_serialize return a serialized representation,
and let everything use that directly. This simplifies most of its callers
significantly.
(It also points up another bug: ctf_gzwrite() failed to call ctf_serialize()
at all, so it would only ever work for a dict you just ctf_write_mem()ed
yourself, just for its invisible side-effect of serializing the dict!)
This lets us simplify away a bunch of internal-only open-side functionality
for overriding the syn_ext_strtab and some just-added functionality for
forcing in an existing atoms table, without loss of functionality, and lets
us lift the restriction on reserializing a dict that was ctf_open()ed rather
than being ctf_create()d: it's now perfectly OK to open a dict, modify it
(except for adding members to existing structs, unions, or enums, which
fails with -ECTF_RDONLY), and write it out again, just as one would expect.
libctf/
* ctf-serialize.c (ctf_symtypetab_sect_sizes): Fix typos.
(ctf_type_sect_size): Add static type sizes too.
(ctf_serialize): Return the new dict rather than updating the
existing dict. No longer fail for dicts with static types;
copy them onto the start of the new types table.
(ctf_gzwrite): Actually serialize before gzwriting.
(ctf_write_mem): Improve forced (test-mode) endian-flipping:
flip dicts even if they are too small to be compressed.
Improve confusing variable naming.
* ctf-archive.c (arc_write_one_ctf): Don't bother to call
ctf_serialize: both the functions we call do so.
* ctf-string.c (ctf_str_create_atoms): Drop serializing case
(atoms arg).
* ctf-open.c (ctf_simple_open): Call ctf_bufopen directly.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Delete/rename to ctf_bufopen: no
longer bother with syn_ext_strtab or forced atoms table,
serialization no longer needs them.
* ctf-create.c (ctf_create): Call ctf_bufopen directly.
* ctf-impl.h (ctf_str_create_atoms): Drop atoms arg.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Likewise.
(ctf_serialize): Adjust.
* testsuite/libctf-lookup/add-to-opened.c: Adjust now that
this is supposed to work.
2024-03-26 21:04:20 +08:00
|
|
|
|
2024-07-16 02:55:40 +08:00
|
|
|
if (rawbufsiz < threshold)
|
libctf: make ctf_serialize() actually serialize
ctf_serialize() evolved from the old ctf_update(), which mutated the
in-memory CTF dict to make all the dynamic in-memory types into static,
unchanging written-to-the-dict types (by deserializing and reserializing
it): back in the days when you could only do type lookups on static types,
this meant you could see all the types you added recently, at the small,
small cost of making it impossible to change those older types ever again
and inducing an amortized O(n^2) cost if you actually wanted to add
references to types you added at arbitrary times to later types.
It also reset things so that ctf_discard() would throw away only types you
added after the most recent ctf_update() call.
Some time ago this was all changed so that you could look up dynamic types
just as easily as static types: ctf_update() changed so that only its
visible side-effect of affecting ctf_discard() remained: the old
ctf_update() was renamed to ctf_serialize(), made internal to libctf, and
called from the various functions that wrote files out.
... but it was still working by serializing and deserializing the entire
dict, swapping out its guts with the newly-serialized copy in an invasive
and horrible fashion that coupled ctf_serialize() to almost every field in
the ctf_dict_t. This is totally useless, and fixing it is easy: just rip
all that code out and have ctf_serialize return a serialized representation,
and let everything use that directly. This simplifies most of its callers
significantly.
(It also points up another bug: ctf_gzwrite() failed to call ctf_serialize()
at all, so it would only ever work for a dict you just ctf_write_mem()ed
yourself, just for its invisible side-effect of serializing the dict!)
This lets us simplify away a bunch of internal-only open-side functionality
for overriding the syn_ext_strtab and some just-added functionality for
forcing in an existing atoms table, without loss of functionality, and lets
us lift the restriction on reserializing a dict that was ctf_open()ed rather
than being ctf_create()d: it's now perfectly OK to open a dict, modify it
(except for adding members to existing structs, unions, or enums, which
fails with -ECTF_RDONLY), and write it out again, just as one would expect.
libctf/
* ctf-serialize.c (ctf_symtypetab_sect_sizes): Fix typos.
(ctf_type_sect_size): Add static type sizes too.
(ctf_serialize): Return the new dict rather than updating the
existing dict. No longer fail for dicts with static types;
copy them onto the start of the new types table.
(ctf_gzwrite): Actually serialize before gzwriting.
(ctf_write_mem): Improve forced (test-mode) endian-flipping:
flip dicts even if they are too small to be compressed.
Improve confusing variable naming.
* ctf-archive.c (arc_write_one_ctf): Don't bother to call
ctf_serialize: both the functions we call do so.
* ctf-string.c (ctf_str_create_atoms): Drop serializing case
(atoms arg).
* ctf-open.c (ctf_simple_open): Call ctf_bufopen directly.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Delete/rename to ctf_bufopen: no
longer bother with syn_ext_strtab or forced atoms table,
serialization no longer needs them.
* ctf-create.c (ctf_create): Call ctf_bufopen directly.
* ctf-impl.h (ctf_str_create_atoms): Drop atoms arg.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Likewise.
(ctf_serialize): Adjust.
* testsuite/libctf-lookup/add-to-opened.c: Adjust now that
this is supposed to work.
2024-03-26 21:04:20 +08:00
|
|
|
{
|
|
|
|
alloc_len = rawbufsiz;
|
|
|
|
uncompressed = 1;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (!flip_endian && uncompressed)
|
|
|
|
{
|
|
|
|
*size = rawbufsiz;
|
|
|
|
return rawbuf;
|
|
|
|
}
|
|
|
|
|
|
|
|
if ((buf = malloc (alloc_len)) == NULL)
|
2021-03-18 20:37:52 +08:00
|
|
|
{
|
|
|
|
ctf_set_errno (fp, ENOMEM);
|
|
|
|
ctf_err_warn (fp, 0, 0, _("ctf_write_mem: cannot allocate %li bytes"),
|
libctf: make ctf_serialize() actually serialize
ctf_serialize() evolved from the old ctf_update(), which mutated the
in-memory CTF dict to make all the dynamic in-memory types into static,
unchanging written-to-the-dict types (by deserializing and reserializing
it): back in the days when you could only do type lookups on static types,
this meant you could see all the types you added recently, at the small,
small cost of making it impossible to change those older types ever again
and inducing an amortized O(n^2) cost if you actually wanted to add
references to types you added at arbitrary times to later types.
It also reset things so that ctf_discard() would throw away only types you
added after the most recent ctf_update() call.
Some time ago this was all changed so that you could look up dynamic types
just as easily as static types: ctf_update() changed so that only its
visible side-effect of affecting ctf_discard() remained: the old
ctf_update() was renamed to ctf_serialize(), made internal to libctf, and
called from the various functions that wrote files out.
... but it was still working by serializing and deserializing the entire
dict, swapping out its guts with the newly-serialized copy in an invasive
and horrible fashion that coupled ctf_serialize() to almost every field in
the ctf_dict_t. This is totally useless, and fixing it is easy: just rip
all that code out and have ctf_serialize return a serialized representation,
and let everything use that directly. This simplifies most of its callers
significantly.
(It also points up another bug: ctf_gzwrite() failed to call ctf_serialize()
at all, so it would only ever work for a dict you just ctf_write_mem()ed
yourself, just for its invisible side-effect of serializing the dict!)
This lets us simplify away a bunch of internal-only open-side functionality
for overriding the syn_ext_strtab and some just-added functionality for
forcing in an existing atoms table, without loss of functionality, and lets
us lift the restriction on reserializing a dict that was ctf_open()ed rather
than being ctf_create()d: it's now perfectly OK to open a dict, modify it
(except for adding members to existing structs, unions, or enums, which
fails with -ECTF_RDONLY), and write it out again, just as one would expect.
libctf/
* ctf-serialize.c (ctf_symtypetab_sect_sizes): Fix typos.
(ctf_type_sect_size): Add static type sizes too.
(ctf_serialize): Return the new dict rather than updating the
existing dict. No longer fail for dicts with static types;
copy them onto the start of the new types table.
(ctf_gzwrite): Actually serialize before gzwriting.
(ctf_write_mem): Improve forced (test-mode) endian-flipping:
flip dicts even if they are too small to be compressed.
Improve confusing variable naming.
* ctf-archive.c (arc_write_one_ctf): Don't bother to call
ctf_serialize: both the functions we call do so.
* ctf-string.c (ctf_str_create_atoms): Drop serializing case
(atoms arg).
* ctf-open.c (ctf_simple_open): Call ctf_bufopen directly.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Delete/rename to ctf_bufopen: no
longer bother with syn_ext_strtab or forced atoms table,
serialization no longer needs them.
* ctf-create.c (ctf_create): Call ctf_bufopen directly.
* ctf-impl.h (ctf_str_create_atoms): Drop atoms arg.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Likewise.
(ctf_serialize): Adjust.
* testsuite/libctf-lookup/add-to-opened.c: Adjust now that
this is supposed to work.
2024-03-26 21:04:20 +08:00
|
|
|
(unsigned long) (alloc_len));
|
|
|
|
goto err;
|
2021-03-18 20:37:52 +08:00
|
|
|
}
|
|
|
|
|
libctf: make ctf_serialize() actually serialize
ctf_serialize() evolved from the old ctf_update(), which mutated the
in-memory CTF dict to make all the dynamic in-memory types into static,
unchanging written-to-the-dict types (by deserializing and reserializing
it): back in the days when you could only do type lookups on static types,
this meant you could see all the types you added recently, at the small,
small cost of making it impossible to change those older types ever again
and inducing an amortized O(n^2) cost if you actually wanted to add
references to types you added at arbitrary times to later types.
It also reset things so that ctf_discard() would throw away only types you
added after the most recent ctf_update() call.
Some time ago this was all changed so that you could look up dynamic types
just as easily as static types: ctf_update() changed so that only its
visible side-effect of affecting ctf_discard() remained: the old
ctf_update() was renamed to ctf_serialize(), made internal to libctf, and
called from the various functions that wrote files out.
... but it was still working by serializing and deserializing the entire
dict, swapping out its guts with the newly-serialized copy in an invasive
and horrible fashion that coupled ctf_serialize() to almost every field in
the ctf_dict_t. This is totally useless, and fixing it is easy: just rip
all that code out and have ctf_serialize return a serialized representation,
and let everything use that directly. This simplifies most of its callers
significantly.
(It also points up another bug: ctf_gzwrite() failed to call ctf_serialize()
at all, so it would only ever work for a dict you just ctf_write_mem()ed
yourself, just for its invisible side-effect of serializing the dict!)
This lets us simplify away a bunch of internal-only open-side functionality
for overriding the syn_ext_strtab and some just-added functionality for
forcing in an existing atoms table, without loss of functionality, and lets
us lift the restriction on reserializing a dict that was ctf_open()ed rather
than being ctf_create()d: it's now perfectly OK to open a dict, modify it
(except for adding members to existing structs, unions, or enums, which
fails with -ECTF_RDONLY), and write it out again, just as one would expect.
libctf/
* ctf-serialize.c (ctf_symtypetab_sect_sizes): Fix typos.
(ctf_type_sect_size): Add static type sizes too.
(ctf_serialize): Return the new dict rather than updating the
existing dict. No longer fail for dicts with static types;
copy them onto the start of the new types table.
(ctf_gzwrite): Actually serialize before gzwriting.
(ctf_write_mem): Improve forced (test-mode) endian-flipping:
flip dicts even if they are too small to be compressed.
Improve confusing variable naming.
* ctf-archive.c (arc_write_one_ctf): Don't bother to call
ctf_serialize: both the functions we call do so.
* ctf-string.c (ctf_str_create_atoms): Drop serializing case
(atoms arg).
* ctf-open.c (ctf_simple_open): Call ctf_bufopen directly.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Delete/rename to ctf_bufopen: no
longer bother with syn_ext_strtab or forced atoms table,
serialization no longer needs them.
* ctf-create.c (ctf_create): Call ctf_bufopen directly.
* ctf-impl.h (ctf_str_create_atoms): Drop atoms arg.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Likewise.
(ctf_serialize): Adjust.
* testsuite/libctf-lookup/add-to-opened.c: Adjust now that
this is supposed to work.
2024-03-26 21:04:20 +08:00
|
|
|
rawhp = (ctf_header_t *) rawbuf;
|
2021-03-18 20:37:52 +08:00
|
|
|
hp = (ctf_header_t *) buf;
|
libctf: make ctf_serialize() actually serialize
ctf_serialize() evolved from the old ctf_update(), which mutated the
in-memory CTF dict to make all the dynamic in-memory types into static,
unchanging written-to-the-dict types (by deserializing and reserializing
it): back in the days when you could only do type lookups on static types,
this meant you could see all the types you added recently, at the small,
small cost of making it impossible to change those older types ever again
and inducing an amortized O(n^2) cost if you actually wanted to add
references to types you added at arbitrary times to later types.
It also reset things so that ctf_discard() would throw away only types you
added after the most recent ctf_update() call.
Some time ago this was all changed so that you could look up dynamic types
just as easily as static types: ctf_update() changed so that only its
visible side-effect of affecting ctf_discard() remained: the old
ctf_update() was renamed to ctf_serialize(), made internal to libctf, and
called from the various functions that wrote files out.
... but it was still working by serializing and deserializing the entire
dict, swapping out its guts with the newly-serialized copy in an invasive
and horrible fashion that coupled ctf_serialize() to almost every field in
the ctf_dict_t. This is totally useless, and fixing it is easy: just rip
all that code out and have ctf_serialize return a serialized representation,
and let everything use that directly. This simplifies most of its callers
significantly.
(It also points up another bug: ctf_gzwrite() failed to call ctf_serialize()
at all, so it would only ever work for a dict you just ctf_write_mem()ed
yourself, just for its invisible side-effect of serializing the dict!)
This lets us simplify away a bunch of internal-only open-side functionality
for overriding the syn_ext_strtab and some just-added functionality for
forcing in an existing atoms table, without loss of functionality, and lets
us lift the restriction on reserializing a dict that was ctf_open()ed rather
than being ctf_create()d: it's now perfectly OK to open a dict, modify it
(except for adding members to existing structs, unions, or enums, which
fails with -ECTF_RDONLY), and write it out again, just as one would expect.
libctf/
* ctf-serialize.c (ctf_symtypetab_sect_sizes): Fix typos.
(ctf_type_sect_size): Add static type sizes too.
(ctf_serialize): Return the new dict rather than updating the
existing dict. No longer fail for dicts with static types;
copy them onto the start of the new types table.
(ctf_gzwrite): Actually serialize before gzwriting.
(ctf_write_mem): Improve forced (test-mode) endian-flipping:
flip dicts even if they are too small to be compressed.
Improve confusing variable naming.
* ctf-archive.c (arc_write_one_ctf): Don't bother to call
ctf_serialize: both the functions we call do so.
* ctf-string.c (ctf_str_create_atoms): Drop serializing case
(atoms arg).
* ctf-open.c (ctf_simple_open): Call ctf_bufopen directly.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Delete/rename to ctf_bufopen: no
longer bother with syn_ext_strtab or forced atoms table,
serialization no longer needs them.
* ctf-create.c (ctf_create): Call ctf_bufopen directly.
* ctf-impl.h (ctf_str_create_atoms): Drop atoms arg.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Likewise.
(ctf_serialize): Adjust.
* testsuite/libctf-lookup/add-to-opened.c: Adjust now that
this is supposed to work.
2024-03-26 21:04:20 +08:00
|
|
|
memcpy (hp, rawbuf, sizeof (ctf_header_t));
|
|
|
|
bp = buf + sizeof (ctf_header_t);
|
|
|
|
*size = sizeof (ctf_header_t);
|
2021-03-18 20:37:52 +08:00
|
|
|
|
libctf: make ctf_serialize() actually serialize
ctf_serialize() evolved from the old ctf_update(), which mutated the
in-memory CTF dict to make all the dynamic in-memory types into static,
unchanging written-to-the-dict types (by deserializing and reserializing
it): back in the days when you could only do type lookups on static types,
this meant you could see all the types you added recently, at the small,
small cost of making it impossible to change those older types ever again
and inducing an amortized O(n^2) cost if you actually wanted to add
references to types you added at arbitrary times to later types.
It also reset things so that ctf_discard() would throw away only types you
added after the most recent ctf_update() call.
Some time ago this was all changed so that you could look up dynamic types
just as easily as static types: ctf_update() changed so that only its
visible side-effect of affecting ctf_discard() remained: the old
ctf_update() was renamed to ctf_serialize(), made internal to libctf, and
called from the various functions that wrote files out.
... but it was still working by serializing and deserializing the entire
dict, swapping out its guts with the newly-serialized copy in an invasive
and horrible fashion that coupled ctf_serialize() to almost every field in
the ctf_dict_t. This is totally useless, and fixing it is easy: just rip
all that code out and have ctf_serialize return a serialized representation,
and let everything use that directly. This simplifies most of its callers
significantly.
(It also points up another bug: ctf_gzwrite() failed to call ctf_serialize()
at all, so it would only ever work for a dict you just ctf_write_mem()ed
yourself, just for its invisible side-effect of serializing the dict!)
This lets us simplify away a bunch of internal-only open-side functionality
for overriding the syn_ext_strtab and some just-added functionality for
forcing in an existing atoms table, without loss of functionality, and lets
us lift the restriction on reserializing a dict that was ctf_open()ed rather
than being ctf_create()d: it's now perfectly OK to open a dict, modify it
(except for adding members to existing structs, unions, or enums, which
fails with -ECTF_RDONLY), and write it out again, just as one would expect.
libctf/
* ctf-serialize.c (ctf_symtypetab_sect_sizes): Fix typos.
(ctf_type_sect_size): Add static type sizes too.
(ctf_serialize): Return the new dict rather than updating the
existing dict. No longer fail for dicts with static types;
copy them onto the start of the new types table.
(ctf_gzwrite): Actually serialize before gzwriting.
(ctf_write_mem): Improve forced (test-mode) endian-flipping:
flip dicts even if they are too small to be compressed.
Improve confusing variable naming.
* ctf-archive.c (arc_write_one_ctf): Don't bother to call
ctf_serialize: both the functions we call do so.
* ctf-string.c (ctf_str_create_atoms): Drop serializing case
(atoms arg).
* ctf-open.c (ctf_simple_open): Call ctf_bufopen directly.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Delete/rename to ctf_bufopen: no
longer bother with syn_ext_strtab or forced atoms table,
serialization no longer needs them.
* ctf-create.c (ctf_create): Call ctf_bufopen directly.
* ctf-impl.h (ctf_str_create_atoms): Drop atoms arg.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Likewise.
(ctf_serialize): Adjust.
* testsuite/libctf-lookup/add-to-opened.c: Adjust now that
this is supposed to work.
2024-03-26 21:04:20 +08:00
|
|
|
if (!uncompressed)
|
libctf: add LIBCTF_WRITE_FOREIGN_ENDIAN debugging option
libctf has always handled endianness differences by detecting
foreign-endian CTF dicts on the input and endian-flipping them: dicts
are always written in native endianness. This makes endian-awareness
very low overhead, but it means that the foreign-endian code paths
almost never get routinely tested, since "make check" usually reads in
dicts ld has just written out: only a few corrupted-CTF tests are
actually in fixed endianness, and even they only test the foreign-
endian code paths when you run make check on a big-endian machine.
(And the fix is surely not to add more .s-based tests like that, because
they are a nightmare to maintain compared to the C-code-based ones.)
To improve on this, add a new environment variable,
LIBCTF_WRITE_FOREIGN_ENDIAN, which causes libctf to unconditionally
endian-flip at ctf_write time, so the output is always in the wrong
endianness. This then tests the foreign-endian read paths properly
at open time.
Make this easier by restructuring the writeout code in ctf-serialize.c,
which duplicates the maybe-gzip-and-write-out code three times (once
for ctf_write_mem, with thresholding, and once each for
ctf_compress_write and ctf_write just so those can avoid thresholding
and/or compression). Instead, have the latter two call the former
with thresholds of 0 or (size_t) -1, respectively.
The endian-flipping code itself gains a bit of complexity, because
one single endian-flipper (flip_types) was assuming the input to be
in foreign-endian form and assuming it could pull things out of the
input once they had been flipped and make sense of them. At the
cost of a few lines of duplicated initializations, teach it to
read before flipping if we're flipping to foreign-endianness instead
of away from it.
libctf/
* ctf-impl.h (ctf_flip_header): No longer static.
(ctf_flip): Likewise.
* ctf-open.c (flip_header): Rename to...
(ctf_flip_header): ... this, now it is not private to one file.
(flip_ctf): Rename...
(ctf_flip): ... this too. Add FOREIGN_ENDIAN arg.
(flip_types): Likewise. Use it.
(ctf_bufopen_internal): Adjust calls.
* ctf-serialize.c (ctf_write_mem): Add flip_endian path via
a newly-allocated bounce buffer.
(ctf_compress_write): Move below ctf_write_mem and reimplement
in terms of it.
(ctf_write): Likewise.
(ctf_gzwrite): Note that this obscure writeout function does not
support endian-flipping.
2022-03-18 21:20:29 +08:00
|
|
|
hp->cth_flags |= CTF_F_COMPRESS;
|
|
|
|
|
libctf: make ctf_serialize() actually serialize
ctf_serialize() evolved from the old ctf_update(), which mutated the
in-memory CTF dict to make all the dynamic in-memory types into static,
unchanging written-to-the-dict types (by deserializing and reserializing
it): back in the days when you could only do type lookups on static types,
this meant you could see all the types you added recently, at the small,
small cost of making it impossible to change those older types ever again
and inducing an amortized O(n^2) cost if you actually wanted to add
references to types you added at arbitrary times to later types.
It also reset things so that ctf_discard() would throw away only types you
added after the most recent ctf_update() call.
Some time ago this was all changed so that you could look up dynamic types
just as easily as static types: ctf_update() changed so that only its
visible side-effect of affecting ctf_discard() remained: the old
ctf_update() was renamed to ctf_serialize(), made internal to libctf, and
called from the various functions that wrote files out.
... but it was still working by serializing and deserializing the entire
dict, swapping out its guts with the newly-serialized copy in an invasive
and horrible fashion that coupled ctf_serialize() to almost every field in
the ctf_dict_t. This is totally useless, and fixing it is easy: just rip
all that code out and have ctf_serialize return a serialized representation,
and let everything use that directly. This simplifies most of its callers
significantly.
(It also points up another bug: ctf_gzwrite() failed to call ctf_serialize()
at all, so it would only ever work for a dict you just ctf_write_mem()ed
yourself, just for its invisible side-effect of serializing the dict!)
This lets us simplify away a bunch of internal-only open-side functionality
for overriding the syn_ext_strtab and some just-added functionality for
forcing in an existing atoms table, without loss of functionality, and lets
us lift the restriction on reserializing a dict that was ctf_open()ed rather
than being ctf_create()d: it's now perfectly OK to open a dict, modify it
(except for adding members to existing structs, unions, or enums, which
fails with -ECTF_RDONLY), and write it out again, just as one would expect.
libctf/
* ctf-serialize.c (ctf_symtypetab_sect_sizes): Fix typos.
(ctf_type_sect_size): Add static type sizes too.
(ctf_serialize): Return the new dict rather than updating the
existing dict. No longer fail for dicts with static types;
copy them onto the start of the new types table.
(ctf_gzwrite): Actually serialize before gzwriting.
(ctf_write_mem): Improve forced (test-mode) endian-flipping:
flip dicts even if they are too small to be compressed.
Improve confusing variable naming.
* ctf-archive.c (arc_write_one_ctf): Don't bother to call
ctf_serialize: both the functions we call do so.
* ctf-string.c (ctf_str_create_atoms): Drop serializing case
(atoms arg).
* ctf-open.c (ctf_simple_open): Call ctf_bufopen directly.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Delete/rename to ctf_bufopen: no
longer bother with syn_ext_strtab or forced atoms table,
serialization no longer needs them.
* ctf-create.c (ctf_create): Call ctf_bufopen directly.
* ctf-impl.h (ctf_str_create_atoms): Drop atoms arg.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Likewise.
(ctf_serialize): Adjust.
* testsuite/libctf-lookup/add-to-opened.c: Adjust now that
this is supposed to work.
2024-03-26 21:04:20 +08:00
|
|
|
src = rawbuf + sizeof (ctf_header_t);
|
libctf: add LIBCTF_WRITE_FOREIGN_ENDIAN debugging option
libctf has always handled endianness differences by detecting
foreign-endian CTF dicts on the input and endian-flipping them: dicts
are always written in native endianness. This makes endian-awareness
very low overhead, but it means that the foreign-endian code paths
almost never get routinely tested, since "make check" usually reads in
dicts ld has just written out: only a few corrupted-CTF tests are
actually in fixed endianness, and even they only test the foreign-
endian code paths when you run make check on a big-endian machine.
(And the fix is surely not to add more .s-based tests like that, because
they are a nightmare to maintain compared to the C-code-based ones.)
To improve on this, add a new environment variable,
LIBCTF_WRITE_FOREIGN_ENDIAN, which causes libctf to unconditionally
endian-flip at ctf_write time, so the output is always in the wrong
endianness. This then tests the foreign-endian read paths properly
at open time.
Make this easier by restructuring the writeout code in ctf-serialize.c,
which duplicates the maybe-gzip-and-write-out code three times (once
for ctf_write_mem, with thresholding, and once each for
ctf_compress_write and ctf_write just so those can avoid thresholding
and/or compression). Instead, have the latter two call the former
with thresholds of 0 or (size_t) -1, respectively.
The endian-flipping code itself gains a bit of complexity, because
one single endian-flipper (flip_types) was assuming the input to be
in foreign-endian form and assuming it could pull things out of the
input once they had been flipped and make sense of them. At the
cost of a few lines of duplicated initializations, teach it to
read before flipping if we're flipping to foreign-endianness instead
of away from it.
libctf/
* ctf-impl.h (ctf_flip_header): No longer static.
(ctf_flip): Likewise.
* ctf-open.c (flip_header): Rename to...
(ctf_flip_header): ... this, now it is not private to one file.
(flip_ctf): Rename...
(ctf_flip): ... this too. Add FOREIGN_ENDIAN arg.
(flip_types): Likewise. Use it.
(ctf_bufopen_internal): Adjust calls.
* ctf-serialize.c (ctf_write_mem): Add flip_endian path via
a newly-allocated bounce buffer.
(ctf_compress_write): Move below ctf_write_mem and reimplement
in terms of it.
(ctf_write): Likewise.
(ctf_gzwrite): Note that this obscure writeout function does not
support endian-flipping.
2022-03-18 21:20:29 +08:00
|
|
|
|
|
|
|
if (flip_endian)
|
2021-03-18 20:37:52 +08:00
|
|
|
{
|
libctf: add LIBCTF_WRITE_FOREIGN_ENDIAN debugging option
libctf has always handled endianness differences by detecting
foreign-endian CTF dicts on the input and endian-flipping them: dicts
are always written in native endianness. This makes endian-awareness
very low overhead, but it means that the foreign-endian code paths
almost never get routinely tested, since "make check" usually reads in
dicts ld has just written out: only a few corrupted-CTF tests are
actually in fixed endianness, and even they only test the foreign-
endian code paths when you run make check on a big-endian machine.
(And the fix is surely not to add more .s-based tests like that, because
they are a nightmare to maintain compared to the C-code-based ones.)
To improve on this, add a new environment variable,
LIBCTF_WRITE_FOREIGN_ENDIAN, which causes libctf to unconditionally
endian-flip at ctf_write time, so the output is always in the wrong
endianness. This then tests the foreign-endian read paths properly
at open time.
Make this easier by restructuring the writeout code in ctf-serialize.c,
which duplicates the maybe-gzip-and-write-out code three times (once
for ctf_write_mem, with thresholding, and once each for
ctf_compress_write and ctf_write just so those can avoid thresholding
and/or compression). Instead, have the latter two call the former
with thresholds of 0 or (size_t) -1, respectively.
The endian-flipping code itself gains a bit of complexity, because
one single endian-flipper (flip_types) was assuming the input to be
in foreign-endian form and assuming it could pull things out of the
input once they had been flipped and make sense of them. At the
cost of a few lines of duplicated initializations, teach it to
read before flipping if we're flipping to foreign-endianness instead
of away from it.
libctf/
* ctf-impl.h (ctf_flip_header): No longer static.
(ctf_flip): Likewise.
* ctf-open.c (flip_header): Rename to...
(ctf_flip_header): ... this, now it is not private to one file.
(flip_ctf): Rename...
(ctf_flip): ... this too. Add FOREIGN_ENDIAN arg.
(flip_types): Likewise. Use it.
(ctf_bufopen_internal): Adjust calls.
* ctf-serialize.c (ctf_write_mem): Add flip_endian path via
a newly-allocated bounce buffer.
(ctf_compress_write): Move below ctf_write_mem and reimplement
in terms of it.
(ctf_write): Likewise.
(ctf_gzwrite): Note that this obscure writeout function does not
support endian-flipping.
2022-03-18 21:20:29 +08:00
|
|
|
ctf_flip_header (hp);
|
libctf: make ctf_serialize() actually serialize
ctf_serialize() evolved from the old ctf_update(), which mutated the
in-memory CTF dict to make all the dynamic in-memory types into static,
unchanging written-to-the-dict types (by deserializing and reserializing
it): back in the days when you could only do type lookups on static types,
this meant you could see all the types you added recently, at the small,
small cost of making it impossible to change those older types ever again
and inducing an amortized O(n^2) cost if you actually wanted to add
references to types you added at arbitrary times to later types.
It also reset things so that ctf_discard() would throw away only types you
added after the most recent ctf_update() call.
Some time ago this was all changed so that you could look up dynamic types
just as easily as static types: ctf_update() changed so that only its
visible side-effect of affecting ctf_discard() remained: the old
ctf_update() was renamed to ctf_serialize(), made internal to libctf, and
called from the various functions that wrote files out.
... but it was still working by serializing and deserializing the entire
dict, swapping out its guts with the newly-serialized copy in an invasive
and horrible fashion that coupled ctf_serialize() to almost every field in
the ctf_dict_t. This is totally useless, and fixing it is easy: just rip
all that code out and have ctf_serialize return a serialized representation,
and let everything use that directly. This simplifies most of its callers
significantly.
(It also points up another bug: ctf_gzwrite() failed to call ctf_serialize()
at all, so it would only ever work for a dict you just ctf_write_mem()ed
yourself, just for its invisible side-effect of serializing the dict!)
This lets us simplify away a bunch of internal-only open-side functionality
for overriding the syn_ext_strtab and some just-added functionality for
forcing in an existing atoms table, without loss of functionality, and lets
us lift the restriction on reserializing a dict that was ctf_open()ed rather
than being ctf_create()d: it's now perfectly OK to open a dict, modify it
(except for adding members to existing structs, unions, or enums, which
fails with -ECTF_RDONLY), and write it out again, just as one would expect.
libctf/
* ctf-serialize.c (ctf_symtypetab_sect_sizes): Fix typos.
(ctf_type_sect_size): Add static type sizes too.
(ctf_serialize): Return the new dict rather than updating the
existing dict. No longer fail for dicts with static types;
copy them onto the start of the new types table.
(ctf_gzwrite): Actually serialize before gzwriting.
(ctf_write_mem): Improve forced (test-mode) endian-flipping:
flip dicts even if they are too small to be compressed.
Improve confusing variable naming.
* ctf-archive.c (arc_write_one_ctf): Don't bother to call
ctf_serialize: both the functions we call do so.
* ctf-string.c (ctf_str_create_atoms): Drop serializing case
(atoms arg).
* ctf-open.c (ctf_simple_open): Call ctf_bufopen directly.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Delete/rename to ctf_bufopen: no
longer bother with syn_ext_strtab or forced atoms table,
serialization no longer needs them.
* ctf-create.c (ctf_create): Call ctf_bufopen directly.
* ctf-impl.h (ctf_str_create_atoms): Drop atoms arg.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Likewise.
(ctf_serialize): Adjust.
* testsuite/libctf-lookup/add-to-opened.c: Adjust now that
this is supposed to work.
2024-03-26 21:04:20 +08:00
|
|
|
if (ctf_flip (fp, rawhp, src, 1) < 0)
|
|
|
|
goto err; /* errno is set for us. */
|
libctf: add LIBCTF_WRITE_FOREIGN_ENDIAN debugging option
libctf has always handled endianness differences by detecting
foreign-endian CTF dicts on the input and endian-flipping them: dicts
are always written in native endianness. This makes endian-awareness
very low overhead, but it means that the foreign-endian code paths
almost never get routinely tested, since "make check" usually reads in
dicts ld has just written out: only a few corrupted-CTF tests are
actually in fixed endianness, and even they only test the foreign-
endian code paths when you run make check on a big-endian machine.
(And the fix is surely not to add more .s-based tests like that, because
they are a nightmare to maintain compared to the C-code-based ones.)
To improve on this, add a new environment variable,
LIBCTF_WRITE_FOREIGN_ENDIAN, which causes libctf to unconditionally
endian-flip at ctf_write time, so the output is always in the wrong
endianness. This then tests the foreign-endian read paths properly
at open time.
Make this easier by restructuring the writeout code in ctf-serialize.c,
which duplicates the maybe-gzip-and-write-out code three times (once
for ctf_write_mem, with thresholding, and once each for
ctf_compress_write and ctf_write just so those can avoid thresholding
and/or compression). Instead, have the latter two call the former
with thresholds of 0 or (size_t) -1, respectively.
The endian-flipping code itself gains a bit of complexity, because
one single endian-flipper (flip_types) was assuming the input to be
in foreign-endian form and assuming it could pull things out of the
input once they had been flipped and make sense of them. At the
cost of a few lines of duplicated initializations, teach it to
read before flipping if we're flipping to foreign-endianness instead
of away from it.
libctf/
* ctf-impl.h (ctf_flip_header): No longer static.
(ctf_flip): Likewise.
* ctf-open.c (flip_header): Rename to...
(ctf_flip_header): ... this, now it is not private to one file.
(flip_ctf): Rename...
(ctf_flip): ... this too. Add FOREIGN_ENDIAN arg.
(flip_types): Likewise. Use it.
(ctf_bufopen_internal): Adjust calls.
* ctf-serialize.c (ctf_write_mem): Add flip_endian path via
a newly-allocated bounce buffer.
(ctf_compress_write): Move below ctf_write_mem and reimplement
in terms of it.
(ctf_write): Likewise.
(ctf_gzwrite): Note that this obscure writeout function does not
support endian-flipping.
2022-03-18 21:20:29 +08:00
|
|
|
}
|
|
|
|
|
libctf: make ctf_serialize() actually serialize
ctf_serialize() evolved from the old ctf_update(), which mutated the
in-memory CTF dict to make all the dynamic in-memory types into static,
unchanging written-to-the-dict types (by deserializing and reserializing
it): back in the days when you could only do type lookups on static types,
this meant you could see all the types you added recently, at the small,
small cost of making it impossible to change those older types ever again
and inducing an amortized O(n^2) cost if you actually wanted to add
references to types you added at arbitrary times to later types.
It also reset things so that ctf_discard() would throw away only types you
added after the most recent ctf_update() call.
Some time ago this was all changed so that you could look up dynamic types
just as easily as static types: ctf_update() changed so that only its
visible side-effect of affecting ctf_discard() remained: the old
ctf_update() was renamed to ctf_serialize(), made internal to libctf, and
called from the various functions that wrote files out.
... but it was still working by serializing and deserializing the entire
dict, swapping out its guts with the newly-serialized copy in an invasive
and horrible fashion that coupled ctf_serialize() to almost every field in
the ctf_dict_t. This is totally useless, and fixing it is easy: just rip
all that code out and have ctf_serialize return a serialized representation,
and let everything use that directly. This simplifies most of its callers
significantly.
(It also points up another bug: ctf_gzwrite() failed to call ctf_serialize()
at all, so it would only ever work for a dict you just ctf_write_mem()ed
yourself, just for its invisible side-effect of serializing the dict!)
This lets us simplify away a bunch of internal-only open-side functionality
for overriding the syn_ext_strtab and some just-added functionality for
forcing in an existing atoms table, without loss of functionality, and lets
us lift the restriction on reserializing a dict that was ctf_open()ed rather
than being ctf_create()d: it's now perfectly OK to open a dict, modify it
(except for adding members to existing structs, unions, or enums, which
fails with -ECTF_RDONLY), and write it out again, just as one would expect.
libctf/
* ctf-serialize.c (ctf_symtypetab_sect_sizes): Fix typos.
(ctf_type_sect_size): Add static type sizes too.
(ctf_serialize): Return the new dict rather than updating the
existing dict. No longer fail for dicts with static types;
copy them onto the start of the new types table.
(ctf_gzwrite): Actually serialize before gzwriting.
(ctf_write_mem): Improve forced (test-mode) endian-flipping:
flip dicts even if they are too small to be compressed.
Improve confusing variable naming.
* ctf-archive.c (arc_write_one_ctf): Don't bother to call
ctf_serialize: both the functions we call do so.
* ctf-string.c (ctf_str_create_atoms): Drop serializing case
(atoms arg).
* ctf-open.c (ctf_simple_open): Call ctf_bufopen directly.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Delete/rename to ctf_bufopen: no
longer bother with syn_ext_strtab or forced atoms table,
serialization no longer needs them.
* ctf-create.c (ctf_create): Call ctf_bufopen directly.
* ctf-impl.h (ctf_str_create_atoms): Drop atoms arg.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Likewise.
(ctf_serialize): Adjust.
* testsuite/libctf-lookup/add-to-opened.c: Adjust now that
this is supposed to work.
2024-03-26 21:04:20 +08:00
|
|
|
if (!uncompressed)
|
2021-03-18 20:37:52 +08:00
|
|
|
{
|
libctf: make ctf_serialize() actually serialize
ctf_serialize() evolved from the old ctf_update(), which mutated the
in-memory CTF dict to make all the dynamic in-memory types into static,
unchanging written-to-the-dict types (by deserializing and reserializing
it): back in the days when you could only do type lookups on static types,
this meant you could see all the types you added recently, at the small,
small cost of making it impossible to change those older types ever again
and inducing an amortized O(n^2) cost if you actually wanted to add
references to types you added at arbitrary times to later types.
It also reset things so that ctf_discard() would throw away only types you
added after the most recent ctf_update() call.
Some time ago this was all changed so that you could look up dynamic types
just as easily as static types: ctf_update() changed so that only its
visible side-effect of affecting ctf_discard() remained: the old
ctf_update() was renamed to ctf_serialize(), made internal to libctf, and
called from the various functions that wrote files out.
... but it was still working by serializing and deserializing the entire
dict, swapping out its guts with the newly-serialized copy in an invasive
and horrible fashion that coupled ctf_serialize() to almost every field in
the ctf_dict_t. This is totally useless, and fixing it is easy: just rip
all that code out and have ctf_serialize return a serialized representation,
and let everything use that directly. This simplifies most of its callers
significantly.
(It also points up another bug: ctf_gzwrite() failed to call ctf_serialize()
at all, so it would only ever work for a dict you just ctf_write_mem()ed
yourself, just for its invisible side-effect of serializing the dict!)
This lets us simplify away a bunch of internal-only open-side functionality
for overriding the syn_ext_strtab and some just-added functionality for
forcing in an existing atoms table, without loss of functionality, and lets
us lift the restriction on reserializing a dict that was ctf_open()ed rather
than being ctf_create()d: it's now perfectly OK to open a dict, modify it
(except for adding members to existing structs, unions, or enums, which
fails with -ECTF_RDONLY), and write it out again, just as one would expect.
libctf/
* ctf-serialize.c (ctf_symtypetab_sect_sizes): Fix typos.
(ctf_type_sect_size): Add static type sizes too.
(ctf_serialize): Return the new dict rather than updating the
existing dict. No longer fail for dicts with static types;
copy them onto the start of the new types table.
(ctf_gzwrite): Actually serialize before gzwriting.
(ctf_write_mem): Improve forced (test-mode) endian-flipping:
flip dicts even if they are too small to be compressed.
Improve confusing variable naming.
* ctf-archive.c (arc_write_one_ctf): Don't bother to call
ctf_serialize: both the functions we call do so.
* ctf-string.c (ctf_str_create_atoms): Drop serializing case
(atoms arg).
* ctf-open.c (ctf_simple_open): Call ctf_bufopen directly.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Delete/rename to ctf_bufopen: no
longer bother with syn_ext_strtab or forced atoms table,
serialization no longer needs them.
* ctf-create.c (ctf_create): Call ctf_bufopen directly.
* ctf-impl.h (ctf_str_create_atoms): Drop atoms arg.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Likewise.
(ctf_serialize): Adjust.
* testsuite/libctf-lookup/add-to-opened.c: Adjust now that
this is supposed to work.
2024-03-26 21:04:20 +08:00
|
|
|
size_t compress_len = alloc_len - sizeof (ctf_header_t);
|
|
|
|
|
2021-03-18 20:37:52 +08:00
|
|
|
if ((rc = compress (bp, (uLongf *) &compress_len,
|
libctf: make ctf_serialize() actually serialize
ctf_serialize() evolved from the old ctf_update(), which mutated the
in-memory CTF dict to make all the dynamic in-memory types into static,
unchanging written-to-the-dict types (by deserializing and reserializing
it): back in the days when you could only do type lookups on static types,
this meant you could see all the types you added recently, at the small,
small cost of making it impossible to change those older types ever again
and inducing an amortized O(n^2) cost if you actually wanted to add
references to types you added at arbitrary times to later types.
It also reset things so that ctf_discard() would throw away only types you
added after the most recent ctf_update() call.
Some time ago this was all changed so that you could look up dynamic types
just as easily as static types: ctf_update() changed so that only its
visible side-effect of affecting ctf_discard() remained: the old
ctf_update() was renamed to ctf_serialize(), made internal to libctf, and
called from the various functions that wrote files out.
... but it was still working by serializing and deserializing the entire
dict, swapping out its guts with the newly-serialized copy in an invasive
and horrible fashion that coupled ctf_serialize() to almost every field in
the ctf_dict_t. This is totally useless, and fixing it is easy: just rip
all that code out and have ctf_serialize return a serialized representation,
and let everything use that directly. This simplifies most of its callers
significantly.
(It also points up another bug: ctf_gzwrite() failed to call ctf_serialize()
at all, so it would only ever work for a dict you just ctf_write_mem()ed
yourself, just for its invisible side-effect of serializing the dict!)
This lets us simplify away a bunch of internal-only open-side functionality
for overriding the syn_ext_strtab and some just-added functionality for
forcing in an existing atoms table, without loss of functionality, and lets
us lift the restriction on reserializing a dict that was ctf_open()ed rather
than being ctf_create()d: it's now perfectly OK to open a dict, modify it
(except for adding members to existing structs, unions, or enums, which
fails with -ECTF_RDONLY), and write it out again, just as one would expect.
libctf/
* ctf-serialize.c (ctf_symtypetab_sect_sizes): Fix typos.
(ctf_type_sect_size): Add static type sizes too.
(ctf_serialize): Return the new dict rather than updating the
existing dict. No longer fail for dicts with static types;
copy them onto the start of the new types table.
(ctf_gzwrite): Actually serialize before gzwriting.
(ctf_write_mem): Improve forced (test-mode) endian-flipping:
flip dicts even if they are too small to be compressed.
Improve confusing variable naming.
* ctf-archive.c (arc_write_one_ctf): Don't bother to call
ctf_serialize: both the functions we call do so.
* ctf-string.c (ctf_str_create_atoms): Drop serializing case
(atoms arg).
* ctf-open.c (ctf_simple_open): Call ctf_bufopen directly.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Delete/rename to ctf_bufopen: no
longer bother with syn_ext_strtab or forced atoms table,
serialization no longer needs them.
* ctf-create.c (ctf_create): Call ctf_bufopen directly.
* ctf-impl.h (ctf_str_create_atoms): Drop atoms arg.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Likewise.
(ctf_serialize): Adjust.
* testsuite/libctf-lookup/add-to-opened.c: Adjust now that
this is supposed to work.
2024-03-26 21:04:20 +08:00
|
|
|
src, rawbufsiz - sizeof (ctf_header_t))) != Z_OK)
|
2021-03-18 20:37:52 +08:00
|
|
|
{
|
|
|
|
ctf_set_errno (fp, ECTF_COMPRESS);
|
|
|
|
ctf_err_warn (fp, 0, 0, _("zlib deflate err: %s"), zError (rc));
|
libctf: make ctf_serialize() actually serialize
ctf_serialize() evolved from the old ctf_update(), which mutated the
in-memory CTF dict to make all the dynamic in-memory types into static,
unchanging written-to-the-dict types (by deserializing and reserializing
it): back in the days when you could only do type lookups on static types,
this meant you could see all the types you added recently, at the small,
small cost of making it impossible to change those older types ever again
and inducing an amortized O(n^2) cost if you actually wanted to add
references to types you added at arbitrary times to later types.
It also reset things so that ctf_discard() would throw away only types you
added after the most recent ctf_update() call.
Some time ago this was all changed so that you could look up dynamic types
just as easily as static types: ctf_update() changed so that only its
visible side-effect of affecting ctf_discard() remained: the old
ctf_update() was renamed to ctf_serialize(), made internal to libctf, and
called from the various functions that wrote files out.
... but it was still working by serializing and deserializing the entire
dict, swapping out its guts with the newly-serialized copy in an invasive
and horrible fashion that coupled ctf_serialize() to almost every field in
the ctf_dict_t. This is totally useless, and fixing it is easy: just rip
all that code out and have ctf_serialize return a serialized representation,
and let everything use that directly. This simplifies most of its callers
significantly.
(It also points up another bug: ctf_gzwrite() failed to call ctf_serialize()
at all, so it would only ever work for a dict you just ctf_write_mem()ed
yourself, just for its invisible side-effect of serializing the dict!)
This lets us simplify away a bunch of internal-only open-side functionality
for overriding the syn_ext_strtab and some just-added functionality for
forcing in an existing atoms table, without loss of functionality, and lets
us lift the restriction on reserializing a dict that was ctf_open()ed rather
than being ctf_create()d: it's now perfectly OK to open a dict, modify it
(except for adding members to existing structs, unions, or enums, which
fails with -ECTF_RDONLY), and write it out again, just as one would expect.
libctf/
* ctf-serialize.c (ctf_symtypetab_sect_sizes): Fix typos.
(ctf_type_sect_size): Add static type sizes too.
(ctf_serialize): Return the new dict rather than updating the
existing dict. No longer fail for dicts with static types;
copy them onto the start of the new types table.
(ctf_gzwrite): Actually serialize before gzwriting.
(ctf_write_mem): Improve forced (test-mode) endian-flipping:
flip dicts even if they are too small to be compressed.
Improve confusing variable naming.
* ctf-archive.c (arc_write_one_ctf): Don't bother to call
ctf_serialize: both the functions we call do so.
* ctf-string.c (ctf_str_create_atoms): Drop serializing case
(atoms arg).
* ctf-open.c (ctf_simple_open): Call ctf_bufopen directly.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Delete/rename to ctf_bufopen: no
longer bother with syn_ext_strtab or forced atoms table,
serialization no longer needs them.
* ctf-create.c (ctf_create): Call ctf_bufopen directly.
* ctf-impl.h (ctf_str_create_atoms): Drop atoms arg.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Likewise.
(ctf_serialize): Adjust.
* testsuite/libctf-lookup/add-to-opened.c: Adjust now that
this is supposed to work.
2024-03-26 21:04:20 +08:00
|
|
|
goto err;
|
2021-03-18 20:37:52 +08:00
|
|
|
}
|
|
|
|
*size += compress_len;
|
|
|
|
}
|
libctf: make ctf_serialize() actually serialize
ctf_serialize() evolved from the old ctf_update(), which mutated the
in-memory CTF dict to make all the dynamic in-memory types into static,
unchanging written-to-the-dict types (by deserializing and reserializing
it): back in the days when you could only do type lookups on static types,
this meant you could see all the types you added recently, at the small,
small cost of making it impossible to change those older types ever again
and inducing an amortized O(n^2) cost if you actually wanted to add
references to types you added at arbitrary times to later types.
It also reset things so that ctf_discard() would throw away only types you
added after the most recent ctf_update() call.
Some time ago this was all changed so that you could look up dynamic types
just as easily as static types: ctf_update() changed so that only its
visible side-effect of affecting ctf_discard() remained: the old
ctf_update() was renamed to ctf_serialize(), made internal to libctf, and
called from the various functions that wrote files out.
... but it was still working by serializing and deserializing the entire
dict, swapping out its guts with the newly-serialized copy in an invasive
and horrible fashion that coupled ctf_serialize() to almost every field in
the ctf_dict_t. This is totally useless, and fixing it is easy: just rip
all that code out and have ctf_serialize return a serialized representation,
and let everything use that directly. This simplifies most of its callers
significantly.
(It also points up another bug: ctf_gzwrite() failed to call ctf_serialize()
at all, so it would only ever work for a dict you just ctf_write_mem()ed
yourself, just for its invisible side-effect of serializing the dict!)
This lets us simplify away a bunch of internal-only open-side functionality
for overriding the syn_ext_strtab and some just-added functionality for
forcing in an existing atoms table, without loss of functionality, and lets
us lift the restriction on reserializing a dict that was ctf_open()ed rather
than being ctf_create()d: it's now perfectly OK to open a dict, modify it
(except for adding members to existing structs, unions, or enums, which
fails with -ECTF_RDONLY), and write it out again, just as one would expect.
libctf/
* ctf-serialize.c (ctf_symtypetab_sect_sizes): Fix typos.
(ctf_type_sect_size): Add static type sizes too.
(ctf_serialize): Return the new dict rather than updating the
existing dict. No longer fail for dicts with static types;
copy them onto the start of the new types table.
(ctf_gzwrite): Actually serialize before gzwriting.
(ctf_write_mem): Improve forced (test-mode) endian-flipping:
flip dicts even if they are too small to be compressed.
Improve confusing variable naming.
* ctf-archive.c (arc_write_one_ctf): Don't bother to call
ctf_serialize: both the functions we call do so.
* ctf-string.c (ctf_str_create_atoms): Drop serializing case
(atoms arg).
* ctf-open.c (ctf_simple_open): Call ctf_bufopen directly.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Delete/rename to ctf_bufopen: no
longer bother with syn_ext_strtab or forced atoms table,
serialization no longer needs them.
* ctf-create.c (ctf_create): Call ctf_bufopen directly.
* ctf-impl.h (ctf_str_create_atoms): Drop atoms arg.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Likewise.
(ctf_serialize): Adjust.
* testsuite/libctf-lookup/add-to-opened.c: Adjust now that
this is supposed to work.
2024-03-26 21:04:20 +08:00
|
|
|
else
|
|
|
|
{
|
|
|
|
memcpy (bp, src, rawbufsiz - sizeof (ctf_header_t));
|
|
|
|
*size += rawbufsiz - sizeof (ctf_header_t);
|
|
|
|
}
|
libctf: add LIBCTF_WRITE_FOREIGN_ENDIAN debugging option
libctf has always handled endianness differences by detecting
foreign-endian CTF dicts on the input and endian-flipping them: dicts
are always written in native endianness. This makes endian-awareness
very low overhead, but it means that the foreign-endian code paths
almost never get routinely tested, since "make check" usually reads in
dicts ld has just written out: only a few corrupted-CTF tests are
actually in fixed endianness, and even they only test the foreign-
endian code paths when you run make check on a big-endian machine.
(And the fix is surely not to add more .s-based tests like that, because
they are a nightmare to maintain compared to the C-code-based ones.)
To improve on this, add a new environment variable,
LIBCTF_WRITE_FOREIGN_ENDIAN, which causes libctf to unconditionally
endian-flip at ctf_write time, so the output is always in the wrong
endianness. This then tests the foreign-endian read paths properly
at open time.
Make this easier by restructuring the writeout code in ctf-serialize.c,
which duplicates the maybe-gzip-and-write-out code three times (once
for ctf_write_mem, with thresholding, and once each for
ctf_compress_write and ctf_write just so those can avoid thresholding
and/or compression). Instead, have the latter two call the former
with thresholds of 0 or (size_t) -1, respectively.
The endian-flipping code itself gains a bit of complexity, because
one single endian-flipper (flip_types) was assuming the input to be
in foreign-endian form and assuming it could pull things out of the
input once they had been flipped and make sense of them. At the
cost of a few lines of duplicated initializations, teach it to
read before flipping if we're flipping to foreign-endianness instead
of away from it.
libctf/
* ctf-impl.h (ctf_flip_header): No longer static.
(ctf_flip): Likewise.
* ctf-open.c (flip_header): Rename to...
(ctf_flip_header): ... this, now it is not private to one file.
(flip_ctf): Rename...
(ctf_flip): ... this too. Add FOREIGN_ENDIAN arg.
(flip_types): Likewise. Use it.
(ctf_bufopen_internal): Adjust calls.
* ctf-serialize.c (ctf_write_mem): Add flip_endian path via
a newly-allocated bounce buffer.
(ctf_compress_write): Move below ctf_write_mem and reimplement
in terms of it.
(ctf_write): Likewise.
(ctf_gzwrite): Note that this obscure writeout function does not
support endian-flipping.
2022-03-18 21:20:29 +08:00
|
|
|
|
libctf: make ctf_serialize() actually serialize
ctf_serialize() evolved from the old ctf_update(), which mutated the
in-memory CTF dict to make all the dynamic in-memory types into static,
unchanging written-to-the-dict types (by deserializing and reserializing
it): back in the days when you could only do type lookups on static types,
this meant you could see all the types you added recently, at the small,
small cost of making it impossible to change those older types ever again
and inducing an amortized O(n^2) cost if you actually wanted to add
references to types you added at arbitrary times to later types.
It also reset things so that ctf_discard() would throw away only types you
added after the most recent ctf_update() call.
Some time ago this was all changed so that you could look up dynamic types
just as easily as static types: ctf_update() changed so that only its
visible side-effect of affecting ctf_discard() remained: the old
ctf_update() was renamed to ctf_serialize(), made internal to libctf, and
called from the various functions that wrote files out.
... but it was still working by serializing and deserializing the entire
dict, swapping out its guts with the newly-serialized copy in an invasive
and horrible fashion that coupled ctf_serialize() to almost every field in
the ctf_dict_t. This is totally useless, and fixing it is easy: just rip
all that code out and have ctf_serialize return a serialized representation,
and let everything use that directly. This simplifies most of its callers
significantly.
(It also points up another bug: ctf_gzwrite() failed to call ctf_serialize()
at all, so it would only ever work for a dict you just ctf_write_mem()ed
yourself, just for its invisible side-effect of serializing the dict!)
This lets us simplify away a bunch of internal-only open-side functionality
for overriding the syn_ext_strtab and some just-added functionality for
forcing in an existing atoms table, without loss of functionality, and lets
us lift the restriction on reserializing a dict that was ctf_open()ed rather
than being ctf_create()d: it's now perfectly OK to open a dict, modify it
(except for adding members to existing structs, unions, or enums, which
fails with -ECTF_RDONLY), and write it out again, just as one would expect.
libctf/
* ctf-serialize.c (ctf_symtypetab_sect_sizes): Fix typos.
(ctf_type_sect_size): Add static type sizes too.
(ctf_serialize): Return the new dict rather than updating the
existing dict. No longer fail for dicts with static types;
copy them onto the start of the new types table.
(ctf_gzwrite): Actually serialize before gzwriting.
(ctf_write_mem): Improve forced (test-mode) endian-flipping:
flip dicts even if they are too small to be compressed.
Improve confusing variable naming.
* ctf-archive.c (arc_write_one_ctf): Don't bother to call
ctf_serialize: both the functions we call do so.
* ctf-string.c (ctf_str_create_atoms): Drop serializing case
(atoms arg).
* ctf-open.c (ctf_simple_open): Call ctf_bufopen directly.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Delete/rename to ctf_bufopen: no
longer bother with syn_ext_strtab or forced atoms table,
serialization no longer needs them.
* ctf-create.c (ctf_create): Call ctf_bufopen directly.
* ctf-impl.h (ctf_str_create_atoms): Drop atoms arg.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Likewise.
(ctf_serialize): Adjust.
* testsuite/libctf-lookup/add-to-opened.c: Adjust now that
this is supposed to work.
2024-03-26 21:04:20 +08:00
|
|
|
free (rawbuf);
|
2021-03-18 20:37:52 +08:00
|
|
|
return buf;
|
libctf: make ctf_serialize() actually serialize
ctf_serialize() evolved from the old ctf_update(), which mutated the
in-memory CTF dict to make all the dynamic in-memory types into static,
unchanging written-to-the-dict types (by deserializing and reserializing
it): back in the days when you could only do type lookups on static types,
this meant you could see all the types you added recently, at the small,
small cost of making it impossible to change those older types ever again
and inducing an amortized O(n^2) cost if you actually wanted to add
references to types you added at arbitrary times to later types.
It also reset things so that ctf_discard() would throw away only types you
added after the most recent ctf_update() call.
Some time ago this was all changed so that you could look up dynamic types
just as easily as static types: ctf_update() changed so that only its
visible side-effect of affecting ctf_discard() remained: the old
ctf_update() was renamed to ctf_serialize(), made internal to libctf, and
called from the various functions that wrote files out.
... but it was still working by serializing and deserializing the entire
dict, swapping out its guts with the newly-serialized copy in an invasive
and horrible fashion that coupled ctf_serialize() to almost every field in
the ctf_dict_t. This is totally useless, and fixing it is easy: just rip
all that code out and have ctf_serialize return a serialized representation,
and let everything use that directly. This simplifies most of its callers
significantly.
(It also points up another bug: ctf_gzwrite() failed to call ctf_serialize()
at all, so it would only ever work for a dict you just ctf_write_mem()ed
yourself, just for its invisible side-effect of serializing the dict!)
This lets us simplify away a bunch of internal-only open-side functionality
for overriding the syn_ext_strtab and some just-added functionality for
forcing in an existing atoms table, without loss of functionality, and lets
us lift the restriction on reserializing a dict that was ctf_open()ed rather
than being ctf_create()d: it's now perfectly OK to open a dict, modify it
(except for adding members to existing structs, unions, or enums, which
fails with -ECTF_RDONLY), and write it out again, just as one would expect.
libctf/
* ctf-serialize.c (ctf_symtypetab_sect_sizes): Fix typos.
(ctf_type_sect_size): Add static type sizes too.
(ctf_serialize): Return the new dict rather than updating the
existing dict. No longer fail for dicts with static types;
copy them onto the start of the new types table.
(ctf_gzwrite): Actually serialize before gzwriting.
(ctf_write_mem): Improve forced (test-mode) endian-flipping:
flip dicts even if they are too small to be compressed.
Improve confusing variable naming.
* ctf-archive.c (arc_write_one_ctf): Don't bother to call
ctf_serialize: both the functions we call do so.
* ctf-string.c (ctf_str_create_atoms): Drop serializing case
(atoms arg).
* ctf-open.c (ctf_simple_open): Call ctf_bufopen directly.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Delete/rename to ctf_bufopen: no
longer bother with syn_ext_strtab or forced atoms table,
serialization no longer needs them.
* ctf-create.c (ctf_create): Call ctf_bufopen directly.
* ctf-impl.h (ctf_str_create_atoms): Drop atoms arg.
(ctf_simple_open_internal): Delete.
(ctf_bufopen_internal): Likewise.
(ctf_serialize): Adjust.
* testsuite/libctf-lookup/add-to-opened.c: Adjust now that
this is supposed to work.
2024-03-26 21:04:20 +08:00
|
|
|
err:
|
|
|
|
free (buf);
|
|
|
|
free (rawbuf);
|
|
|
|
return NULL;
|
2021-03-18 20:37:52 +08:00
|
|
|
}
|
|
|
|
|
2024-07-16 02:55:40 +08:00
|
|
|
/* Write the compressed CTF data stream to the specified file descriptor,
|
|
|
|
possibly compressed. Internal only (for now). */
|
2021-03-18 20:37:52 +08:00
|
|
|
int
|
2024-07-16 02:55:40 +08:00
|
|
|
ctf_write_thresholded (ctf_dict_t *fp, int fd, size_t threshold)
|
2021-03-18 20:37:52 +08:00
|
|
|
{
|
libctf: add LIBCTF_WRITE_FOREIGN_ENDIAN debugging option
libctf has always handled endianness differences by detecting
foreign-endian CTF dicts on the input and endian-flipping them: dicts
are always written in native endianness. This makes endian-awareness
very low overhead, but it means that the foreign-endian code paths
almost never get routinely tested, since "make check" usually reads in
dicts ld has just written out: only a few corrupted-CTF tests are
actually in fixed endianness, and even they only test the foreign-
endian code paths when you run make check on a big-endian machine.
(And the fix is surely not to add more .s-based tests like that, because
they are a nightmare to maintain compared to the C-code-based ones.)
To improve on this, add a new environment variable,
LIBCTF_WRITE_FOREIGN_ENDIAN, which causes libctf to unconditionally
endian-flip at ctf_write time, so the output is always in the wrong
endianness. This then tests the foreign-endian read paths properly
at open time.
Make this easier by restructuring the writeout code in ctf-serialize.c,
which duplicates the maybe-gzip-and-write-out code three times (once
for ctf_write_mem, with thresholding, and once each for
ctf_compress_write and ctf_write just so those can avoid thresholding
and/or compression). Instead, have the latter two call the former
with thresholds of 0 or (size_t) -1, respectively.
The endian-flipping code itself gains a bit of complexity, because
one single endian-flipper (flip_types) was assuming the input to be
in foreign-endian form and assuming it could pull things out of the
input once they had been flipped and make sense of them. At the
cost of a few lines of duplicated initializations, teach it to
read before flipping if we're flipping to foreign-endianness instead
of away from it.
libctf/
* ctf-impl.h (ctf_flip_header): No longer static.
(ctf_flip): Likewise.
* ctf-open.c (flip_header): Rename to...
(ctf_flip_header): ... this, now it is not private to one file.
(flip_ctf): Rename...
(ctf_flip): ... this too. Add FOREIGN_ENDIAN arg.
(flip_types): Likewise. Use it.
(ctf_bufopen_internal): Adjust calls.
* ctf-serialize.c (ctf_write_mem): Add flip_endian path via
a newly-allocated bounce buffer.
(ctf_compress_write): Move below ctf_write_mem and reimplement
in terms of it.
(ctf_write): Likewise.
(ctf_gzwrite): Note that this obscure writeout function does not
support endian-flipping.
2022-03-18 21:20:29 +08:00
|
|
|
unsigned char *buf;
|
|
|
|
unsigned char *bp;
|
|
|
|
size_t tmp;
|
|
|
|
ssize_t buf_len;
|
2021-03-18 20:37:52 +08:00
|
|
|
ssize_t len;
|
libctf: add LIBCTF_WRITE_FOREIGN_ENDIAN debugging option
libctf has always handled endianness differences by detecting
foreign-endian CTF dicts on the input and endian-flipping them: dicts
are always written in native endianness. This makes endian-awareness
very low overhead, but it means that the foreign-endian code paths
almost never get routinely tested, since "make check" usually reads in
dicts ld has just written out: only a few corrupted-CTF tests are
actually in fixed endianness, and even they only test the foreign-
endian code paths when you run make check on a big-endian machine.
(And the fix is surely not to add more .s-based tests like that, because
they are a nightmare to maintain compared to the C-code-based ones.)
To improve on this, add a new environment variable,
LIBCTF_WRITE_FOREIGN_ENDIAN, which causes libctf to unconditionally
endian-flip at ctf_write time, so the output is always in the wrong
endianness. This then tests the foreign-endian read paths properly
at open time.
Make this easier by restructuring the writeout code in ctf-serialize.c,
which duplicates the maybe-gzip-and-write-out code three times (once
for ctf_write_mem, with thresholding, and once each for
ctf_compress_write and ctf_write just so those can avoid thresholding
and/or compression). Instead, have the latter two call the former
with thresholds of 0 or (size_t) -1, respectively.
The endian-flipping code itself gains a bit of complexity, because
one single endian-flipper (flip_types) was assuming the input to be
in foreign-endian form and assuming it could pull things out of the
input once they had been flipped and make sense of them. At the
cost of a few lines of duplicated initializations, teach it to
read before flipping if we're flipping to foreign-endianness instead
of away from it.
libctf/
* ctf-impl.h (ctf_flip_header): No longer static.
(ctf_flip): Likewise.
* ctf-open.c (flip_header): Rename to...
(ctf_flip_header): ... this, now it is not private to one file.
(flip_ctf): Rename...
(ctf_flip): ... this too. Add FOREIGN_ENDIAN arg.
(flip_types): Likewise. Use it.
(ctf_bufopen_internal): Adjust calls.
* ctf-serialize.c (ctf_write_mem): Add flip_endian path via
a newly-allocated bounce buffer.
(ctf_compress_write): Move below ctf_write_mem and reimplement
in terms of it.
(ctf_write): Likewise.
(ctf_gzwrite): Note that this obscure writeout function does not
support endian-flipping.
2022-03-18 21:20:29 +08:00
|
|
|
int err = 0;
|
2021-03-18 20:37:52 +08:00
|
|
|
|
2024-07-16 02:55:40 +08:00
|
|
|
if ((buf = ctf_write_mem (fp, &tmp, threshold)) == NULL)
|
2021-03-18 20:37:52 +08:00
|
|
|
return -1; /* errno is set for us. */
|
|
|
|
|
libctf: add LIBCTF_WRITE_FOREIGN_ENDIAN debugging option
libctf has always handled endianness differences by detecting
foreign-endian CTF dicts on the input and endian-flipping them: dicts
are always written in native endianness. This makes endian-awareness
very low overhead, but it means that the foreign-endian code paths
almost never get routinely tested, since "make check" usually reads in
dicts ld has just written out: only a few corrupted-CTF tests are
actually in fixed endianness, and even they only test the foreign-
endian code paths when you run make check on a big-endian machine.
(And the fix is surely not to add more .s-based tests like that, because
they are a nightmare to maintain compared to the C-code-based ones.)
To improve on this, add a new environment variable,
LIBCTF_WRITE_FOREIGN_ENDIAN, which causes libctf to unconditionally
endian-flip at ctf_write time, so the output is always in the wrong
endianness. This then tests the foreign-endian read paths properly
at open time.
Make this easier by restructuring the writeout code in ctf-serialize.c,
which duplicates the maybe-gzip-and-write-out code three times (once
for ctf_write_mem, with thresholding, and once each for
ctf_compress_write and ctf_write just so those can avoid thresholding
and/or compression). Instead, have the latter two call the former
with thresholds of 0 or (size_t) -1, respectively.
The endian-flipping code itself gains a bit of complexity, because
one single endian-flipper (flip_types) was assuming the input to be
in foreign-endian form and assuming it could pull things out of the
input once they had been flipped and make sense of them. At the
cost of a few lines of duplicated initializations, teach it to
read before flipping if we're flipping to foreign-endianness instead
of away from it.
libctf/
* ctf-impl.h (ctf_flip_header): No longer static.
(ctf_flip): Likewise.
* ctf-open.c (flip_header): Rename to...
(ctf_flip_header): ... this, now it is not private to one file.
(flip_ctf): Rename...
(ctf_flip): ... this too. Add FOREIGN_ENDIAN arg.
(flip_types): Likewise. Use it.
(ctf_bufopen_internal): Adjust calls.
* ctf-serialize.c (ctf_write_mem): Add flip_endian path via
a newly-allocated bounce buffer.
(ctf_compress_write): Move below ctf_write_mem and reimplement
in terms of it.
(ctf_write): Likewise.
(ctf_gzwrite): Note that this obscure writeout function does not
support endian-flipping.
2022-03-18 21:20:29 +08:00
|
|
|
buf_len = tmp;
|
|
|
|
bp = buf;
|
|
|
|
|
|
|
|
while (buf_len > 0)
|
2021-03-18 20:37:52 +08:00
|
|
|
{
|
libctf: add LIBCTF_WRITE_FOREIGN_ENDIAN debugging option
libctf has always handled endianness differences by detecting
foreign-endian CTF dicts on the input and endian-flipping them: dicts
are always written in native endianness. This makes endian-awareness
very low overhead, but it means that the foreign-endian code paths
almost never get routinely tested, since "make check" usually reads in
dicts ld has just written out: only a few corrupted-CTF tests are
actually in fixed endianness, and even they only test the foreign-
endian code paths when you run make check on a big-endian machine.
(And the fix is surely not to add more .s-based tests like that, because
they are a nightmare to maintain compared to the C-code-based ones.)
To improve on this, add a new environment variable,
LIBCTF_WRITE_FOREIGN_ENDIAN, which causes libctf to unconditionally
endian-flip at ctf_write time, so the output is always in the wrong
endianness. This then tests the foreign-endian read paths properly
at open time.
Make this easier by restructuring the writeout code in ctf-serialize.c,
which duplicates the maybe-gzip-and-write-out code three times (once
for ctf_write_mem, with thresholding, and once each for
ctf_compress_write and ctf_write just so those can avoid thresholding
and/or compression). Instead, have the latter two call the former
with thresholds of 0 or (size_t) -1, respectively.
The endian-flipping code itself gains a bit of complexity, because
one single endian-flipper (flip_types) was assuming the input to be
in foreign-endian form and assuming it could pull things out of the
input once they had been flipped and make sense of them. At the
cost of a few lines of duplicated initializations, teach it to
read before flipping if we're flipping to foreign-endianness instead
of away from it.
libctf/
* ctf-impl.h (ctf_flip_header): No longer static.
(ctf_flip): Likewise.
* ctf-open.c (flip_header): Rename to...
(ctf_flip_header): ... this, now it is not private to one file.
(flip_ctf): Rename...
(ctf_flip): ... this too. Add FOREIGN_ENDIAN arg.
(flip_types): Likewise. Use it.
(ctf_bufopen_internal): Adjust calls.
* ctf-serialize.c (ctf_write_mem): Add flip_endian path via
a newly-allocated bounce buffer.
(ctf_compress_write): Move below ctf_write_mem and reimplement
in terms of it.
(ctf_write): Likewise.
(ctf_gzwrite): Note that this obscure writeout function does not
support endian-flipping.
2022-03-18 21:20:29 +08:00
|
|
|
if ((len = write (fd, bp, buf_len)) < 0)
|
2021-03-18 20:37:52 +08:00
|
|
|
{
|
libctf: add LIBCTF_WRITE_FOREIGN_ENDIAN debugging option
libctf has always handled endianness differences by detecting
foreign-endian CTF dicts on the input and endian-flipping them: dicts
are always written in native endianness. This makes endian-awareness
very low overhead, but it means that the foreign-endian code paths
almost never get routinely tested, since "make check" usually reads in
dicts ld has just written out: only a few corrupted-CTF tests are
actually in fixed endianness, and even they only test the foreign-
endian code paths when you run make check on a big-endian machine.
(And the fix is surely not to add more .s-based tests like that, because
they are a nightmare to maintain compared to the C-code-based ones.)
To improve on this, add a new environment variable,
LIBCTF_WRITE_FOREIGN_ENDIAN, which causes libctf to unconditionally
endian-flip at ctf_write time, so the output is always in the wrong
endianness. This then tests the foreign-endian read paths properly
at open time.
Make this easier by restructuring the writeout code in ctf-serialize.c,
which duplicates the maybe-gzip-and-write-out code three times (once
for ctf_write_mem, with thresholding, and once each for
ctf_compress_write and ctf_write just so those can avoid thresholding
and/or compression). Instead, have the latter two call the former
with thresholds of 0 or (size_t) -1, respectively.
The endian-flipping code itself gains a bit of complexity, because
one single endian-flipper (flip_types) was assuming the input to be
in foreign-endian form and assuming it could pull things out of the
input once they had been flipped and make sense of them. At the
cost of a few lines of duplicated initializations, teach it to
read before flipping if we're flipping to foreign-endianness instead
of away from it.
libctf/
* ctf-impl.h (ctf_flip_header): No longer static.
(ctf_flip): Likewise.
* ctf-open.c (flip_header): Rename to...
(ctf_flip_header): ... this, now it is not private to one file.
(flip_ctf): Rename...
(ctf_flip): ... this too. Add FOREIGN_ENDIAN arg.
(flip_types): Likewise. Use it.
(ctf_bufopen_internal): Adjust calls.
* ctf-serialize.c (ctf_write_mem): Add flip_endian path via
a newly-allocated bounce buffer.
(ctf_compress_write): Move below ctf_write_mem and reimplement
in terms of it.
(ctf_write): Likewise.
(ctf_gzwrite): Note that this obscure writeout function does not
support endian-flipping.
2022-03-18 21:20:29 +08:00
|
|
|
err = ctf_set_errno (fp, errno);
|
|
|
|
ctf_err_warn (fp, 0, 0, _("ctf_compress_write: error writing"));
|
|
|
|
goto ret;
|
2021-03-18 20:37:52 +08:00
|
|
|
}
|
libctf: add LIBCTF_WRITE_FOREIGN_ENDIAN debugging option
libctf has always handled endianness differences by detecting
foreign-endian CTF dicts on the input and endian-flipping them: dicts
are always written in native endianness. This makes endian-awareness
very low overhead, but it means that the foreign-endian code paths
almost never get routinely tested, since "make check" usually reads in
dicts ld has just written out: only a few corrupted-CTF tests are
actually in fixed endianness, and even they only test the foreign-
endian code paths when you run make check on a big-endian machine.
(And the fix is surely not to add more .s-based tests like that, because
they are a nightmare to maintain compared to the C-code-based ones.)
To improve on this, add a new environment variable,
LIBCTF_WRITE_FOREIGN_ENDIAN, which causes libctf to unconditionally
endian-flip at ctf_write time, so the output is always in the wrong
endianness. This then tests the foreign-endian read paths properly
at open time.
Make this easier by restructuring the writeout code in ctf-serialize.c,
which duplicates the maybe-gzip-and-write-out code three times (once
for ctf_write_mem, with thresholding, and once each for
ctf_compress_write and ctf_write just so those can avoid thresholding
and/or compression). Instead, have the latter two call the former
with thresholds of 0 or (size_t) -1, respectively.
The endian-flipping code itself gains a bit of complexity, because
one single endian-flipper (flip_types) was assuming the input to be
in foreign-endian form and assuming it could pull things out of the
input once they had been flipped and make sense of them. At the
cost of a few lines of duplicated initializations, teach it to
read before flipping if we're flipping to foreign-endianness instead
of away from it.
libctf/
* ctf-impl.h (ctf_flip_header): No longer static.
(ctf_flip): Likewise.
* ctf-open.c (flip_header): Rename to...
(ctf_flip_header): ... this, now it is not private to one file.
(flip_ctf): Rename...
(ctf_flip): ... this too. Add FOREIGN_ENDIAN arg.
(flip_types): Likewise. Use it.
(ctf_bufopen_internal): Adjust calls.
* ctf-serialize.c (ctf_write_mem): Add flip_endian path via
a newly-allocated bounce buffer.
(ctf_compress_write): Move below ctf_write_mem and reimplement
in terms of it.
(ctf_write): Likewise.
(ctf_gzwrite): Note that this obscure writeout function does not
support endian-flipping.
2022-03-18 21:20:29 +08:00
|
|
|
buf_len -= len;
|
|
|
|
bp += len;
|
2021-03-18 20:37:52 +08:00
|
|
|
}
|
|
|
|
|
libctf: add LIBCTF_WRITE_FOREIGN_ENDIAN debugging option
libctf has always handled endianness differences by detecting
foreign-endian CTF dicts on the input and endian-flipping them: dicts
are always written in native endianness. This makes endian-awareness
very low overhead, but it means that the foreign-endian code paths
almost never get routinely tested, since "make check" usually reads in
dicts ld has just written out: only a few corrupted-CTF tests are
actually in fixed endianness, and even they only test the foreign-
endian code paths when you run make check on a big-endian machine.
(And the fix is surely not to add more .s-based tests like that, because
they are a nightmare to maintain compared to the C-code-based ones.)
To improve on this, add a new environment variable,
LIBCTF_WRITE_FOREIGN_ENDIAN, which causes libctf to unconditionally
endian-flip at ctf_write time, so the output is always in the wrong
endianness. This then tests the foreign-endian read paths properly
at open time.
Make this easier by restructuring the writeout code in ctf-serialize.c,
which duplicates the maybe-gzip-and-write-out code three times (once
for ctf_write_mem, with thresholding, and once each for
ctf_compress_write and ctf_write just so those can avoid thresholding
and/or compression). Instead, have the latter two call the former
with thresholds of 0 or (size_t) -1, respectively.
The endian-flipping code itself gains a bit of complexity, because
one single endian-flipper (flip_types) was assuming the input to be
in foreign-endian form and assuming it could pull things out of the
input once they had been flipped and make sense of them. At the
cost of a few lines of duplicated initializations, teach it to
read before flipping if we're flipping to foreign-endianness instead
of away from it.
libctf/
* ctf-impl.h (ctf_flip_header): No longer static.
(ctf_flip): Likewise.
* ctf-open.c (flip_header): Rename to...
(ctf_flip_header): ... this, now it is not private to one file.
(flip_ctf): Rename...
(ctf_flip): ... this too. Add FOREIGN_ENDIAN arg.
(flip_types): Likewise. Use it.
(ctf_bufopen_internal): Adjust calls.
* ctf-serialize.c (ctf_write_mem): Add flip_endian path via
a newly-allocated bounce buffer.
(ctf_compress_write): Move below ctf_write_mem and reimplement
in terms of it.
(ctf_write): Likewise.
(ctf_gzwrite): Note that this obscure writeout function does not
support endian-flipping.
2022-03-18 21:20:29 +08:00
|
|
|
ret:
|
|
|
|
free (buf);
|
|
|
|
return err;
|
|
|
|
}
|
|
|
|
|
2024-07-16 02:55:40 +08:00
|
|
|
/* Compress the specified CTF data stream and write it to the specified file
|
|
|
|
descriptor. */
|
|
|
|
int
|
|
|
|
ctf_compress_write (ctf_dict_t *fp, int fd)
|
|
|
|
{
|
|
|
|
return ctf_write_thresholded (fp, fd, 0);
|
|
|
|
}
|
|
|
|
|
libctf: add LIBCTF_WRITE_FOREIGN_ENDIAN debugging option
libctf has always handled endianness differences by detecting
foreign-endian CTF dicts on the input and endian-flipping them: dicts
are always written in native endianness. This makes endian-awareness
very low overhead, but it means that the foreign-endian code paths
almost never get routinely tested, since "make check" usually reads in
dicts ld has just written out: only a few corrupted-CTF tests are
actually in fixed endianness, and even they only test the foreign-
endian code paths when you run make check on a big-endian machine.
(And the fix is surely not to add more .s-based tests like that, because
they are a nightmare to maintain compared to the C-code-based ones.)
To improve on this, add a new environment variable,
LIBCTF_WRITE_FOREIGN_ENDIAN, which causes libctf to unconditionally
endian-flip at ctf_write time, so the output is always in the wrong
endianness. This then tests the foreign-endian read paths properly
at open time.
Make this easier by restructuring the writeout code in ctf-serialize.c,
which duplicates the maybe-gzip-and-write-out code three times (once
for ctf_write_mem, with thresholding, and once each for
ctf_compress_write and ctf_write just so those can avoid thresholding
and/or compression). Instead, have the latter two call the former
with thresholds of 0 or (size_t) -1, respectively.
The endian-flipping code itself gains a bit of complexity, because
one single endian-flipper (flip_types) was assuming the input to be
in foreign-endian form and assuming it could pull things out of the
input once they had been flipped and make sense of them. At the
cost of a few lines of duplicated initializations, teach it to
read before flipping if we're flipping to foreign-endianness instead
of away from it.
libctf/
* ctf-impl.h (ctf_flip_header): No longer static.
(ctf_flip): Likewise.
* ctf-open.c (flip_header): Rename to...
(ctf_flip_header): ... this, now it is not private to one file.
(flip_ctf): Rename...
(ctf_flip): ... this too. Add FOREIGN_ENDIAN arg.
(flip_types): Likewise. Use it.
(ctf_bufopen_internal): Adjust calls.
* ctf-serialize.c (ctf_write_mem): Add flip_endian path via
a newly-allocated bounce buffer.
(ctf_compress_write): Move below ctf_write_mem and reimplement
in terms of it.
(ctf_write): Likewise.
(ctf_gzwrite): Note that this obscure writeout function does not
support endian-flipping.
2022-03-18 21:20:29 +08:00
|
|
|
/* Write the uncompressed CTF data stream to the specified file descriptor. */
|
|
|
|
int
|
|
|
|
ctf_write (ctf_dict_t *fp, int fd)
|
|
|
|
{
|
2024-07-16 02:55:40 +08:00
|
|
|
return ctf_write_thresholded (fp, fd, (size_t) -1);
|
2021-03-18 20:37:52 +08:00
|
|
|
}
|