binutils-gdb/gdb/build-id.c
Andrew Burgess 22836ca885 gdb: check for multiple matching build-id files
Within the debug-file-directory GDB looks for the existence of a
.build-id directory.

Within the .build-id directory GDB looks for files with the form:

  .build-id/ff/4b4142d62b399499844924d53e33d4028380db.debug

which contain the debug information for the objfile with the build-id
ff4b4142d62b399499844924d53e33d4028380db.

There appear to be two strategies for populating the .build-id
directory.  Ubuntu takes the approach of placing the actual debug
information in this directory, so
4b4142d62b399499844924d53e33d4028380db.debug is an actual file
containing the debug information.

Fedora, RHEL, and SUSE take a slightly different approach, placing the
debug information elsewhere, and then creating symlinks in the
.build-id directory back to the original debug information file.  The
actual debug information is arranged in a mirror of the filesystem
within the debug directory, as an example, if the debug-file-directory
is /usr/lib/debug, then the debug information for /bin/foo can be
found in /usr/lib/debug/bin/foo.debug.

Where this gets interesting is that in some cases a package will
install a single binary with multiple names, in this case a single
binary will be install with either hard-links, or symlinks providing
the alternative names.

The debug information for these multiple binaries will then be placed
into the /usr/lib/debug/ tree, and again, links are created so a
single file can provide debug information for each of the names that
binary presents as.  An example file system might look like this (the
[link] could be symlinks, but are more likely hard-links):

  /bin/
    foo
    bar -> foo	[ HARD LINK ]
    baz -> foo	[ HARD LINK ]
  /usr/
    lib/
      debug/
        bin/
	  foo.debug
	  bar.debug -> foo.debug	[ HARD LINK ]
	  baz.debug -> foo.debug	[ HARD LINK ]

In the .build-id tree though we have a problem.  Do we have a single
entry that links to one of the .debug files?  This would work; a user
debugging any of the binaries will find the debug information based on
the build-id, and will get the correct information, after all the
.debug files are identical (same file linked together).  But there is
one problem with this approach.

Sometimes, for *reasons* it's possible that one or more the linked
binaries might get removed, along with its associated debug
information.  I'm honestly not 100% certain under what circumstances
this can happen, but what I observe is that sometime a single name for
a binary, and its corresponding .debug entry, can be missing.  If this
happens to be the entry that the .build-id link is pointing at, then
we have a problem.  The user can no longer find the debug information
based on the .build-id link.

The solution that Fedora, RHEL, & SUSE have adopted is to add multiple
entries in the .build-id tree, with each entry pointing to a different
name within the debug/ tree, a sequence number is added to the
build-id to distinguish the multiple entries.  Thus, we might end up
with a layout like this:

  /bin/
    foo
    bar -> foo	[ HARD LINK ]
    baz -> foo	[ HARD LINK ]
  /usr/
    lib/
      debug/
        bin/
	  foo.debug
	  bar.debug -> foo.debug	[ HARD LINK ]
	  baz.debug -> foo.debug	[ HARD LINK ]
      .build-id/
        a3/
          4b4142d62b399499844924d53e33d4028380db.debug -> ../../debug/bin/foo.debug	[ SYMLINK ]
          4b4142d62b399499844924d53e33d4028380db.1.debug -> ../../debug/bin/bar.debug	[ SYMLINK ]
          4b4142d62b399499844924d53e33d4028380db.2.debug -> ../../debug/bin/baz.debug	[ SYMLINK ]

With current master GDB, debug information will only ever be looked up
via the 4b4142d62b399499844924d53e33d4028380db.debug link.  But if
'foo' and its corresponding 'foo.debug' are ever removed, then master
GDB will fail to find the debug information.

Ubuntu seems to have a much better approach for debug information
handling; they place the debug information directly into the .build-id
tree, so there only ever needs to be a single entry for any one
build-id.  I wonder if/how they handle the case where multiple names
might share a single .debug file, if one of those names is then
uninstalled, how do they know the .debug file should be retained or
not ... but I assume that problem either doesn't exist or has been
solved.

Anyway, for a while Fedora has carried a patch that handles the
build-id sequence number logic.  What's presented here is inspired by
the Fedora patch, but has some changes to fix some issues.

I'm aware that this is a patch that applies to only some (probably a
minority) of distros.  However, the logic is contained to only a
single function in build-id.c, and isn't too complex, so I'm hoping
that there wont be too many objections.

For distros that don't have build-id sequence numbers there should be
no impact.  The sequence number approach still leaves the first file
without a sequence number, and this is the first file that GDB (after
this patch) checks for.  The new logic only kicks in if the
non-sequence numbered first file exists, but is a symlink to a non
existent file; in this case GDB checks for the sequence numbered files
instead.

Tests are included.

There is a small fix needed for gdb.base/sysroot-debug-lookup.exp,
after this commit GDB now treats a target: sysroot where the target
file system is local to GDB the same as if the sysroot had no target:
prefix.  The consequence of this is that GDB now resolves a symlink
back to the real filename in the sysroot-debug-lookup.exp test where
it didn't previously.  As this behaviour is inline with the case where
there is no target: prefix I think this is fine.
2024-07-18 13:24:20 +01:00

338 lines
9.8 KiB
C

/* build-id-related functions.
Copyright (C) 1991-2024 Free Software Foundation, Inc.
This file is part of GDB.
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>. */
#include "bfd.h"
#include "gdb_bfd.h"
#include "build-id.h"
#include "gdbsupport/gdb_vecs.h"
#include "symfile.h"
#include "objfiles.h"
#include "filenames.h"
#include "gdbcore.h"
#include "cli/cli-style.h"
/* See build-id.h. */
const struct bfd_build_id *
build_id_bfd_get (bfd *abfd)
{
/* Dynamic objfiles such as ones created by JIT reader API
have no underlying bfd structure (that is, objfile->obfd
is NULL). */
if (abfd == nullptr)
return nullptr;
if (!bfd_check_format (abfd, bfd_object)
&& !bfd_check_format (abfd, bfd_core))
return NULL;
if (abfd->build_id != NULL)
return abfd->build_id;
/* No build-id */
return NULL;
}
/* See build-id.h. */
int
build_id_verify (bfd *abfd, size_t check_len, const bfd_byte *check)
{
const struct bfd_build_id *found;
int retval = 0;
found = build_id_bfd_get (abfd);
if (found == NULL)
warning (_("File \"%ps\" has no build-id, file skipped"),
styled_string (file_name_style.style (),
bfd_get_filename (abfd)));
else if (!build_id_equal (found, check_len, check))
warning (_("File \"%ps\" has a different build-id, file skipped"),
styled_string (file_name_style.style (),
bfd_get_filename (abfd)));
else
retval = 1;
return retval;
}
/* Helper for build_id_to_debug_bfd. ORIGINAL_LINK with SUFFIX appended is
a path to a potential build-id-based separate debug file, potentially a
symlink to the real file. If the file exists and matches BUILD_ID,
return a BFD reference to it. */
static gdb_bfd_ref_ptr
build_id_to_debug_bfd_1 (const std::string &original_link,
size_t build_id_len, const bfd_byte *build_id,
const char *suffix)
{
tribool supports_target_stat = TRIBOOL_UNKNOWN;
/* Drop the 'target:' prefix if the target filesystem is local. */
std::string_view original_link_view (original_link);
if (is_target_filename (original_link) && target_filesystem_is_local ())
original_link_view
= original_link_view.substr (strlen (TARGET_SYSROOT_PREFIX));
/* The upper bound of '10' here is completely arbitrary. The loop should
terminate via 'break' when either (a) a readable symlink is found, or
(b) a non-existing entry is found.
However, for remote targets, we rely on the remote returning sane
error codes. If a remote sends back the wrong error code then it
might trick GDB into thinking that the symlink exists, but points to a
missing file, in which case GDB will try the next seqno. We don't
want a broken remote to cause GDB to spin here forever, hence a fixed
upper bound. */
for (unsigned seqno = 0; seqno < 10; seqno++)
{
std::string link (original_link_view);
if (seqno > 0)
string_appendf (link, ".%u", seqno);
link += suffix;
separate_debug_file_debug_printf ("Trying %s...", link.c_str ());
gdb::unique_xmalloc_ptr<char> filename_holder;
const char *filename = nullptr;
if (is_target_filename (link))
{
gdb_assert (link.length () >= strlen (TARGET_SYSROOT_PREFIX));
const char *link_on_target
= link.c_str () + strlen (TARGET_SYSROOT_PREFIX);
fileio_error target_errno;
if (supports_target_stat != TRIBOOL_FALSE)
{
struct stat sb;
int res = target_fileio_stat (nullptr, link_on_target, &sb,
&target_errno);
if (res != 0 && target_errno != FILEIO_ENOSYS)
{
separate_debug_file_debug_printf ("path doesn't exist");
break;
}
else if (res != 0 && target_errno == FILEIO_ENOSYS)
supports_target_stat = TRIBOOL_FALSE;
else
{
supports_target_stat = TRIBOOL_TRUE;
filename = link.c_str ();
}
}
if (supports_target_stat == TRIBOOL_FALSE)
{
gdb_assert (filename == nullptr);
/* Connecting to a target that doesn't support 'stat'. Try
'readlink' as an alternative. This isn't ideal, but is
maybe better than nothing. Returns EINVAL if the path
isn't a symbolic link, which hints that the path is
available -- there are other errors e.g. ENOENT for when
the path doesn't exist, but we just assume that anything
other than EINVAL indicates the path doesn't exist. */
std::optional<std::string> link_target
= target_fileio_readlink (nullptr, link_on_target,
&target_errno);
if (link_target.has_value ()
|| target_errno == FILEIO_EINVAL)
filename = link.c_str ();
else
{
separate_debug_file_debug_printf ("path doesn't exist");
break;
}
}
}
else
{
struct stat buf;
/* The `access' call below automatically dereferences LINK, but
we want to stop incrementing SEQNO once we find a symlink
that doesn't exist. */
if (lstat (link.c_str (), &buf) != 0)
{
separate_debug_file_debug_printf ("path doesn't exist");
break;
}
/* Can LINK be accessed, or if LINK is a symlink, can the file
pointed too be accessed? Do this as lrealpath() is
expensive, even for the usually non-existent files. */
if (access (link.c_str (), F_OK) == 0)
{
filename_holder.reset (lrealpath (link.c_str ()));
filename = filename_holder.get ();
}
}
if (filename == nullptr)
{
separate_debug_file_debug_printf ("unable to compute real path");
continue;
}
/* We expect to be silent on the non-existing files. */
gdb_bfd_ref_ptr debug_bfd = gdb_bfd_open (filename, gnutarget);
if (debug_bfd == NULL)
{
separate_debug_file_debug_printf ("unable to open `%s`", filename);
continue;
}
if (!build_id_verify (debug_bfd.get(), build_id_len, build_id))
{
separate_debug_file_debug_printf ("build-id does not match");
continue;
}
separate_debug_file_debug_printf ("found a match");
return debug_bfd;
}
separate_debug_file_debug_printf ("no suitable file found");
return {};
}
/* Common code for finding BFDs of a given build-id. This function
works with both debuginfo files (SUFFIX == ".debug") and executable
files (SUFFIX == ""). */
static gdb_bfd_ref_ptr
build_id_to_bfd_suffix (size_t build_id_len, const bfd_byte *build_id,
const char *suffix)
{
SEPARATE_DEBUG_FILE_SCOPED_DEBUG_ENTER_EXIT;
/* Keep backward compatibility so that DEBUG_FILE_DIRECTORY being "" will
cause "/.build-id/..." lookups. */
std::vector<gdb::unique_xmalloc_ptr<char>> debugdir_vec
= dirnames_to_char_ptr_vec (debug_file_directory.c_str ());
for (const gdb::unique_xmalloc_ptr<char> &debugdir : debugdir_vec)
{
const gdb_byte *data = build_id;
size_t size = build_id_len;
/* Compute where the file named after the build-id would be.
If debugdir is "/usr/lib/debug" and the build-id is abcdef, this will
give "/usr/lib/debug/.build-id/ab/cdef.debug". */
std::string link = debugdir.get ();
link += "/.build-id/";
if (size > 0)
{
size--;
string_appendf (link, "%02x/", (unsigned) *data++);
}
while (size-- > 0)
string_appendf (link, "%02x", (unsigned) *data++);
gdb_bfd_ref_ptr debug_bfd
= build_id_to_debug_bfd_1 (link, build_id_len, build_id, suffix);
if (debug_bfd != NULL)
return debug_bfd;
/* Try to look under the sysroot as well. If the sysroot is
"/the/sysroot", it will give
"/the/sysroot/usr/lib/debug/.build-id/ab/cdef.debug".
If the sysroot is 'target:' and the target filesystem is local to
GDB then 'target:/path/to/check' becomes '/path/to/check' which
we just checked above. */
if (!gdb_sysroot.empty ()
&& (gdb_sysroot != TARGET_SYSROOT_PREFIX
|| !target_filesystem_is_local ()))
{
link = gdb_sysroot + link;
debug_bfd = build_id_to_debug_bfd_1 (link, build_id_len, build_id,
suffix);
if (debug_bfd != NULL)
return debug_bfd;
}
}
return {};
}
/* See build-id.h. */
gdb_bfd_ref_ptr
build_id_to_debug_bfd (size_t build_id_len, const bfd_byte *build_id)
{
return build_id_to_bfd_suffix (build_id_len, build_id, ".debug");
}
/* See build-id.h. */
gdb_bfd_ref_ptr
build_id_to_exec_bfd (size_t build_id_len, const bfd_byte *build_id)
{
return build_id_to_bfd_suffix (build_id_len, build_id, "");
}
/* See build-id.h. */
std::string
find_separate_debug_file_by_buildid (struct objfile *objfile,
deferred_warnings *warnings)
{
const struct bfd_build_id *build_id;
build_id = build_id_bfd_get (objfile->obfd.get ());
if (build_id != NULL)
{
SEPARATE_DEBUG_FILE_SCOPED_DEBUG_START_END
("looking for separate debug info (build-id) for %s",
objfile_name (objfile));
gdb_bfd_ref_ptr abfd (build_id_to_debug_bfd (build_id->size,
build_id->data));
/* Prevent looping on a stripped .debug file. */
if (abfd != NULL
&& filename_cmp (bfd_get_filename (abfd.get ()),
objfile_name (objfile)) == 0)
{
separate_debug_file_debug_printf
("\"%s\": separate debug info file has no debug info",
bfd_get_filename (abfd.get ()));
warnings->warn (_("\"%ps\": separate debug info file has no "
"debug info"),
styled_string (file_name_style.style (),
bfd_get_filename (abfd.get ())));
}
else if (abfd != NULL)
return std::string (bfd_get_filename (abfd.get ()));
}
return std::string ();
}