Powerpc is not reporting the
Catchpoint 1 (returned from syscall execve), ....
as expected. The issue appears to be with the kernel not returning the
expected result. This patch marks the test failure as an xfail.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28623
While working on a Python script, which was interacting with a remote
target, I noticed some weird slowness in GDB. In my program I had a
structure something like this:
struct foo_t
{
int array[5];
};
struct foo_t global_foo;
Then in the Python script I was fetching a complete copy of global
foo, like:
val = gdb.parse_and_eval('global_foo')
val.fetch_lazy()
Then I would work with items in foo_t.array, like:
print(val['array'][1])
I called the fetch_lazy method specifically because I knew I was going
to end up accessing almost all of the contents of val, and so I wanted
GDB to do a single remote protocol call to fetch all the contents in
one go, rather than trying to do lazy fetches for a couple of bytes at
a time.
What I observed was that, after the fetch_lazy call, GDB does,
correctly, fetch the entire contents of global_foo, including all of
the contents of array, however, when I access val.array[1], GDB still
goes and fetches the value of this element from the remote target.
What's going on is that in valarith.c, in value_subscript, for C like
languages, we always end up treating the array value as a pointer, and
then doing value_ptradd, and value_ind, the second of these calls
always returns a lazy value.
My guess is that this approach allows us to handle indexing off the
end of an array, when working with zero element arrays, or when
indexing a raw pointer as an array. And, I agree, that in these
cases, where, even when the original value is non-lazy, we still will
not have the content of the array loaded, we should be using the
value_ind approach.
However, for cases where we do have the array contents loaded, and we
do know the bounds of the array, I think we should be using
value_subscripted_rvalue, which is what we use for non C like
languages.
One problem I did run into, exposed by gdb.base/charset.exp, was that
value_subscripted_rvalue stripped typedefs from the element type of
the array, which means the value returned will not have the same type
as an element of the array, but would be the raw, non-typedefed,
type. In charset.exp we got back an 'int' instead of a
'wchar_t' (which is a typedef of 'int'), and this impacts how we print
the value. Removing typedefs from the resulting value just seems
wrong, so I got rid of that, and I don't see any test regressions.
With this change in place, my original Python script is now doing no
additional memory accesses, and its performance increases about 10x!
This commit updates uses of 'loc' and 'loc_kind' to 'm_loc' and
'm_loc_kind' respectively, in gdb-gdb.py.in, which is required after
this commit:
commit cd3f655cc7
Date: Thu Sep 30 22:38:29 2021 -0400
gdb: add accessors for field (and call site) location
I have also incorporated this change:
https://sourceware.org/pipermail/gdb-patches/2021-September/182171.html
Which means we print 'm_name' instead of 'name' when displaying the
'm_name' member variable.
Finally, I have also added support for the new TYPE_SPECIFIC_INT
fields, which were added with this commit:
commit 20a5fcbd5b
Date: Wed Sep 23 09:39:24 2020 -0600
Handle bit offset and bit size in base types
I updated the gdb.gdb/python-helper.exp test to cover all of these
changes.
Replace the direct assignments to current_thread with
switch_to_thread. Use scoped_restore_current_thread when appropriate.
There is one instance remaining in linux-low.cc's wait_for_sigstop.
This will be handled in a separate patch.
Regression-tested on X86-64 Linux using the native-gdbserver and
native-extended-gdbserver board files.
Introduce a class for restoring the current thread and a function to
switch to the given thread. This is a preparation for a refactoring
that aims to remove direct assignments to 'current_thread'.
While working on a later patch that required me to understand how GDB
starts up inferiors, I was confused by the
target_ops::post_startup_inferior method.
The post_startup_inferior target function is only called from
inf_ptrace_target::create_inferior.
Part of the target class hierarchy looks like this:
inf_child_target
|
'-- inf_ptrace_target
|
|-- linux_nat_target
|
|-- fbsd_nat_target
|
|-- nbsd_nat_target
|
|-- obsd_nat_target
|
'-- rs6000_nat_target
Every sub-class of inf_ptrace_target, except rs6000_nat_target,
implements ::post_startup_inferior. The rs6000_nat_target picks up
the implementation of ::post_startup_inferior not from
inf_ptrace_target, but from inf_child_target.
No descendent of inf_child_target, outside the inf_ptrace_target
sub-tree, implements ::post_startup_inferior, which isn't really
surprising, as they would never see the method called (remember, the
method is only called from inf_ptrace_target::create_inferior).
What I find confusing is the role inf_child_target plays in
implementing, what is really a helper function for just one of its
descendents.
In this commit I propose that we formally make ::post_startup_inferior
a helper function of inf_ptrace_target. To do this I will remove the
::post_startup_inferior from the target_ops API, and instead make this
a protected, pure virtual function on inf_ptrace_target.
I'll remove the empty implementation of ::post_startup_inferior from
the inf_child_target class, and add a new empty implementation to the
rs6000_nat_target class.
All the other descendents of inf_ptrace_target already provide an
implementation of this method and so don't need to change beyond
making the method protected within their class declarations.
To me, this makes much more sense now. The helper function, which is
only called from within the inf_ptrace_target class, is now a part of
the inf_ptrace_target class.
The only way in which this change is visible to a user is if the user
turns on 'set debug target 1'. With this debug flag on, prior to this
patch the user would see something like:
-> native->post_startup_inferior (...)
<- native->post_startup_inferior (2588939)
After this patch these lines are no longer present, as the
post_startup_inferior is no longer a top level target method. For me,
this is an acceptable change.
While working on another patch I had reason to look at
mips-netbsd-nat.c, and noticed that the class mips_nbsd_nat_target
inherits directly from inf_ptrace_target.
This is a little strange as alpha_bsd_nat_target,
arm_netbsd_nat_target, hppa_nbsd_nat_target, m68k_bsd_nat_target,
ppc_nbsd_nat_target, sh_nbsd_nat_target, and vax_bsd_nat_target all
inherit from nbsd_nat_target.
Originally, in this commit:
commit f6ac5f3d63
Date: Thu May 3 00:37:22 2018 +0100
Convert struct target_ops to C++
When the target tree was converted to C++, all of the above classes
inherited from inf_ptrace_target except for hppa_nbsd_nat_target,
which was the only class that originally inherited from
nbsd_nat_target.
Later on all the remaining targets (except mips) were converted to
inherit from nbsd_nat_target, these are the commits:
commit 4fed520be2
Date: Sat Mar 14 16:05:24 2020 +0100
Inherit alpha_netbsd_nat_target from nbsd_nat_target
commit 6018d381a0
Date: Sat Mar 14 14:50:51 2020 +0100
Inherit arm_netbsd_nat_target from nbsd_nat_target
commit 01a801176e
Date: Sat Mar 14 16:54:42 2020 +0100
Inherit m68k_bsd_nat_target from nbsd_nat_target
commit 9faa006d11
Date: Thu Mar 19 14:52:57 2020 +0100
Inherit ppc_nbsd_nat_target from nbsd_nat_target
commit 9809762324
Date: Tue Mar 17 15:07:39 2020 +0100
Inherit sh_nbsd_nat_target from nbsd_nat_target
commit d5be5fa420
Date: Sat Mar 14 13:21:58 2020 +0100
Inherit vax_bsd_nat_target from nbsd_nat_target
I could only find mailing list threads for ppc and sh in the archive ,
and unfortunately, none of the commits has any real detail that might
explain why mips was missed out, the only extra context I could find
was this message:
https://sourceware.org/pipermail/gdb-patches/2020-March/166853.html
Which says that "proper" OS support is going to be added to
nbsd_nat_target, hence the need to inherit from that class.
My guess is that leaving mips_nbsd_nat_target unchanged was an
oversight, so, in this commit, I propose changing mips_nbsd_nat_target
to inherit from nbsd_nat_target just like all the other nbsd targets.
My motivation for this patch relates to the post_startup_inferior
target method. In a future commit I want to change how this method is
handled. Currently the mips_nbsd_nat_target will pick up the empty
implementation of inf_child_target::post_startup_inferior rather than
the version in netbsd-nat.c. This feels like a bug to me, as surely,
enabling of proc events is something that would need to be done for
all netbsd targets, regardless of architecture.
In my future patch I have a choice then, either (a) add a new, empty
implementation of post_startup_inferior to mips_nbsd_nat_target,
or (b) this commit, have mips_nbsd_nat_target inherit from
nbsd_nat_target. Option (b) seems like the right way to go, hence,
this commit.
I've done absolutely no testing for this change, not even building it,
as that would require at least an environment in which I can x-build
mips-netbsd applications, which I have no idea how to set up.
While testing another patch I was trying to build different
configurations of GDB, and, during one test build I ran into a
problem, I configured with `--enable-targets=all
--host=i686-w64-mingw32` and saw this error while linking GDB:
.../i686-w64-mingw32/bin/ld: mips-tdep.o: in function `mips_gdbarch_init':
.../src/gdb/mips-tdep.c:8730: undefined reference to `disassembler_options_mips'
.../i686-w64-mingw32/bin/ld: riscv-tdep.o: in function `riscv_gdbarch_init':
.../src/gdb/riscv-tdep.c:3851: undefined reference to `disassembler_options_riscv'
So the `disassembler_options_mips` and `disassembler_options_riscv`
symbols are missing.
This turns out to be because mips-dis.c and riscv-dis.c, in which
these symbols are defined, are in the TARGET64_LIBOPCODES_CFILES list
in opcodes/Makefile.am, these files are only built when we are
building with a 64-bit bfd.
If we look further, into bfd/Makefile.am, we can see that all the
files matching elf*-riscv.lo are in the BFD64_BACKENDS list, as are
the elf*-mips.lo files, and (I know because I tried), the two
disassemblers do, not surprisingly, depend on features supplied from
libbfd.
So, though we can build most of GDB just fine for riscv and mips with
a 32-bit bfd, if I understand correctly, the final GDB
executable (assuming we could get it to link) would not understand
these architectures at the bfd level, nor would there be any
disassembler available. This sounds like a pretty useless GDB to me.
So, in this commit, I move the riscv and mips targets into GDB's list
of targets that require a 64-bit bfd. After this I can build GDB just
fine with the configure options given above.
This was discussed on the mailing list in a couple of threads:
https://sourceware.org/pipermail/gdb-patches/2021-December/184365.htmlhttps://sourceware.org/pipermail/binutils/2021-November/118498.html
and it is agreed, that it is unfortunate that the 32-bit riscv and
32-bit mips targets require a 64-bit bfd. If in the future this
situation ever changes then it would be expected that some (or all) of
this patch would be reverted. Until then though, this patch allows
GDB to build when configured with --enable-targets=all, and when using
a 32-bit libbfd.
I found some uses of xfree in the path substitution code in source.c.
C++-ifying struct substitute_path_rule both simplifies the code and
removes manual memory management.
Regression tested on x86-64 Fedora 34.
On Fedora 35,
$ readelf -d /usr/bin/npc
caused readelf to run out of stack since load_separate_debug_info
returned the input main file as the separate debug info:
(gdb) bt
#0 load_separate_debug_info (
main_filename=main_filename@entry=0x510f50 "/export/home/hjl/.cache/debuginfod_client/dcc33c51c49e7dafc178fdb5cf8bd8946f965295/debuginfo",
xlink=xlink@entry=0x4e5180 <debug_displays+4480>,
parse_func=parse_func@entry=0x431550 <parse_gnu_debuglink>,
check_func=check_func@entry=0x432ae0 <check_gnu_debuglink>,
func_data=func_data@entry=0x7fffffffdb60, file=file@entry=0x51d430)
at /export/gnu/import/git/sources/binutils-gdb/binutils/dwarf.c:11057
#1 0x000000000043328d in check_for_and_load_links (file=0x51d430,
filename=0x510f50 "/export/home/hjl/.cache/debuginfod_client/dcc33c51c49e7dafc178fdb5cf8bd8946f965295/debuginfo")
at /export/gnu/import/git/sources/binutils-gdb/binutils/dwarf.c:11381
#2 0x00000000004332ae in check_for_and_load_links (file=0x51b070,
filename=0x518dd0 "/export/home/hjl/.cache/debuginfod_client/dcc33c51c49e7dafc178fdb5cf8bd8946f965295/debuginfo")
Return NULL if the separate debug info is the same as the input main
file to avoid infinite recursion.
PR binutils/28679
* dwarf.c (load_separate_debug_info): Don't return the input
main file.
This reverts a 1995 fix to handle bogus object files. Presumably such
object files have long gone.
* elf.c (bfd_section_from_shdr): Remove old hack for Oracle
libraries.
The comment on top of gdb/testsuite/boards/remote-stdio-gdbserver.exp says
that user can specify path to gdbserver on remote system by setting
GDBSERVER variable. However, this variable was ignored and /usr/bin/gdbserver
was used unconditionally.
This commit updates the code to respect GDBSERVER if set and fall back to
/usr/bin/gdbserver if not.
This reverts commit fe72c32765.
It turns out it was wrong, libinproctrace.so does build its own
gdbsupport/tdesc.cc. This broke the build:
make[1]: Entering directory '/home/simark/build/binutils-gdb-one-target/gdbserver'
CXXLD libinproctrace.so
/usr/bin/ld: gdbsupport/tdesc-ipa.o: in function `print_xml_feature::visit_pre(target_desc const*)':
/home/simark/src/binutils-gdb/gdbserver/../gdbsupport/tdesc.cc:407: undefined reference to `tdesc_architecture_name(target_desc const*)'
/usr/bin/ld: /home/simark/src/binutils-gdb/gdbserver/../gdbsupport/tdesc.cc:408: undefined reference to `tdesc_architecture_name(target_desc const*)'
/usr/bin/ld: /home/simark/src/binutils-gdb/gdbserver/../gdbsupport/tdesc.cc:411: undefined reference to `tdesc_osabi_name(target_desc const*)'
/usr/bin/ld: /home/simark/src/binutils-gdb/gdbserver/../gdbsupport/tdesc.cc:416: undefined reference to `tdesc_compatible_info_list(target_desc const*)'
/usr/bin/ld: /home/simark/src/binutils-gdb/gdbserver/../gdbsupport/tdesc.cc:418: undefined reference to `tdesc_compatible_info_arch_name(std::unique_ptr<tdesc_compatible_info, std::default_delete<tdesc_compatible_info> > const&)'
Not returning an error indication here leaves the attribute
uninitialised, which then leads to intemperate behaviour.
PR 28674
* dwarf2.c (read_attribute_value): Return NULL on trying to read
past end of attributes.
binutils-all/strip-13 and binutils-all/strip-14 tests create
SHT_REL/SHT_RELA sections by hand. These don't have sh_link set to
the .symtab section as they should, leading to readelf warnings if you
happen to be looking at the object files.
* elf.c (assign_section_numbers): Formatting. Set sh_link for
reloc sections created as normal sections in relocatable
objects.
I suppose this code was copied from GDBserver and this ifndef was left
there. As far as I know, IN_PROCESS_AGENT will never be defined when
building this file, so we can remove this.
Change-Id: I84fc408e330b3a29106df830a09342861cadbaf6
Fix this, seen when building with clang 14:
CXX microblaze-tdep.o
/home/simark/src/binutils-gdb/gdb/microblaze-tdep.c:207:7: error: variable 'flags' set but not used [-Werror,-Wunused-but-set-variable]
int flags = 0;
^
Change-Id: I59f726ed33e924912748bc475b6fd9a9394fc0d0
Fix these, seen when building with clang 14:
CXX csky-tdep.o
/home/simark/src/binutils-gdb/gdb/csky-tdep.c:332:7: error: variable 'need_dummy_stack' set but not used [-Werror,-Wunused-but-set-variable]
int need_dummy_stack = 0;
^
/home/simark/src/binutils-gdb/gdb/csky-tdep.c:805:12: error: variable 'offset' set but not used [-Werror,-Wunused-but-set-variable]
int offset = 0;
^
Change-Id: I6703bcb50e83c50083f716f4084ef6aa30d659c4
The documented behavior of proc runto is to not emit a PASS when
succeeding to to run to the specified location, but emit a FAIL when
failing to do so. I suppose the intent is that it won't pollute the
results normally passing tests (although I don't see why we would care),
but make visible any problems.
However, it seems like the implementation makes it default to never
print anything. "no-message" is prependend to "args", so if "message"
is not passed, we will always take the path that sets print_fail to 0,
which will silence any failure.
This unfortunately means that tests relying on runto_main won't emit a
FAIL if failing to run to main. And since commit 4dfef5be68
("gdb/testsuite: make runto_main not pass no-message to runto"), tests
don't emit a FAIL themselves when failing to run to main. This means
that tests failing to run to main just fail silently, and that's bad.
This can be reproduced by hacking gdb.base/template.exp like so:
diff --git a/gdb/testsuite/gdb.base/template.c b/gdb/testsuite/gdb.base/template.c
index bcf39c377d92..052be5b79d73 100644
--- a/gdb/testsuite/gdb.base/template.c
+++ b/gdb/testsuite/gdb.base/template.c
@@ -15,6 +15,14 @@
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>. */
+#include <stdlib.h>
+
+__attribute__((constructor))
+static void c (void)
+{
+ exit (1);
+}
+
int
main (void)
{
Running the modified gdb.base/template.exp shows that it exits without
printing any result.
Remove the line that prepends no-message to args, that should make
runto's behavior match its documentation.
This patch will appear to add many failures, but in reality they already
existed, they were just silenced.
Change-Id: I2a730d5bc72b6ef0698cd6aad962d9902aa7c3d6
On ELFv1, the _start symbol must point to the *function descriptor* (in
the .opd section), not to the function code (in the .text section) like
with ELFv2 and other architectures.
With test-case gdb.base/maint.exp and target board -readnow, I run into:
...
FAIL: gdb.base/maint.exp: maint info line-table w/o a file name
...
The problem is that this and other regexps anchored using '^':
...
-re "^$gdb_prompt $" {
...
don't trigger because other regexps don't consume the entire preceding line.
This is partly due to the addition of the IS-STMT column.
Fix this by making the regexps consume entire lines.
Tested on x86_64-linux with native and target board readnow, as well as
check-read1 and check-readmore.
With test-case gdb.base/include-main.exp and target board readnow, I run into:
...
FAIL: gdb.base/include-main.exp: maint info symtab
...
The corresponding check in gdb.base/include-main.exp:
...
gdb_test_no_output "maint info symtab"
...
checks that no CU was expanded, while -readnow ensures that all CUs are
expanded.
Fix this by skipping the check with -readnow.
Tested on x86_64-linux, with native and target board readnow.
* To be consistent with -march option, removed the "=" operator when
user want to reset the whole architecture string. So the formats are,
.option arch, +<extension><version>, ...
.option arch, -<extension>
.option arch, <ISA string>
* Don't allow to add or remove the base extensions in the .option arch
directive. Instead, users should reset the whole architecture string
while they want to change the base extension.
* The operator "+" won't update the version of extension, if the
extension is already in the subset list.
bfd/
* elfxx-riscv.c (riscv_add_subset): Don't update the version
if the extension is already in the subset list.
(riscv_update_subset): To be consistent with -march option,
removed the "=" operator when user want to reset the whole
architecture string. Besides, Don't allow to add or remove
the base extensions in the .option arch directive.
gas/
* testsuite/gas/riscv/option-arch-01.s: Updated since we cannot
add or remove the base extensions in the .option arch directive.
* testsuite/gas/riscv/option-arch-02.s: Likewise.
* testsuite/gas/riscv/option-arch-fail.l: Likewise.
* testsuite/gas/riscv/option-arch-fail.s: Likewise.
* testsuite/gas/riscv/option-arch-01a.d: Set -misa-spec=2.2.
* testsuite/gas/riscv/option-arch-01b.d: Likewise.
* testsuite/gas/riscv/option-arch-02.d: Updated since the .option
arch, + won't change the version of extension, if the extension is
already in the subset list.
* testsuite/gas/riscv/option-arch-03.s: Removed the "=" operator
when resetting the whole architecture string.
The ## marker tells automake to not include the comment in its
generated output, so use that in most places where the comment
only makes sense in the inputs.
While working with pending fork events, I wondered what would happen if
the user detached an inferior while a thread of that inferior had a
pending fork event. What happens with the fork child, which is
ptrace-attached by the GDB process (or by GDBserver), but not known to
the core? Sure enough, neither the core of GDB or the target detach the
child process, so GDB (or GDBserver) just stays ptrace-attached to the
process. The result is that the fork child process is stuck, while you
would expect it to be detached and run.
Make GDBserver detach of fork children it knows about. That is done in
the generic handle_detach function. Since a process_info already exists
for the child, we can simply call detach_inferior on it.
GDB-side, make the linux-nat and remote targets detach of fork children
known because of pending fork events. These pending fork events can be
stored in:
- thread_info::pending_waitstatus, if the core has consumed the event
but then saved it for later (for example, because it got the event
while stopping all threads, to present an all-stop stop on top of a
non-stop target)
- thread_info::pending_follow: if we ran to a "catch fork" and we
detach at that moment
Additionally, pending fork events can be in target-specific fields:
- For linux-nat, they can be in lwp_info::status and
lwp_info::waitstatus.
- For the remote target, they could be stored as pending stop replies,
saved in `remote_state::notif_state::pending_event`, if not
acknowledged yet, or in `remote_state::stop_reply_queue`, if
acknowledged. I followed the model of remove_new_fork_children for
this: call remote_notif_get_pending_events to process /
acknowledge any unacknowledged notification, then look through
stop_reply_queue.
Update the gdb.threads/pending-fork-event.exp test (and rename it to
gdb.threads/pending-fork-event-detach.exp) to try to detach the process
while it is stopped with a pending fork event. In order to verify that
the fork child process is correctly detached and resumes execution
outside of GDB's control, make that process create a file in the test
output directory, and make the test wait $timeout seconds for that file
to appear (it happens instantly if everything goes well).
This test catches a bug in linux-nat.c, also reported as PR 28512
("waitstatus.h:300: internal-error: gdb_signal target_waitstatus::sig()
const: Assertion `m_kind == TARGET_WAITKIND_STOPPED || m_kind ==
TARGET_WAITKIND_SIGNALLED' failed.). When detaching a thread with a
pending event, get_detach_signal unconditionally fetches the signal
stored in the waitstatus (`tp->pending_waitstatus ().sig ()`). However,
that is only valid if the pending event is of type
TARGET_WAITKIND_STOPPED, and this is now enforced using assertions (iit
would also be valid for TARGET_WAITKIND_SIGNALLED, but that would mean
the thread does not exist anymore, so we wouldn't be detaching it). Add
a condition in get_detach_signal to access the signal number only if the
wait status is of kind TARGET_WAITKIND_STOPPED, and use GDB_SIGNAL_0
instead (since the thread was not stopped with a signal to begin with).
Add another test, gdb.threads/pending-fork-event-ns.exp, specifically to
verify that we consider events in pending stop replies in the remote
target. This test has many threads constantly forking, and we detach
from the program while the program is executing. That gives us some
chance that we detach while a fork stop reply is stored in the remote
target. To verify that we correctly detach all fork children, we ask
the parent to exit by sending it a SIGUSR1 signal and have it write a
file to the filesystem before exiting. Because the parent's main thread
joins the forking threads, and the forking threads wait for their fork
children to exit, if some fork child is not detach by GDB, the parent
will not write the file, and the test will time out. If I remove the
new remote_detach_pid calls in remote.c, the test fails eventually if I
run it in a loop.
There is a known limitation: we don't remove breakpoints from the
children before detaching it. So the children, could hit a trap
instruction after being detached and crash. I know this is wrong, and
it should be fixed, but I would like to handle that later. The current
patch doesn't fix everything, but it's a step in the right direction.
Change-Id: I6d811a56f520e3cb92d5ea563ad38976f92e93dd
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28512
A following patch will change targets so that when they detach an
inferior, they also detach any pending fork children this inferior may
have. While doing this, I hit a case where we couldn't differentiate
two cases, where in one we should detach the fork detach but not in the
other.
Suppose we continue past a fork with "follow-fork-mode == child" &&
"detach-on-fork on". follow_fork_inferior calls target_detach to detach
the parent. In that case the target should not detach the fork
child, as we'll continue debugging the child. As of now, the
tp->pending_follow field of the thread who called fork still contains
the details about the fork.
Then, suppose we run to a fork catchpoint and the user types "detach".
In that case, the target should detach the fork child in addition to the
parent. In that case as well, the tp->pending_follow field contains
the details about the fork.
To allow targets to differentiate the two cases, clear
tp->pending_follow a bit earlier, when following a fork. Targets will
then see that tp->pending_follow contains TARGET_WAITKIND_SPURIOUS, and
won't detach the fork child.
As of this patch, no behavior changes are expected.
Change-Id: I537741859ed712cb531baaefc78bb934e2a28153
In preparation for a following patch, refactor a few things that I did
find a bit awkward, and to make them a bit more reusable.
- Pass an inferior to kill_new_fork_children instead of a pid. That
allows iterating on only this inferior's threads and avoid further
filtering on the thread's pid.
- Change thread_pending_fork_status to return a non-nullptr value only
if the thread does have a pending fork status.
- Remove is_pending_fork_parent_thread, as one can just use
thread_pending_fork_status and check for nullptr.
- Replace is_pending_fork_parent with is_fork_status, which just
returns if the given target_waitkind if a fork or a vfork. Push
filtering on the pid to the callers, when it is necessary.
Change-Id: I0764ccc684d40f054e39df6fa5458cc4c5d1cd7b
Move the stop_reply and a few functions up. Some code above them in the
file will need to use them in a following patch. No behavior changes
expected here.
Change-Id: I3ca57d0e3ec253f56e1ba401289d9d167de14ad2
The following patch will add some code paths that need to ptrace-detach
a given PID. Factor out the code that does this and put it in its own
function, so that it can be re-used.
Change-Id: Ie65ca0d89893b41aea0a23d9fc6ffbed042a9705
This patch aims at fixing a bug where an inferior is unexpectedly
created when a fork happens at the same time as another event, and that
other event is reported to GDB first (and the fork event stays pending
in GDBserver). This happens for example when we step a thread and
another thread forks at the same time. The bug looks like (if I
reproduce the included test by hand):
(gdb) show detach-on-fork
Whether gdb will detach the child of a fork is on.
(gdb) show follow-fork-mode
Debugger response to a program call of fork or vfork is "parent".
(gdb) si
[New inferior 2]
Reading /home/simark/build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/step-while-fork-in-other-thread/step-while-fork-in-other-thread from remote target...
Reading /home/simark/build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/step-while-fork-in-other-thread/step-while-fork-in-other-thread from remote target...
Reading symbols from target:/home/simark/build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/step-while-fork-in-other-thread/step-while-fork-in-other-thread...
[New Thread 965190.965190]
[Switching to Thread 965190.965190]
Remote 'g' packet reply is too long (expected 560 bytes, got 816 bytes): ... <long series of bytes>
The sequence of events leading to the problem is:
- We are using the all-stop user-visible mode as well as the
synchronous / all-stop variant of the remote protocol
- We have two threads, thread A that we single-step and thread B that
calls fork at the same time
- GDBserver's linux_process_target::wait pulls the "single step
complete SIGTRAP" and the "fork" events from the kernel. It
arbitrarily choses one event to report, it happens to be the
single-step SIGTRAP. The fork stays pending in the thread_info.
- GDBserver send that SIGTRAP as a stop reply to GDB
- While in stop_all_threads, GDB calls update_thread_list, which ends
up querying the remote thread list using qXfer:threads:read.
- In the reply, GDBserver includes the fork child created as a result
of thread B's fork.
- GDB-side, the remote target sees the new PID, calls
remote_notice_new_inferior, which ends up unexpectedly creating a new
inferior, and things go downhill from there.
The problem here is that as long as GDB did not process the fork event,
it should pretend the fork child does not exist. Ultimately, this event
will be reported, we'll go through follow_fork, and that process will be
detached.
The remote target (GDB-side), has some code to remove from the reported
thread list the threads that are the result of forks not processed by
GDB yet. But that only works for fork events that have made their way
to the remote target (GDB-side), but haven't been consumed by the core
yet, so are still lingering as pending stop replies in the remote target
(see remove_new_fork_children in remote.c). But in our case, the fork
event hasn't made its way to the GDB-side remote target. We need to
implement the same kind of logic GDBserver-side: if there exists a
thread / inferior that is the result of a fork event GDBserver hasn't
reported yet, it should exclude that thread / inferior from the reported
thread list.
This was actually discussed a while ago, but not implemented AFAIK:
https://pi.simark.ca/gdb-patches/1ad9f5a8-d00e-9a26-b0c9-3f4066af5142@redhat.com/#thttps://sourceware.org/pipermail/gdb-patches/2016-June/133906.html
Implementation details-wise, the fix for this is all in GDBserver. The
Linux layer of GDBserver already tracks unreported fork parent / child
relationships using the lwp_info::fork_relative, in order to avoid
wildcard actions resuming fork childs unknown to GDB. This information
needs to be made available to the handle_qxfer_threads_worker function,
so it can filter the reported threads. Add a new thread_pending_parent
target function that allows the Linux target to return the parent of an
eventual fork child.
Testing-wise, the test replicates pretty-much the sequence of events
shown above. The setup of the test makes it such that the main thread
is about to fork. We stepi the other thread, so that the step completes
very quickly, in a single event. Meanwhile, the main thread is resumed,
so very likely has time to call fork. This means that the bug may not
reproduce every time (if the main thread does not have time to call
fork), but it will reproduce more often than not. The test fails
without the fix applied on the native-gdbserver and
native-extended-gdbserver boards.
At some point I suspected that which thread called fork and which thread
did the step influenced the order in which the events were reported, and
therefore the reproducibility of the bug. So I made the test try both
combinations: main thread forks while other thread steps, and vice
versa. I'm not sure this is still necessary, but I left it there
anyway. It doesn't hurt to test a few more combinations.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28288
Change-Id: I2158d5732fc7d7ca06b0eb01f88cf27bf527b990
There are some loops in gdb that use ARRAY_SIZE (or a wordier
equivalent) to loop over a static array. This patch changes some of
these to use foreach instead.
Regression tested on x86-64 Fedora 34.
In my earlier C++-ization patch for file_and_directory, I introduced
an error:
- if (strcmp (fnd.name, "<unknown>") != 0)
+ if (fnd.is_unknown ())
This change inverted the sense of the test, which causes failures with
.debug_names.
This patch fixes the bug. Regression tested on x86-64 Fedora 34. I
also tested it using the AdaCore internal test suite, with
.debug_names -- this was failing before, and now it works.
The documentation suggests that we implement gdb.Value.__init__,
however, this is not currently true, we really implement
gdb.Value.__new__. This will cause confusion if a user tries to
sub-class gdb.Value. They might write:
class MyVal (gdb.Value):
def __init__ (self, val):
gdb.Value.__init__(self, val)
obj = MyVal(123)
print ("Got: %s" % obj)
But, when they source this code they'll see:
(gdb) source ~/tmp/value-test.py
Traceback (most recent call last):
File "/home/andrew/tmp/value-test.py", line 7, in <module>
obj = MyVal(123)
File "/home/andrew/tmp/value-test.py", line 5, in __init__
gdb.Value.__init__(self, val)
TypeError: object.__init__() takes exactly one argument (the instance to initialize)
(gdb)
The reason for this is that, as we don't implement __init__ for
gdb.Value, Python ends up calling object.__init__ instead, which
doesn't expect any arguments.
The Python docs suggest that the reason why we might take this
approach is because we want gdb.Value to be immutable:
https://docs.python.org/3/c-api/typeobj.html#c.PyTypeObject.tp_new
But I don't see any reason why we should require gdb.Value to be
immutable when other types defined in GDB are not. This current
immutability can be seen in this code:
obj = gdb.Value(1234)
print("Got: %s" % obj)
obj.__init__ (5678)
print("Got: %s" % obj)
Which currently runs without error, but prints:
Got: 1234
Got: 1234
In this commit I propose that we switch to using __init__ to
initialize gdb.Value objects.
This does introduce some additional complexity, during the __init__
call a gdb.Value might already be associated with a gdb value object,
in which case we need to cleanly break that association before
installing the new gdb value object. However, the cost of doing this
is not great, and the benefit - being able to easily sub-class
gdb.Value seems worth it.
After this commit the first example above works without error, while
the second example now prints:
Got: 1234
Got: 5678
In order to make it easier to override the gdb.Value.__init__ method,
I have tweaked the definition of gdb.Value.__init__. The second,
optional argument to __init__ is a gdb.Type, if this argument is not
present then GDB figures out a suitable type.
However, if we want to override the __init__ method in a sub-class,
and still support the default argument, it is easier to write:
class MyVal (gdb.Value):
def __init__ (self, val, type=None):
gdb.Value.__init__(self, val, type)
Currently, passing None for the Type will result in an error:
TypeError: type argument must be a gdb.Type.
After this commit I now allow the type argument to be None, in which
case GDB figures out a suitable type just as if the type had not been
passed at all.
Unless a user is trying to reinitialize a value, or create sub-classes
of gdb.Value, there should be no user visible changes after this
commit.
While investigating some disassembler problems I ran into this case;
GDB compiled on a 32-bit arm target, with --enable-targets=all. Then
in GDB:
(gdb) set architecture i386
(gdb) disassemble 0x0,+4
unknown disassembler error (error = -1)
This is interesting because it shows a case where the libopcodes
disassembler is returning -1 without first calling the
memory_error_func callback. Indeed, the return from libopcodes
happens from this code snippet in i386-dis.c in the print_insn
function:
if (address_mode == mode_64bit && sizeof (bfd_vma) < 8)
{
(*info->fprintf_func) (info->stream,
_("64-bit address is disabled"));
return -1;
}
Notice how, prior to the return the disassembler tries to print a
helpful message out, but GDB doesn't print this message.
The reason this message goes missing is the call stack, it looks like
this:
gdb_pretty_print_disassembler::pretty_print_insn
gdb_disassembler::print_insn
gdbarch_print_insn
...
i386-dis.c:print_insn
When i386-dis.c:print_insn returns -1 this is handled in
gdb_disassembler::print_insn, where an exception is thrown. However,
the actual printing of the disassembler output is done in
gdb_pretty_print_disassembler::pretty_print_insn, and is only done if
an exception is not thrown.
In this commit I change this. The pretty_print_insn now uses
try/catch around the call to gdb_disassembler::print_insn, if we catch
an error then we first print any pending output in the instruction
buffer, before rethrowing the exception. As a result, even if an
exception is thrown we still print any pending disassembler output to
the screen; in the above case the helpful message will now be shown.
Before my patch we might expect to see this output:
(gdb) disassemble 0x0,+4
Dump of assembler code from 0x0 to 0x4:
0x0000000000000000: unknown disassembler error (error = -1)
(gdb)
But now we see this:
(gdb) disassemble 0x0,+4
Dump of assembler code from 0x0 to 0x4:
0x0000000000000000: 64-bit address is disabled
unknown disassembler error (error = -1)
If the disassembler returns -1 without printing a helpful message then
we would still expect a change in output, something like:
(gdb) disassemble 0x0,+4
Dump of assembler code from 0x0 to 0x4:
0x0000000000000000:
unknown disassembler error (error = -1)
Which I think is still acceptable, though at this point I think a
strong case can be made that this is a disassembler bug (not printing
anything, but still returning -1).
Notice however, that the error message is always printed on a new line
now. This is also true for the memory error case, where before we
might see this:
(gdb) disassemble 0x0,+4
Dump of assembler code from 0x0 to 0x4:
0x00000000: Cannot access memory at address 0x0
We now get this:
(gdb) disassemble 0x0,+4
Dump of assembler code from 0x0 to 0x4:
0x00000000:
Cannot access memory at address 0x0
For me, I'm happy to accept this change, having the error on a line by
itself, rather than just appended to the end of the previous line,
seems like an improvement, but I'm aware others might feel
differently, so I'd appreciate any feedback.
Permanent program breakpoints (ones inserted into the code) other than
the one GDB uses for POWER (0x7fe00008) did not result in stop but
caused GDB to loop infinitely.
This was because GDB did not recognize trap instructions other than
"trap". For example, "tw 12, 4, 4" was not be recognized, causing GDB
to loop forever.
This commit fixes this by providing POWER specific hook
(gdbarch_program_breakpoint_here_p) recognizing all tw, twi, td and tdi
instructions.
Tested on Linux on PowerPC e500 and on QEMU PPC64le.
Power ISA 3.0 B spec [1], sections 3.3.11 "Fixed-Point Trap Instructions"
and section C.6 "Trap Mnemonics" specify "tw, 31, 0, 0" (encoded as
0x7fe00008) as canonical unconditional trap instruction.
This commit changes the breakpoint instruction used by GDB from
"tw 12, r2, r2" to unconditional "trap".
[1]: https://openpowerfoundation.org/?resource_lib=power-isa-version-3-0
If a.so contains an SHT_RELR section, objcopy a.so will fail with:
a.so: unknown type [0x13] section `.relr.dyn'
This change allows objcopy to work.
bfd/
* elf.c (bfd_section_from_shdr): Support SHT_RELR.
struct linespec contains pointers to vectors, instead of containing
vectors directly. This is probably historical, when linespec_parser
(which contains a struct linespec field) was not C++-ified yet. But it
seems easy to change the pointers to vectors to just vectors today.
This simplifies the code, we don't need to manually allocate and delete
the vectors and there's no pointer that can be NULL.
As far as I understand, there was not meaningful distinction between a
NULL pointer to vector and an empty vector. So all NULL checks are
changed for !empty checks.
Change-Id: Ie759707da14d9d984169b93233343a86e2de9ee6