gdbserver: hide fork child threads from GDB
This patch aims at fixing a bug where an inferior is unexpectedly
created when a fork happens at the same time as another event, and that
other event is reported to GDB first (and the fork event stays pending
in GDBserver). This happens for example when we step a thread and
another thread forks at the same time. The bug looks like (if I
reproduce the included test by hand):
(gdb) show detach-on-fork
Whether gdb will detach the child of a fork is on.
(gdb) show follow-fork-mode
Debugger response to a program call of fork or vfork is "parent".
(gdb) si
[New inferior 2]
Reading /home/simark/build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/step-while-fork-in-other-thread/step-while-fork-in-other-thread from remote target...
Reading /home/simark/build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/step-while-fork-in-other-thread/step-while-fork-in-other-thread from remote target...
Reading symbols from target:/home/simark/build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/step-while-fork-in-other-thread/step-while-fork-in-other-thread...
[New Thread 965190.965190]
[Switching to Thread 965190.965190]
Remote 'g' packet reply is too long (expected 560 bytes, got 816 bytes): ... <long series of bytes>
The sequence of events leading to the problem is:
- We are using the all-stop user-visible mode as well as the
synchronous / all-stop variant of the remote protocol
- We have two threads, thread A that we single-step and thread B that
calls fork at the same time
- GDBserver's linux_process_target::wait pulls the "single step
complete SIGTRAP" and the "fork" events from the kernel. It
arbitrarily choses one event to report, it happens to be the
single-step SIGTRAP. The fork stays pending in the thread_info.
- GDBserver send that SIGTRAP as a stop reply to GDB
- While in stop_all_threads, GDB calls update_thread_list, which ends
up querying the remote thread list using qXfer:threads:read.
- In the reply, GDBserver includes the fork child created as a result
of thread B's fork.
- GDB-side, the remote target sees the new PID, calls
remote_notice_new_inferior, which ends up unexpectedly creating a new
inferior, and things go downhill from there.
The problem here is that as long as GDB did not process the fork event,
it should pretend the fork child does not exist. Ultimately, this event
will be reported, we'll go through follow_fork, and that process will be
detached.
The remote target (GDB-side), has some code to remove from the reported
thread list the threads that are the result of forks not processed by
GDB yet. But that only works for fork events that have made their way
to the remote target (GDB-side), but haven't been consumed by the core
yet, so are still lingering as pending stop replies in the remote target
(see remove_new_fork_children in remote.c). But in our case, the fork
event hasn't made its way to the GDB-side remote target. We need to
implement the same kind of logic GDBserver-side: if there exists a
thread / inferior that is the result of a fork event GDBserver hasn't
reported yet, it should exclude that thread / inferior from the reported
thread list.
This was actually discussed a while ago, but not implemented AFAIK:
https://pi.simark.ca/gdb-patches/1ad9f5a8-d00e-9a26-b0c9-3f4066af5142@redhat.com/#t
https://sourceware.org/pipermail/gdb-patches/2016-June/133906.html
Implementation details-wise, the fix for this is all in GDBserver. The
Linux layer of GDBserver already tracks unreported fork parent / child
relationships using the lwp_info::fork_relative, in order to avoid
wildcard actions resuming fork childs unknown to GDB. This information
needs to be made available to the handle_qxfer_threads_worker function,
so it can filter the reported threads. Add a new thread_pending_parent
target function that allows the Linux target to return the parent of an
eventual fork child.
Testing-wise, the test replicates pretty-much the sequence of events
shown above. The setup of the test makes it such that the main thread
is about to fork. We stepi the other thread, so that the step completes
very quickly, in a single event. Meanwhile, the main thread is resumed,
so very likely has time to call fork. This means that the bug may not
reproduce every time (if the main thread does not have time to call
fork), but it will reproduce more often than not. The test fails
without the fix applied on the native-gdbserver and
native-extended-gdbserver boards.
At some point I suspected that which thread called fork and which thread
did the step influenced the order in which the events were reported, and
therefore the reproducibility of the bug. So I made the test try both
combinations: main thread forks while other thread steps, and vice
versa. I'm not sure this is still necessary, but I left it there
anyway. It doesn't hurt to test a few more combinations.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28288
Change-Id: I2158d5732fc7d7ca06b0eb01f88cf27bf527b990
2021-12-01 22:40:02 +08:00
|
|
|
/* This testcase is part of GDB, the GNU debugger.
|
|
|
|
|
2023-01-01 20:49:04 +08:00
|
|
|
Copyright 2021-2023 Free Software Foundation, Inc.
|
gdbserver: hide fork child threads from GDB
This patch aims at fixing a bug where an inferior is unexpectedly
created when a fork happens at the same time as another event, and that
other event is reported to GDB first (and the fork event stays pending
in GDBserver). This happens for example when we step a thread and
another thread forks at the same time. The bug looks like (if I
reproduce the included test by hand):
(gdb) show detach-on-fork
Whether gdb will detach the child of a fork is on.
(gdb) show follow-fork-mode
Debugger response to a program call of fork or vfork is "parent".
(gdb) si
[New inferior 2]
Reading /home/simark/build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/step-while-fork-in-other-thread/step-while-fork-in-other-thread from remote target...
Reading /home/simark/build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/step-while-fork-in-other-thread/step-while-fork-in-other-thread from remote target...
Reading symbols from target:/home/simark/build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/step-while-fork-in-other-thread/step-while-fork-in-other-thread...
[New Thread 965190.965190]
[Switching to Thread 965190.965190]
Remote 'g' packet reply is too long (expected 560 bytes, got 816 bytes): ... <long series of bytes>
The sequence of events leading to the problem is:
- We are using the all-stop user-visible mode as well as the
synchronous / all-stop variant of the remote protocol
- We have two threads, thread A that we single-step and thread B that
calls fork at the same time
- GDBserver's linux_process_target::wait pulls the "single step
complete SIGTRAP" and the "fork" events from the kernel. It
arbitrarily choses one event to report, it happens to be the
single-step SIGTRAP. The fork stays pending in the thread_info.
- GDBserver send that SIGTRAP as a stop reply to GDB
- While in stop_all_threads, GDB calls update_thread_list, which ends
up querying the remote thread list using qXfer:threads:read.
- In the reply, GDBserver includes the fork child created as a result
of thread B's fork.
- GDB-side, the remote target sees the new PID, calls
remote_notice_new_inferior, which ends up unexpectedly creating a new
inferior, and things go downhill from there.
The problem here is that as long as GDB did not process the fork event,
it should pretend the fork child does not exist. Ultimately, this event
will be reported, we'll go through follow_fork, and that process will be
detached.
The remote target (GDB-side), has some code to remove from the reported
thread list the threads that are the result of forks not processed by
GDB yet. But that only works for fork events that have made their way
to the remote target (GDB-side), but haven't been consumed by the core
yet, so are still lingering as pending stop replies in the remote target
(see remove_new_fork_children in remote.c). But in our case, the fork
event hasn't made its way to the GDB-side remote target. We need to
implement the same kind of logic GDBserver-side: if there exists a
thread / inferior that is the result of a fork event GDBserver hasn't
reported yet, it should exclude that thread / inferior from the reported
thread list.
This was actually discussed a while ago, but not implemented AFAIK:
https://pi.simark.ca/gdb-patches/1ad9f5a8-d00e-9a26-b0c9-3f4066af5142@redhat.com/#t
https://sourceware.org/pipermail/gdb-patches/2016-June/133906.html
Implementation details-wise, the fix for this is all in GDBserver. The
Linux layer of GDBserver already tracks unreported fork parent / child
relationships using the lwp_info::fork_relative, in order to avoid
wildcard actions resuming fork childs unknown to GDB. This information
needs to be made available to the handle_qxfer_threads_worker function,
so it can filter the reported threads. Add a new thread_pending_parent
target function that allows the Linux target to return the parent of an
eventual fork child.
Testing-wise, the test replicates pretty-much the sequence of events
shown above. The setup of the test makes it such that the main thread
is about to fork. We stepi the other thread, so that the step completes
very quickly, in a single event. Meanwhile, the main thread is resumed,
so very likely has time to call fork. This means that the bug may not
reproduce every time (if the main thread does not have time to call
fork), but it will reproduce more often than not. The test fails
without the fix applied on the native-gdbserver and
native-extended-gdbserver boards.
At some point I suspected that which thread called fork and which thread
did the step influenced the order in which the events were reported, and
therefore the reproducibility of the bug. So I made the test try both
combinations: main thread forks while other thread steps, and vice
versa. I'm not sure this is still necessary, but I left it there
anyway. It doesn't hurt to test a few more combinations.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28288
Change-Id: I2158d5732fc7d7ca06b0eb01f88cf27bf527b990
2021-12-01 22:40:02 +08:00
|
|
|
|
|
|
|
This program is free software; you can redistribute it and/or modify
|
|
|
|
it under the terms of the GNU General Public License as published by
|
|
|
|
the Free Software Foundation; either version 3 of the License, or
|
|
|
|
(at your option) any later version.
|
|
|
|
|
|
|
|
This program is distributed in the hope that it will be useful,
|
|
|
|
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
|
|
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
|
|
GNU General Public License for more details.
|
|
|
|
|
|
|
|
You should have received a copy of the GNU General Public License
|
|
|
|
along with this program. If not, see <http://www.gnu.org/licenses/>. */
|
|
|
|
|
|
|
|
#include <pthread.h>
|
|
|
|
#include <unistd.h>
|
|
|
|
#include <assert.h>
|
|
|
|
|
|
|
|
static volatile int release_forking_thread = 0;
|
|
|
|
static int x;
|
|
|
|
static pthread_barrier_t barrier;
|
|
|
|
|
|
|
|
static void
|
|
|
|
break_here (void)
|
|
|
|
{
|
|
|
|
x++;
|
|
|
|
}
|
|
|
|
|
|
|
|
static void
|
|
|
|
do_fork (void)
|
|
|
|
{
|
|
|
|
while (!release_forking_thread);
|
|
|
|
|
|
|
|
if (FORK_FUNCTION () == 0)
|
gdb, gdbserver: detach fork child when detaching from fork parent
While working with pending fork events, I wondered what would happen if
the user detached an inferior while a thread of that inferior had a
pending fork event. What happens with the fork child, which is
ptrace-attached by the GDB process (or by GDBserver), but not known to
the core? Sure enough, neither the core of GDB or the target detach the
child process, so GDB (or GDBserver) just stays ptrace-attached to the
process. The result is that the fork child process is stuck, while you
would expect it to be detached and run.
Make GDBserver detach of fork children it knows about. That is done in
the generic handle_detach function. Since a process_info already exists
for the child, we can simply call detach_inferior on it.
GDB-side, make the linux-nat and remote targets detach of fork children
known because of pending fork events. These pending fork events can be
stored in:
- thread_info::pending_waitstatus, if the core has consumed the event
but then saved it for later (for example, because it got the event
while stopping all threads, to present an all-stop stop on top of a
non-stop target)
- thread_info::pending_follow: if we ran to a "catch fork" and we
detach at that moment
Additionally, pending fork events can be in target-specific fields:
- For linux-nat, they can be in lwp_info::status and
lwp_info::waitstatus.
- For the remote target, they could be stored as pending stop replies,
saved in `remote_state::notif_state::pending_event`, if not
acknowledged yet, or in `remote_state::stop_reply_queue`, if
acknowledged. I followed the model of remove_new_fork_children for
this: call remote_notif_get_pending_events to process /
acknowledge any unacknowledged notification, then look through
stop_reply_queue.
Update the gdb.threads/pending-fork-event.exp test (and rename it to
gdb.threads/pending-fork-event-detach.exp) to try to detach the process
while it is stopped with a pending fork event. In order to verify that
the fork child process is correctly detached and resumes execution
outside of GDB's control, make that process create a file in the test
output directory, and make the test wait $timeout seconds for that file
to appear (it happens instantly if everything goes well).
This test catches a bug in linux-nat.c, also reported as PR 28512
("waitstatus.h:300: internal-error: gdb_signal target_waitstatus::sig()
const: Assertion `m_kind == TARGET_WAITKIND_STOPPED || m_kind ==
TARGET_WAITKIND_SIGNALLED' failed.). When detaching a thread with a
pending event, get_detach_signal unconditionally fetches the signal
stored in the waitstatus (`tp->pending_waitstatus ().sig ()`). However,
that is only valid if the pending event is of type
TARGET_WAITKIND_STOPPED, and this is now enforced using assertions (iit
would also be valid for TARGET_WAITKIND_SIGNALLED, but that would mean
the thread does not exist anymore, so we wouldn't be detaching it). Add
a condition in get_detach_signal to access the signal number only if the
wait status is of kind TARGET_WAITKIND_STOPPED, and use GDB_SIGNAL_0
instead (since the thread was not stopped with a signal to begin with).
Add another test, gdb.threads/pending-fork-event-ns.exp, specifically to
verify that we consider events in pending stop replies in the remote
target. This test has many threads constantly forking, and we detach
from the program while the program is executing. That gives us some
chance that we detach while a fork stop reply is stored in the remote
target. To verify that we correctly detach all fork children, we ask
the parent to exit by sending it a SIGUSR1 signal and have it write a
file to the filesystem before exiting. Because the parent's main thread
joins the forking threads, and the forking threads wait for their fork
children to exit, if some fork child is not detach by GDB, the parent
will not write the file, and the test will time out. If I remove the
new remote_detach_pid calls in remote.c, the test fails eventually if I
run it in a loop.
There is a known limitation: we don't remove breakpoints from the
children before detaching it. So the children, could hit a trap
instruction after being detached and crash. I know this is wrong, and
it should be fixed, but I would like to handle that later. The current
patch doesn't fix everything, but it's a step in the right direction.
Change-Id: I6d811a56f520e3cb92d5ea563ad38976f92e93dd
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28512
2021-12-01 22:40:03 +08:00
|
|
|
{
|
|
|
|
/* We create the file in a separate program that we exec: if FORK_FUNCTION
|
|
|
|
is vfork, we shouldn't do anything more than an exec. */
|
|
|
|
execl (TOUCH_FILE_BIN, TOUCH_FILE_BIN, NULL);
|
|
|
|
}
|
gdbserver: hide fork child threads from GDB
This patch aims at fixing a bug where an inferior is unexpectedly
created when a fork happens at the same time as another event, and that
other event is reported to GDB first (and the fork event stays pending
in GDBserver). This happens for example when we step a thread and
another thread forks at the same time. The bug looks like (if I
reproduce the included test by hand):
(gdb) show detach-on-fork
Whether gdb will detach the child of a fork is on.
(gdb) show follow-fork-mode
Debugger response to a program call of fork or vfork is "parent".
(gdb) si
[New inferior 2]
Reading /home/simark/build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/step-while-fork-in-other-thread/step-while-fork-in-other-thread from remote target...
Reading /home/simark/build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/step-while-fork-in-other-thread/step-while-fork-in-other-thread from remote target...
Reading symbols from target:/home/simark/build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/step-while-fork-in-other-thread/step-while-fork-in-other-thread...
[New Thread 965190.965190]
[Switching to Thread 965190.965190]
Remote 'g' packet reply is too long (expected 560 bytes, got 816 bytes): ... <long series of bytes>
The sequence of events leading to the problem is:
- We are using the all-stop user-visible mode as well as the
synchronous / all-stop variant of the remote protocol
- We have two threads, thread A that we single-step and thread B that
calls fork at the same time
- GDBserver's linux_process_target::wait pulls the "single step
complete SIGTRAP" and the "fork" events from the kernel. It
arbitrarily choses one event to report, it happens to be the
single-step SIGTRAP. The fork stays pending in the thread_info.
- GDBserver send that SIGTRAP as a stop reply to GDB
- While in stop_all_threads, GDB calls update_thread_list, which ends
up querying the remote thread list using qXfer:threads:read.
- In the reply, GDBserver includes the fork child created as a result
of thread B's fork.
- GDB-side, the remote target sees the new PID, calls
remote_notice_new_inferior, which ends up unexpectedly creating a new
inferior, and things go downhill from there.
The problem here is that as long as GDB did not process the fork event,
it should pretend the fork child does not exist. Ultimately, this event
will be reported, we'll go through follow_fork, and that process will be
detached.
The remote target (GDB-side), has some code to remove from the reported
thread list the threads that are the result of forks not processed by
GDB yet. But that only works for fork events that have made their way
to the remote target (GDB-side), but haven't been consumed by the core
yet, so are still lingering as pending stop replies in the remote target
(see remove_new_fork_children in remote.c). But in our case, the fork
event hasn't made its way to the GDB-side remote target. We need to
implement the same kind of logic GDBserver-side: if there exists a
thread / inferior that is the result of a fork event GDBserver hasn't
reported yet, it should exclude that thread / inferior from the reported
thread list.
This was actually discussed a while ago, but not implemented AFAIK:
https://pi.simark.ca/gdb-patches/1ad9f5a8-d00e-9a26-b0c9-3f4066af5142@redhat.com/#t
https://sourceware.org/pipermail/gdb-patches/2016-June/133906.html
Implementation details-wise, the fix for this is all in GDBserver. The
Linux layer of GDBserver already tracks unreported fork parent / child
relationships using the lwp_info::fork_relative, in order to avoid
wildcard actions resuming fork childs unknown to GDB. This information
needs to be made available to the handle_qxfer_threads_worker function,
so it can filter the reported threads. Add a new thread_pending_parent
target function that allows the Linux target to return the parent of an
eventual fork child.
Testing-wise, the test replicates pretty-much the sequence of events
shown above. The setup of the test makes it such that the main thread
is about to fork. We stepi the other thread, so that the step completes
very quickly, in a single event. Meanwhile, the main thread is resumed,
so very likely has time to call fork. This means that the bug may not
reproduce every time (if the main thread does not have time to call
fork), but it will reproduce more often than not. The test fails
without the fix applied on the native-gdbserver and
native-extended-gdbserver boards.
At some point I suspected that which thread called fork and which thread
did the step influenced the order in which the events were reported, and
therefore the reproducibility of the bug. So I made the test try both
combinations: main thread forks while other thread steps, and vice
versa. I'm not sure this is still necessary, but I left it there
anyway. It doesn't hurt to test a few more combinations.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28288
Change-Id: I2158d5732fc7d7ca06b0eb01f88cf27bf527b990
2021-12-01 22:40:02 +08:00
|
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
static void *
|
|
|
|
thread_func (void *p)
|
|
|
|
{
|
gdb, gdbserver: detach fork child when detaching from fork parent
While working with pending fork events, I wondered what would happen if
the user detached an inferior while a thread of that inferior had a
pending fork event. What happens with the fork child, which is
ptrace-attached by the GDB process (or by GDBserver), but not known to
the core? Sure enough, neither the core of GDB or the target detach the
child process, so GDB (or GDBserver) just stays ptrace-attached to the
process. The result is that the fork child process is stuck, while you
would expect it to be detached and run.
Make GDBserver detach of fork children it knows about. That is done in
the generic handle_detach function. Since a process_info already exists
for the child, we can simply call detach_inferior on it.
GDB-side, make the linux-nat and remote targets detach of fork children
known because of pending fork events. These pending fork events can be
stored in:
- thread_info::pending_waitstatus, if the core has consumed the event
but then saved it for later (for example, because it got the event
while stopping all threads, to present an all-stop stop on top of a
non-stop target)
- thread_info::pending_follow: if we ran to a "catch fork" and we
detach at that moment
Additionally, pending fork events can be in target-specific fields:
- For linux-nat, they can be in lwp_info::status and
lwp_info::waitstatus.
- For the remote target, they could be stored as pending stop replies,
saved in `remote_state::notif_state::pending_event`, if not
acknowledged yet, or in `remote_state::stop_reply_queue`, if
acknowledged. I followed the model of remove_new_fork_children for
this: call remote_notif_get_pending_events to process /
acknowledge any unacknowledged notification, then look through
stop_reply_queue.
Update the gdb.threads/pending-fork-event.exp test (and rename it to
gdb.threads/pending-fork-event-detach.exp) to try to detach the process
while it is stopped with a pending fork event. In order to verify that
the fork child process is correctly detached and resumes execution
outside of GDB's control, make that process create a file in the test
output directory, and make the test wait $timeout seconds for that file
to appear (it happens instantly if everything goes well).
This test catches a bug in linux-nat.c, also reported as PR 28512
("waitstatus.h:300: internal-error: gdb_signal target_waitstatus::sig()
const: Assertion `m_kind == TARGET_WAITKIND_STOPPED || m_kind ==
TARGET_WAITKIND_SIGNALLED' failed.). When detaching a thread with a
pending event, get_detach_signal unconditionally fetches the signal
stored in the waitstatus (`tp->pending_waitstatus ().sig ()`). However,
that is only valid if the pending event is of type
TARGET_WAITKIND_STOPPED, and this is now enforced using assertions (iit
would also be valid for TARGET_WAITKIND_SIGNALLED, but that would mean
the thread does not exist anymore, so we wouldn't be detaching it). Add
a condition in get_detach_signal to access the signal number only if the
wait status is of kind TARGET_WAITKIND_STOPPED, and use GDB_SIGNAL_0
instead (since the thread was not stopped with a signal to begin with).
Add another test, gdb.threads/pending-fork-event-ns.exp, specifically to
verify that we consider events in pending stop replies in the remote
target. This test has many threads constantly forking, and we detach
from the program while the program is executing. That gives us some
chance that we detach while a fork stop reply is stored in the remote
target. To verify that we correctly detach all fork children, we ask
the parent to exit by sending it a SIGUSR1 signal and have it write a
file to the filesystem before exiting. Because the parent's main thread
joins the forking threads, and the forking threads wait for their fork
children to exit, if some fork child is not detach by GDB, the parent
will not write the file, and the test will time out. If I remove the
new remote_detach_pid calls in remote.c, the test fails eventually if I
run it in a loop.
There is a known limitation: we don't remove breakpoints from the
children before detaching it. So the children, could hit a trap
instruction after being detached and crash. I know this is wrong, and
it should be fixed, but I would like to handle that later. The current
patch doesn't fix everything, but it's a step in the right direction.
Change-Id: I6d811a56f520e3cb92d5ea563ad38976f92e93dd
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28512
2021-12-01 22:40:03 +08:00
|
|
|
pthread_barrier_wait (&barrier);
|
|
|
|
|
gdbserver: hide fork child threads from GDB
This patch aims at fixing a bug where an inferior is unexpectedly
created when a fork happens at the same time as another event, and that
other event is reported to GDB first (and the fork event stays pending
in GDBserver). This happens for example when we step a thread and
another thread forks at the same time. The bug looks like (if I
reproduce the included test by hand):
(gdb) show detach-on-fork
Whether gdb will detach the child of a fork is on.
(gdb) show follow-fork-mode
Debugger response to a program call of fork or vfork is "parent".
(gdb) si
[New inferior 2]
Reading /home/simark/build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/step-while-fork-in-other-thread/step-while-fork-in-other-thread from remote target...
Reading /home/simark/build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/step-while-fork-in-other-thread/step-while-fork-in-other-thread from remote target...
Reading symbols from target:/home/simark/build/binutils-gdb/gdb/testsuite/outputs/gdb.threads/step-while-fork-in-other-thread/step-while-fork-in-other-thread...
[New Thread 965190.965190]
[Switching to Thread 965190.965190]
Remote 'g' packet reply is too long (expected 560 bytes, got 816 bytes): ... <long series of bytes>
The sequence of events leading to the problem is:
- We are using the all-stop user-visible mode as well as the
synchronous / all-stop variant of the remote protocol
- We have two threads, thread A that we single-step and thread B that
calls fork at the same time
- GDBserver's linux_process_target::wait pulls the "single step
complete SIGTRAP" and the "fork" events from the kernel. It
arbitrarily choses one event to report, it happens to be the
single-step SIGTRAP. The fork stays pending in the thread_info.
- GDBserver send that SIGTRAP as a stop reply to GDB
- While in stop_all_threads, GDB calls update_thread_list, which ends
up querying the remote thread list using qXfer:threads:read.
- In the reply, GDBserver includes the fork child created as a result
of thread B's fork.
- GDB-side, the remote target sees the new PID, calls
remote_notice_new_inferior, which ends up unexpectedly creating a new
inferior, and things go downhill from there.
The problem here is that as long as GDB did not process the fork event,
it should pretend the fork child does not exist. Ultimately, this event
will be reported, we'll go through follow_fork, and that process will be
detached.
The remote target (GDB-side), has some code to remove from the reported
thread list the threads that are the result of forks not processed by
GDB yet. But that only works for fork events that have made their way
to the remote target (GDB-side), but haven't been consumed by the core
yet, so are still lingering as pending stop replies in the remote target
(see remove_new_fork_children in remote.c). But in our case, the fork
event hasn't made its way to the GDB-side remote target. We need to
implement the same kind of logic GDBserver-side: if there exists a
thread / inferior that is the result of a fork event GDBserver hasn't
reported yet, it should exclude that thread / inferior from the reported
thread list.
This was actually discussed a while ago, but not implemented AFAIK:
https://pi.simark.ca/gdb-patches/1ad9f5a8-d00e-9a26-b0c9-3f4066af5142@redhat.com/#t
https://sourceware.org/pipermail/gdb-patches/2016-June/133906.html
Implementation details-wise, the fix for this is all in GDBserver. The
Linux layer of GDBserver already tracks unreported fork parent / child
relationships using the lwp_info::fork_relative, in order to avoid
wildcard actions resuming fork childs unknown to GDB. This information
needs to be made available to the handle_qxfer_threads_worker function,
so it can filter the reported threads. Add a new thread_pending_parent
target function that allows the Linux target to return the parent of an
eventual fork child.
Testing-wise, the test replicates pretty-much the sequence of events
shown above. The setup of the test makes it such that the main thread
is about to fork. We stepi the other thread, so that the step completes
very quickly, in a single event. Meanwhile, the main thread is resumed,
so very likely has time to call fork. This means that the bug may not
reproduce every time (if the main thread does not have time to call
fork), but it will reproduce more often than not. The test fails
without the fix applied on the native-gdbserver and
native-extended-gdbserver boards.
At some point I suspected that which thread called fork and which thread
did the step influenced the order in which the events were reported, and
therefore the reproducibility of the bug. So I made the test try both
combinations: main thread forks while other thread steps, and vice
versa. I'm not sure this is still necessary, but I left it there
anyway. It doesn't hurt to test a few more combinations.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28288
Change-Id: I2158d5732fc7d7ca06b0eb01f88cf27bf527b990
2021-12-01 22:40:02 +08:00
|
|
|
#if defined(MAIN_THREAD_FORKS)
|
|
|
|
break_here ();
|
|
|
|
#elif defined(OTHER_THREAD_FORKS)
|
|
|
|
do_fork ();
|
|
|
|
#else
|
|
|
|
# error "MAIN_THREAD_FORKS or OTHER_THREAD_FORKS must be defined"
|
|
|
|
#endif
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
int
|
|
|
|
main (void)
|
|
|
|
{
|
|
|
|
pthread_t thread;
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
pthread_barrier_init (&barrier, NULL, 2);
|
|
|
|
|
|
|
|
alarm (30);
|
|
|
|
ret = pthread_create (&thread, NULL, thread_func, NULL);
|
|
|
|
assert (ret == 0);
|
|
|
|
|
|
|
|
pthread_barrier_wait (&barrier);
|
|
|
|
|
|
|
|
#if defined(MAIN_THREAD_FORKS)
|
|
|
|
do_fork ();
|
|
|
|
#elif defined(OTHER_THREAD_FORKS)
|
|
|
|
break_here ();
|
|
|
|
#else
|
|
|
|
# error "MAIN_THREAD_FORKS or OTHER_THREAD_FORKS must be defined"
|
|
|
|
#endif
|
|
|
|
|
|
|
|
pthread_join (thread, NULL);
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|