mirror of
https://sourceware.org/git/binutils-gdb.git
synced 2025-01-12 12:16:04 +08:00
d8bbae6ea0
There is a problem with how GDB handles a vfork happening in a multi-threaded program. This problem was reported to me by somebody not using vfork directly, but using system(3) in a multi-threaded program, which may be implemented using vfork. This patch only deals about the follow-fork-mode=parent, detach-on-fork=on case, because it would be too much to chew at once to fix the bugs in the other cases as well (I tried). The problem ----------- When a program vforks, the parent thread is suspended by the kernel until the child process exits or execs. Specifically, in a multi-threaded program, only the thread that called vfork is suspended, other threads keep running freely. This is documented in the vfork(2) man page ("Caveats" section). Let's suppose GDB is handling a vfork and the user's desire is to detach from the child. Before detaching the child, GDB must remove the software breakpoints inserted in the shared parent/child address space, in case there's a breakpoint in the path the child is going to take before exec'ing or exit'ing (unlikely, but possible). Otherwise the child could hit a breakpoint instruction while running outside the control of GDB, which would make it crash. GDB must also avoid re-inserting breakpoints in the parent as long as it didn't receive the "vfork done" event (that is, when the child has exited or execed): since the address space is shared with the child, that would re-insert breakpoints in the child process also. So what GDB does is: 1. Receive "vfork" event for the parent 2. Remove breakpoints from the (shared) address space and set program_space::breakpoints_not_allowed to avoid re-inserting them 3. Detach from the child thread 4. Resume the parent 5. Wait for and receive "vfork done" event for the parent 6. Clean program_space::breakpoints_not_allowed and re-insert breakpoints 7. Resume the parent Resuming the parent at step 4 is necessary in order for the kernel to report the "vfork done" event. The kernel won't report a ptrace event for a thread that is ptrace-stopped. But the theory behind this is that between steps 4 and 5, the parent won't actually do any progress even though it is ptrace-resumed, because the kernel keeps it suspended, waiting for the child to exec or exit. So it doesn't matter for that thread if breakpoints are not inserted. The problem is when the program is multi-threaded. In step 4, GDB resumes all threads of the parent. The thread that did the vfork stays suspended by the kernel, so that's fine. But other threads are running freely while breakpoints are removed, which is a problem because they could miss a breakpoint that they should have hit. The problem is present with all-stop and non-stop targets. The only difference is that with an all-stop targets, the other threads are stopped by the target when it reports the vfork event and are resumed by the target when GDB resumes the parent. With a non-stop target, the other threads are simply never stopped. The fix ------- There many combinations of settings to consider (all-stop/non-stop, target-non-stop on/off, follow-fork-mode parent/child, detach-on-fork on/off, schedule-multiple on/off), but for this patch I restrict the scope to follow-fork-mode=parent, detach-on-fork=on. That's the "default" case, where we detach the child and keep debugging the parent. I tried to fix them all, but it's just too much to do at once. The code paths and behaviors for when we don't detach the child are completely different. The guiding principle for this patch is that all threads of the vforking inferior should be stopped as long as breakpoints are removed. This is similar to handling in-line step-overs, in a way. For non-stop targets (the default on Linux native), this is what happens: - In follow_fork, we call stop_all_threads to stop all threads of the inferior - In follow_fork_inferior, we record the vfork parent thread in inferior::thread_waiting_for_vfork_done - Back in handle_inferior_event, we call keep_going, which resumes only the event thread (this is already the case, with a non-stop target). This is the thread that will be waiting for vfork-done. - When we get the vfork-done event, we go in the (new) handle_vfork_done function to restart the previously stopped threads. In the same scenario, but with an all-stop target: - In follow_fork, no need to stop all threads of the inferior, the target has stopped all threads of all its inferiors before returning the event. - In follow_fork_inferior, we record the vfork parent thread in inferior::thread_waiting_for_vfork_done. - Back in handle_inferior_event, we also call keep_going. However, we only want to resume the event thread here, not all inferior threads. In internal_resume_ptid (called by resume_1), we therefore now check whether one of the inferiors we are about to resume has thread_waiting_for_vfork_done set. If so, we only resume that thread. Note that when resuming multiple inferiors, one vforking and one not non-vforking, we could resume the vforking thread from the vforking inferior plus all threads from the non-vforking inferior. However, this is not implemented, it would require more work. - When we get the vfork-done event, the existing call to keep_going naturally resumes all threads. Testing-wise, add a test that tries to make the main thread hit a breakpoint while a secondary thread calls vfork. Without the fix, the main thread keeps going while breakpoints are removed, resulting in a missed breakpoint and the program exiting. Change-Id: I20eb78e17ca91f93c19c2b89a7e12c382ee814a1
89 lines
2.6 KiB
C
89 lines
2.6 KiB
C
/* This testcase is part of GDB, the GNU debugger.
|
|
|
|
Copyright 2022 Free Software Foundation, Inc.
|
|
|
|
This program is free software; you can redistribute it and/or modify
|
|
it under the terms of the GNU General Public License as published by
|
|
the Free Software Foundation; either version 3 of the License, or
|
|
(at your option) any later version.
|
|
|
|
This program is distributed in the hope that it will be useful,
|
|
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
|
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
|
GNU General Public License for more details.
|
|
|
|
You should have received a copy of the GNU General Public License
|
|
along with this program. If not, see <http://www.gnu.org/licenses/>. */
|
|
|
|
#include <assert.h>
|
|
#include <pthread.h>
|
|
#include <unistd.h>
|
|
#include <sys/wait.h>
|
|
|
|
static volatile int release_vfork = 0;
|
|
static volatile int release_main = 0;
|
|
|
|
static void *
|
|
vforker (void *arg)
|
|
{
|
|
while (!release_vfork)
|
|
usleep (1);
|
|
|
|
pid_t pid = vfork ();
|
|
if (pid == 0)
|
|
{
|
|
/* A vfork child is not supposed to mess with the state of the program,
|
|
but it is helpful for the purpose of this test. */
|
|
release_main = 1;
|
|
_exit(7);
|
|
}
|
|
|
|
int stat;
|
|
int ret = waitpid (pid, &stat, 0);
|
|
assert (ret == pid);
|
|
assert (WIFEXITED (stat));
|
|
assert (WEXITSTATUS (stat) == 7);
|
|
|
|
return NULL;
|
|
}
|
|
|
|
static void
|
|
should_break_here (void)
|
|
{}
|
|
|
|
int
|
|
main (void)
|
|
{
|
|
|
|
pthread_t thread;
|
|
int ret = pthread_create (&thread, NULL, vforker, NULL);
|
|
assert (ret == 0);
|
|
|
|
/* We break here first, while the thread is stuck on `!release_fork`. */
|
|
release_vfork = 1;
|
|
|
|
/* We set a breakpoint on should_break_here.
|
|
|
|
We then set "release_fork" from the debugger and continue. The main
|
|
thread hangs on `!release_main` while the non-main thread vforks. During
|
|
the window of time where the two processes have a shared address space
|
|
(after vfork, before _exit), GDB removes the breakpoints from the address
|
|
space. During that window, only the vfork-ing thread (the non-main
|
|
thread) is frozen by the kernel. The main thread is free to execute. The
|
|
child process sets `release_main`, releasing the main thread. A buggy GDB
|
|
would let the main thread execute during that window, leading to the
|
|
breakpoint on should_break_here being missed. A fixed GDB does not resume
|
|
the threads of the vforking process other than the vforking thread. When
|
|
the vfork child exits, the fixed GDB resumes the main thread, after
|
|
breakpoints are reinserted, so the breakpoint is not missed. */
|
|
|
|
while (!release_main)
|
|
usleep (1);
|
|
|
|
should_break_here ();
|
|
|
|
pthread_join (thread, NULL);
|
|
|
|
return 6;
|
|
}
|