binutils-gdb/gdb/infrun.h
Simon Marchi 1192f124a3 gdb: generalize commit_resume, avoid commit-resuming when threads have pending statuses
The rationale for this patch comes from the ROCm port [1], the goal
being to reduce the number of back and forths between GDB and the
target when doing successive operations.  I'll start with explaining
the rationale and then go over the implementation.  In the ROCm / GPU
world, the term "wave" is somewhat equivalent to a "thread" in GDB.
So if you read if from a GPU stand point, just s/thread/wave/.

ROCdbgapi, the library used by GDB [2] to communicate with the GPU
target, gives the illusion that it's possible for the debugger to
control (start and stop) individual threads.  But in reality, this is
not how it works.  Under the hood, all threads of a queue are
controlled as a group.  To stop one thread in a group of running ones,
the state of all threads is retrieved from the GPU, all threads are
destroyed, and all threads but the one we want to stop are re-created
from the saved state.  The net result, from the point of view of GDB,
is that the library stopped one thread.  The same thing goes if we
want to resume one thread while others are running: the state of all
running threads is retrieved from the GPU, they are all destroyed, and
they are all re-created, including the thread we want to resume.

This leads to some inefficiencies when combined with how GDB works,
here are two examples:

 - Stopping all threads: because the target operates in non-stop mode,
   when the user interface mode is all-stop, GDB must stop all threads
   individually when presenting a stop.  Let's suppose we have 1000
   threads and the user does ^C.  GDB asks the target to stop one
   thread.  Behind the scenes, the library retrieves 1000 thread
   states and restores the 999 others still running ones.  GDB asks
   the target to stop another one.  The target retrieves 999 thread
   states and restores the 998 remaining ones.  That means that to
   stop 1000 threads, we did 1000 back and forths with the GPU.  It
   would have been much better to just retrieve the states once and
   stop there.

 - Resuming with pending events: suppose the 1000 threads hit a
   breakpoint at the same time.  The breakpoint is conditional and
   evaluates to true for the first thread, to false for all others.
   GDB pulls one event (for the first thread) from the target, decides
   that it should present a stop, so stops all threads using
   stop_all_threads.  All these other threads have a breakpoint event
   to report, which is saved in `thread_info::suspend::waitstatus` for
   later.  When the user does "continue", GDB resumes that one thread
   that did hit the breakpoint.  It then processes the pending events
   one by one as if they just arrived.  It picks one, evaluates the
   condition to false, and resumes the thread.  It picks another one,
   evaluates the condition to false, and resumes the thread.  And so
   on.  In between each resumption, there is a full state retrieval
   and re-creation.  It would be much nicer if we could wait a little
   bit before sending those threads on the GPU, until it processed all
   those pending events.

To address this kind of performance issue, ROCdbgapi has a concept
called "forward progress required", which is a boolean state that
allows its user (i.e. GDB) to say "I'm doing a bunch of operations,
you can hold off putting the threads on the GPU until I'm done" (the
"forward progress not required" state).  Turning forward progress back
on indicates to the library that all threads that are supposed to be
running should now be really running on the GPU.

It turns out that GDB has a similar concept, though not as general,
commit_resume.  One difference is that commit_resume is not stateful:
the target can't look up "does the core need me to schedule resumed
threads for execution right now".  It is also specifically linked to
the resume method, it is not used in other contexts.  The target
accumulates resumption requests through target_ops::resume calls, and
then commits those resumptions when target_ops::commit_resume is
called.  The target has no way to check if it's ok to leave resumed
threads stopped in other target methods.

To bridge the gap, this patch generalizes the commit_resume concept in
GDB to match the forward progress concept of ROCdbgapi.  The current
name (commit_resume) can be interpreted as "commit the previous resume
calls".  I renamed the concept to "commit_resumed", as in "commit the
threads that are resumed".

In the new version, we have two things:

 - the commit_resumed_state field in process_stratum_target: indicates
   whether GDB requires target stacks using this target to have
   resumed threads committed to the execution target/device.  If
   false, an execution target is allowed to leave resumed threads
   un-committed at the end of whatever method it is executing.

 - the commit_resumed target method: called when commit_resumed_state
   transitions from false to true.  While commit_resumed_state was
   false, the target may have left some resumed threads un-committed.
   This method being called tells it that it should commit them back
   to the execution device.

Let's take the "Stopping all threads" scenario from above and see how
it would work with the ROCm target with this change.  Before stopping
all threads, GDB would set the target's commit_resumed_state field to
false.  It would then ask the target to stop the first thread.  The
target would retrieve all threads' state from the GPU and mark that
one as stopped.  Since commit_resumed_state is false, it leaves all
the other threads (still resumed) stopped.  GDB would then proceed to
call target_stop for all the other threads.  Since resumed threads are
not committed, this doesn't do any back and forth with the GPU.

To simplify the implementation of targets, this patch makes it so that
when calling certain target methods, the contract between the core and
the targets guarantees that commit_resumed_state is false.  This way,
the target doesn't need two paths, one for commit_resumed_state ==
true and one for commit_resumed_state == false.  It can just assert
that commit_resumed_state is false and work with that assumption.
This also helps catch places where we forgot to disable
commit_resumed_state before calling the method, which represents a
probable optimization opportunity.  The commit adds assertions in the
target method wrappers (target_resume and friends) to have some
confidence that this contract between the core and the targets is
respected.

The scoped_disable_commit_resumed type is used to disable the commit
resumed state of all process targets on construction, and selectively
re-enable it on destruction (see below for criteria).  Note that it
only sets the process_stratum_target::commit_resumed_state flag.  A
subsequent call to maybe_call_commit_resumed_all_targets is necessary
to call the commit_resumed method on all target stacks with process
targets that got their commit_resumed_state flag turned back on.  This
separation is because we don't want to call the commit_resumed methods
in scoped_disable_commit_resumed's destructor, as they may throw.

On destruction, commit-resumed is not re-enabled for a given target
if:

 1. this target has no threads resumed, or

 2. this target has at least one resumed thread with a pending status
    known to the core (saved in thread_info::suspend::waitstatus).

The first point is not technically necessary, because a proper
commit_resumed implementation would be a no-op if the target has no
resumed threads.  But since we have a flag do to a quick check, it
shouldn't hurt.

The second point is more important: together with the
scoped_disable_commit_resumed instance added in fetch_inferior_event,
it makes it so the "Resuming with pending events" described above is
handled efficiently.  Here's what happens in that case:

 1. The user types "continue".

 2. Upon destruction, the scoped_disable_commit_resumed in the
    `proceed` function does not enable commit-resumed, as it sees some
    threads have pending statuses.

 3. fetch_inferior_event is called to handle another event, the
    breakpoint hit evaluates to false, and that thread is resumed.
    Because there are still more threads with pending statuses, the
    destructor of scoped_disable_commit_resumed in
    fetch_inferior_event still doesn't enable commit-resumed.

 4. Rinse and repeat step 3, until the last pending status is handled
    by fetch_inferior_event.  In that case,
    scoped_disable_commit_resumed's destructor sees there are no more
    threads with pending statues, so it asks the target to commit
    resumed threads.

This allows us to avoid all unnecessary back and forths, there is a
single commit_resumed call once all pending statuses are processed.

This change required remote_target::remote_stop_ns to learn how to
handle stopping threads that were resumed but pending vCont.  The
simplest example where that happens is when using the remote target in
all-stop, but with "maint set target-non-stop on", to force it to
operate in non-stop mode under the hood.  If two threads hit a
breakpoint at the same time, GDB will receive two stop replies.  It
will present the stop for one thread and save the other one in
thread_info::suspend::waitstatus.

Before this patch, when doing "continue", GDB first resumes the thread
without a pending status:

    Sending packet: $vCont;c:p172651.172676#f3

It then consumes the pending status in the next fetch_inferior_event
call:

    [infrun] do_target_wait_1: Using pending wait status status->kind = stopped, signal = GDB_SIGNAL_TRAP for Thread 1517137.1517137.
    [infrun] target_wait (-1.0.0, status) =
    [infrun]   1517137.1517137.0 [Thread 1517137.1517137],
    [infrun]   status->kind = stopped, signal = GDB_SIGNAL_TRAP

It then realizes it needs to stop all threads to present the stop, so
stops the thread it just resumed:

    [infrun] stop_all_threads:   Thread 1517137.1517137 not executing
    [infrun] stop_all_threads:   Thread 1517137.1517174 executing, need stop
    remote_stop called
    Sending packet: $vCont;t:p172651.172676#04

This is an unnecessary resume/stop.  With this patch, we don't commit
resumed threads after proceeding, because of the pending status:

    [infrun] maybe_commit_resumed_all_process_targets: not requesting commit-resumed for target extended-remote, a thread has a pending waitstatus

When GDB handles the pending status and stop_all_threads runs, we stop a
resumed but pending vCont thread:

    remote_stop_ns: Enqueueing phony stop reply for thread pending vCont-resume (1520940, 1520976, 0)

That thread was never actually resumed on the remote stub / gdbserver,
so we shouldn't send a packet to the remote side asking to stop the
thread.

Note that there are paths that resume the target and then do a
synchronous blocking wait, in sort of nested event loop, via
wait_sync_command_done.  For example, inferior function calls, or any
run control command issued from a breakpoint command list.  We handle
that making wait_sync_command_one a "sync" point -- force forward
progress, or IOW, force-enable commit-resumed state.

gdb/ChangeLog:
yyyy-mm-dd  Simon Marchi  <simon.marchi@efficios.com>
	    Pedro Alves  <pedro@palves.net>

	* infcmd.c (run_command_1, attach_command, detach_command)
	(interrupt_target_1): Use scoped_disable_commit_resumed.
	* infrun.c (do_target_resume): Remove
	target_commit_resume call.
	(commit_resume_all_targets): Remove.
	(maybe_set_commit_resumed_all_targets): New.
	(maybe_call_commit_resumed_all_targets): New.
	(enable_commit_resumed): New.
	(scoped_disable_commit_resumed::scoped_disable_commit_resumed)
	(scoped_disable_commit_resumed::~scoped_disable_commit_resumed)
	(scoped_disable_commit_resumed::reset)
	(scoped_disable_commit_resumed::reset_and_commit)
	(scoped_enable_commit_resumed::scoped_enable_commit_resumed)
	(scoped_enable_commit_resumed::~scoped_enable_commit_resumed):
	New.
	(proceed): Use scoped_disable_commit_resumed and
	maybe_call_commit_resumed_all_targets.
	(fetch_inferior_event): Use scoped_disable_commit_resumed.
	* infrun.h (struct scoped_disable_commit_resumed): New.
	(maybe_call_commit_resumed_all_process_targets): New.
	(struct scoped_enable_commit_resumed): New.
	* mi/mi-main.c (exec_continue): Use scoped_disable_commit_resumed.
	* process-stratum-target.h (class process_stratum_target):
	<commit_resumed_state>: New.
	* record-full.c (record_full_wait_1): Change commit_resumed_state
	around calling commit_resumed.
	* remote.c (class remote_target) <commit_resume>: Rename to...
	<commit_resumed>: ... this.
	(struct stop_reply): Move up.
	(remote_target::commit_resume): Rename to...
	(remote_target::commit_resumed): ... this.  Check if there is any
	thread pending vCont resume.
	(remote_target::remote_stop_ns): Generate stop replies for resumed
	but pending vCont threads.
	(remote_target::wait_ns): Add gdb_assert.
	* target-delegates.c: Regenerate.
	* target.c (target_wait, target_resume): Assert that the current
	process_stratum target isn't in commit-resumed state.
	(defer_target_commit_resume): Remove.
	(target_commit_resume): Remove.
	(target_commit_resumed): New.
	(make_scoped_defer_target_commit_resume): Remove.
	(target_stop): Assert that the current process_stratum target
	isn't in commit-resumed state.
	* target.h (struct target_ops) <commit_resume>: Rename to ...
	 <commit_resumed>: ... this.
	(target_commit_resume): Remove.
	(target_commit_resumed): New.
	(make_scoped_defer_target_commit_resume): Remove.
	* top.c (wait_sync_command_done): Use
	scoped_enable_commit_resumed.

[1] https://github.com/ROCm-Developer-Tools/ROCgdb/
[2] https://github.com/ROCm-Developer-Tools/ROCdbgapi

Change-Id: I836135531a29214b21695736deb0a81acf8cf566
2021-03-26 15:58:47 +00:00

375 lines
13 KiB
C++

/* Copyright (C) 1986-2021 Free Software Foundation, Inc.
This file is part of GDB.
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>. */
#ifndef INFRUN_H
#define INFRUN_H 1
#include "symtab.h"
#include "gdbsupport/byte-vector.h"
struct target_waitstatus;
struct frame_info;
struct address_space;
struct return_value_info;
struct process_stratum_target;
struct thread_info;
/* True if we are debugging run control. */
extern bool debug_infrun;
/* Print an "infrun" debug statement. */
#define infrun_debug_printf(fmt, ...) \
debug_prefixed_printf_cond (debug_infrun, "infrun",fmt, ##__VA_ARGS__)
/* Print "infrun" start/end debug statements. */
#define INFRUN_SCOPED_DEBUG_START_END(msg) \
scoped_debug_start_end (debug_infrun, "infrun", msg)
/* Print "infrun" enter/exit debug statements. */
#define INFRUN_SCOPED_DEBUG_ENTER_EXIT \
scoped_debug_enter_exit (debug_infrun, "infrun")
/* Nonzero if we want to give control to the user when we're notified
of shared library events by the dynamic linker. */
extern int stop_on_solib_events;
/* True if execution commands resume all threads of all processes by
default; otherwise, resume only threads of the current inferior
process. */
extern bool sched_multi;
/* When set, stop the 'step' command if we enter a function which has
no line number information. The normal behavior is that we step
over such function. */
extern bool step_stop_if_no_debug;
/* If set, the inferior should be controlled in non-stop mode. In
this mode, each thread is controlled independently. Execution
commands apply only to the selected thread by default, and stop
events stop only the thread that had the event -- the other threads
are kept running freely. */
extern bool non_stop;
/* When set (default), the target should attempt to disable the
operating system's address space randomization feature when
starting an inferior. */
extern bool disable_randomization;
/* Returns a unique identifier for the current stop. This can be used
to tell whether a command has proceeded the inferior past the
current location. */
extern ULONGEST get_stop_id (void);
/* Reverse execution. */
enum exec_direction_kind
{
EXEC_FORWARD,
EXEC_REVERSE
};
/* The current execution direction. */
extern enum exec_direction_kind execution_direction;
extern void start_remote (int from_tty);
/* Clear out all variables saying what to do when inferior is
continued or stepped. First do this, then set the ones you want,
then call `proceed'. STEP indicates whether we're preparing for a
step/stepi command. */
extern void clear_proceed_status (int step);
extern void proceed (CORE_ADDR, enum gdb_signal);
/* Return a ptid representing the set of threads that we will proceed,
in the perspective of the user/frontend. We may actually resume
fewer threads at first, e.g., if a thread is stopped at a
breakpoint that needs stepping-off, but that should not be visible
to the user/frontend, and neither should the frontend/user be
allowed to proceed any of the threads that happen to be stopped for
internal run control handling, if a previous command wanted them
resumed. */
extern ptid_t user_visible_resume_ptid (int step);
/* Return the process_stratum target that we will proceed, in the
perspective of the user/frontend. If RESUME_PTID is
MINUS_ONE_PTID, then we'll resume all threads of all targets, so
the function returns NULL. Otherwise, we'll be resuming a process
or thread of the current process, so we return the current
inferior's process stratum target. */
extern process_stratum_target *user_visible_resume_target (ptid_t resume_ptid);
/* Return control to GDB when the inferior stops for real. Print
appropriate messages, remove breakpoints, give terminal our modes,
and run the stop hook. Returns true if the stop hook proceeded the
target, false otherwise. */
extern int normal_stop (void);
/* Return the cached copy of the last target/ptid/waitstatus returned
by target_wait()/deprecated_target_wait_hook(). The data is
actually cached by handle_inferior_event(), which gets called
immediately after target_wait()/deprecated_target_wait_hook(). */
extern void get_last_target_status (process_stratum_target **target,
ptid_t *ptid,
struct target_waitstatus *status);
/* Set the cached copy of the last target/ptid/waitstatus. */
extern void set_last_target_status (process_stratum_target *target, ptid_t ptid,
struct target_waitstatus status);
/* Clear the cached copy of the last ptid/waitstatus returned by
target_wait(). */
extern void nullify_last_target_wait_ptid ();
/* Stop all threads. Only returns after everything is halted. */
extern void stop_all_threads (void);
extern void prepare_for_detach (void);
extern void fetch_inferior_event ();
extern void init_wait_for_inferior (void);
extern void insert_step_resume_breakpoint_at_sal (struct gdbarch *,
struct symtab_and_line ,
struct frame_id);
/* Returns true if we're trying to step past the instruction at
ADDRESS in ASPACE. */
extern int stepping_past_instruction_at (struct address_space *aspace,
CORE_ADDR address);
/* Returns true if thread whose thread number is THREAD is stepping
over a breakpoint. */
extern int thread_is_stepping_over_breakpoint (int thread);
/* Returns true if we're trying to step past an instruction that
triggers a non-steppable watchpoint. */
extern int stepping_past_nonsteppable_watchpoint (void);
/* Record in TP the frame and location we're currently stepping through. */
extern void set_step_info (thread_info *tp,
struct frame_info *frame,
struct symtab_and_line sal);
/* Several print_*_reason helper functions to print why the inferior
has stopped to the passed in UIOUT. */
/* Signal received, print why the inferior has stopped. */
extern void print_signal_received_reason (struct ui_out *uiout,
enum gdb_signal siggnal);
/* Print why the inferior has stopped. We are done with a
step/next/si/ni command, print why the inferior has stopped. */
extern void print_end_stepping_range_reason (struct ui_out *uiout);
/* The inferior was terminated by a signal, print why it stopped. */
extern void print_signal_exited_reason (struct ui_out *uiout,
enum gdb_signal siggnal);
/* The inferior program is finished, print why it stopped. */
extern void print_exited_reason (struct ui_out *uiout, int exitstatus);
/* Reverse execution: target ran out of history info, print why the
inferior has stopped. */
extern void print_no_history_reason (struct ui_out *uiout);
/* Print the result of a function at the end of a 'finish' command.
RV points at an object representing the captured return value/type
and its position in the value history. */
extern void print_return_value (struct ui_out *uiout,
struct return_value_info *rv);
/* Print current location without a level number, if we have changed
functions or hit a breakpoint. Print source line if we have one.
If the execution command captured a return value, print it. If
DISPLAYS is false, do not call 'do_displays'. */
extern void print_stop_event (struct ui_out *uiout, bool displays = true);
/* Pretty print the results of target_wait, for debugging purposes. */
extern void print_target_wait_results (ptid_t waiton_ptid, ptid_t result_ptid,
const struct target_waitstatus *ws);
extern int signal_stop_state (int);
extern int signal_print_state (int);
extern int signal_pass_state (int);
extern int signal_stop_update (int, int);
extern int signal_print_update (int, int);
extern int signal_pass_update (int, int);
extern void update_signals_program_target (void);
/* Clear the convenience variables associated with the exit of the
inferior. Currently, those variables are $_exitcode and
$_exitsignal. */
extern void clear_exit_convenience_vars (void);
/* Dump LEN bytes at BUF in hex to a string and return it. */
extern std::string displaced_step_dump_bytes (const gdb_byte *buf, size_t len);
extern void update_observer_mode (void);
extern void signal_catch_update (const unsigned int *);
/* In some circumstances we allow a command to specify a numeric
signal. The idea is to keep these circumstances limited so that
users (and scripts) develop portable habits. For comparison,
POSIX.2 `kill' requires that 1,2,3,6,9,14, and 15 work (and using a
numeric signal at all is obsolescent. We are slightly more lenient
and allow 1-15 which should match host signal numbers on most
systems. Use of symbolic signal names is strongly encouraged. */
enum gdb_signal gdb_signal_from_command (int num);
/* Enables/disables infrun's async event source in the event loop. */
extern void infrun_async (int enable);
/* Call infrun's event handler the next time through the event
loop. */
extern void mark_infrun_async_event_handler (void);
/* The global chain of threads that need to do a step-over operation
to get past e.g., a breakpoint. */
extern struct thread_info *global_thread_step_over_chain_head;
/* Remove breakpoints if possible (usually that means, if everything
is stopped). On failure, print a message. */
extern void maybe_remove_breakpoints (void);
/* If a UI was in sync execution mode, and now isn't, restore its
prompt (a synchronous execution command has finished, and we're
ready for input). */
extern void all_uis_check_sync_execution_done (void);
/* If a UI was in sync execution mode, and hasn't displayed the prompt
yet, re-disable its prompt (a synchronous execution command was
started or re-started). */
extern void all_uis_on_sync_execution_starting (void);
/* In all-stop, restart the target if it had to be stopped to
detach. */
extern void restart_after_all_stop_detach (process_stratum_target *proc_target);
/* RAII object to temporarily disable the requirement for target
stacks to commit their resumed threads.
On construction, set process_stratum_target::commit_resumed_state
to false for all process_stratum targets in all target
stacks.
On destruction (or if reset_and_commit() is called), set
process_stratum_target::commit_resumed_state to true for all
process_stratum targets in all target stacks, except those that:
- have no resumed threads
- have a resumed thread with a pending status
target_commit_resumed is not called in the destructor, because its
implementations could throw, and we don't to swallow that error in
a destructor. Instead, the caller should call the
reset_and_commit_resumed() method so that an eventual exception can
propagate. "reset" in the method name refers to the fact that this
method has the same effect as the destructor, in addition to
committing resumes.
The creation of nested scoped_disable_commit_resumed objects is
tracked, such that only the outermost instance actually does
something, for cases like this:
void
inner_func ()
{
scoped_disable_commit_resumed disable;
// do stuff
disable.reset_and_commit ();
}
void
outer_func ()
{
scoped_disable_commit_resumed disable;
for (... each thread ...)
inner_func ();
disable.reset_and_commit ();
}
In this case, we don't want the `disable` destructor in
`inner_func` to require targets to commit resumed threads, so that
the `reset_and_commit()` call in `inner_func` doesn't actually
resume threads. */
struct scoped_disable_commit_resumed
{
explicit scoped_disable_commit_resumed (const char *reason);
~scoped_disable_commit_resumed ();
DISABLE_COPY_AND_ASSIGN (scoped_disable_commit_resumed);
/* Undoes the disabling done by the ctor, and calls
maybe_call_commit_resumed_all_targets(). */
void reset_and_commit ();
private:
/* Undoes the disabling done by the ctor. */
void reset ();
/* Whether this object has been reset. */
bool m_reset = false;
const char *m_reason;
bool m_prev_enable_commit_resumed;
};
/* Call target_commit_resumed method on all target stacks whose
process_stratum target layer has COMMIT_RESUME_STATE set. */
extern void maybe_call_commit_resumed_all_targets ();
/* RAII object to temporarily enable the requirement for target stacks
to commit their resumed threads. This is the inverse of
scoped_disable_commit_resumed. The constructor calls the
maybe_call_commit_resumed_all_targets function itself, since it's
OK to throw from a constructor. */
struct scoped_enable_commit_resumed
{
explicit scoped_enable_commit_resumed (const char *reason);
~scoped_enable_commit_resumed ();
DISABLE_COPY_AND_ASSIGN (scoped_enable_commit_resumed);
private:
const char *m_reason;
bool m_prev_enable_commit_resumed;
};
#endif /* INFRUN_H */