binutils-gdb/gdb/displaced-stepping.c
Andrew Burgess cf141dd8cc gdb: fix reg corruption from displaced stepping on amd64
This commit aims to address a problem that exists with the current
approach to displaced stepping, and was identified in PR gdb/22921.

Displaced stepping is currently supported on AArch64, ARM, amd64,
i386, rs6000 (ppc), and s390.  Of these, I believe there is a problem
with the current approach which will impact amd64 and ARM, and can
lead to random register corruption when the inferior makes use of
asynchronous signals and GDB is using displaced stepping.

The problem can be found in displaced_step_buffers::finish in
displaced-stepping.c, and is this; after GDB tries to perform a
displaced step, and the inferior stops, GDB classifies the stop into
one of two states, either the displaced step succeeded, or the
displaced step failed.

If the displaced step succeeded then gdbarch_displaced_step_fixup is
called, which has the job of fixing up the state of the current
inferior as if the step had not been performed in a displaced manner.
This all seems just fine.

However, if the displaced step is considered to have not completed
then GDB doesn't call gdbarch_displaced_step_fixup, instead GDB
remains in displaced_step_buffers::finish and just performs a minimal
fixup which involves adjusting the program counter back to its
original value.

The problem here is that for amd64 and ARM setting up for a displaced
step can involve changing the values in some temporary registers.  If
the displaced step succeeds then this is fine; after the step the
temporary registers are restored to their original values in the
architecture specific code.

But if the displaced step does not succeed then the temporary
registers are never restored, and they retain their modified values.

In this context a temporary register is simply any register that is
not otherwise used by the instruction being stepped that the
architecture specific code considers safe to borrow for the lifetime
of the instruction being stepped.

In the bug PR gdb/22921, the amd64 instruction being stepped is
an rip-relative instruction like this:

  jmp    *0x2fe2(%rip)

When we displaced step this instruction we borrow a register, and
modify the instruction to something like:

  jmp    *0x2fe2(%rcx)

with %rcx having its value adjusted to contain the original %rip
value.

Now if the displaced step does not succeed, then %rcx will be left
with a corrupted value.  Obviously corrupting any register is bad; in
the bug report this problem was spotted because %rcx is used as a
function argument register.

And finally, why might a displaced step not succeed?  Asynchronous
signals provides one reason.  GDB sets up for the displaced step and,
at that precise moment, the OS delivers a signal (SIGALRM in the bug
report), the signal stops the inferior at the address of the displaced
instruction.  GDB cancels the displaced instruction, handles the
signal, and then tries again with the displaced step.  But it is that
first cancellation of the displaced step that causes the problem; in
that case GDB (correctly) sees the displaced step as having not
completed, and so does not perform the architecture specific fixup,
leaving the register corrupted.

The reason why I think AArch64, rs600, i386, and s390 are not effected
by this problem is that I don't believe these architectures make use
of any temporary registers, so when a displaced step is not completed
successfully, the minimal fix up is sufficient.

On amd64 we use at most one temporary register.

On ARM, looking at arm_displaced_step_copy_insn_closure, we could
modify up to 16 temporary registers, and the instruction being
displaced stepped could be expanded to multiple replacement
instructions, which increases the chances of this bug triggering.

This commit only aims to address the issue on amd64 for now, though I
believe that the approach I'm proposing here might be applicable for
ARM too.

What I propose is that we always call gdbarch_displaced_step_fixup.

We will now pass an extra argument to gdbarch_displaced_step_fixup,
this a boolean that indicates whether GDB thinks the displaced step
completed successfully or not.

When this flag is false this indicates that the displaced step halted
for some "other" reason.  On ARM GDB can potentially read the
inferior's program counter in order figure out how far through the
sequence of replacement instructions we got, and from that GDB can
figure out what fixup needs to be performed.

On targets like amd64 the problem is slightly easier as displaced
stepping only uses a single replacement instruction.  If the displaced
step didn't complete the GDB knows that the single instruction didn't
execute.

The point is that by always calling gdbarch_displaced_step_fixup, each
architecture can now ensure that the inferior state is fixed up
correctly in all cases, not just the success case.

On amd64 this ensures that we always restore the temporary register
value, and so bug PR gdb/22921 is resolved.

In order to move all architectures to this new API, I have moved the
minimal roll-back version of the code inside the architecture specific
fixup functions for AArch64, rs600, s390, and ARM.  For all of these
except ARM I think this is good enough, as no temporaries are used all
that's needed is the program counter restore anyway.

For ARM the minimal code is no worse than what we had before, though I
do consider this architecture's displaced-stepping broken.

I've updated the gdb.arch/amd64-disp-step.exp test to cover the
'jmpq*' instruction that was causing problems in the original bug, and
also added support for testing the displaced step in the presence of
asynchronous signal delivery.

I've also added two new tests (for amd64 and i386) that check that GDB
can correctly handle displaced stepping over a single instruction that
branches to itself.  I added these tests after a first version of this
patch relied too much on checking the program-counter value in order
to see if the displaced instruction had executed.  This works fine in
almost all cases, but when an instruction branches to itself a pure
program counter check is not sufficient.  The new tests expose this
problem.

Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=22921

Approved-By: Pedro Alves <pedro@palves.net>
2023-04-06 14:22:10 +01:00

324 lines
10 KiB
C

/* Displaced stepping related things.
Copyright (C) 2020-2023 Free Software Foundation, Inc.
This file is part of GDB.
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>. */
#include "defs.h"
#include "displaced-stepping.h"
#include "cli/cli-cmds.h"
#include "command.h"
#include "gdbarch.h"
#include "gdbcore.h"
#include "gdbthread.h"
#include "inferior.h"
#include "regcache.h"
#include "target/target.h"
/* Default destructor for displaced_step_copy_insn_closure. */
displaced_step_copy_insn_closure::~displaced_step_copy_insn_closure ()
= default;
bool debug_displaced = false;
static void
show_debug_displaced (struct ui_file *file, int from_tty,
struct cmd_list_element *c, const char *value)
{
gdb_printf (file, _("Displace stepping debugging is %s.\n"), value);
}
displaced_step_prepare_status
displaced_step_buffers::prepare (thread_info *thread, CORE_ADDR &displaced_pc)
{
gdb_assert (!thread->displaced_step_state.in_progress ());
/* Sanity check: the thread should not be using a buffer at this point. */
for (displaced_step_buffer &buf : m_buffers)
gdb_assert (buf.current_thread != thread);
regcache *regcache = get_thread_regcache (thread);
const address_space *aspace = regcache->aspace ();
gdbarch *arch = regcache->arch ();
ULONGEST len = gdbarch_displaced_step_buffer_length (arch);
/* Search for an unused buffer. */
displaced_step_buffer *buffer = nullptr;
displaced_step_prepare_status fail_status
= DISPLACED_STEP_PREPARE_STATUS_CANT;
for (displaced_step_buffer &candidate : m_buffers)
{
bool bp_in_range = breakpoint_in_range_p (aspace, candidate.addr, len);
bool is_free = candidate.current_thread == nullptr;
if (!bp_in_range)
{
if (is_free)
{
buffer = &candidate;
break;
}
else
{
/* This buffer would be suitable, but it's used right now. */
fail_status = DISPLACED_STEP_PREPARE_STATUS_UNAVAILABLE;
}
}
else
{
/* There's a breakpoint set in the scratch pad location range
(which is usually around the entry point). We'd either
install it before resuming, which would overwrite/corrupt the
scratch pad, or if it was already inserted, this displaced
step would overwrite it. The latter is OK in the sense that
we already assume that no thread is going to execute the code
in the scratch pad range (after initial startup) anyway, but
the former is unacceptable. Simply punt and fallback to
stepping over this breakpoint in-line. */
displaced_debug_printf ("breakpoint set in displaced stepping "
"buffer at %s, can't use.",
paddress (arch, candidate.addr));
}
}
if (buffer == nullptr)
return fail_status;
displaced_debug_printf ("selected buffer at %s",
paddress (arch, buffer->addr));
/* Save the original PC of the thread. */
buffer->original_pc = regcache_read_pc (regcache);
/* Return displaced step buffer address to caller. */
displaced_pc = buffer->addr;
/* Save the original contents of the displaced stepping buffer. */
buffer->saved_copy.resize (len);
int status = target_read_memory (buffer->addr,
buffer->saved_copy.data (), len);
if (status != 0)
throw_error (MEMORY_ERROR,
_("Error accessing memory address %s (%s) for "
"displaced-stepping scratch space."),
paddress (arch, buffer->addr), safe_strerror (status));
displaced_debug_printf ("saved %s: %s",
paddress (arch, buffer->addr),
bytes_to_string (buffer->saved_copy).c_str ());
/* Save this in a local variable first, so it's released if code below
throws. */
displaced_step_copy_insn_closure_up copy_insn_closure
= gdbarch_displaced_step_copy_insn (arch, buffer->original_pc,
buffer->addr, regcache);
if (copy_insn_closure == nullptr)
{
/* The architecture doesn't know how or want to displaced step
this instruction or instruction sequence. Fallback to
stepping over the breakpoint in-line. */
return DISPLACED_STEP_PREPARE_STATUS_CANT;
}
/* This marks the buffer as being in use. */
buffer->current_thread = thread;
/* Save this, now that we know everything went fine. */
buffer->copy_insn_closure = std::move (copy_insn_closure);
/* Reset the displaced step buffer state if we failed to write PC.
Otherwise we will prevent this buffer from being used, as it will
always have a thread in buffer->current_thread. */
auto reset_buffer = make_scope_exit
([buffer] ()
{
buffer->current_thread = nullptr;
buffer->copy_insn_closure.reset ();
});
/* Adjust the PC so it points to the displaced step buffer address that will
be used. This needs to be done after we save the copy_insn_closure, as
some architectures (Arm, for one) need that information so they can adjust
other data as needed. In particular, Arm needs to know if the instruction
being executed in the displaced step buffer is thumb or not. Without that
information, things will be very wrong in a random way. */
regcache_write_pc (regcache, buffer->addr);
/* PC update successful. Discard the displaced step state rollback. */
reset_buffer.release ();
/* Tell infrun not to try preparing a displaced step again for this inferior if
all buffers are taken. */
thread->inf->displaced_step_state.unavailable = true;
for (const displaced_step_buffer &buf : m_buffers)
{
if (buf.current_thread == nullptr)
{
thread->inf->displaced_step_state.unavailable = false;
break;
}
}
return DISPLACED_STEP_PREPARE_STATUS_OK;
}
static void
write_memory_ptid (ptid_t ptid, CORE_ADDR memaddr,
const gdb_byte *myaddr, int len)
{
scoped_restore save_inferior_ptid = make_scoped_restore (&inferior_ptid);
inferior_ptid = ptid;
write_memory (memaddr, myaddr, len);
}
static bool
displaced_step_instruction_executed_successfully
(gdbarch *arch, const target_waitstatus &status)
{
if (status.kind () == TARGET_WAITKIND_STOPPED
&& status.sig () != GDB_SIGNAL_TRAP)
return false;
/* All other (thread event) waitkinds can only happen if the
instruction fully executed. For example, a fork, or a syscall
entry can only happen if the syscall instruction actually
executed. */
if (target_stopped_by_watchpoint ())
{
if (gdbarch_have_nonsteppable_watchpoint (arch)
|| target_have_steppable_watchpoint ())
return false;
}
return true;
}
displaced_step_finish_status
displaced_step_buffers::finish (gdbarch *arch, thread_info *thread,
const target_waitstatus &status)
{
gdb_assert (thread->displaced_step_state.in_progress ());
/* Find the buffer this thread was using. */
displaced_step_buffer *buffer = nullptr;
for (displaced_step_buffer &candidate : m_buffers)
{
if (candidate.current_thread == thread)
{
buffer = &candidate;
break;
}
}
gdb_assert (buffer != nullptr);
/* Move this to a local variable so it's released in case something goes
wrong. */
displaced_step_copy_insn_closure_up copy_insn_closure
= std::move (buffer->copy_insn_closure);
gdb_assert (copy_insn_closure != nullptr);
/* Reset BUFFER->CURRENT_THREAD immediately to mark the buffer as available,
in case something goes wrong below. */
buffer->current_thread = nullptr;
/* Now that a buffer gets freed, tell infrun it can ask us to prepare a displaced
step again for this inferior. Do that here in case something goes wrong
below. */
thread->inf->displaced_step_state.unavailable = false;
ULONGEST len = gdbarch_displaced_step_buffer_length (arch);
/* Restore memory of the buffer. */
write_memory_ptid (thread->ptid, buffer->addr,
buffer->saved_copy.data (), len);
displaced_debug_printf ("restored %s %s",
thread->ptid.to_string ().c_str (),
paddress (arch, buffer->addr));
regcache *rc = get_thread_regcache (thread);
bool instruction_executed_successfully
= displaced_step_instruction_executed_successfully (arch, status);
gdbarch_displaced_step_fixup (arch, copy_insn_closure.get (),
buffer->original_pc, buffer->addr,
rc, instruction_executed_successfully);
return (instruction_executed_successfully
? DISPLACED_STEP_FINISH_STATUS_OK
: DISPLACED_STEP_FINISH_STATUS_NOT_EXECUTED);
}
const displaced_step_copy_insn_closure *
displaced_step_buffers::copy_insn_closure_by_addr (CORE_ADDR addr)
{
for (const displaced_step_buffer &buffer : m_buffers)
{
if (addr == buffer.addr)
{
/* The closure information should always be available. */
gdb_assert (buffer.copy_insn_closure.get () != nullptr);
return buffer.copy_insn_closure.get ();
}
}
return nullptr;
}
void
displaced_step_buffers::restore_in_ptid (ptid_t ptid)
{
for (const displaced_step_buffer &buffer : m_buffers)
{
if (buffer.current_thread == nullptr)
continue;
regcache *regcache = get_thread_regcache (buffer.current_thread);
gdbarch *arch = regcache->arch ();
ULONGEST len = gdbarch_displaced_step_buffer_length (arch);
write_memory_ptid (ptid, buffer.addr, buffer.saved_copy.data (), len);
displaced_debug_printf ("restored in ptid %s %s",
ptid.to_string ().c_str (),
paddress (arch, buffer.addr));
}
}
void _initialize_displaced_stepping ();
void
_initialize_displaced_stepping ()
{
add_setshow_boolean_cmd ("displaced", class_maintenance,
&debug_displaced, _("\
Set displaced stepping debugging."), _("\
Show displaced stepping debugging."), _("\
When non-zero, displaced stepping specific debugging is enabled."),
NULL,
show_debug_displaced,
&setdebuglist, &showdebuglist);
}