gdb/python: implement the print_insn extension language hook

This commit extends the Python API to include disassembler support.

The motivation for this commit was to provide an API by which the user
could write Python scripts that would augment the output of the
disassembler.

To achieve this I have followed the model of the existing libopcodes
disassembler, that is, instructions are disassembled one by one.  This
does restrict the type of things that it is possible to do from a
Python script, i.e. all additional output has to fit on a single line,
but this was all I needed, and creating something more complex would,
I think, require greater changes to how GDB's internal disassembler
operates.

The disassembler API is contained in the new gdb.disassembler module,
which defines the following classes:

  DisassembleInfo

      Similar to libopcodes disassemble_info structure, has read-only
  properties: address, architecture, and progspace.  And has methods:
  __init__, read_memory, and is_valid.

      Each time GDB wants an instruction disassembled, an instance of
  this class is passed to a user written disassembler function, by
  reading the properties, and calling the methods (and other support
  methods in the gdb.disassembler module) the user can perform and
  return the disassembly.

  Disassembler

      This is a base-class which user written disassemblers should
  inherit from.  This base class provides base implementations of
  __init__ and __call__ which the user written disassembler should
  override.

  DisassemblerResult

      This class can be used to hold the result of a call to the
  disassembler, it's really just a wrapper around a string (the text
  of the disassembled instruction) and a length (in bytes).  The user
  can return an instance of this class from Disassembler.__call__ to
  represent the newly disassembled instruction.

The gdb.disassembler module also provides the following functions:

  register_disassembler

      This function registers an instance of a Disassembler sub-class
  as a disassembler, either for one specific architecture, or, as a
  global disassembler for all architectures.

  builtin_disassemble

      This provides access to GDB's builtin disassembler.  A common
  use case that I see is augmenting the existing disassembler output.
  The user code can call this function to have GDB disassemble the
  instruction in the normal way.  The user gets back a
  DisassemblerResult object, which they can then read in order to
  augment the disassembler output in any way they wish.

      This function also provides a mechanism to intercept the
  disassemblers reads of memory, thus the user can adjust what GDB
  sees when it is disassembling.

The included documentation provides a more detailed description of the
API.

There is also a new CLI command added:

  maint info python-disassemblers

This command is defined in the Python gdb.disassemblers module, and
can be used to list the currently registered Python disassemblers.
This commit is contained in:
Andrew Burgess 2021-09-17 18:12:34 +01:00 committed by Andrew Burgess
parent e4ae302562
commit 15e15b2d9c
12 changed files with 2648 additions and 1 deletions

View File

@ -393,6 +393,7 @@ SUBDIR_PYTHON_SRCS = \
python/py-cmd.c \
python/py-connection.c \
python/py-continueevent.c \
python/py-disasm.c \
python/py-event.c \
python/py-evtregistry.c \
python/py-evts.c \

View File

@ -63,6 +63,40 @@ maintenance info line-table
** New method gdb.Frame.language that returns the name of the
frame's language.
** New Python API for wrapping GDB's disassembler:
- gdb.disassembler.register_disassembler(DISASSEMBLER, ARCH).
DISASSEMBLER is a sub-class of gdb.disassembler.Disassembler.
ARCH is either None or a string containing a bfd architecture
name. DISASSEMBLER is registered as a disassembler for
architecture ARCH, or for all architectures if ARCH is None.
The previous disassembler registered for ARCH is returned, this
can be None if no previous disassembler was registered.
- gdb.disassembler.Disassembler is the class from which all
disassemblers should inherit. Its constructor takes a string,
a name for the disassembler, which is currently only used in
some debug output. Sub-classes should override the __call__
method to perform disassembly, invoking __call__ on this base
class will raise an exception.
- gdb.disassembler.DisassembleInfo is the class used to describe
a single disassembly request from GDB. An instance of this
class is passed to the __call__ method of
gdb.disassembler.Disassembler and has the following read-only
attributes: 'address', and 'architecture', as well as the
following method: 'read_memory'.
- gdb.disassembler.builtin_disassemble(INFO, MEMORY_SOURCE),
calls GDB's builtin disassembler on INFO, which is a
gdb.disassembler.DisassembleInfo object. MEMORY_SOURCE is
optional, its default value is None. If MEMORY_SOURCE is not
None then it must be an object that has a 'read_memory' method.
- gdb.disassembler.DisassemblerResult is a class that can be used
to wrap the result of a call to a Disassembler. It has
read-only attributes 'length' and 'string'.
*** Changes in GDB 12
* DBX mode is deprecated, and will be removed in GDB 13

View File

@ -69,6 +69,7 @@ PYTHON_DIR = python
PYTHON_INSTALL_DIR = $(DESTDIR)$(GDB_DATADIR)/$(PYTHON_DIR)
PYTHON_FILE_LIST = \
gdb/__init__.py \
gdb/disassembler.py \
gdb/FrameDecorator.py \
gdb/FrameIterator.py \
gdb/frames.py \

View File

@ -39680,6 +39680,51 @@ packet history.
@item maint info jit
Print information about JIT code objects loaded in the current inferior.
@anchor{maint info python-disassemblers}
@kindex maint info python-disassemblers
@item maint info python-disassemblers
This command is defined within the @code{gdb.disassembler} Python
module (@pxref{Disassembly In Python}), and will only be present after
that module has been imported. To force the module to be imported do
the following:
@smallexample
(@value{GDBP}) python import gdb.disassembler
@end smallexample
This command lists all the architectures for which a disassembler is
currently registered, and the name of the disassembler. If a
disassembler is registered for all architectures, then this is listed
last against the @samp{GLOBAL} architecture.
If one of the disassemblers would be selected for the architecture of
the current inferior, then this disassembler will be marked.
The following example shows a situation in which two disassemblers are
registered, initially the @samp{i386} disassembler matches the current
architecture, then the architecture is changed, now the @samp{GLOBAL}
disassembler matches.
@smallexample
@group
(@value{GDBP}) show architecture
The target architecture is set to "auto" (currently "i386").
(@value{GDBP}) maint info python-disassemblers
Architecture Disassember Name
i386 Disassembler_1 (Matches current architecture)
GLOBAL Disassembler_2
@end group
@group
(@value{GDBP}) set architecture arm
The target architecture is set to "arm".
(@value{GDBP}) maint info python-disassemblers
quit
Architecture Disassember Name
i386 Disassembler_1
GLOBAL Disassembler_2 (Matches current architecture)
@end group
@end smallexample
@kindex set displaced-stepping
@kindex show displaced-stepping
@cindex displaced stepping support

View File

@ -222,6 +222,7 @@ optional arguments while skipping others. Example:
* Registers In Python:: Python representation of registers.
* Connections In Python:: Python representation of connections.
* TUI Windows In Python:: Implementing new TUI windows.
* Disassembly In Python:: Instruction Disassembly In Python
@end menu
@node Basic Python
@ -599,6 +600,7 @@ such as those used by readline for command input, and annotation
related prompts are prohibited from being changed.
@end defun
@anchor{gdb_architecture_names}
@defun gdb.architecture_names ()
Return a list containing all of the architecture names that the
current build of @value{GDBN} supports. Each architecture name is a
@ -3287,6 +3289,7 @@ single address space, so this may not match the architecture of a
particular frame (@pxref{Frames In Python}).
@end defun
@anchor{gdbpy_inferior_read_memory}
@findex Inferior.read_memory
@defun Inferior.read_memory (address, length)
Read @var{length} addressable memory units from the inferior, starting at
@ -6575,6 +6578,331 @@ corner), and @var{button} specifies which mouse button was used, whose
values can be 1 (left), 2 (middle), or 3 (right).
@end defun
@node Disassembly In Python
@subsubsection Instruction Disassembly In Python
@cindex python instruction disassembly
@value{GDBN}'s builtin disassembler can be extended, or even replaced,
using the Python API. The disassembler related features are contained
within the @code{gdb.disassembler} module:
@deftp {class} gdb.disassembler.DisassembleInfo
Disassembly is driven by instances of this class. Each time
@value{GDBN} needs to disassemble an instruction, an instance of this
class is created and passed to a registered disassembler. The
disassembler is then responsible for disassembling an instruction and
returning a result.
Instances of this type are usually created within @value{GDBN},
however, it is possible to create a copy of an instance of this type,
see the description of @code{__init__} for more details.
This class has the following properties and methods:
@defvar DisassembleInfo.address
A read-only integer containing the address at which @value{GDBN}
wishes to disassemble a single instruction.
@end defvar
@defvar DisassembleInfo.architecture
The @code{gdb.Architecture} (@pxref{Architectures In Python}) for
which @value{GDBN} is currently disassembling, this property is
read-only.
@end defvar
@defvar DisassembleInfo.progspace
The @code{gdb.Progspace} (@pxref{Progspaces In Python,,Program Spaces
In Python}) for which @value{GDBN} is currently disassembling, this
property is read-only.
@end defvar
@defun DisassembleInfo.is_valid ()
Returns @code{True} if the @code{DisassembleInfo} object is valid,
@code{False} if not. A @code{DisassembleInfo} object will become
invalid once the disassembly call for which the @code{DisassembleInfo}
was created, has returned. Calling other @code{DisassembleInfo}
methods, or accessing @code{DisassembleInfo} properties, will raise a
@code{RuntimeError} exception if it is invalid.
@end defun
@defun DisassembleInfo.__init__ (info)
This can be used to create a new @code{DisassembleInfo} object that is
a copy of @var{info}. The copy will have the same @code{address},
@code{architecture}, and @code{progspace} values as @var{info}, and
will become invalid at the same time as @var{info}.
This method exists so that sub-classes of @code{DisassembleInfo} can
be created, these sub-classes must be initialized as copies of an
existing @code{DisassembleInfo} object, but sub-classes might choose
to override the @code{read_memory} method, and so control what
@value{GDBN} sees when reading from memory
(@pxref{builtin_disassemble}).
@end defun
@defun DisassembleInfo.read_memory (length, offset)
This method allows the disassembler to read the bytes of the
instruction to be disassembled. The method reads @var{length} bytes,
starting at @var{offset} from
@code{DisassembleInfo.address}.
It is important that the disassembler read the instruction bytes using
this method, rather than reading inferior memory directly, as in some
cases @value{GDBN} disassembles from an internal buffer rather than
directly from inferior memory, calling this method handles this
detail.
Returns a buffer object, which behaves much like an array or a string,
just as @code{Inferior.read_memory} does
(@pxref{gdbpy_inferior_read_memory,,Inferior.read_memory}). The
length of the returned buffer will always be exactly @var{length}.
If @value{GDBN} is unable to read the required memory then a
@code{gdb.MemoryError} exception is raised (@pxref{Exception
Handling}).
This method can be overridden by a sub-class in order to control what
@value{GDBN} sees when reading from memory
(@pxref{builtin_disassemble}). When overriding this method it is
important to understand how @code{builtin_disassemble} makes use of
this method.
While disassembling a single instruction there could be multiple calls
to this method, and the same bytes might be read multiple times. Any
single call might only read a subset of the total instruction bytes.
If an implementation of @code{read_memory} is unable to read the
requested memory contents, for example, if there's a request to read
from an invalid memory address, then a @code{gdb.MemoryError} should
be raised.
Raising a @code{MemoryError} inside @code{read_memory} does not
automatically mean a @code{MemoryError} will be raised by
@code{builtin_disassemble}. It is possible the @value{GDBN}'s builtin
disassembler is probing to see how many bytes are available. When
@code{read_memory} raises the @code{MemoryError} the builtin
disassembler might be able to perform a complete disassembly with the
bytes it has available, in this case @code{builtin_disassemble} will
not itself raise a @code{MemoryError}.
Any other exception type raised in @code{read_memory} will propagate
back and be available re-raised by @code{builtin_disassemble}.
@end defun
@end deftp
@deftp {class} Disassembler
This is a base class from which all user implemented disassemblers
must inherit.
@defun Disassembler.__init__ (name)
The constructor takes @var{name}, a string, which should be a short
name for this disassembler.
@end defun
@defun Disassembler.__call__ (info)
The @code{__call__} method must be overridden by sub-classes to
perform disassembly. Calling @code{__call__} on this base class will
raise a @code{NotImplementedError} exception.
The @var{info} argument is an instance of @code{DisassembleInfo}, and
describes the instruction that @value{GDBN} wants disassembling.
If this function returns @code{None}, this indicates to @value{GDBN}
that this sub-class doesn't wish to disassemble the requested
instruction. @value{GDBN} will then use its builtin disassembler to
perform the disassembly.
Alternatively, this function can return a @code{DisassemblerResult}
that represents the disassembled instruction, this type is described
in more detail below.
The @code{__call__} method can raise a @code{gdb.MemoryError}
exception (@pxref{Exception Handling}) to indicate to @value{GDBN}
that there was a problem accessing the required memory, this will then
be displayed by @value{GDBN} within the disassembler output.
Ideally, the only three outcomes from invoking @code{__call__} would
be a return of @code{None}, a successful disassembly returned in a
@code{DisassemblerResult}, or a @code{MemoryError} indicating that
there was a problem reading memory.
However, as an implementation of @code{__call__} could fail due to
other reasons, e.g.@: some external resource required to perform
disassembly is temporarily unavailable, then, if @code{__call__}
raises a @code{GdbError}, the exception will be converted to a string
and printed at the end of the disassembly output, the disassembly
request will then stop.
Any other exception type raised by the @code{__call__} method is
considered an error in the user code, the exception will be printed to
the error stream according to the @kbd{set python print-stack} setting
(@pxref{set_python_print_stack,,@kbd{set python print-stack}}).
@end defun
@end deftp
@deftp {class} DisassemblerResult
This class is used to hold the result of calling
@w{@code{Disassembler.__call__}}, and represents a single disassembled
instruction. This class has the following properties and methods:
@defun DisassemblerResult.__init__ (@var{length}, @var{string})
Initialize an instance of this class, @var{length} is the length of
the disassembled instruction in bytes, which must be greater than
zero, and @var{string} is a non-empty string that represents the
disassembled instruction.
@end defun
@defvar DisassemblerResult.length
A read-only property containing the length of the disassembled
instruction in bytes, this will always be greater than zero.
@end defvar
@defvar DisassemblerResult.string
A read-only property containing a non-empty string representing the
disassembled instruction.
@end defvar
@end deftp
The following functions are also contained in the
@code{gdb.disassembler} module:
@defun register_disassembler (disassembler, architecture)
The @var{disassembler} must be a sub-class of
@code{gdb.disassembler.Disassembler} or @code{None}.
The optional @var{architecture} is either a string, or the value
@code{None}. If it is a string, then it should be the name of an
architecture known to @value{GDBN}, as returned either from
@code{gdb.Architecture.name}
(@pxref{gdbpy_architecture_name,,gdb.Architecture.name}), or from
@code{gdb.architecture_names}
(@pxref{gdb_architecture_names,,gdb.architecture_names}).
The @var{disassembler} will be installed for the architecture named by
@var{architecture}, or if @var{architecture} is @code{None}, then
@var{disassembler} will be installed as a global disassembler for use
by all architectures.
@cindex disassembler in Python, global vs.@: specific
@cindex search order for disassembler in Python
@cindex look up of disassembler in Python
@value{GDBN} only records a single disassembler for each architecture,
and a single global disassembler. Calling
@code{register_disassembler} for an architecture, or for the global
disassembler, will replace any existing disassembler registered for
that @var{architecture} value. The previous disassembler is returned.
If @var{disassembler} is @code{None} then any disassembler currently
registered for @var{architecture} is deregistered and returned.
When @value{GDBN} is looking for a disassembler to use, @value{GDBN}
first looks for an architecture specific disassembler. If none has
been registered then @value{GDBN} looks for a global disassembler (one
registered with @var{architecture} set to @code{None}). Only one
disassembler is called to perform disassembly, so, if there is both an
architecture specific disassembler, and a global disassembler
registered, it is the architecture specific disassembler that will be
used.
@value{GDBN} tracks the architecture specific, and global
disassemblers separately, so it doesn't matter in which order
disassemblers are created or registered; an architecture specific
disassembler, if present, will always be used in preference to a
global disassembler.
You can use the @kbd{maint info python-disassemblers} command
(@pxref{maint info python-disassemblers}) to see which disassemblers
have been registered.
@end defun
@anchor{builtin_disassemble}
@defun builtin_disassemble (info)
This function calls back into @value{GDBN}'s builtin disassembler to
disassemble the instruction identified by @var{info}, an instance, or
sub-class, of @code{DisassembleInfo}.
When the builtin disassembler needs to read memory the
@code{read_memory} method on @var{info} will be called. By
sub-classing @code{DisassembleInfo} and overriding the
@code{read_memory} method, it is possible to intercept calls to
@code{read_memory} from the builtin disassembler, and to modify the
values returned.
It is important to understand that, even when
@code{DisassembleInfo.read_memory} raises a @code{gdb.MemoryError}, it
is the internal disassembler itself that reports the memory error to
@value{GDBN}. The reason for this is that the disassembler might
probe memory to see if a byte is readable or not; if the byte can't be
read then the disassembler may choose not to report an error, but
instead to disassemble the bytes that it does have available.
If the builtin disassembler is successful then an instance of
@code{DisassemblerResult} is returned from @code{builtin_disassemble},
alternatively, if something goes wrong, an exception will be raised.
A @code{MemoryError} will be raised if @code{builtin_disassemble} is
unable to read some memory that is required in order to perform
disassembly correctly.
Any exception that is not a @code{MemoryError}, that is raised in a
call to @code{read_memory}, will pass through
@code{builtin_disassemble}, and be visible to the caller.
Finally, there are a few cases where @value{GDBN}'s builtin
disassembler can fail for reasons that are not covered by
@code{MemoryError}. In these cases, a @code{GdbError} will be raised.
The contents of the exception will be a string describing the problem
the disassembler encountered.
@end defun
Here is an example that registers a global disassembler. The new
disassembler invokes the builtin disassembler, and then adds a
comment, @code{## Comment}, to each line of disassembly output:
@smallexample
class ExampleDisassembler(gdb.disassembler.Disassembler):
def __init__(self):
super().__init__("ExampleDisassembler")
def __call__(self, info):
result = gdb.disassembler.builtin_disassemble(info)
length = result.length
text = result.string + "\t## Comment"
return gdb.disassembler.DisassemblerResult(length, text)
gdb.disassembler.register_disassembler(ExampleDisassembler())
@end smallexample
The following example creates a sub-class of @code{DisassembleInfo} in
order to intercept the @code{read_memory} calls, within
@code{read_memory} any bytes read from memory have the two 4-bit
nibbles swapped around. This isn't a very useful adjustment, but
serves as an example.
@smallexample
class MyInfo(gdb.disassembler.DisassembleInfo):
def __init__(self, info):
super().__init__(info)
def read_memory(self, length, offset):
buffer = super().read_memory(length, offset)
result = bytearray()
for b in buffer:
v = int.from_bytes(b, 'little')
v = (v << 4) & 0xf0 | (v >> 4)
result.append(v)
return memoryview(result)
class NibbleSwapDisassembler(gdb.disassembler.Disassembler):
def __init__(self):
super().__init__("NibbleSwapDisassembler")
def __call__(self, info):
info = MyInfo(info)
return gdb.disassembler.builtin_disassemble(info)
gdb.disassembler.register_disassembler(NibbleSwapDisassembler())
@end smallexample
@node Python Auto-loading
@subsection Python Auto-loading
@cindex Python auto-loading

View File

@ -0,0 +1,178 @@
# Copyright (C) 2021-2022 Free Software Foundation, Inc.
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
"""Disassembler related module."""
import gdb
import _gdb.disassembler
# Re-export everything from the _gdb.disassembler module, which is
# defined within GDB's C++ code.
from _gdb.disassembler import *
# Module global dictionary of gdb.disassembler.Disassembler objects.
# The keys of this dictionary are bfd architecture names, or the
# special value None.
#
# When a request to disassemble comes in we first lookup the bfd
# architecture name from the gdbarch, if that name exists in this
# dictionary then we use that Disassembler object.
#
# If there's no architecture specific disassembler then we look for
# the key None in this dictionary, and if that key exists, we use that
# disassembler.
#
# If none of the above checks found a suitable disassembler, then no
# disassembly is performed in Python.
_disassemblers_dict = {}
class Disassembler(object):
"""A base class from which all user implemented disassemblers must
inherit."""
def __init__(self, name):
"""Constructor. Takes a name, which should be a string, which can be
used to identify this disassembler in diagnostic messages."""
self.name = name
def __call__(self, info):
"""A default implementation of __call__. All sub-classes must
override this method. Calling this default implementation will throw
a NotImplementedError exception."""
raise NotImplementedError("Disassembler.__call__")
def register_disassembler(disassembler, architecture=None):
"""Register a disassembler. DISASSEMBLER is a sub-class of
gdb.disassembler.Disassembler. ARCHITECTURE is either None or a
string, the name of an architecture known to GDB.
DISASSEMBLER is registered as a disassembler for ARCHITECTURE, or
all architectures when ARCHITECTURE is None.
Returns the previous disassembler registered with this
ARCHITECTURE value.
"""
if not isinstance(disassembler, Disassembler) and disassembler is not None:
raise TypeError("disassembler should sub-class gdb.disassembler.Disassembler")
old = None
if architecture in _disassemblers_dict:
old = _disassemblers_dict[architecture]
del _disassemblers_dict[architecture]
if disassembler is not None:
_disassemblers_dict[architecture] = disassembler
# Call the private _set_enabled function within the
# _gdb.disassembler module. This function sets a global flag
# within GDB's C++ code that enables or dissables the Python
# disassembler functionality, this improves performance of the
# disassembler by avoiding unneeded calls into Python when we know
# that no disassemblers are registered.
_gdb.disassembler._set_enabled(len(_disassemblers_dict) > 0)
return old
def _print_insn(info):
"""This function is called by GDB when it wants to disassemble an
instruction. INFO describes the instruction to be
disassembled."""
def lookup_disassembler(arch):
try:
name = arch.name()
if name is None:
return None
if name in _disassemblers_dict:
return _disassemblers_dict[name]
if None in _disassemblers_dict:
return _disassemblers_dict[None]
return None
except:
# It's pretty unlikely this exception case will ever
# trigger, one situation would be if the user somehow
# corrupted the _disassemblers_dict variable such that it
# was no longer a dictionary.
return None
disassembler = lookup_disassembler(info.architecture)
if disassembler is None:
return None
return disassembler(info)
class maint_info_py_disassemblers_cmd(gdb.Command):
"""
List all registered Python disassemblers.
List the name of all registered Python disassemblers, next to the
name of the architecture for which the disassembler is registered.
The global Python disassembler is listed next to the string
'GLOBAL'.
The disassembler that matches the architecture of the currently
selected inferior will be marked, this is an indication of which
disassembler will be invoked if any disassembly is performed in
the current inferior.
"""
def __init__(self):
super().__init__("maintenance info python-disassemblers", gdb.COMMAND_USER)
def invoke(self, args, from_tty):
# If no disassemblers are registered, tell the user.
if len(_disassemblers_dict) == 0:
print("No Python disassemblers registered.")
return
# Figure out the longest architecture name, so we can
# correctly format the table of results.
longest_arch_name = 0
for architecture in _disassemblers_dict:
if architecture is not None:
name = _disassemblers_dict[architecture].name
if len(name) > longest_arch_name:
longest_arch_name = len(name)
# Figure out the name of the current architecture. There
# should always be a current inferior, but if, somehow, there
# isn't, then leave curr_arch as the empty string, which will
# not then match agaisnt any architecture in the dictionary.
curr_arch = ""
if gdb.selected_inferior() is not None:
curr_arch = gdb.selected_inferior().architecture().name()
# Now print the dictionary of registered disassemblers out to
# the user.
match_tag = "\t(Matches current architecture)"
fmt_len = max(longest_arch_name, len("Architecture"))
format_string = "{:" + str(fmt_len) + "s} {:s}"
print(format_string.format("Architecture", "Disassember Name"))
for architecture in _disassemblers_dict:
if architecture is not None:
name = _disassemblers_dict[architecture].name
if architecture == curr_arch:
name += match_tag
match_tag = ""
print(format_string.format(architecture, name))
if None in _disassemblers_dict:
name = _disassemblers_dict[None].name + match_tag
print(format_string.format("GLOBAL", name))
maint_info_py_disassemblers_cmd()

1090
gdb/python/py-disasm.c Normal file

File diff suppressed because it is too large Load Diff

View File

@ -540,6 +540,8 @@ int gdbpy_initialize_connection ()
int gdbpy_initialize_micommands (void)
CPYCHECKER_NEGATIVE_RESULT_SETS_EXCEPTION;
void gdbpy_finalize_micommands ();
int gdbpy_initialize_disasm ()
CPYCHECKER_NEGATIVE_RESULT_SETS_EXCEPTION;
/* A wrapper for PyErr_Fetch that handles reference counting for the
caller. */
@ -587,6 +589,13 @@ public:
return PyErr_GivenExceptionMatches (m_error_type.get (), type);
}
/* Return a new reference to the exception value object. */
gdbpy_ref<> value ()
{
return m_error_value;
}
private:
gdbpy_ref<> m_error_type, m_error_value, m_error_traceback;
@ -840,4 +849,18 @@ extern bool gdbpy_is_progspace (PyObject *obj);
extern gdb::unique_xmalloc_ptr<char> gdbpy_fix_doc_string_indentation
(gdb::unique_xmalloc_ptr<char> doc);
/* Implement the 'print_insn' hook for Python. Disassemble an instruction
whose address is ADDRESS for architecture GDBARCH. The bytes of the
instruction should be read with INFO->read_memory_func as the
instruction being disassembled might actually be in a buffer.
Used INFO->fprintf_func to print the results of the disassembly, and
return the length of the instruction in octets.
If no instruction can be disassembled then return an empty value. */
extern gdb::optional<int> gdbpy_print_insn (struct gdbarch *gdbarch,
CORE_ADDR address,
disassemble_info *info);
#endif /* PYTHON_PYTHON_INTERNAL_H */

View File

@ -167,7 +167,7 @@ static const struct extension_language_ops python_extension_ops =
gdbpy_colorize_disasm,
NULL, /* gdbpy_print_insn, */
gdbpy_print_insn,
};
#endif /* HAVE_PYTHON */
@ -2053,6 +2053,7 @@ do_start_initialization ()
if (gdbpy_initialize_auto_load () < 0
|| gdbpy_initialize_values () < 0
|| gdbpy_initialize_disasm () < 0
|| gdbpy_initialize_frames () < 0
|| gdbpy_initialize_commands () < 0
|| gdbpy_initialize_instruction () < 0

View File

@ -0,0 +1,25 @@
/* This test program is part of GDB, the GNU debugger.
Copyright 2021-2022 Free Software Foundation, Inc.
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>. */
int
main ()
{
asm ("nop");
asm ("nop"); /* Break here. */
asm ("nop");
return 0;
}

View File

@ -0,0 +1,209 @@
# Copyright (C) 2021-2022 Free Software Foundation, Inc.
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# This file is part of the GDB testsuite. It validates the Python
# disassembler API.
load_lib gdb-python.exp
standard_testfile
if { [prepare_for_testing "failed to prepare" ${testfile} ${srcfile} "debug"] } {
return -1
}
# Skip all tests if Python scripting is not enabled.
if { [skip_python_tests] } { continue }
if ![runto_main] then {
fail "can't run to main"
return 0
}
set pyfile [gdb_remote_download host ${srcdir}/${subdir}/${testfile}.py]
gdb_test "source ${pyfile}" "Python script imported" \
"import python scripts"
gdb_breakpoint [gdb_get_line_number "Break here."]
gdb_continue_to_breakpoint "Break here."
set curr_pc [get_valueof "/x" "\$pc" "*unknown*"]
gdb_test_no_output "python current_pc = ${curr_pc}"
# The current pc will be something like 0x1234 with no leading zeros.
# However, in the disassembler output addresses are padded with zeros.
# This substitution changes 0x1234 to 0x0*1234, which can then be used
# as a regexp in the disassembler output matching.
set curr_pc_pattern [string replace ${curr_pc} 0 1 "0x0*"]
# Grab the name of the current architecture, this is used in the tests
# patterns below.
set curr_arch [get_python_valueof "gdb.selected_inferior().architecture().name()" "*unknown*"]
# Helper proc that removes all registered disassemblers.
proc py_remove_all_disassemblers {} {
gdb_test_no_output "python remove_all_python_disassemblers()"
}
# A list of test plans. Each plan is a list of two elements, the
# first element is the name of a class in py-disasm.py, this is a
# disassembler class. The second element is a pattern that should be
# matched in the disassembler output.
#
# Each different disassembler tests some different feature of the
# Python disassembler API.
set unknown_error_pattern "unknown disassembler error \\(error = -1\\)"
set addr_pattern "\r\n=> ${curr_pc_pattern} <\[^>\]+>:\\s+"
set base_pattern "${addr_pattern}nop"
set test_plans \
[list \
[list "" "${base_pattern}\r\n.*"] \
[list "GlobalNullDisassembler" "${base_pattern}\r\n.*"] \
[list "GlobalPreInfoDisassembler" "${base_pattern}\\s+## ad = $hex, ar = ${curr_arch}\r\n.*"] \
[list "GlobalPostInfoDisassembler" "${base_pattern}\\s+## ad = $hex, ar = ${curr_arch}\r\n.*"] \
[list "GlobalReadDisassembler" "${base_pattern}\\s+## bytes =( $hex)+\r\n.*"] \
[list "GlobalAddrDisassembler" "${base_pattern}\\s+## addr = ${curr_pc_pattern} <\[^>\]+>\r\n.*"] \
[list "GdbErrorEarlyDisassembler" "${addr_pattern}GdbError instead of a result\r\n${unknown_error_pattern}"] \
[list "RuntimeErrorEarlyDisassembler" "${addr_pattern}Python Exception <class 'RuntimeError'>: RuntimeError instead of a result\r\n\r\n${unknown_error_pattern}"] \
[list "GdbErrorLateDisassembler" "${addr_pattern}GdbError after builtin disassembler\r\n${unknown_error_pattern}"] \
[list "RuntimeErrorLateDisassembler" "${addr_pattern}Python Exception <class 'RuntimeError'>: RuntimeError after builtin disassembler\r\n\r\n${unknown_error_pattern}"] \
[list "MemoryErrorEarlyDisassembler" "${base_pattern}\\s+## AFTER ERROR\r\n.*"] \
[list "MemoryErrorLateDisassembler" "${addr_pattern}Cannot access memory at address ${curr_pc_pattern}"] \
[list "RethrowMemoryErrorDisassembler" "${addr_pattern}Cannot access memory at address $hex"] \
[list "ReadMemoryMemoryErrorDisassembler" "${addr_pattern}Cannot access memory at address ${curr_pc_pattern}"] \
[list "ReadMemoryGdbErrorDisassembler" "${addr_pattern}read_memory raised GdbError\r\n${unknown_error_pattern}"] \
[list "ReadMemoryRuntimeErrorDisassembler" "${addr_pattern}Python Exception <class 'RuntimeError'>: read_memory raised RuntimeError\r\n\r\n${unknown_error_pattern}"] \
[list "ReadMemoryCaughtMemoryErrorDisassembler" "${addr_pattern}nop\r\n.*"] \
[list "ReadMemoryCaughtGdbErrorDisassembler" "${addr_pattern}nop\r\n.*"] \
[list "ReadMemoryCaughtRuntimeErrorDisassembler" "${addr_pattern}nop\r\n.*"] \
[list "MemorySourceNotABufferDisassembler" "${addr_pattern}Python Exception <class 'TypeError'>: Result from read_memory is not a buffer\r\n\r\n${unknown_error_pattern}"] \
[list "MemorySourceBufferTooLongDisassembler" "${addr_pattern}Python Exception <class 'ValueError'>: Buffer returned from read_memory is sized $decimal instead of the expected $decimal\r\n\r\n${unknown_error_pattern}"] \
[list "ResultOfWrongType" "${addr_pattern}Python Exception <class 'TypeError'>: Result is not a DisassemblerResult.\r\n.*"] \
[list "ResultWithInvalidLength" "${addr_pattern}Python Exception <class 'ValueError'>: Invalid length attribute: length must be greater than 0.\r\n.*"] \
[list "ResultWithInvalidString" "${addr_pattern}Python Exception <class 'ValueError'>: String attribute must not be empty.\r\n.*"]]
# Now execute each test plan.
foreach plan $test_plans {
set global_disassembler_name [lindex $plan 0]
set expected_pattern [lindex $plan 1]
with_test_prefix "global_disassembler=${global_disassembler_name}" {
# Remove all existing disassemblers.
py_remove_all_disassemblers
# If we have a disassembler to load, do it now.
if { $global_disassembler_name != "" } {
gdb_test_no_output "python add_global_disassembler($global_disassembler_name)"
}
# Disassemble main, and check the disassembler output.
gdb_test "disassemble main" $expected_pattern
}
}
# Check some errors relating to DisassemblerResult creation.
with_test_prefix "DisassemblerResult errors" {
gdb_test "python gdb.disassembler.DisassemblerResult(0, 'abc')" \
[multi_line \
"ValueError: Length must be greater than 0." \
"Error while executing Python code."]
gdb_test "python gdb.disassembler.DisassemblerResult(-1, 'abc')" \
[multi_line \
"ValueError: Length must be greater than 0." \
"Error while executing Python code."]
gdb_test "python gdb.disassembler.DisassemblerResult(1, '')" \
[multi_line \
"ValueError: String must not be empty." \
"Error while executing Python code."]
}
# Check that the architecture specific disassemblers can override the
# global disassembler.
#
# First, register a global disassembler, and check it is in place.
with_test_prefix "GLOBAL tagging disassembler" {
py_remove_all_disassemblers
gdb_test_no_output "python gdb.disassembler.register_disassembler(TaggingDisassembler(\"GLOBAL\"), None)"
gdb_test "disassemble main" "${base_pattern}\\s+## tag = GLOBAL\r\n.*"
}
# Now register an architecture specific disassembler, and check it
# overrides the global disassembler.
with_test_prefix "LOCAL tagging disassembler" {
gdb_test_no_output "python gdb.disassembler.register_disassembler(TaggingDisassembler(\"LOCAL\"), \"${curr_arch}\")"
gdb_test "disassemble main" "${base_pattern}\\s+## tag = LOCAL\r\n.*"
}
# Now remove the architecture specific disassembler, and check that
# the global disassembler kicks back in.
with_test_prefix "GLOBAL tagging disassembler again" {
gdb_test_no_output "python gdb.disassembler.register_disassembler(None, \"${curr_arch}\")"
gdb_test "disassemble main" "${base_pattern}\\s+## tag = GLOBAL\r\n.*"
}
# Check that a DisassembleInfo becomes invalid after the call into the
# disassembler.
with_test_prefix "DisassembleInfo becomes invalid" {
py_remove_all_disassemblers
gdb_test_no_output "python add_global_disassembler(GlobalCachingDisassembler)"
gdb_test "disassemble main" "${base_pattern}\\s+## CACHED\r\n.*"
gdb_test "python GlobalCachingDisassembler.check()" "PASS"
}
# Test the memory source aspect of the builtin disassembler.
with_test_prefix "memory source api" {
py_remove_all_disassemblers
gdb_test_no_output "python analyzing_disassembler = add_global_disassembler(AnalyzingDisassembler)"
gdb_test "disassemble main" "${base_pattern}\r\n.*"
gdb_test "python analyzing_disassembler.find_replacement_candidate()" \
"Replace from $hex to $hex with NOP"
gdb_test "disassemble main" "${base_pattern}\r\n.*" \
"second disassembler pass"
gdb_test "python analyzing_disassembler.check()" \
"PASS"
}
# Test the 'maint info python-disassemblers command.
with_test_prefix "maint info python-disassemblers" {
py_remove_all_disassemblers
gdb_test "maint info python-disassemblers" "No Python disassemblers registered\\." \
"list disassemblers, none registered"
gdb_test_no_output "python disasm = add_global_disassembler(BuiltinDisassembler)"
gdb_test "maint info python-disassemblers" \
[multi_line \
"Architecture\\s+Disassember Name" \
"GLOBAL\\s+BuiltinDisassembler\\s+\\(Matches current architecture\\)"] \
"list disassemblers, single global disassembler"
gdb_test_no_output "python arch = gdb.selected_inferior().architecture().name()"
gdb_test_no_output "python gdb.disassembler.register_disassembler(disasm, arch)"
gdb_test "maint info python-disassemblers" \
[multi_line \
"Architecture\\s+Disassember Name" \
"\[^\r\n\]+BuiltinDisassembler\\s+\\(Matches current architecture\\)" \
"GLOBAL\\s+BuiltinDisassembler"] \
"list disassemblers, multiple disassemblers registered"
}
# Check the attempt to create a "new" DisassembleInfo object fails.
with_test_prefix "Bad DisassembleInfo creation" {
gdb_test_no_output "python my_info = InvalidDisassembleInfo()"
gdb_test "python print(my_info.is_valid())" "True"
gdb_test "python gdb.disassembler.builtin_disassemble(my_info)" \
[multi_line \
"RuntimeError: DisassembleInfo is no longer valid\\." \
"Error while executing Python code\\."]
}

View File

@ -0,0 +1,712 @@
# Copyright (C) 2021-2022 Free Software Foundation, Inc.
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
import gdb
import gdb.disassembler
import struct
import sys
from gdb.disassembler import Disassembler, DisassemblerResult
# A global, holds the program-counter address at which we should
# perform the extra disassembly that this script provides.
current_pc = None
# Remove all currently registered disassemblers.
def remove_all_python_disassemblers():
for a in gdb.architecture_names():
gdb.disassembler.register_disassembler(None, a)
gdb.disassembler.register_disassembler(None, None)
class TestDisassembler(Disassembler):
"""A base class for disassemblers within this script to inherit from.
Implements the __call__ method and ensures we only do any
disassembly wrapping for the global CURRENT_PC."""
def __init__(self):
global current_pc
super().__init__("TestDisassembler")
self.__info = None
if current_pc == None:
raise gdb.GdbError("no current_pc set")
def __call__(self, info):
global current_pc
if info.address != current_pc:
return None
self.__info = info
return self.disassemble(info)
def get_info(self):
return self.__info
def disassemble(self, info):
raise NotImplementedError("override the disassemble method")
class GlobalPreInfoDisassembler(TestDisassembler):
"""Check the attributes of DisassembleInfo before disassembly has occurred."""
def disassemble(self, info):
ad = info.address
ar = info.architecture
if ad != current_pc:
raise gdb.GdbError("invalid address")
if not isinstance(ar, gdb.Architecture):
raise gdb.GdbError("invalid architecture type")
result = gdb.disassembler.builtin_disassemble(info)
text = result.string + "\t## ad = 0x%x, ar = %s" % (ad, ar.name())
return DisassemblerResult(result.length, text)
class GlobalPostInfoDisassembler(TestDisassembler):
"""Check the attributes of DisassembleInfo after disassembly has occurred."""
def disassemble(self, info):
result = gdb.disassembler.builtin_disassemble(info)
ad = info.address
ar = info.architecture
if ad != current_pc:
raise gdb.GdbError("invalid address")
if not isinstance(ar, gdb.Architecture):
raise gdb.GdbError("invalid architecture type")
text = result.string + "\t## ad = 0x%x, ar = %s" % (ad, ar.name())
return DisassemblerResult(result.length, text)
class GlobalReadDisassembler(TestDisassembler):
"""Check the DisassembleInfo.read_memory method. Calls the builtin
disassembler, then reads all of the bytes of this instruction, and
adds them as a comment to the disassembler output."""
def disassemble(self, info):
result = gdb.disassembler.builtin_disassemble(info)
len = result.length
str = ""
for o in range(len):
if str != "":
str += " "
v = bytes(info.read_memory(1, o))[0]
if sys.version_info[0] < 3:
v = struct.unpack("<B", v)
str += "0x%02x" % v
text = result.string + "\t## bytes = %s" % str
return DisassemblerResult(result.length, text)
class GlobalAddrDisassembler(TestDisassembler):
"""Check the gdb.format_address method."""
def disassemble(self, info):
result = gdb.disassembler.builtin_disassemble(info)
arch = info.architecture
addr = info.address
program_space = info.progspace
str = gdb.format_address(addr, program_space, arch)
text = result.string + "\t## addr = %s" % str
return DisassemblerResult(result.length, text)
class GdbErrorEarlyDisassembler(TestDisassembler):
"""Raise a GdbError instead of performing any disassembly."""
def disassemble(self, info):
raise gdb.GdbError("GdbError instead of a result")
class RuntimeErrorEarlyDisassembler(TestDisassembler):
"""Raise a RuntimeError instead of performing any disassembly."""
def disassemble(self, info):
raise RuntimeError("RuntimeError instead of a result")
class GdbErrorLateDisassembler(TestDisassembler):
"""Raise a GdbError after calling the builtin disassembler."""
def disassemble(self, info):
result = gdb.disassembler.builtin_disassemble(info)
raise gdb.GdbError("GdbError after builtin disassembler")
class RuntimeErrorLateDisassembler(TestDisassembler):
"""Raise a RuntimeError after calling the builtin disassembler."""
def disassemble(self, info):
result = gdb.disassembler.builtin_disassemble(info)
raise RuntimeError("RuntimeError after builtin disassembler")
class MemoryErrorEarlyDisassembler(TestDisassembler):
"""Throw a memory error, ignore the error and disassemble."""
def disassemble(self, info):
tag = "## FAIL"
try:
info.read_memory(1, -info.address + 2)
except gdb.MemoryError:
tag = "## AFTER ERROR"
result = gdb.disassembler.builtin_disassemble(info)
text = result.string + "\t" + tag
return DisassemblerResult(result.length, text)
class MemoryErrorLateDisassembler(TestDisassembler):
"""Throw a memory error after calling the builtin disassembler, but
before we return a result."""
def disassemble(self, info):
result = gdb.disassembler.builtin_disassemble(info)
# The following read will throw an error.
info.read_memory(1, -info.address + 2)
return DisassemblerResult(1, "BAD")
class RethrowMemoryErrorDisassembler(TestDisassembler):
"""Catch and rethrow a memory error."""
def disassemble(self, info):
try:
info.read_memory(1, -info.address + 2)
except gdb.MemoryError as e:
raise gdb.MemoryError("cannot read code at address 0x2")
return DisassemblerResult(1, "BAD")
class ResultOfWrongType(TestDisassembler):
"""Return something that is not a DisassemblerResult from disassemble method"""
class Blah:
def __init__(self, length, string):
self.length = length
self.string = string
def disassemble(self, info):
return self.Blah(1, "ABC")
class ResultWrapper(gdb.disassembler.DisassemblerResult):
def __init__(self, length, string, length_x=None, string_x=None):
super().__init__(length, string)
if length_x is None:
self.__length = length
else:
self.__length = length_x
if string_x is None:
self.__string = string
else:
self.__string = string_x
@property
def length(self):
return self.__length
@property
def string(self):
return self.__string
class ResultWithInvalidLength(TestDisassembler):
"""Return a result object with an invalid length."""
def disassemble(self, info):
result = gdb.disassembler.builtin_disassemble(info)
return ResultWrapper(result.length, result.string, 0)
class ResultWithInvalidString(TestDisassembler):
"""Return a result object with an empty string."""
def disassemble(self, info):
result = gdb.disassembler.builtin_disassemble(info)
return ResultWrapper(result.length, result.string, None, "")
class TaggingDisassembler(TestDisassembler):
"""A simple disassembler that just tags the output."""
def __init__(self, tag):
super().__init__()
self._tag = tag
def disassemble(self, info):
result = gdb.disassembler.builtin_disassemble(info)
text = result.string + "\t## tag = %s" % self._tag
return DisassemblerResult(result.length, text)
class GlobalCachingDisassembler(TestDisassembler):
"""A disassembler that caches the DisassembleInfo that is passed in,
as well as a copy of the original DisassembleInfo.
Once the call into the disassembler is complete then the
DisassembleInfo objects become invalid, and any calls into them
should trigger an exception."""
# This is where we cache the DisassembleInfo objects.
cached_insn_disas = []
class MyInfo(gdb.disassembler.DisassembleInfo):
def __init__(self, info):
super().__init__(info)
def disassemble(self, info):
"""Disassemble the instruction, add a CACHED comment to the output,
and cache the DisassembleInfo so that it is not garbage collected."""
GlobalCachingDisassembler.cached_insn_disas.append(info)
GlobalCachingDisassembler.cached_insn_disas.append(self.MyInfo(info))
result = gdb.disassembler.builtin_disassemble(info)
text = result.string + "\t## CACHED"
return DisassemblerResult(result.length, text)
@staticmethod
def check():
"""Check that all of the methods on the cached DisassembleInfo trigger an
exception."""
for info in GlobalCachingDisassembler.cached_insn_disas:
assert isinstance(info, gdb.disassembler.DisassembleInfo)
assert not info.is_valid()
try:
val = info.address
raise gdb.GdbError("DisassembleInfo.address is still valid")
except RuntimeError as e:
assert str(e) == "DisassembleInfo is no longer valid."
except:
raise gdb.GdbError(
"DisassembleInfo.address raised an unexpected exception"
)
try:
val = info.architecture
raise gdb.GdbError("DisassembleInfo.architecture is still valid")
except RuntimeError as e:
assert str(e) == "DisassembleInfo is no longer valid."
except:
raise gdb.GdbError(
"DisassembleInfo.architecture raised an unexpected exception"
)
try:
val = info.read_memory(1, 0)
raise gdb.GdbError("DisassembleInfo.read is still valid")
except RuntimeError as e:
assert str(e) == "DisassembleInfo is no longer valid."
except:
raise gdb.GdbError(
"DisassembleInfo.read raised an unexpected exception"
)
print("PASS")
class GlobalNullDisassembler(TestDisassembler):
"""A disassembler that does not change the output at all."""
def disassemble(self, info):
pass
class ReadMemoryMemoryErrorDisassembler(TestDisassembler):
"""Raise a MemoryError exception from the DisassembleInfo.read_memory
method."""
class MyInfo(gdb.disassembler.DisassembleInfo):
def __init__(self, info):
super().__init__(info)
def read_memory(self, length, offset):
# Throw a memory error with a specific address. We don't
# expect this address to show up in the output though.
raise gdb.MemoryError(0x1234)
def disassemble(self, info):
info = self.MyInfo(info)
return gdb.disassembler.builtin_disassemble(info)
class ReadMemoryGdbErrorDisassembler(TestDisassembler):
"""Raise a GdbError exception from the DisassembleInfo.read_memory
method."""
class MyInfo(gdb.disassembler.DisassembleInfo):
def __init__(self, info):
super().__init__(info)
def read_memory(self, length, offset):
raise gdb.GdbError("read_memory raised GdbError")
def disassemble(self, info):
info = self.MyInfo(info)
return gdb.disassembler.builtin_disassemble(info)
class ReadMemoryRuntimeErrorDisassembler(TestDisassembler):
"""Raise a RuntimeError exception from the DisassembleInfo.read_memory
method."""
class MyInfo(gdb.disassembler.DisassembleInfo):
def __init__(self, info):
super().__init__(info)
def read_memory(self, length, offset):
raise RuntimeError("read_memory raised RuntimeError")
def disassemble(self, info):
info = self.MyInfo(info)
return gdb.disassembler.builtin_disassemble(info)
class ReadMemoryCaughtMemoryErrorDisassembler(TestDisassembler):
"""Raise a MemoryError exception from the DisassembleInfo.read_memory
method, catch this in the outer disassembler."""
class MyInfo(gdb.disassembler.DisassembleInfo):
def __init__(self, info):
super().__init__(info)
def read_memory(self, length, offset):
raise gdb.MemoryError(0x1234)
def disassemble(self, info):
info = self.MyInfo(info)
try:
return gdb.disassembler.builtin_disassemble(info)
except gdb.MemoryError:
return None
class ReadMemoryCaughtGdbErrorDisassembler(TestDisassembler):
"""Raise a GdbError exception from the DisassembleInfo.read_memory
method, catch this in the outer disassembler."""
class MyInfo(gdb.disassembler.DisassembleInfo):
def __init__(self, info):
super().__init__(info)
def read_memory(self, length, offset):
raise gdb.GdbError("exception message")
def disassemble(self, info):
info = self.MyInfo(info)
try:
return gdb.disassembler.builtin_disassemble(info)
except gdb.GdbError as e:
if e.args[0] == "exception message":
return None
raise e
class ReadMemoryCaughtRuntimeErrorDisassembler(TestDisassembler):
"""Raise a RuntimeError exception from the DisassembleInfo.read_memory
method, catch this in the outer disassembler."""
class MyInfo(gdb.disassembler.DisassembleInfo):
def __init__(self, info):
super().__init__(info)
def read_memory(self, length, offset):
raise RuntimeError("exception message")
def disassemble(self, info):
info = self.MyInfo(info)
try:
return gdb.disassembler.builtin_disassemble(info)
except RuntimeError as e:
if e.args[0] == "exception message":
return None
raise e
class MemorySourceNotABufferDisassembler(TestDisassembler):
class MyInfo(gdb.disassembler.DisassembleInfo):
def __init__(self, info):
super().__init__(info)
def read_memory(self, length, offset):
return 1234
def disassemble(self, info):
info = self.MyInfo(info)
return gdb.disassembler.builtin_disassemble(info)
class MemorySourceBufferTooLongDisassembler(TestDisassembler):
"""The read memory returns too many bytes."""
class MyInfo(gdb.disassembler.DisassembleInfo):
def __init__(self, info):
super().__init__(info)
def read_memory(self, length, offset):
buffer = super().read_memory(length, offset)
# Create a new memory view made by duplicating BUFFER. This
# will trigger an error as GDB expects a buffer of exactly
# LENGTH to be returned, while this will return a buffer of
# 2*LENGTH.
return memoryview(
bytes([int.from_bytes(x, "little") for x in (list(buffer[0:]) * 2)])
)
def disassemble(self, info):
info = self.MyInfo(info)
return gdb.disassembler.builtin_disassemble(info)
class BuiltinDisassembler(Disassembler):
"""Just calls the builtin disassembler."""
def __init__(self):
super().__init__("BuiltinDisassembler")
def __call__(self, info):
return gdb.disassembler.builtin_disassemble(info)
class AnalyzingDisassembler(Disassembler):
class MyInfo(gdb.disassembler.DisassembleInfo):
"""Wrapper around builtin DisassembleInfo type that overrides the
read_memory method."""
def __init__(self, info, start, end, nop_bytes):
"""INFO is the DisassembleInfo we are wrapping. START and END are
addresses, and NOP_BYTES should be a memoryview object.
The length (END - START) should be the same as the length
of NOP_BYTES.
Any memory read requests outside the START->END range are
serviced normally, but any attempt to read within the
START->END range will return content from NOP_BYTES."""
super().__init__(info)
self._start = start
self._end = end
self._nop_bytes = nop_bytes
def _read_replacement(self, length, offset):
"""Return a slice of the buffer representing the replacement nop
instructions."""
assert self._nop_bytes is not None
rb = self._nop_bytes
# If this request is outside of a nop instruction then we don't know
# what to do, so just raise a memory error.
if offset >= len(rb) or (offset + length) > len(rb):
raise gdb.MemoryError("invalid length and offset combination")
# Return only the slice of the nop instruction as requested.
s = offset
e = offset + length
return rb[s:e]
def read_memory(self, length, offset=0):
"""Callback used by the builtin disassembler to read the contents of
memory."""
# If this request is within the region we are replacing with 'nop'
# instructions, then call the helper function to perform that
# replacement.
if self._start is not None:
assert self._end is not None
if self.address >= self._start and self.address < self._end:
return self._read_replacement(length, offset)
# Otherwise, we just forward this request to the default read memory
# implementation.
return super().read_memory(length, offset)
def __init__(self):
"""Constructor."""
super().__init__("AnalyzingDisassembler")
# Details about the instructions found during the first disassembler
# pass.
self._pass_1_length = []
self._pass_1_insn = []
self._pass_1_address = []
# The start and end address for the instruction we will replace with
# one or more 'nop' instructions during pass two.
self._start = None
self._end = None
# The index in the _pass_1_* lists for where the nop instruction can
# be found, also, the buffer of bytes that make up a nop instruction.
self._nop_index = None
self._nop_bytes = None
# A flag that indicates if we are in the first or second pass of
# this disassembler test.
self._first_pass = True
# The disassembled instructions collected during the second pass.
self._pass_2_insn = []
# A copy of _pass_1_insn that has been modified to include the extra
# 'nop' instructions we plan to insert during the second pass. This
# is then checked against _pass_2_insn after the second disassembler
# pass has completed.
self._check = []
def __call__(self, info):
"""Called to perform the disassembly."""
# Override the info object, this provides access to our
# read_memory function.
info = self.MyInfo(info, self._start, self._end, self._nop_bytes)
result = gdb.disassembler.builtin_disassemble(info)
# Record some informaiton about the first 'nop' instruction we find.
if self._nop_index is None and result.string == "nop":
self._nop_index = len(self._pass_1_length)
# The offset in the following read_memory call defaults to 0.
print("APB: Reading nop bytes")
self._nop_bytes = info.read_memory(result.length)
# Record information about each instruction that is disassembled.
# This test is performed in two passes, and we need different
# information in each pass.
if self._first_pass:
self._pass_1_length.append(result.length)
self._pass_1_insn.append(result.string)
self._pass_1_address.append(info.address)
else:
self._pass_2_insn.append(result.string)
return result
def find_replacement_candidate(self):
"""Call this after the first disassembly pass. This identifies a suitable
instruction to replace with 'nop' instruction(s)."""
if self._nop_index is None:
raise gdb.GdbError("no nop was found")
nop_idx = self._nop_index
nop_length = self._pass_1_length[nop_idx]
# First we look for an instruction that is larger than a nop
# instruction, but whose length is an exact multiple of the nop
# instruction's length.
replace_idx = None
for idx in range(len(self._pass_1_length)):
if (
idx > 0
and idx != nop_idx
and self._pass_1_insn[idx] != "nop"
and self._pass_1_length[idx] > self._pass_1_length[nop_idx]
and self._pass_1_length[idx] % self._pass_1_length[nop_idx] == 0
):
replace_idx = idx
break
# If we still don't have a replacement candidate, then search again,
# this time looking for an instruciton that is the same length as a
# nop instruction.
if replace_idx is None:
for idx in range(len(self._pass_1_length)):
if (
idx > 0
and idx != nop_idx
and self._pass_1_insn[idx] != "nop"
and self._pass_1_length[idx] == self._pass_1_length[nop_idx]
):
replace_idx = idx
break
# Weird, the nop instruction must be larger than every other
# instruction, or all instructions are 'nop'?
if replace_idx is None:
raise gdb.GdbError("can't find an instruction to replace")
# Record the instruction range that will be replaced with 'nop'
# instructions, and mark that we are now on the second pass.
self._start = self._pass_1_address[replace_idx]
self._end = self._pass_1_address[replace_idx] + self._pass_1_length[replace_idx]
self._first_pass = False
print("Replace from 0x%x to 0x%x with NOP" % (self._start, self._end))
# Finally, build the expected result. Create the _check list, which
# is a copy of _pass_1_insn, but replace the instruction we
# identified above with a series of 'nop' instructions.
self._check = list(self._pass_1_insn)
nop_count = int(self._pass_1_length[replace_idx] / self._pass_1_length[nop_idx])
nops = ["nop"] * nop_count
self._check[replace_idx : (replace_idx + 1)] = nops
def check(self):
"""Call this after the second disassembler pass to validate the output."""
if self._check != self._pass_2_insn:
print("APB, Check : %s" % self._check)
print("APB, Result: %s" % self._pass_2_insn)
raise gdb.GdbError("mismatch")
print("PASS")
def add_global_disassembler(dis_class):
"""Create an instance of DIS_CLASS and register it as a global disassembler."""
dis = dis_class()
gdb.disassembler.register_disassembler(dis, None)
return dis
class InvalidDisassembleInfo(gdb.disassembler.DisassembleInfo):
"""An attempt to create a DisassembleInfo sub-class without calling
the parent class init method.
Attempts to use instances of this class should throw an error
saying that the DisassembleInfo is not valid, despite this class
having all of the required attributes.
The reason why this class will never be valid is that an internal
field (within the C++ code) can't be initialized without calling
the parent class init method."""
def __init__(self):
assert current_pc is not None
def is_valid(self):
return True
@property
def address(self):
global current_pc
return current_pc
@property
def architecture(self):
return gdb.selected_inferior().architecture()
@property
def progspace(self):
return gdb.selected_inferior().progspace
# Start with all disassemblers removed.
remove_all_python_disassemblers()
print("Python script imported")