src/gdb/ChangeLog:

2006-03-28  Jim Blandy  <jimb@codesourcery.com>

	* prologue-value.c, prologue-value.h: New files.
	* Makefile.in (prologue_value_h): New variable.
	(HFILES_NO_SRCDIR): List prologue-value.h.
	(SFILES): List prologue-value.c.
	(COMMON_OBS): List prologue-value.o.
	(prologue-value.o): New rule.

src/gdb/doc/ChangeLog:
2006-03-28  Jim Blandy  <jimb@codesourcery.com>

	* gdbint.texinfo (Prologue Analysis): New section.
This commit is contained in:
Jim Blandy 2006-03-28 19:19:16 +00:00
parent 05c6a9a10e
commit 7d30c22d4c
6 changed files with 1080 additions and 0 deletions

View File

@ -1,3 +1,12 @@
2006-03-28 Jim Blandy <jimb@codesourcery.com>
* prologue-value.c, prologue-value.h: New files.
* Makefile.in (prologue_value_h): New variable.
(HFILES_NO_SRCDIR): List prologue-value.h.
(SFILES): List prologue-value.c.
(COMMON_OBS): List prologue-value.o.
(prologue-value.o): New rule.
2006-03-27 Michael Snyder <msnyder@redhat.com>
* xstormy16-tdep.c (xstormy16_return_value, xstormy16_push_dummy_call,

View File

@ -542,6 +542,7 @@ SFILES = ada-exp.y ada-lang.c ada-typeprint.c ada-valprint.c \
objc-exp.y objc-lang.c \
objfiles.c osabi.c observer.c \
p-exp.y p-lang.c p-typeprint.c p-valprint.c parse.c printcmd.c \
prologue-value.c \
regcache.c reggroups.c remote.c remote-fileio.c \
scm-exp.c scm-lang.c scm-valprint.c \
sentinel-frame.c \
@ -757,6 +758,7 @@ ppcnbsd_tdep_h = ppcnbsd-tdep.h
ppcobsd_tdep_h = ppcobsd-tdep.h
ppc_tdep_h = ppc-tdep.h
proc_utils_h = proc-utils.h
prologue_value_h = prologue-value.h
regcache_h = regcache.h
reggroups_h = reggroups.h
regset_h = regset.h
@ -867,6 +869,7 @@ HFILES_NO_SRCDIR = bcache.h buildsym.h call-cmds.h coff-solib.h defs.h \
symfile.h stabsread.h target.h terminal.h typeprint.h \
xcoffsolib.h \
macrotab.h macroexp.h macroscope.h \
prologue-value.h \
ada-lang.h c-lang.h f-lang.h \
jv-lang.h \
m2-lang.h p-lang.h \
@ -2437,6 +2440,8 @@ procfs.o: procfs.c $(defs_h) $(inferior_h) $(target_h) $(gdbcore_h) \
proc-service.o: proc-service.c $(defs_h) $(gdb_proc_service_h) $(inferior_h) \
$(symtab_h) $(target_h) $(gregset_h)
proc-why.o: proc-why.c $(defs_h) $(proc_utils_h)
prologue-value.o: prologue-value.c $(defs_h) $(gdb_string_h) $(gdb_assert_h) \
$(prologue_value_h) $(regcache_h)
p-typeprint.o: p-typeprint.c $(defs_h) $(gdb_obstack_h) $(bfd_h) $(symtab_h) \
$(gdbtypes_h) $(expression_h) $(value_h) $(gdbcore_h) $(target_h) \
$(language_h) $(p_lang_h) $(typeprint_h) $(gdb_string_h)

View File

@ -1,3 +1,7 @@
2006-03-28 Jim Blandy <jimb@codesourcery.com>
* gdbint.texinfo (Prologue Analysis): New section.
2006-03-07 Jim Blandy <jimb@red-bean.com>
* gdb.texinfo (Connecting): Document 'target remote pipe'.

View File

@ -287,6 +287,175 @@ used to create a new @value{GDBN} frame struct, and then
@code{DEPRECATED_INIT_EXTRA_FRAME_INFO} and
@code{DEPRECATED_INIT_FRAME_PC} will be called for the new frame.
@section Prologue Analysis
@cindex prologue analysis
@cindex call frame information
@cindex CFI (call frame information)
To produce a backtrace and allow the user to manipulate older frames'
variables and arguments, @value{GDBN} needs to find the base addresses
of older frames, and discover where those frames' registers have been
saved. Since a frame's ``callee-saves'' registers get saved by
younger frames if and when they're reused, a frame's registers may be
scattered unpredictably across younger frames. This means that
changing the value of a register-allocated variable in an older frame
may actually entail writing to a save slot in some younger frame.
Modern versions of GCC emit Dwarf call frame information (``CFI''),
which describes how to find frame base addresses and saved registers.
But CFI is not always available, so as a fallback @value{GDBN} uses a
technique called @dfn{prologue analysis} to find frame sizes and saved
registers. A prologue analyzer disassembles the function's machine
code starting from its entry point, and looks for instructions that
allocate frame space, save the stack pointer in a frame pointer
register, save registers, and so on. Obviously, this can't be done
accurately in general, but it's tractible to do well enough to be very
helpful. Prologue analysis predates the GNU toolchain's support for
CFI; at one time, prologue analysis was the only mechanism
@value{GDBN} used for stack unwinding at all, when the function
calling conventions didn't specify a fixed frame layout.
In the olden days, function prologues were generated by hand-written,
target-specific code in GCC, and treated as opaque and untouchable by
optimizers. Looking at this code, it was usually straightforward to
write a prologue analyzer for @value{GDBN} that would accurately
understand all the prologues GCC would generate. However, over time
GCC became more aggressive about instruction scheduling, and began to
understand more about the semantics of the prologue instructions
themselves; in response, @value{GDBN}'s analyzers became more complex
and fragile. Keeping the prologue analyzers working as GCC (and the
instruction sets themselves) evolved became a substantial task.
@cindex @file{prologue-value.c}
@cindex abstract interpretation of function prologues
@cindex pseudo-evaluation of function prologues
To try to address this problem, the code in @file{prologue-value.h}
and @file{prologue-value.c} provides a general framework for writing
prologue analyzers that are simpler and more robust than ad-hoc
analyzers. When we analyze a prologue using the prologue-value
framework, we're really doing ``abstract interpretation'' or
``pseudo-evaluation'': running the function's code in simulation, but
using conservative approximations of the values registers and memory
would hold when the code actually runs. For example, if our function
starts with the instruction:
@example
addi r1, 42 # add 42 to r1
@end example
@noindent
we don't know exactly what value will be in @code{r1} after executing
this instruction, but we do know it'll be 42 greater than its original
value.
If we then see an instruction like:
@example
addi r1, 22 # add 22 to r1
@end example
@noindent
we still don't know what @code{r1's} value is, but again, we can say
it is now 64 greater than its original value.
If the next instruction were:
@example
mov r2, r1 # set r2 to r1's value
@end example
@noindent
then we can say that @code{r2's} value is now the original value of
@code{r1} plus 64.
It's common for prologues to save registers on the stack, so we'll
need to track the values of stack frame slots, as well as the
registers. So after an instruction like this:
@example
mov (fp+4), r2
@end example
@noindent
then we'd know that the stack slot four bytes above the frame pointer
holds the original value of @code{r1} plus 64.
And so on.
Of course, this can only go so far before it gets unreasonable. If we
wanted to be able to say anything about the value of @code{r1} after
the instruction:
@example
xor r1, r3 # exclusive-or r1 and r3, place result in r1
@end example
@noindent
then things would get pretty complex. But remember, we're just doing
a conservative approximation; if exclusive-or instructions aren't
relevant to prologues, we can just say @code{r1}'s value is now
``unknown''. We can ignore things that are too complex, if that loss of
information is acceptable for our application.
So when we say ``conservative approximation'' here, what we mean is an
approximation that is either accurate, or marked ``unknown'', but
never inaccurate.
Using this framework, a prologue analyzer is simply an interpreter for
machine code, but one that uses conservative approximations for the
contents of registers and memory instead of actual values. Starting
from the function's entry point, you simulate instructions up to the
current PC, or an instruction that you don't know how to simulate.
Now you can examine the state of the registers and stack slots you've
kept track of.
@itemize @bullet
@item
To see how large your stack frame is, just check the value of the
stack pointer register; if it's the original value of the SP
minus a constant, then that constant is the stack frame's size.
If the SP's value has been marked as ``unknown'', then that means
the prologue has done something too complex for us to track, and
we don't know the frame size.
@item
To see where we've saved the previous frame's registers, we just
search the values we've tracked --- stack slots, usually, but
registers, too, if you want --- for something equal to the register's
original value. If the calling conventions suggest a standard place
to save a given register, then we can check there first, but really,
anything that will get us back the original value will probably work.
@end itemize
This does take some work. But prologue analyzers aren't
quick-and-simple pattern patching to recognize a few fixed prologue
forms any more; they're big, hairy functions. Along with inferior
function calls, prologue analysis accounts for a substantial portion
of the time needed to stabilize a @value{GDBN} port. So it's
worthwhile to look for an approach that will be easier to understand
and maintain. In the approach described above:
@itemize @bullet
@item
It's easier to see that the analyzer is correct: you just see
whether the analyzer properly (albiet conservatively) simulates
the effect of each instruction.
@item
It's easier to extend the analyzer: you can add support for new
instructions, and know that you haven't broken anything that
wasn't already broken before.
@item
It's orthogonal: to gather new information, you don't need to
complicate the code for each instruction. As long as your domain
of conservative values is already detailed enough to tell you
what you need, then all the existing instruction simulations are
already gathering the right data for you.
@end itemize
The file @file{prologue-value.h} contains detailed comments explaining
the framework and how to use it.
@section Breakpoint Handling
@cindex breakpoints

591
gdb/prologue-value.c Normal file
View File

@ -0,0 +1,591 @@
/* Prologue value handling for GDB.
Copyright 2003, 2004, 2005 Free Software Foundation, Inc.
This file is part of GDB.
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to:
Free Software Foundation, Inc.
51 Franklin St - Fifth Floor
Boston, MA 02110-1301
USA */
#include "defs.h"
#include "gdb_string.h"
#include "gdb_assert.h"
#include "prologue-value.h"
#include "regcache.h"
/* Constructors. */
pv_t
pv_unknown (void)
{
pv_t v = { pvk_unknown, 0, 0 };
return v;
}
pv_t
pv_constant (CORE_ADDR k)
{
pv_t v;
v.kind = pvk_constant;
v.reg = -1; /* for debugging */
v.k = k;
return v;
}
pv_t
pv_register (int reg, CORE_ADDR k)
{
pv_t v;
v.kind = pvk_register;
v.reg = reg;
v.k = k;
return v;
}
/* Arithmetic operations. */
/* If one of *A and *B is a constant, and the other isn't, swap the
values as necessary to ensure that *B is the constant. This can
reduce the number of cases we need to analyze in the functions
below. */
static void
constant_last (pv_t *a, pv_t *b)
{
if (a->kind == pvk_constant
&& b->kind != pvk_constant)
{
pv_t temp = *a;
*a = *b;
*b = temp;
}
}
pv_t
pv_add (pv_t a, pv_t b)
{
constant_last (&a, &b);
/* We can add a constant to a register. */
if (a.kind == pvk_register
&& b.kind == pvk_constant)
return pv_register (a.reg, a.k + b.k);
/* We can add a constant to another constant. */
else if (a.kind == pvk_constant
&& b.kind == pvk_constant)
return pv_constant (a.k + b.k);
/* Anything else we don't know how to add. We don't have a
representation for, say, the sum of two registers, or a multiple
of a register's value (adding a register to itself). */
else
return pv_unknown ();
}
pv_t
pv_add_constant (pv_t v, CORE_ADDR k)
{
/* Rather than thinking of all the cases we can and can't handle,
we'll just let pv_add take care of that for us. */
return pv_add (v, pv_constant (k));
}
pv_t
pv_subtract (pv_t a, pv_t b)
{
/* This isn't quite the same as negating B and adding it to A, since
we don't have a representation for the negation of anything but a
constant. For example, we can't negate { pvk_register, R1, 10 },
but we do know that { pvk_register, R1, 10 } minus { pvk_register,
R1, 5 } is { pvk_constant, <ignored>, 5 }.
This means, for example, that we could subtract two stack
addresses; they're both relative to the original SP. Since the
frame pointer is set based on the SP, its value will be the
original SP plus some constant (probably zero), so we can use its
value just fine, too. */
constant_last (&a, &b);
/* We can subtract two constants. */
if (a.kind == pvk_constant
&& b.kind == pvk_constant)
return pv_constant (a.k - b.k);
/* We can subtract a constant from a register. */
else if (a.kind == pvk_register
&& b.kind == pvk_constant)
return pv_register (a.reg, a.k - b.k);
/* We can subtract a register from itself, yielding a constant. */
else if (a.kind == pvk_register
&& b.kind == pvk_register
&& a.reg == b.reg)
return pv_constant (a.k - b.k);
/* We don't know how to subtract anything else. */
else
return pv_unknown ();
}
pv_t
pv_logical_and (pv_t a, pv_t b)
{
constant_last (&a, &b);
/* We can 'and' two constants. */
if (a.kind == pvk_constant
&& b.kind == pvk_constant)
return pv_constant (a.k & b.k);
/* We can 'and' anything with the constant zero. */
else if (b.kind == pvk_constant
&& b.k == 0)
return pv_constant (0);
/* We can 'and' anything with ~0. */
else if (b.kind == pvk_constant
&& b.k == ~ (CORE_ADDR) 0)
return a;
/* We can 'and' a register with itself. */
else if (a.kind == pvk_register
&& b.kind == pvk_register
&& a.reg == b.reg
&& a.k == b.k)
return a;
/* Otherwise, we don't know. */
else
return pv_unknown ();
}
/* Examining prologue values. */
int
pv_is_identical (pv_t a, pv_t b)
{
if (a.kind != b.kind)
return 0;
switch (a.kind)
{
case pvk_unknown:
return 1;
case pvk_constant:
return (a.k == b.k);
case pvk_register:
return (a.reg == b.reg && a.k == b.k);
default:
gdb_assert (0);
}
}
int
pv_is_constant (pv_t a)
{
return (a.kind == pvk_constant);
}
int
pv_is_register (pv_t a, int r)
{
return (a.kind == pvk_register
&& a.reg == r);
}
int
pv_is_register_k (pv_t a, int r, CORE_ADDR k)
{
return (a.kind == pvk_register
&& a.reg == r
&& a.k == k);
}
enum pv_boolean
pv_is_array_ref (pv_t addr, CORE_ADDR size,
pv_t array_addr, CORE_ADDR array_len,
CORE_ADDR elt_size,
int *i)
{
/* Note that, since .k is a CORE_ADDR, and CORE_ADDR is unsigned, if
addr is *before* the start of the array, then this isn't going to
be negative... */
pv_t offset = pv_subtract (addr, array_addr);
if (offset.kind == pvk_constant)
{
/* This is a rather odd test. We want to know if the SIZE bytes
at ADDR don't overlap the array at all, so you'd expect it to
be an || expression: "if we're completely before || we're
completely after". But with unsigned arithmetic, things are
different: since it's a number circle, not a number line, the
right values for offset.k are actually one contiguous range. */
if (offset.k <= -size
&& offset.k >= array_len * elt_size)
return pv_definite_no;
else if (offset.k % elt_size != 0
|| size != elt_size)
return pv_maybe;
else
{
*i = offset.k / elt_size;
return pv_definite_yes;
}
}
else
return pv_maybe;
}
/* Areas. */
/* A particular value known to be stored in an area.
Entries form a ring, sorted by unsigned offset from the area's base
register's value. Since entries can straddle the wrap-around point,
unsigned offsets form a circle, not a number line, so the list
itself is structured the same way --- there is no inherent head.
The entry with the lowest offset simply follows the entry with the
highest offset. Entries may abut, but never overlap. The area's
'entry' pointer points to an arbitrary node in the ring. */
struct area_entry
{
/* Links in the doubly-linked ring. */
struct area_entry *prev, *next;
/* Offset of this entry's address from the value of the base
register. */
CORE_ADDR offset;
/* The size of this entry. Note that an entry may wrap around from
the end of the address space to the beginning. */
CORE_ADDR size;
/* The value stored here. */
pv_t value;
};
struct pv_area
{
/* This area's base register. */
int base_reg;
/* The mask to apply to addresses, to make the wrap-around happen at
the right place. */
CORE_ADDR addr_mask;
/* An element of the doubly-linked ring of entries, or zero if we
have none. */
struct area_entry *entry;
};
struct pv_area *
make_pv_area (int base_reg)
{
struct pv_area *a = (struct pv_area *) xmalloc (sizeof (*a));
memset (a, 0, sizeof (*a));
a->base_reg = base_reg;
a->entry = 0;
/* Remember that shift amounts equal to the type's width are
undefined. */
a->addr_mask = ((((CORE_ADDR) 1 << (TARGET_ADDR_BIT - 1)) - 1) << 1) | 1;
return a;
}
/* Delete all entries from AREA. */
static void
clear_entries (struct pv_area *area)
{
struct area_entry *e = area->entry;
if (e)
{
/* This needs to be a do-while loop, in order to actually
process the node being checked for in the terminating
condition. */
do
{
struct area_entry *next = e->next;
xfree (e);
}
while (e != area->entry);
area->entry = 0;
}
}
void
free_pv_area (struct pv_area *area)
{
clear_entries (area);
xfree (area);
}
static void
do_free_pv_area_cleanup (void *arg)
{
free_pv_area ((struct pv_area *) arg);
}
struct cleanup *
make_cleanup_free_pv_area (struct pv_area *area)
{
return make_cleanup (do_free_pv_area_cleanup, (void *) area);
}
int
pv_area_store_would_trash (struct pv_area *area, pv_t addr)
{
/* It may seem odd that pvk_constant appears here --- after all,
that's the case where we know the most about the address! But
pv_areas are always relative to a register, and we don't know the
value of the register, so we can't compare entry addresses to
constants. */
return (addr.kind == pvk_unknown
|| addr.kind == pvk_constant
|| (addr.kind == pvk_register && addr.reg != area->base_reg));
}
/* Return a pointer to the first entry we hit in AREA starting at
OFFSET and going forward.
This may return zero, if AREA has no entries.
And since the entries are a ring, this may return an entry that
entirely preceeds OFFSET. This is the correct behavior: depending
on the sizes involved, we could still overlap such an area, with
wrap-around. */
static struct area_entry *
find_entry (struct pv_area *area, CORE_ADDR offset)
{
struct area_entry *e = area->entry;
if (! e)
return 0;
/* If the next entry would be better than the current one, then scan
forward. Since we use '<' in this loop, it always terminates.
Note that, even setting aside the addr_mask stuff, we must not
simplify this, in high school algebra fashion, to
(e->next->offset < e->offset), because of the way < interacts
with wrap-around. We have to subtract offset from both sides to
make sure both things we're comparing are on the same side of the
discontinuity. */
while (((e->next->offset - offset) & area->addr_mask)
< ((e->offset - offset) & area->addr_mask))
e = e->next;
/* If the previous entry would be better than the current one, then
scan backwards. */
while (((e->prev->offset - offset) & area->addr_mask)
< ((e->offset - offset) & area->addr_mask))
e = e->prev;
/* In case there's some locality to the searches, set the area's
pointer to the entry we've found. */
area->entry = e;
return e;
}
/* Return non-zero if the SIZE bytes at OFFSET would overlap ENTRY;
return zero otherwise. AREA is the area to which ENTRY belongs. */
static int
overlaps (struct pv_area *area,
struct area_entry *entry,
CORE_ADDR offset,
CORE_ADDR size)
{
/* Think carefully about wrap-around before simplifying this. */
return (((entry->offset - offset) & area->addr_mask) < size
|| ((offset - entry->offset) & area->addr_mask) < entry->size);
}
void
pv_area_store (struct pv_area *area,
pv_t addr,
CORE_ADDR size,
pv_t value)
{
/* Remove any (potentially) overlapping entries. */
if (pv_area_store_would_trash (area, addr))
clear_entries (area);
else
{
CORE_ADDR offset = addr.k;
struct area_entry *e = find_entry (area, offset);
/* Delete all entries that we would overlap. */
while (e && overlaps (area, e, offset, size))
{
struct area_entry *next = (e->next == e) ? 0 : e->next;
e->prev->next = e->next;
e->next->prev = e->prev;
xfree (e);
e = next;
}
/* Move the area's pointer to the next remaining entry. This
will also zero the pointer if we've deleted all the entries. */
area->entry = e;
}
/* Now, there are no entries overlapping us, and area->entry is
either zero or pointing at the closest entry after us. We can
just insert ourselves before that.
But if we're storing an unknown value, don't bother --- that's
the default. */
if (value.kind == pvk_unknown)
return;
else
{
CORE_ADDR offset = addr.k;
struct area_entry *e = (struct area_entry *) xmalloc (sizeof (*e));
e->offset = offset;
e->size = size;
e->value = value;
if (area->entry)
{
e->prev = area->entry->prev;
e->next = area->entry;
e->prev->next = e->next->prev = e;
}
else
{
e->prev = e->next = e;
area->entry = e;
}
}
}
pv_t
pv_area_fetch (struct pv_area *area, pv_t addr, CORE_ADDR size)
{
/* If we have no entries, or we can't decide how ADDR relates to the
entries we do have, then the value is unknown. */
if (! area->entry
|| pv_area_store_would_trash (area, addr))
return pv_unknown ();
else
{
CORE_ADDR offset = addr.k;
struct area_entry *e = find_entry (area, offset);
/* If this entry exactly matches what we're looking for, then
we're set. Otherwise, say it's unknown. */
if (e->offset == offset && e->size == size)
return e->value;
else
return pv_unknown ();
}
}
int
pv_area_find_reg (struct pv_area *area,
struct gdbarch *gdbarch,
int reg,
CORE_ADDR *offset_p)
{
struct area_entry *e = area->entry;
if (e)
do
{
if (e->value.kind == pvk_register
&& e->value.reg == reg
&& e->value.k == 0
&& e->size == register_size (gdbarch, reg))
{
if (offset_p)
*offset_p = e->offset;
return 1;
}
e = e->next;
}
while (e != area->entry);
return 0;
}
void
pv_area_scan (struct pv_area *area,
void (*func) (void *closure,
pv_t addr,
CORE_ADDR size,
pv_t value),
void *closure)
{
struct area_entry *e = area->entry;
pv_t addr;
addr.kind = pvk_register;
addr.reg = area->base_reg;
if (e)
do
{
addr.k = e->offset;
func (closure, addr, e->size, e->value);
e = e->next;
}
while (e != area->entry);
}

302
gdb/prologue-value.h Normal file
View File

@ -0,0 +1,302 @@
/* Interface to prologue value handling for GDB.
Copyright 2003, 2004, 2005 Free Software Foundation, Inc.
This file is part of GDB.
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to:
Free Software Foundation, Inc.
51 Franklin St - Fifth Floor
Boston, MA 02110-1301
USA */
#ifndef PROLOGUE_VALUE_H
#define PROLOGUE_VALUE_H
/* When we analyze a prologue, we're really doing 'abstract
interpretation' or 'pseudo-evaluation': running the function's code
in simulation, but using conservative approximations of the values
it would have when it actually runs. For example, if our function
starts with the instruction:
addi r1, 42 # add 42 to r1
we don't know exactly what value will be in r1 after executing this
instruction, but we do know it'll be 42 greater than its original
value.
If we then see an instruction like:
addi r1, 22 # add 22 to r1
we still don't know what r1's value is, but again, we can say it is
now 64 greater than its original value.
If the next instruction were:
mov r2, r1 # set r2 to r1's value
then we can say that r2's value is now the original value of r1
plus 64.
It's common for prologues to save registers on the stack, so we'll
need to track the values of stack frame slots, as well as the
registers. So after an instruction like this:
mov (fp+4), r2
then we'd know that the stack slot four bytes above the frame
pointer holds the original value of r1 plus 64.
And so on.
Of course, this can only go so far before it gets unreasonable. If
we wanted to be able to say anything about the value of r1 after
the instruction:
xor r1, r3 # exclusive-or r1 and r3, place result in r1
then things would get pretty complex. But remember, we're just
doing a conservative approximation; if exclusive-or instructions
aren't relevant to prologues, we can just say r1's value is now
'unknown'. We can ignore things that are too complex, if that loss
of information is acceptable for our application.
So when I say "conservative approximation" here, what I mean is an
approximation that is either accurate, or marked "unknown", but
never inaccurate.
Once you've reached the current PC, or an instruction that you
don't know how to simulate, you stop. Now you can examine the
state of the registers and stack slots you've kept track of.
- To see how large your stack frame is, just check the value of the
stack pointer register; if it's the original value of the SP
minus a constant, then that constant is the stack frame's size.
If the SP's value has been marked as 'unknown', then that means
the prologue has done something too complex for us to track, and
we don't know the frame size.
- To see where we've saved the previous frame's registers, we just
search the values we've tracked --- stack slots, usually, but
registers, too, if you want --- for something equal to the
register's original value. If the ABI suggests a standard place
to save a given register, then we can check there first, but
really, anything that will get us back the original value will
probably work.
Sure, this takes some work. But prologue analyzers aren't
quick-and-simple pattern patching to recognize a few fixed prologue
forms any more; they're big, hairy functions. Along with inferior
function calls, prologue analysis accounts for a substantial
portion of the time needed to stabilize a GDB port. So I think
it's worthwhile to look for an approach that will be easier to
understand and maintain. In the approach used here:
- It's easier to see that the analyzer is correct: you just see
whether the analyzer properly (albiet conservatively) simulates
the effect of each instruction.
- It's easier to extend the analyzer: you can add support for new
instructions, and know that you haven't broken anything that
wasn't already broken before.
- It's orthogonal: to gather new information, you don't need to
complicate the code for each instruction. As long as your domain
of conservative values is already detailed enough to tell you
what you need, then all the existing instruction simulations are
already gathering the right data for you.
A 'struct prologue_value' is a conservative approximation of the
real value the register or stack slot will have. */
struct prologue_value {
/* What sort of value is this? This determines the interpretation
of subsequent fields. */
enum {
/* We don't know anything about the value. This is also used for
values we could have kept track of, when doing so would have
been too complex and we don't want to bother. The bottom of
our lattice. */
pvk_unknown,
/* A known constant. K is its value. */
pvk_constant,
/* The value that register REG originally had *UPON ENTRY TO THE
FUNCTION*, plus K. If K is zero, this means, obviously, just
the value REG had upon entry to the function. REG is a GDB
register number. Before we start interpreting, we initialize
every register R to { pvk_register, R, 0 }. */
pvk_register,
} kind;
/* The meanings of the following fields depend on 'kind'; see the
comments for the specific 'kind' values. */
int reg;
CORE_ADDR k;
};
typedef struct prologue_value pv_t;
/* Return the unknown prologue value --- { pvk_unknown, ?, ? }. */
pv_t pv_unknown (void);
/* Return the prologue value representing the constant K. */
pv_t pv_constant (CORE_ADDR k);
/* Return the prologue value representing the original value of
register REG, plus the constant K. */
pv_t pv_register (int reg, CORE_ADDR k);
/* Return conservative approximations of the results of the following
operations. */
pv_t pv_add (pv_t a, pv_t b); /* a + b */
pv_t pv_add_constant (pv_t v, CORE_ADDR k); /* a + k */
pv_t pv_subtract (pv_t a, pv_t b); /* a - b */
pv_t pv_logical_and (pv_t a, pv_t b); /* a & b */
/* Return non-zero iff A and B are identical expressions.
This is not the same as asking if the two values are equal; the
result of such a comparison would have to be a pv_boolean, and
asking whether two 'unknown' values were equal would give you
pv_maybe. Same for comparing, say, { pvk_register, R1, 0 } and {
pvk_register, R2, 0}.
Instead, this function asks whether the two representations are the
same. */
int pv_is_identical (pv_t a, pv_t b);
/* Return non-zero if A is known to be a constant. */
int pv_is_constant (pv_t a);
/* Return non-zero if A is the original value of register number R
plus some constant, zero otherwise. */
int pv_is_register (pv_t a, int r);
/* Return non-zero if A is the original value of register R plus the
constant K. */
int pv_is_register_k (pv_t a, int r, CORE_ADDR k);
/* A conservative boolean type, including "maybe", when we can't
figure out whether something is true or not. */
enum pv_boolean {
pv_maybe,
pv_definite_yes,
pv_definite_no,
};
/* Decide whether a reference to SIZE bytes at ADDR refers exactly to
an element of an array. The array starts at ARRAY_ADDR, and has
ARRAY_LEN values of ELT_SIZE bytes each. If ADDR definitely does
refer to an array element, set *I to the index of the referenced
element in the array, and return pv_definite_yes. If it definitely
doesn't, return pv_definite_no. If we can't tell, return pv_maybe.
If the reference does touch the array, but doesn't fall exactly on
an element boundary, or doesn't refer to the whole element, return
pv_maybe. */
enum pv_boolean pv_is_array_ref (pv_t addr, CORE_ADDR size,
pv_t array_addr, CORE_ADDR array_len,
CORE_ADDR elt_size,
int *i);
/* A 'struct pv_area' keeps track of values stored in a particular
region of memory. */
struct pv_area;
/* Create a new area, tracking stores relative to the original value
of BASE_REG. If BASE_REG is SP, then this effectively records the
contents of the stack frame: the original value of the SP is the
frame's CFA, or some constant offset from it.
Stores to constant addresses, unknown addresses, or to addresses
relative to registers other than BASE_REG will trash this area; see
pv_area_store_would_trash. */
struct pv_area *make_pv_area (int base_reg);
/* Free AREA. */
void free_pv_area (struct pv_area *area);
/* Register a cleanup to free AREA. */
struct cleanup *make_cleanup_free_pv_area (struct pv_area *area);
/* Store the SIZE-byte value VALUE at ADDR in AREA.
If ADDR is not relative to the same base register we used in
creating AREA, then we can't tell which values here the stored
value might overlap, and we'll have to mark everything as
unknown. */
void pv_area_store (struct pv_area *area,
pv_t addr,
CORE_ADDR size,
pv_t value);
/* Return the SIZE-byte value at ADDR in AREA. This may return
pv_unknown (). */
pv_t pv_area_fetch (struct pv_area *area, pv_t addr, CORE_ADDR size);
/* Return true if storing to address ADDR in AREA would force us to
mark the contents of the entire area as unknown. This could happen
if, say, ADDR is unknown, since we could be storing anywhere. Or,
it could happen if ADDR is relative to a different register than
the other stores base register, since we don't know the relative
values of the two registers.
If you've reached such a store, it may be better to simply stop the
prologue analysis, and return the information you've gathered,
instead of losing all that information, most of which is probably
okay. */
int pv_area_store_would_trash (struct pv_area *area, pv_t addr);
/* Search AREA for the original value of REGISTER. If we can't find
it, return zero; if we can find it, return a non-zero value, and if
OFFSET_P is non-zero, set *OFFSET_P to the register's offset within
AREA. GDBARCH is the architecture of which REGISTER is a member.
In the worst case, this takes time proportional to the number of
items stored in AREA. If you plan to gather a lot of information
about registers saved in AREA, consider calling pv_area_scan
instead, and collecting all your information in one pass. */
int pv_area_find_reg (struct pv_area *area,
struct gdbarch *gdbarch,
int register,
CORE_ADDR *offset_p);
/* For every part of AREA whose value we know, apply FUNC to CLOSURE,
the value's address, its size, and the value itself. */
void pv_area_scan (struct pv_area *area,
void (*func) (void *closure,
pv_t addr,
CORE_ADDR size,
pv_t value),
void *closure);
#endif /* PROLOGUE_VALUE_H */