Since commit 359efc2d89 ("[gdb/testsuite] Make gdb.base/annota1.exp more
robust") we see this fail with target board unix/-fPIE/-pie:
...
FAIL: gdb.base/annota1.exp: run until main breakpoint (timeout)
...
The problem is that the commit makes the number and order of matched
annotations fixed, while between target boards unix and unix/-fPIE/-pie there
is a difference:
...
\032\032post-prompt
Starting program: outputs/gdb.base/annota1/annota1
+\032\032breakpoints-invalid
+
\032\032starting
\032\032frames-invalid
...
Fix this by optionally matching the additional annotation.
Tested on x86_64-linux.
As reported in PR29043, when running test-case gdb.dwarf2/dw2-lines.exp with
target board unix/-m32/-fPIE/-pie, we run into:
...
Breakpoint 2, 0x56555540 in bar ()^M
(gdb) PASS: gdb.dwarf2/dw2-lines.exp: cv=2: cdw=32: lv=2: ldw=32: \
continue to breakpoint: foo \(1\)
next^M
Single stepping until exit from function bar,^M
which has no line number information.^M
0x56555587 in main ()^M
(gdb) FAIL: gdb.dwarf2/dw2-lines.exp: cv=2: cdw=32: lv=2: ldw=32: \
next to foo (2)
...
The problem is that the bar breakpoint ends up at an unexpected location
because:
- the synthetic debug info is incomplete and doesn't provide line info
for the prologue part of the function, so consequently gdb uses the i386
port prologue skipper to get past the prologue
- the i386 port prologue skipper doesn't get past a get_pc_thunk call.
Work around this in the test-case by breaking on bar_label instead.
Tested on x86_64-linux with target boards unix, unix/-m32, unix/-fPIE/-pie and
unix/-m32/-fPIE/-pie.
I noticed that a build of GDB with GCC + --enable-ubsan, testing
against GDBserver showed this GDB crash:
(gdb) PASS: gdb.trace/trace-condition.exp: trace: 0x00abababcdcdcdcd << 46 == 0x7373400000000000: advance to trace begin
tstart
../../src/gdb/valarith.c:1365:15: runtime error: left shift of 48320975398096333 by 46 places cannot be represented in type 'long int'
ERROR: GDB process no longer exists
GDB process exited with wait status 269549 exp9 0 1
UNRESOLVED: gdb.trace/trace-condition.exp: trace: 0x00abababcdcdcdcd << 46 == 0x7373400000000000: start trace experiment
The problem is that, "0x00abababcdcdcdcd << 46" is an undefined signed
left shift, because the result is not representable in the type of the
lhs, which is signed. This actually became defined in C++20, and if
you compile with "g++ -std=c++20 -Wall", you'll see that GCC no longer
warns about it, while it warns if you specify prior language versions.
While at it, there are a couple other situations that are undefined
(and are still undefined in C++20) and result in GDB dying: shifting
by a negative ammount, or by >= than the bit size of the promoted lhs.
For the latter, GDB shifts using (U)LONGEST internally, so you have to
shift by >= 64 bits to see it:
$ gdb --batch -q -ex "p 1 << -1"
../../src/gdb/valarith.c:1365:15: runtime error: shift exponent -1 is negative
$ # gdb exited
$ gdb --batch -q -ex "p 1 << 64"
../../src/gdb/valarith.c:1365:15: runtime error: shift exponent 64 is too large for 64-bit type 'long int'
$ # gdb exited
Also, right shifting a negative value is implementation-defined
(before C++20, after which it is defined). For this, I chose to
change nothing in GDB other than adding tests, as I don't really know
whether we need to do anything. AFAIK, most implementations do an
arithmetic right shift, and it may be we don't support any host or
target that behaves differently. Plus, this becomes defined in C++20
exactly as arithmetic right shift.
Compilers don't error out on such shifts, at best they warn, so I
think GDB should just continue doing the shifts anyhow too.
Thus:
- Adjust scalar_binop to avoid the undefined paths, either by adding
explicit result paths, or by casting the lhs of the left shift to
unsigned, as appropriate.
For the shifts by a too-large count, I made the result be what you'd
get if you split the large count in a series of smaller shifts.
Thus:
Left shift, positive or negative lhs:
V << 64
=> V << 16 << 16 << 16 << 16
=> 0
Right shift, positive lhs:
Vpos >> 64
=> Vpos >> 16 >> 16 >> 16 >> 16
=> 0
Right shift, negative lhs:
Vneg >> 64
=> Vneg >> 16 >> 16 >> 16 >> 16
=> -1
This is actually Go's semantics (the compiler really emits
instructions to make it so that you get 0 or -1 if you have a
too-large shift). So for that language GDB does the shift and
nothing else. For other C-like languages where such a shift is
undefined, GDB warns in addition to performing the shift.
For shift by a negative count, for Go, this is a hard error. For
other languages, since their compilers only warn, I made GDB warn
too. The semantics I chose (we're free to pick them since this is
undefined behavior) is as-if you had shifted by the count cast to
unsigned, thus as if you had shifted by a too-large count, thus the
same as the previous scenario illustrated above.
Examples:
(gdb) set language go
(gdb) p 1 << 100
$1 = 0
(gdb) p -1 << 100
$2 = 0
(gdb) p 1 >> 100
$3 = 0
(gdb) p -1 >> 100
$4 = -1
(gdb) p -2 >> 100
$5 = -1
(gdb) p 1 << -1
left shift count is negative
(gdb) set language c
(gdb) p -2 >> 100
warning: right shift count >= width of type
$6 = -1
(gdb) p -2 << 100
warning: left shift count >= width of type
$7 = 0
(gdb) p 1 << -1
warning: left shift count is negative
$8 = 0
(gdb) p -1 >> -1
warning: right shift count is negative
$9 = -1
- The warnings' texts are the same as what GCC prints.
- Add comprehensive tests in a new gdb.base/bitshift.exp testcase, so
that we exercise all these scenarios.
Change-Id: I8bcd5fa02de3114b7ababc03e65702d86ec8d45d
This commit ports these two fixes to the C parser:
commit ebf13736b4
CommitDate: Thu Sep 4 21:46:28 2014 +0100
parse_number("0") reads uninitialized memory
commit 20562150d8
CommitDate: Wed Oct 3 15:19:06 2018 -0600
Avoid undefined behavior in parse_number
... to the Fortran, Go, and Fortran number parsers, fixing the same
problems there.
Also add a new testcase that exercises printing 0xffffffffffffffff
(max 64-bit) in all languages, which crashes a GDB built with UBsan
without the fix.
I moved get_set_option_choices out of all-architectures.exp.tcl to
common code to be able to extract all the supported languages. I did
a tweak to it to generalize it a bit -- you now have to pass down the
"set" part of the command as well. This is so that the proc can be
used with "maintenance set" commands as well in future.
Change-Id: I8e8f2fdc1e8407f63d923c26fd55d98148b9e16a
I see this failure:
(gdb) run ^M
Starting program: /home/smarchi/build/binutils-gdb/gdb/testsuite/outputs/gdb.dwarf2/dw2-inline-param/dw2-inline-param ^M
Warning:^M
Cannot insert breakpoint 1.^M
Cannot access memory at address 0x113b^M
^M
(gdb) FAIL: gdb.dwarf2/dw2-inline-param.exp: runto: run to *0x113b
The test loads the binary in GDB, grabs the address of a symbol, strips
the binary, reloads it in GDB, runs the program, and then tries to place
a breakpoint at that address. The problem is that the binary is built
as position independent, so the address GDB grabs in the first place
isn't where the code ends up after running.
Fix this by linking the binary as non-position-independent. The
alternative would be to compute the relocated address where to place the
breakpoint, but that's not very straightforward, unfortunately.
I was confused for a while, I was trying to load the binary in GDB
manually to get the symbol address, but GDB was telling me the symbol
could not be found. Meanwhile, it clearly worked in gdb.log. The thing
is that GDB strips the binary in-place, so we don't have access to the
intermediary binary with symbols. Change the test to output the
stripped binary to a separate file instead.
Change-Id: I66c56293df71b1ff49cf748d6784bd0e935211ba
Add the print of the base-class of an extended type to the output of
ptype. This requires the Fortran compiler to emit DW_AT_inheritance
for the extended type.
Co-authored-by: Nils-Christian Kempke <nils-christian.kempke@intel.com>
Fortran 2003 supports type extension. This patch allows access
to inherited members by using their fully qualified name as described
in the Fortran standard.
In doing so the patch also fixes a bug in GDB when trying to access the
members of a base class in a derived class via the derived class' base
class member.
This patch fixes PR22497 and PR26373 on GDB side.
Using the example Fortran program from PR22497
program mvce
implicit none
type :: my_type
integer :: my_int
end type my_type
type, extends(my_type) :: extended_type
end type extended_type
type(my_type) :: foo
type(extended_type) :: bar
foo%my_int = 0
bar%my_int = 1
print*, foo, bar
end program mvce
and running this with GDB and setting a BP at 17:
Before:
(gdb) p bar%my_type
A syntax error in expression, near `my_type'.
(gdb) p bar%my_int
There is no member named my_int.
(gdb) p bar%my_type%my_int
A syntax error in expression, near `my_type%my_int'.
(gdb) p bar
$1 = ( my_type = ( my_int = 1 ) )
After:
(gdb) p bar%my_type
$1 = ( my_int = 1 )
(gdb) p bar%my_int
$2 = 1 # this line requires DW_TAG_inheritance to work
(gdb) p bar%my_type%my_int
$3 = 1
(gdb) p bar
$4 = ( my_type = ( my_int = 1 ) )
In the above example "p bar%my_int" requires the compiler to emit
information about the inheritance relationship between extended_type
and my_type which gfortran and flang currently do not de. The
respective issue gcc/49475 has been put as kfail.
Co-authored-by: Nils-Christian Kempke <nils-christian.kempke@intel.com>
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=26373https://sourceware.org/bugzilla/show_bug.cgi?id=22497
Straightforward change, return an std::string instead of a
gdb::unique_xmalloc_ptr<char>. No behavior change expected.
Change-Id: Ia5e94c94221c35f978bb1b7bdffbff7209e0520e
Commit:
commit df7a7bdd97
Date: Thu Mar 17 18:56:23 2022 +0000
gdb: add support for Fortran's ASSUMED RANK arrays
Added support for Fortran assumed rank arrays. Unfortunately, this
commit contained a bug that means though GDB can correctly calculate
the rank of an assumed rank array, GDB can't fetch the contents of an
assumed rank array.
The history of this patch can be seen on the mailing list here:
https://sourceware.org/pipermail/gdb-patches/2022-January/185306.html
The patches that were finally committed can be found here:
https://sourceware.org/pipermail/gdb-patches/2022-March/186906.html
The original patches did support fetching the array contents, it was
only the later series that introduced the regression.
The problem is that when calculating the array rank the result is a
count of the number of ranks, i.e. this is a 1 based result, 1, 2, 3,
etc.
In contrast, when computing the details of any particular rank the
value passed to the DWARF expression evaluator should be a 0 based
rank offset, i.e. a 0 based number, 0, 1, 2, etc.
In the patches that were originally merged, this was not the case, and
we were passing the 1 based rank number to the expression evaluator,
e.g. passing 1 when we should pass 0, 2 when we should pass 1, etc.
As a result the DWARF expression evaluator was reading the
wrong (undefined) memory, and returning garbage results.
In this commit I have extended the test case to cover checking the
array contents, I've then ensured we make use of the correct rank
value, and extended some comments, and added or adjusted some asserts
as appropriate.
Make gdb_compile handle a new "macros" option, which makes it pass the
appropriate flag to make the compiler include macro information in the
debug info. This will help simplify tests using macros, reduce
redundant code, and make it easier to add support for a new compiler.
Right now it only handles clang specially (using -fdebug-macro) and
falls back to -g3 otherwise (which works for gcc). Other compilers can
be added as needed.
There are some tests that are currently skipped if the compiler is nor
gcc nor clang. After this patch, the tests will attempt to run (the -g3
fall back will be used). That gives a chance to people using other
compilers to notice something is wrong and maybe add support for their
compiler. If it is needed to support a compiler that doesn't have a way
to include macro information, then we can always introduce a
"skip_macro_tests" that can be used to skip over them.
Change-Id: I50cd6ab1bfbb478c1005486408e214b551364c9b
On openSUSE Tumbleweed I run into:
...
FAIL: gdb.base/annota1.exp: run until main breakpoint (timeout)
...
The problem is that the libthread_db message occurs at a location where it's
not expected:
...
Starting program: outputs/gdb.base/annota1/annota1 ^M
^M
^Z^Zstarting^M
^M
^Z^Zframes-invalid^M
[Thread debugging using libthread_db enabled]^M
Using host libthread_db library "/lib64/libthread_db.so.1".^M
^M
^Z^Zbreakpoints-invalid^M
^M
...
Fix this by making the matching more robust:
- rewrite the regexp such that each annotation is on a single line,
starting with \r\n\032\032 and ending with \r\n
- add a regexp variable optional_re, that matches all possible optional
output, and use it as a separator in the first part of the regexp
Tested on x86_64-linux.
By calling `uplevel $body` in the program proc (a pattern we use at many
places), we can get rid of curly braces around each line number program
directive. That seems like a nice small improvement to me.
Change-Id: Ib327edcbffbd4c23a08614adee56c12ea25ebc0b
Same idea as previous patch, but for symtab::objfile. I find
it clearer without this wrapper, as it shows that the objfile is
common to all symtabs of a given compunit. Otherwise, you could think
that each symtab (of a given compunit) can have a specific objfile.
Change-Id: Ifc0dbc7ec31a06eefa2787c921196949d5a6fcc6
symtab::blockvector is a wrapper around compunit_symtab::blockvector.
It is a bit misleadnig, as it gives the impression that a symtab has a
blockvector. Remove it, change all users to fetch the blockvector
through the compunit instead.
Change-Id: Ibd062cd7926112a60d52899dff9224591cbdeebf
I think the symtab::dirname method is bogus, or at least very
misleading. It makes you think that it returns the directory that was
used to find that symtab's file during compilation (i.e. the directory
the file refers to in the DWARF line header file table), or the
directory part of the symtab's filename maybe. In fact, it returns the
compilation unit's directory, which is the CWD of the compiler, at
compilation time. At least for DWARF, if the symtab's filename is
relative, it will be relative to that directory. But if the symtab's
filename is absolute, then the directory returned by symtab::dirname has
nothing to do with the symtab's filename.
Remove symtab::dirname to avoid this confusion, change all users to
fetch the same information through the compunit. At least, it will be
clear that this is a compunit property, not a symtab property.
Change-Id: I2894c3bf3789d7359a676db3c58be2c10763f5f0
Change gdb_breakpoint to accept a linespec, not just a function. In
fact, no behavior changes are necessary, this only changes the parameter
name and documentation. Change runto as well, since the two are so
close (runto forwards all its arguments to gdb_breakpoint).
I wrote this for a downstrean GDB port, but thought it could be
useful upstream, eventually, even though not callers take advantage of
it yet.
Change-Id: I08175fd444d5a60df90fd9985e1b5dfd87c027cc
This commit updates the comments in the gdb/reggroups.{c,h} files.
Fill in some missing comments, correct a few comments that were not
clear, and where we had comments duplicated between .c and .h files,
update the .c to reference the .h.
No user visible changes after this commit.
Move 'struct reggroup' into the reggroups.h header. Remove the
reggroup_name and reggroup_type accessor functions, and just use the
name/type member functions within 'struct reggroup', update all uses
of these removed functions.
There should be no user visible changes after this commit.
Convert the 7 global, pre-defined, register groups const, and fix the
fall out (a minor tweak required in riscv-tdep.c).
There should be no user visible changes after this commit.
Convert the reggroup_new and reggroup_gdbarch_new functions to return
a 'const regggroup *', and fix up all the fallout.
There should be no user visible changes after this commit.
Add a new function gdbarch_reggroups that returns a reference to a
vector containing all the reggroups for an architecture.
Make use of this function throughout GDB instead of the existing
reggroup_next and reggroup_prev functions.
Finally, delete the reggroup_next and reggroup_prev functions.
Most of these changes are pretty straight forward, using range based
for loops instead of the old style look using reggroup_next. There
are two places where the changes are less straight forward.
In gdb/python/py-registers.c, the register group iterator needed to
change slightly. As the iterator is tightly coupled to the gdbarch, I
just fetch the register group vector from the gdbarch when needed, and
use an index counter to find the next item from the vector when
needed.
In gdb/tui/tui-regs.c the tui_reg_next and tui_reg_prev functions are
just wrappers around reggroup_next and reggroup_prev respectively.
I've just inlined the logic of the old functions into the tui
functions. As the tui function had its own special twist (wrap around
behaviour) I think this is OK.
There should be no user visible changes after this commit.
Replace manual linked list with a std::vector. This commit doesn't
change the reggroup_next and reggroup_prev API, but that will change
in a later commit.
This commit is focused on the minimal changes needed to manage the
reggroups using a std::vector, without changing the API exposed by the
reggroup.c file.
There should be no user visible changes after this commit.
There's a set of 7 default register groups. If we don't add any
gdbarch specific register groups during gdbarch initialisation, then
when we iterate over the register groups using reggroup_next and
reggroup_prev we will make use of these 7 default groups. See the use
of default_groups in gdb/reggroups.c for details on this.
However, if the gdbarch adds its own groups during gdbarch
initialisation, then these groups will be used in preference to the
default groups.
A problem arises though if the particular architecture makes use of
the target description mechanism. If the default target
description(s) (i.e. those internal to GDB that are used when the user
doesn't provide their own) don't mention any additional register
groups then the default register groups will be used.
But if the target description does mention additional groups then the
default groups are not used, and instead, the groups from the target
description are used.
The problem with this is that what usually happens is that the target
description will mention additional groups, e.g. groups for special
registers. Most architectures that use target descriptions work
around this by adding all (or most) of the default register groups in
all cases. See i386_add_reggroups, aarch64_add_reggroups,
riscv_add_reggroups, xtensa_add_reggroups, and others.
In this patch, my suggestion is that we should just add the default
register groups for every architecture, always. This change is in
gdb/reggroups.c.
All the remaining changes are me updating the various architectures to
not add the default groups themselves.
So, where will this change be visible to the user? I think the
following commands will possibly change:
* info registers / info all-registers:
The user can provide a register group to these commands. For example,
on csky, we previously never added the 'vector' group. Now, as a
default group, this will be available, but (presumably) will not
contain any registers. I don't think this is necessarily a bad
thing, there's something to be said for having some consistent
defaults available. There are other architectures that didn't add
all 7 of the defaults, which will now have gained additional groups.
* maint print reggroups
This prints the set of all available groups. As a maintenance
command I'm less concerned with the output changing here.
Obviously, for the architectures that didn't previously add all the
defaults, this list just got bigger.
* maint print register-groups
This prints all the registers, and the groups they are in. If the
defaults were not previously being added then a register (obviously)
can't appear in one of the default groups. Now the groups are
available then registers might be in more groups than previously.
However, this is again a maintenance command, so I'm less concerned
about this changing.
Start GDB like:
$ gdb -q executable
(gdb) start
(gdb) layout src
... tui windows are now displayed ...
(gdb) tui reg next
At this point the data (register) window should be displayed, but will
contain the message 'Register Values Unavailable', and at the console
you'll see the message "unknown register group 'next'".
The same happens with 'tui reg prev' (but the error message is
slightly different).
At this point you can continue to use 'tui reg next' and/or 'tui reg
prev' and you'll keep getting the error message.
The problem is that when the data (register) window is first
displayed, it's current register group is nullptr. As a consequence
tui_reg_next and tui_reg_prev (tui/tui-regs.c) will always just return
nullptr, which triggers an error in tui_reg_command.
In this commit I change tui_reg_next and tui_reg_prev so that they
instead return the first and last register group respectively if the
current register group is nullptr.
So, after this, using 'tui reg next' will (in the above case) show the
first register group, while 'tui reg prev' will display the last
register group.
While looking at the 'tui reg' command as part of another patch, I
spotted a theoretical bug.
The 'tui reg' command takes the name of a register group, but also
handles partial register group matches, though the partial match has to
be unique. The current command logic goes:
With the code as currently written, if a target description named a
register group either 'prev' or 'next' then GDB would see this as an
ambiguous register name, and refuse to switch groups.
Naming a register group 'prev' or 'next' seems pretty unlikely, but,
by adding a single else block we can prevent this problem.
Now, if there's a 'prev' or 'next' register group, the user will not
be able to select the group directly, the 'prev' and 'next' names will
always iterate through the available groups instead. But at least the
user could select their groups by iteration, rather than direct
selection.
Update reggroup_find to return a const reggroup *.
There are other function in gdb/reggroup.{c,h} files that could
benefit from returning const, these will be updated in later commits.
There should be no user visible changes after this commit.
Change gdbarch_register_reggroup_p to take a 'const struct reggroup *'
argument. This requires a change to the gdb/gdbarch-components.py
script, regeneration of gdbarch.{c,h}, and then updates to all the
architectures that implement this method.
There should be no user visible changes after this commit.
This commit makes the 'struct reggroup *' argument const for the
following functions:
reggroup_next
reggroup_prev
reggroup_name
reggroup_type
There are other places that could benefit from const in the
reggroup.{c,h} files, but these will be changing in further commits.
There should be no user visible changes after this commit.
While working on a different patch, I triggered an assertion from the
initialize_current_architecture code, specifically from one of
the *_gdbarch_init functions in a *-tdep.c file. This exposes a
couple of issues with GDB.
This is easy enough to reproduce by adding 'gdb_assert (false)' into a
suitable function. For example, I added a line into i386_gdbarch_init
and can see the following issue.
I start GDB and immediately hit the assert, the output is as you'd
expect, except for the very last line:
$ ./gdb/gdb --data-directory ./gdb/data-directory/
../../src.dev-1/gdb/i386-tdep.c:8455: internal-error: i386_gdbarch_init: Assertion `false' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
----- Backtrace -----
... snip ...
---------------------
../../src.dev-1/gdb/i386-tdep.c:8455: internal-error: i386_gdbarch_init: Assertion `false' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n) ../../src.dev-1/gdb/ser-event.c:212:16: runtime error: member access within null pointer of type 'struct serial'
Something goes wrong when we try to query the user. Note, I
configured GDB with --enable-ubsan, I suspect that without this the
above "error" would actually just be a crash.
The backtrace from ser-event.c:212 looks like this:
(gdb) bt 10
#0 serial_event_clear (event=0x675c020) at ../../src/gdb/ser-event.c:212
#1 0x0000000000769456 in invoke_async_signal_handlers () at ../../src/gdb/async-event.c:211
#2 0x000000000295049b in gdb_do_one_event () at ../../src/gdbsupport/event-loop.cc:194
#3 0x0000000001f015f8 in gdb_readline_wrapper (
prompt=0x67135c0 "../../src/gdb/i386-tdep.c:8455: internal-error: i386_gdbarch_init: Assertion `false' failed.\nA problem internal to GDB has been detected,\nfurther debugging may prove unreliable.\nQuit this debugg"...)
at ../../src/gdb/top.c:1141
#4 0x0000000002118b64 in defaulted_query(const char *, char, typedef __va_list_tag __va_list_tag *) (
ctlstr=0x2e4eb68 "%s\nQuit this debugging session? ", defchar=0 '\000', args=0x7fffffffa6e0)
at ../../src/gdb/utils.c:934
#5 0x0000000002118f72 in query (ctlstr=0x2e4eb68 "%s\nQuit this debugging session? ")
at ../../src/gdb/utils.c:1026
#6 0x00000000021170f6 in internal_vproblem(internal_problem *, const char *, int, const char *, typedef __va_list_tag __va_list_tag *) (problem=0x6107bc0 <internal_error_problem>, file=0x2b976c8 "../../src/gdb/i386-tdep.c",
line=8455, fmt=0x2b96d7f "%s: Assertion `%s' failed.", ap=0x7fffffffa8e8) at ../../src/gdb/utils.c:417
#7 0x00000000021175a0 in internal_verror (file=0x2b976c8 "../../src/gdb/i386-tdep.c", line=8455,
fmt=0x2b96d7f "%s: Assertion `%s' failed.", ap=0x7fffffffa8e8) at ../../src/gdb/utils.c:485
#8 0x00000000029503b3 in internal_error (file=0x2b976c8 "../../src/gdb/i386-tdep.c", line=8455,
fmt=0x2b96d7f "%s: Assertion `%s' failed.") at ../../src/gdbsupport/errors.cc:55
#9 0x000000000122d5b6 in i386_gdbarch_init (info=..., arches=0x0) at ../../src/gdb/i386-tdep.c:8455
(More stack frames follow...)
It turns out that the problem is that the async event handler
mechanism has been invoked, but this has not yet been initialized.
If we look at gdb_init (in gdb/top.c) we can indeed see the call to
gdb_init_signals is after the call to initialize_current_architecture.
If I reorder the calls, moving gdb_init_signals earlier, then the
initial error is resolved, however, things are still broken. I now
see the same "Quit this debugging session? (y or n)" prompt, but when
I provide an answer and press return GDB immediately crashes.
So what's going on now? The next problem is that the call_readline
field within the current_ui structure is not initialized, and this
callback is invoked to process the reply I entered.
The problem is that call_readline is setup as a result of calling
set_top_level_interpreter, which is called from captured_main_1.
Unfortunately, set_top_level_interpreter is called after gdb_init is
called.
I wondered how to solve this problem for a while, however, I don't
know if there's an easy "just reorder some lines" solution here.
Looking through captured_main_1 there seems to be a bunch of
dependencies between printing various things, parsing config files,
and setting up the interpreter. I'm sure there is a solution hiding
in there somewhere.... I'm just not sure I want to spend any longer
looking for it.
So.
I propose a simpler solution, more of a hack/work-around. In utils.c
we already have a function filtered_printing_initialized, this is
checked in a few places within internal_vproblem. In some of these
cases the call gates whether or not GDB will query the user.
My proposal is to add a new readline_initialized function, which
checks if the current_ui has had readline initialized yet. If this is
not the case then we should not attempt to query the user.
After this change GDB prints the error message, the backtrace, and
then aborts (including dumping core). This actually seems pretty sane
as, if GDB has not yet made it through the initialization then it
doesn't make much sense to allow the user to say "no, I don't want to
quit the debug session" (I think).
$ objdump -d outputs/gdb.base/varargs/varargs
00000001200012e8 <find_max_float_real>:
...
1200013b8: c7c10000 lwc1 $f1,0(s8)
1200013bc: c7c00004 lwc1 $f0,4(s8)
1200013c0: 46000886 mov.s $f2,$f1
1200013c4: 46000046 mov.s $f1,$f0
1200013c8: 46001006 mov.s $f0,$f2
1200013cc: 46000886 mov.s $f2,$f1
1200013d0: 03c0e825 move sp,s8
1200013d4: dfbe0038 ld s8,56(sp)
1200013d8: 67bd0080 daddiu sp,sp,128
1200013dc: 03e00008 jr ra
1200013e0: 00000000 nop
From the above disassembly, we can see that when the return value of the
function is a complex type and len <= 2 * MIPS64_REGSIZE, the return value
will be passed through $f0 and $f2, so fix the corresponding processing
in mips_n32n64_return_value().
$ make check RUNTESTFLAGS='GDB=../gdb gdb.base/varargs.exp --outdir=test'
Before applying the patch:
FAIL: gdb.base/varargs.exp: print find_max_float_real(4, fc1, fc2, fc3, fc4)
FAIL: gdb.base/varargs.exp: print find_max_double_real(4, dc1, dc2, dc3, dc4)
# of expected passes 9
# of unexpected failures 2
After applying the patch:
# of expected passes 11
This also fixes:
FAIL: gdb.base/callfuncs.exp: call inferior func with struct - returns float _Complex
Signed-off-by: Youling Tang <tangyouling@loongson.cn>
Co-Authored-By: Maciej W. Rozycki <macro@orcam.me.uk>
This changes jit.c to use new and delete, rather than XCNEW. This
simplifies the code a little. This was useful for another patch I'm
working on, and I thought it would make sense to send it separately.
Regression tested on x86-64 Fedora 34.
Bug 28980 shows that trying to value_copy an entirely optimized out
value causes an internal error. The original bug report involves MI and
some Python pretty printer, and is quite difficult to reproduce, but
another easy way to reproduce (that is believed to be equivalent) was
proposed:
$ ./gdb -q -nx --data-directory=data-directory -ex "py print(gdb.Value(gdb.Value(5).type.optimized_out()))"
/home/smarchi/src/binutils-gdb/gdb/value.c:1731: internal-error: value_copy: Assertion `arg->contents != nullptr' failed.
This is caused by 5f8ab46bc6 ("gdb: constify parameter of
value_copy"). It added an assertion that the contents buffer is
allocated if the value is not lazy:
if (!value_lazy (val))
{
gdb_assert (arg->contents != nullptr);
This was based on the comment on value::contents, which suggest that
this is the case:
/* Actual contents of the value. Target byte-order. NULL or not
valid if lazy is nonzero. */
gdb::unique_xmalloc_ptr<gdb_byte> contents;
However, it turns out that it can also be nullptr also if the value is
entirely optimized out, for example on exit of
allocate_optimized_out_value. That function creates a lazy value, marks
the entire value as optimized out, and then clears the lazy flag. But
contents remains nullptr.
This wasn't a problem for value_copy before, because it was calling
value_contents_all_raw on the input value, which caused contents to be
allocated before doing the copy. This means that the input value to
value_copy did not have its contents allocated on entry, but had it
allocated on exit. The result value had it allocated on exit. And that
we copied bytes for an entirely optimized out value (i.e. meaningless
bytes).
From here I see two choices:
1. respect the documented invariant that contents is nullptr only and
only if the value is lazy, which means making
allocate_optimized_out_value allocate contents
2. extend the cases where contents can be nullptr to also include
values that are entirely optimized out (note that you could still
have some entirely optimized out values that do have contents
allocated, it depends on how they were created) and adjust
value_copy accordingly
Choice #1 is safe, but less efficient: it's not very useful to allocate
a buffer for an entirely optimized out value. It's even a bit less
efficient than what we had initially, because values coming out of
allocate_optimized_out_value would now always get their contents
allocated.
Choice #2 would be more efficient than what we had before: giving an
optimized out value without allocated contents to value_copy would
result in an optimized out value without allocated contents (and the
input value would still be without allocated contents on exit). But
it's more risky, since it's difficult to ensure that all users of the
contents (through the various_contents* accessors) are all fine with
that new invariant.
In this patch, I opt for choice #2, since I think it is a better
direction than choice #1. #1 would be a pessimization, and if we go
this way, I doubt that it will ever be revisited, it will just stay that
way forever.
Add a selftest to test this. I initially started to write it as a
Python test (since the reproducer is in Python), but a selftest is more
straightforward.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28980
Change-Id: I6e2f5c0ea804fafa041fcc4345d47064b5900ed7
Implement the "init" method of struct tramp_frame to prepend tramp
frame unwinder for signal on LoongArch.
With this patch, the following failed testcases can be fixed:
FAIL: gdb.base/annota1.exp: backtrace @ signal handler (timeout)
FAIL: gdb.base/annota3.exp: backtrace @ signal handler (pattern 2)
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Since this commit:
commit 8322445e05
Date: Tue Jun 21 01:11:45 2016 +0100
Introduce interpreter factories
Interpreters should be registered with GDB, not by calling interp_add,
but with a call to interp_factory_register. I've checked the insight
source, and it too has moved over to using interp_factory_register.
In this commit I make interp_add static within interps.c.
There should be no user visible change after this commit.
This set of changes enable support for the ARMv8.1-m PACBTI extensions [1].
The goal of the PACBTI extensions is similar in scope to that of a-profile
PAC/BTI (aarch64 only), but the underlying implementation is different.
One important difference is that the pointer authentication code is stored
in a separate register, thus we don't need to mask/unmask the return address
from a function in order to produce a correct backtrace.
The patch introduces the following modifications:
- Extend the prologue analyser for 32-bit ARM to handle some instructions
from ARMv8.1-m PACBTI: pac, aut, pacg, autg and bti. Also keep track of
return address signing/authentication instructions.
- Adds code to identify object file attributes that indicate the presence of
ARMv8.1-m PACBTI (Tag_PAC_extension, Tag_BTI_extension, Tag_PACRET_use and
Tag_BTI_use).
- Adds support for DWARF pseudo-register RA_AUTH_CODE, as described in the
aadwarf32 [2].
- Extends the dwarf unwinder to track the value of RA_AUTH_CODE.
- Decorates backtraces with the "[PAC]" identifier when a frame has signed
the return address.
- Makes GDB aware of a new XML feature "org.gnu.gdb.arm.m-profile-pacbti". This
feature is not included as an XML file on GDB's side because it is only
supported for bare metal targets.
- Additional documentation.
[1] https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/armv8-1-m-pointer-authentication-and-branch-target-identification-extension
[2] https://github.com/ARM-software/abi-aa/blob/main/aadwarf32/aadwarf32.rst
While working on the disassembler I was getting frustrated. Every
time I touched disasm.h it seemed like every file in GDB would need to
be rebuilt. Surely the disassembler can't be required by that many
parts of GDB, right?
Turns out that disasm.h is included in target.h, so pretty much every
file was being rebuilt!
The only thing from disasm.h that target.h needed is the
gdb_disassembly_flag enum, as this is part of the target_ops api.
In this commit I move gdb_disassembly_flag into its own file. This is
then included in target.h and disasm.h, after which, the number of
files that depend on disasm.h is much reduced.
I also audited all the other includes of disasm.h and found that the
includes in mep-tdep.c and python/py-registers.c are no longer needed,
so I've removed these.
Now, after changing disasm.h, GDB rebuilds much quicker.
There should be no user visible changes after this commit.
Simon pointed out that timestamped_file probably needed to implement a
few more methods. This patch introduces a new file-wrapping file that
forwards most of its calls, making it simpler to implement new such
files. It also converts timestamped_file and pager_file to use it.
Regression tested on x86-64 Fedora 34.
I don't think there's any need to call init_thread_list in
windows-nat.c. This patch removes it. I tested this using the
internal AdaCore test suite on Windows, which FWIW does include some
multi-threaded inferiors.
Tom de Vries reported some failures in this test:
continue
Continuing.
[New inferior 2 (process 14967)]
Thread 1.1 "vfork-follow-pa" hit Breakpoint 2, break_parent () at /home/vries/gdb_versions/devel/src/gdb/testsuite/gdb.base/vfork-follow-parent.c:23
23 }
(gdb) FAIL: gdb.base/vfork-follow-parent.exp: resolution_method=schedule-multiple: continue to end of inferior 2
inferior 1
[Switching to inferior 1 [process 14961] (/home/vries/gdb_versions/devel/build/gdb/testsuite/outputs/gdb.base/vfork-follow-parent/vfork-follow-parent)]
[Switching to thread 1.1 (process 14961)]
#0 break_parent () at /home/vries/gdb_versions/devel/src/gdb/testsuite/gdb.base/vfork-follow-parent.c:23
23 }
(gdb) PASS: gdb.base/vfork-follow-parent.exp: resolution_method=schedule-multiple: inferior 1
continue
Continuing.
[Inferior 2 (process 14967) exited normally]
(gdb) FAIL: gdb.base/vfork-follow-parent.exp: resolution_method=schedule-multiple: continue to break_parent (the program exited)
Here, we continue both the vfork parent and child, since
schedule-multiple is on. The child exits, which un-freezes the parent
and makes an exit event available to GDB. We expect GDB to consume this
exit event and present it to the user. Here, we see that GDB shows the
parent hitting a breakpoint before showing the child exit.
Because of the vfork, we know that chronologically, the child exiting
must have happend before the parent hitting a breakpoint. However,
scheduling being what it is, it is possible for the parent to un-freeze
and exit quickly, such that when GDB pulls events out of the kernel,
exit events for both processes are available. And then, GDB may chose
at random to return the one for the parent first. This is what I
imagine what causes the failure shown above.
We could change the test to expect both possible outcomes, but I wanted
to avoid complicating the .exp file that way. Instead, add a variable
that the parent loops on that we set only after we confirmed the exit of
the child. That should ensure that the order is always the same.
Note that I wasn't able to reproduce the failure, so I can't tell if
this fix really fixes the problem.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29021
Change-Id: Ibc8e527e0e00dac54b22021fe4d9d8ab0f3b28ad
I got failures like this once on a CI:
frame^M
&"frame\n"^M
~"#0 child_sub_function () at /home/jenkins/workspace/binutils-gdb_master_build/arch/amd64/target_board/unix/src/binutils-gdb/gdb/testsuite/gdb.mi/user-selected-context-sync.c:33\n"^M
~"33\t dummy = !dummy; /* thread loop line */\n"^M
^done^M
(gdb) ^M
FAIL: gdb.mi/mi-cmd-user-context.exp: frame 1 (unexpected output)
The problem is that the test expects the following regexp:
".*#0 0x.*"
And that typically works, when the output of the frame command looks
like:
#0 0x00005555555551bb in child_sub_function () at ...
Note the lack of hexadecimal address in the failing case. Whether or
not the hexadecimal address is printed (roughly) depends on whether the
current PC is at the beginning of a line. So depending on where thread
2 was when GDB stopped it (after thread 1 hit its breakpoint), we can
get either output. Adjust the regexps to not expect an hexadecimal
prefix (0x) but a function name instead (either child_sub_function or
child_function). That one is always printed, and is also a good check
that we are in the frame we expect.
Note that for test "frame 5", we are showing a pthread frame (on my
system), so the function name is internal to pthread, not something we
can rely on. In that case, it's almost certain that we are not at the
beginning of a line, or that we don't have debug info, so I think it's
fine to expect the hex prefix.
And for test "frame 6", it's ok to _not_ expect a hex prefix (what the
test currently does), since we are showing thread 1, which has hit a
breakpoint placed at the beginning of a line.
When testing this, Tom de Vries pointed out that the current test code
doesn't ensure that the child threads are in child_sub_function when
they are stopped. If the scheduler chooses so, it is possible for the
child threads to be still in the pthread_barrier_wait or child_function
functions when they get stopped. So that would be another racy failure
waiting to happen.
The only way I can think of to ensure the child threads are in the
child_sub_function function when they get stopped is to synchronize the
threads using some variables instead of pthread_barrier_wait. So,
replace the barrier with an array of flags (one per child thread). Each
child thread flips its flag in child_sub_function to allow the main
thread to make progress and eventually hit the breakpoint.
I copied user-selected-context-sync.c to a new mi-cmd-user-context.c and
made modifications to that, to avoid interfering with
user-selected-context-sync.exp.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=29025
Change-Id: I919673bbf9927158beb0e8b7e9e980b8d65eca90
Someone at IRC spotted a bug in qRcmd handling. This looks like an oversight
or it is that way for historical reasons.
The code in gdb/remote.c:remote_target::rcmd uses isdigit instead of
isxdigit. One could argue that we are expecting decimal numbers, but further
below we use fromhex ().
Update the function to use isxdigit instead and also update the documentation.
I see there are lots of other cases of undocumented number format for error
messages, mostly described as NN instead of nn. For now I'll just update
this particular function.
The test introduced by this patch would fail in this configuration, with
the native-gdbserver or native-extended-gdbserver boards:
FAIL: gdb.threads/next-fork-other-thread.exp: fork_func=fork: target-non-stop=auto: non-stop=off: displaced-stepping=auto: i=2: next to for loop
The problem is that the step operation is forgotten when handling the
fork/vfork. With "debug infrun" and "debug remote", it looks like this
(some lines omitted for brevity). We do the next:
[infrun] proceed: enter
[infrun] proceed: addr=0xffffffffffffffff, signal=GDB_SIGNAL_DEFAULT
[infrun] resume_1: step=1, signal=GDB_SIGNAL_0, trap_expected=0, current thread [4154304.4154304.0] at 0x5555555553bf
[infrun] do_target_resume: resume_ptid=4154304.0.0, step=1, sig=GDB_SIGNAL_0
[remote] Sending packet: $vCont;r5555555553bf,5555555553c4:p3f63c0.3f63c0;c:p3f63c0.-1#cd
[infrun] proceed: exit
We then handle a fork event:
[infrun] fetch_inferior_event: enter
[remote] wait: enter
[remote] Packet received: T05fork:p3f63ee.3f63ee;06:0100000000000000;07:b08e59f6ff7f0000;10:bf60e8f7ff7f0000;thread:p3f63c0.3f63c6;core:17;
[remote] wait: exit
[infrun] print_target_wait_results: target_wait (-1.0.0 [process -1], status) =
[infrun] print_target_wait_results: 4154304.4154310.0 [Thread 4154304.4154310],
[infrun] print_target_wait_results: status->kind = FORKED, child_ptid = 4154350.4154350.0
[infrun] handle_inferior_event: status->kind = FORKED, child_ptid = 4154350.4154350.0
[remote] Sending packet: $D;3f63ee#4b
[infrun] resume_1: step=0, signal=GDB_SIGNAL_0, trap_expected=0, current thread [4154304.4154310.0] at 0x7ffff7e860bf
[infrun] do_target_resume: resume_ptid=4154304.0.0, step=0, sig=GDB_SIGNAL_0
[remote] Sending packet: $vCont;c:p3f63c0.-1#73
[infrun] fetch_inferior_event: exit
In the first snippet, we resume the stepping thread with the range-stepping (r)
vCont command. But after handling the fork (detaching the fork child), we
resumed the whole process freely. The stepping thread, which was paused by
GDBserver while reporting the fork event, was therefore resumed freely, instead
of confined to the addresses of the stepped line. Note that since this
is a "next", it could be that we have entered a function, installed a
step-resume breakpoint, and it's ok to continue freely the stepping
thread, but that's not the case here. The two snippets shown above were
next to each other in the logs.
For the fork case, we can resume stepping right after handling the
event.
However, for the vfork case, where we are waiting for the
external child process to exec or exit, we only resume the thread that
called vfork, and keep the others stopped (see patch "gdb: fix handling of
vfork by multi-threaded program" prior in this series). So we can't
resume the stepping thread right now. Instead, do it after handling the
vfork-done event.
Change-Id: I92539c970397ce880110e039fe92b87480f816bd