Using 'x/hf' should print bytes as float16, but instead it currently
prints as an integer. I tracked this down to a missing case in
float_type_from_length.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30161
Approved-By: Simon Marchi <simon.marchi@efficios.com>
As described in C++ reference [1], static data members are not part
of objects of a given class type. Modified compute_struct_member ()
to ignore static data member so that we can get the expected result.
loongson@linux:~$ cat test.c
#include<stdio.h>
struct struct_01 { static unsigned a; float b;};
unsigned struct_01::a = 66;
struct struct_01 struct_01_val = { 99.00 };
int check_arg_struct(struct struct_01 arg)
{
printf("arg.a = %d\n", arg.a);
printf("arg.b = %f\n", arg.b);
return 0;
}
int main()
{
check_arg_struct(struct_01_val);
return 0;
}
loongson@linux:~$ g++ -g test.c -o test++
loongson@linux:~$ gdb test++
Without this patch:
...
(gdb) start
...
(gdb) p check_arg_struct(struct_01_val)
arg.a = 66
arg.b = 0.000000
$1 = 0
With this patch:
...
(gdb) start
...
(gdb) p check_arg_struct(struct_01_val)
arg.a = 66
arg.b = 99.000000
$1 = 0
[1] https://learn.microsoft.com/en-us/cpp/cpp/static-members-cpp?view=msvc-170
Signed-off-by: Hui Li <lihui@loongson.cn>
Reviewed-By: Tom Tromey <tom@tromey.com>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Writing out zeros is counterproductive if a file system supports
sparse files. A very large gap need not take much actual disk space,
but it usually will if zeros are written.
memory_bseek also supports not writing out zeros in a gap.
* elf.c (write_zeros): Delete.
(assign_file_positions_for_load_sections): Don't call write_zeros.
Comment.
[ This is a simplified rewrite of an earlier submission "[RFC][gdb/symtab] Add
maint set symbol-read-order", submitted here (
https://sourceware.org/pipermail/gdb-patches/2022-September/192044.html
). ]
With the test-case included in this patch, we run into:
...
(gdb) file dwarf2-and-ctf
(gdb) print var_ctf^M
'var_ctf' has unknown type; cast it to its declared type^M
...
The problem is that the executable contains both ctf and dwarf2, so the ctf
info (which contains the type information about var_ctf) is ignored.
GDB has support for handling multiple debug formats, but the common use case
for ctf is to be used when dwarf2 is not present, and gdb reflects that,
assuming that by reading ctf in addition there won't be any extra information,
so it's not worth the additional cycles and memory.
Add a new command "set/show always-read-ctf on/off", that when on forces
unconditional reading of ctf, allowing us to do:
...
(gdb) set always-read-ctf on
(gdb) file dwarf2-and-ctf
(gdb) print var_ctf^M
$2 = 2^M
...
The setting is off by default, preserving current behaviour.
A bit of background on the relevance of reading order: the formats have a
priority relationship between them, where reading earlier means lower
priority. By reading the format with the most detail last, we ensure it has
the highest priority, which makes sure that in case there is overlapping info,
the most detailed info is found. This explains the current reading order of
mdebug, stabs and dwarf2.
Add the unconditional reading of ctf before dwarf2, because it's less detailed
than dwarf2. The conditional reading of ctf is still done after the attempt to
read dwarf2, necessarily so because we only know whether there's dwarf2 after
we've tried to read it.
The new command allow us to replace uses of -Wl,--strip-debug added in commit
908a926ec4 ("[gdb/testsuite] Fix ctf test-cases on openSUSE Tumbleweed") by
uses of "set always-read-ctf on", but I've left that for another commit.
Tested on x86_64-linux.
Reviewed-By: Eli Zaretskii <eliz@gnu.org>
Reviewed-By: Tom Tromey <tom@tromey.com>
The copyright years in the ROCm files (e.g. solib-rocm.c) are wrong,
they end in 2022 instead of 2023. I suppose because I posted (or at
least prepared) the patches in 2022 but merged them in 2023, and forgot
to update the year. I found a bunch of other files that are in the same
situation. Fix them all up.
Change-Id: Ia55f5b563606c2ba6a89046f22bc0bf1c0ff2e10
Reviewed-By: Tom Tromey <tom@tromey.com>
Once a baton is stored in a struct type, it doesn't make sense to
modify it. This patch constifies the API.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
gdbtypes treats dynamic property batons as 'void *', but in actuality
the only users all use dwarf2_property_baton. This patch changes this
code to be type-safe. If a new type is needed here, it seems like
that too could be done in a type-safe way.
Approved-By: Simon Marchi <simon.marchi@efficios.com>
Changing mri mode between macro definition and use isn't good. This
.macro x
.endm
.mri 1
x
leads to a segfault. Fixed with the following patch, but I suppose
what should really happen is that macros be marked as being mri mode
when defined, and that determine whether the magic NARG parameter be
supplied at expansion. Nobody has complained about this in 30 years
so I'm not inclined to change gas behaviour to that extent.
* macro.c (macro_expand): Don't segfault in mri mode if NARG
formal isn't found.
Tom de Vries pointed out that my earlier patch:
commit 873a185be2
Date: Fri Dec 16 07:56:57 2022 -0700
Don't use struct buffer in handle_qxfer_btrace
regressed gdb.btrace/reconnect.exp. I didn't notice this because I
did not have libipt installed.
This patch fixes the bug.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30169
Tested-By: Bruno Larsen <blarsen@redhat.com>
I ran into:
...
(gdb) PASS: gdb.python/py-record-btrace.exp: function call: \
python print(c.prev)
python print(c == c.next.prev)^M
Traceback (most recent call last):^M
File "<string>", line 1, in <module>^M
AttributeError: 'NoneType' object has no attribute 'prev'^M
Error while executing Python code.^M
(gdb) FAIL: gdb.python/py-record-btrace.exp: function call: \
python print(c == c.next.prev)
...
due to having only 4 insn instead of 100:
...
python print(len(insn))^M
4^M
...
This could be caused by the same hw bug as we already have an xfail for, so
expand the xfail matching.
Tested on x86_64-linux.
PR testsuite/30185
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30185
Approved-By: Markus T. Metzger <markus.t.metzger@intel.com>
Also fix an error introduced in 1998 in reporting a zero count for
negative counts.
* read.c (s_space): Use unsigned multiply, and catch overflow.
Correct order of tests for invalid repeat counts. Ensure
ignored directives don't affect mri_pending_align.
When debugging GDB, I find it a bit tedious to inspect htab_t objects.
It is possible to find the entries by poking at the fields, but it's
annoying to do each time. I think a pretty printer would help. Add a
basic one to gdb-gdb.py.
The pretty printer advertises itself as "array-like", and the result
looks like:
(top-gdb) p bfcache
$3 = htab_t with 3 elements = {0x6210003252a0, 0x62100032caa0, 0x62100033baa0}
The htab_t itself doesn't know about the type of pointed objects. But
it's easy enough to cast the addresses to the right type to use them:
(top-gdb) print *((btrace_frame_cache *) 0x6210003252a0)
$6 = {tp = 0x61700002ed80, frame = 0x6210003251e0, bfun = 0x62000000b390}
Change-Id: Ia692e3555fe7a117b7ec087840246b1260a704c6
Reviewed-By: Tom Tromey <tom@tromey.com>
On powerpc64le-linux, I run into two timeouts:
...
FAIL: gdb.python/py-breakpoint.exp: test_watchpoints: \
Test watchpoint write (timeout)
FAIL: gdb.python/py-breakpoint.exp: test_bkpt_internal: \
Test watchpoint write (timeout)
...
In this case, hw watchpoints are not supported, and using sw watchpoints
is slow.
Most of the time is spent in handling a try-catch, which triggers a malloc. I
think this bit is more relevant for the "catch throw" part of the test-case,
so fix the timeouts by setting the watchpoints after the try-catch.
Tested on x86_64-linux and powerpc64le-linux.
value_in is unused. From git log, it seems to have been part of the
Chill language, which was removed from gdb eons ago. This patch
removes the function. Tested by rebuilding.
On x86_64-linux, I have:
...
(gdb) watch -location y^M
Hardware watchpoint 2: -location y^M
(gdb) PASS: gdb.rust/watch.exp: watch -location y
...
but on powerpc64le-linux, I run into:
...
(gdb) watch -location y^M
Watchpoint 2: -location y^M
(gdb) FAIL: gdb.rust/watch.exp: watch -location y
...
due to the regexp matching "Hardware watchpoint" but not "Watchpoint":
...
gdb_test "watch -location y" ".*watchpoint .* -location .*"
...
Fix this by making the regexp less restrictive.
Tested on x86_64-linux and powerpc64le-linux.
Background
----------
When a thread-specific breakpoint is deleted as a result of the
specific thread exiting the function remove_threaded_breakpoints is
called which sets the disposition of the breakpoint to
disp_del_at_next_stop and sets the breakpoint number to 0. Setting
the breakpoint number to zero has the effect of hiding the breakpoint
from the user. We also print a message indicating that the breakpoint
has been deleted.
It was brought to my attention during a review of another patch[1]
that setting a breakpoints number to zero will suppress the MI
breakpoint-deleted notification for that breakpoint, and indeed, this
can be seen to be true, in delete_breakpoint, if the breakpoint number
is zero, then GDB will not notify the breakpoint_deleted observer.
It seems wrong that a user created, thread-specific breakpoint, will
have a =breakpoint-created notification, but will not have a
=breakpoint-deleted notification. I suspect that this is a bug.
[1] https://sourceware.org/pipermail/gdb-patches/2023-February/196560.html
The First Problem
-----------------
During my initial testing I wanted to see how GDB handled the
breakpoint after it's number was set to zero. To do this I created
the testcase gdb.threads/thread-bp-deleted.exp. This test creates a
worker thread, which immediately exits. After the worker thread has
exited the main thread spins in a loop.
In GDB I break once the worker thread has been created and place a
thread-specific breakpoint, then use 'continue&' to resume the
inferior in non-stop mode. The worker thread then exits, but the main
thread never stops - instead it sits in the spin. I then tried to use
'maint info breakpoints' to see what GDB thought of the
thread-specific breakpoint.
Unfortunately, GDB crashed like this:
(gdb) continue&
Continuing.
(gdb) [Thread 0x7ffff7c5d700 (LWP 1202458) exited]
Thread-specific breakpoint 3 deleted - thread 2 no longer in the thread list.
maint info breakpoints
... snip some output ...
Fatal signal: Segmentation fault
----- Backtrace -----
0x5ffb62 gdb_internal_backtrace_1
../../src/gdb/bt-utils.c:122
0x5ffc05 _Z22gdb_internal_backtracev
../../src/gdb/bt-utils.c:168
0x89965e handle_fatal_signal
../../src/gdb/event-top.c:964
0x8997ca handle_sigsegv
../../src/gdb/event-top.c:1037
0x7f96f5971b1f ???
/usr/src/debug/glibc-2.30-2-gd74461fa34/nptl/../sysdeps/unix/sysv/linux/x86_64/sigaction.c:0
0xe602b0 _Z15print_thread_idP11thread_info
../../src/gdb/thread.c:1439
0x5b3d05 print_one_breakpoint_location
../../src/gdb/breakpoint.c:6542
0x5b462e print_one_breakpoint
../../src/gdb/breakpoint.c:6702
0x5b5354 breakpoint_1
../../src/gdb/breakpoint.c:6924
0x5b58b8 maintenance_info_breakpoints
../../src/gdb/breakpoint.c:7009
... etc ...
As the thread-specific breakpoint is set to disp_del_at_next_stop, and
GDB hasn't stopped yet, then the breakpoint still exists in the global
breakpoint list.
The breakpoint will not show in 'info breakpoints' as its number is
zero, but it will show in 'maint info breakpoints'.
As GDB prints the breakpoint, the thread-id for the breakpoint is
printed as part of the 'stop only in thread ...' line. Printing the
thread-id involves calling find_thread_global_id to convert the global
thread-id into a thread_info*. Then calling print_thread_id to
convert the thread_info* into a string.
The problem is that find_thread_global_id returns nullptr as the
thread for the thread-specific breakpoint has exited. The
print_thread_id assumes it will be passed a non-nullptr. As a result
GDB crashes.
In this commit I've added an assert to print_thread_id (gdb/thread.c)
to check that the pointed passed in is not nullptr. This assert would
have triggered in the above case before GDB crashed.
MI Notifications: The Dangers Of Changing A Breakpoint's Number
---------------------------------------------------------------
Currently the delete_breakpoint function doesn't trigger the
breakpoint_deleted observer for any breakpoint with the number zero.
There is a comment explaining why this is the case in the code; it's
something about watchpoints. But I did consider just removing the 'is
the number zero' guard and always triggering the breakpoint_deleted
observer, figuring that I'd then fix the watchpoint issue some other
way.
But I realised this wasn't going to be good enough. When the MI
notification was delivered the number would be zero, so any frontend
parsing the notifications would not be able to match
=breakpoint-deleted notification to the earlier =breakpoint-created
notification.
What this means is that, at the point the breakpoint_deleted observer
is called, the breakpoint's number must be correct.
MI Notifications: The Dangers Of Delaying Deletion
--------------------------------------------------
The test I used to expose the above crash also brought another problem
to my attention. In the above test we used 'continue&' to resume,
after which a thread exited, but the inferior didn't stop. Recreating
the same test in the MI looks like this:
-break-insert -p 2 main
^done,bkpt={number="2",type="breakpoint",disp="keep",...<snip>...}
(gdb)
-exec-continue
^running
*running,thread-id="all"
(gdb)
~"[Thread 0x7ffff7c5d700 (LWP 987038) exited]\n"
=thread-exited,id="2",group-id="i1"
~"Thread-specific breakpoint 2 deleted - thread 2 no longer in the thread list.\n"
At this point the we have a single thread left, which is still
running:
-thread-info
^done,threads=[{id="1",target-id="Thread 0x7ffff7c5eb80 (LWP 987035)",name="thread-bp-delet",state="running",core="4"}],current-thread-id="1"
(gdb)
Notice that we got the =thread-exited notification from GDB as soon as
the thread exited. We also saw the CLI line from GDB, the line
explaining that breakpoint 2 was deleted. But, as expected, we didn't
see the =breakpoint-deleted notification.
I say "as expected" because the number was set to zero. But, even if
the number was not set to zero we still wouldn't see the
notification. The MI notification is driven by the breakpoint_deleted
observer, which is only called when we actually delete the breakpoint,
which is only done the next time GDB stops.
Now, maybe this is fine. The notification is delivered a little
late. But remember, by setting the number to zero the breakpoint will
be hidden from the user, for example, the breakpoint is removed from
the MI's -break-info command output.
This means that GDB is in a position where the breakpoint doesn't show
up in the breakpoint table, but a =breakpoint-deleted notification has
not yet been sent out. This doesn't seem right to me.
What this means is that, when the thread exits, we should immediately
be sending out the =breakpoint-deleted notification. We should not
wait for GDB to next stop before sending the notification.
The Solution
------------
My proposed solution is this; in remove_threaded_breakpoints, instead
of setting the disposition to disp_del_at_next_stop and setting the
number to zero, we now just call delete_breakpoint directly.
The notification will now be sent out immediately; as soon as the
thread exits.
As the number has not changed when delete_breakpoint is called, the
notification will have the correct number.
And as the breakpoint is immediately removed from the breakpoint list,
we no longer need to worry about 'maint info breakpoints' trying to
print the thread-id for an exited thread.
My only concern is that calling delete_breakpoint directly seems so
obvious that I wonder why the original patch (that added
remove_threaded_breakpoints) didn't take this approach. This code was
added in commit 49fa26b041, but the commit message offers no clues
to why this approach was taken, and the original email thread offers
no insights either[2]. There are no test regressions after making
this change, so I'm hopeful that this is going to be fine.
[2] https://sourceware.org/pipermail/gdb-patches/2013-September/106493.html
The Complication
----------------
Of course, it couldn't be that simple.
The script gdb.python/py-finish-breakpoint.exp had some regressions
during testing.
The problem was with the FinishBreakpoint.out_of_scope callback
implementation. This callback is supposed to trigger whenever the
FinishBreakpoint goes out of scope; and this includes when the thread
for the breakpoint exits.
The problem I ran into is the Python FinishBreakpoint implementation.
Specifically, after this change I was loosing some of the out_of_scope
calls.
The problem is that the out_of_scope call (of which I'm interested) is
triggered from the inferior_exit observer. Before my change the
observers were called in this order:
thread_exit
inferior_exit
breakpoint_deleted
The inferior_exit would trigger the out_of_scope call.
After my change the breakpoint_deleted notification (for
thread-specific breakpoints) occurs earlier, as soon as the
thread-exits, so now the order is:
thread_exit
breakpoint_deleted
inferior_exit
Currently, after the breakpoint_deleted call the Python object
associated with the breakpoint is released, so, when we get to the
inferior_exit observer, there's no longer a Python object to call the
out_of_scope method on.
My solution is to follow the model for how bpfinishpy_pre_stop_hook
and bpfinishpy_post_stop_hook are called, this is done from
gdbpy_breakpoint_cond_says_stop in py-breakpoint.c.
I've now added a new bpfinishpy_pre_delete_hook
gdbpy_breakpoint_deleted in py-breakpoint.c, and from this new hook
function I check and where needed call the out_of_scope method.
With this fix in place I now see the
gdb.python/py-finish-breakpoint.exp test fully passing again.
Testing
-------
Tested on x86-64/Linux with unix, native-gdbserver, and
native-extended-gdbserver boards.
New tests added to covers all the cases I've discussed above.
Approved-By: Pedro Alves <pedro@palves.net>
I currently see this failure when running the gdb.mi/mi-pending.exp
test using the native-extended-remote board:
-break-insert -f -c x==4 mi-pendshr.c:pendfunc2
&"No source file named mi-pendshr.c.\n"
^done,bkpt={number="2",type="breakpoint",disp="keep",enabled="y",addr="<PENDING>",pending="mi-pendshr.c:pendfunc2",cond="x==4",evaluated-by="host",times="0",original-location="mi-pendshr.c:pendfunc2"}
(gdb)
FAIL: gdb.mi/mi-pending.exp: MI pending breakpoint on mi-pendshr.c:pendfunc2 if x==4 (unexpected output)
The failure is caused by the 'evaluated-by="host"' string, which only
appears in the output when the test is run using the
native-extended-remote board.
I could fix this by just updating the pattern in
gdb.mi/mi-pending.exp, but I have instead updated mi-pending.exp to
make more use of the support procs in mi-support.exp. This did
require making a couple of adjustments to mi-support.exp, but I think
the result is that mi-pending.exp is now easier to read, and I see no
failures with native-extended-remote anymore.
One of the test names has changed after this work, I think the old
test name was wrong - it described a breakpoint as pending when the
breakpoint was not pending, I suspect a copy & paste error.
But there's no changes to what is actually being tested after this
patch.
Approved-By: Pedro Alves <pedro@palves.net>
I noticed that several tests included copy & pasted code to run the
'maint show target-non-stop' command, and then switch based on the
result.
In this commit I factor this code out into a helper proc in
lib/gdb.exp, and update all the places I could find that used this
pattern to make use of the helper proc.
There should be no change in what is tested after this commit.
Reviewed-By: Pedro Alves <pedro@palves.net>
Introduce foreach_mi_ui_mode, a helper proc which can be used when
tests are going to be repeated once with the MI in the main UI, and
once with the MI on a separate UI.
The proc is used like this:
foreach_mi_ui_mode VAR {
# BODY
}
The BODY will be run twice, once with VAR set to 'main' and once with
VAR set to 'separate', inside BODY we can then change the behaviour
based on the current UI mode.
The point of this proc is that we sometimes shouldn't run the separate
UI tests (when gdb_debug_enabled is true), and this proc hides all
this logic. If the separate UI mode should not be used then BODY will
be run just once with VAR set to 'main'.
I've updated two tests that can make use of this helper proc. I'm
going to add another similar test in a later commit.
There should be no change to what is tested with this commit.
Approved-By: Pedro Alves <pedro@palves.net>
The mi_clean_restart proc calls the mi_gdb_start proc passing no
arguments.
In this commit I add an extra (optional) argument to the
mi_clean_restart proc, and pass this through to mi_gdb_start.
The benefit of this is that we can now use mi_clean_restart when we
also want to pass the 'separate-mi-tty' or 'separate-inferior-tty'
flags to mi_gdb_start, and avoids having to otherwise duplicate the
contents of mi_clean_restart in different tests.
I've updated the obvious places where this new functionality can be
used, and I'm seeing no test regressions.
Reviewed-By: Pedro Alves <pedro@palves.net>
Building on the previous commit, now that the breakpoint related
support functions in lib/mi-support.exp can now help creating the
patterns for thread specific breakpoints, make use of this
functionality for gdb.mi/mi-nsmoribund.exp and gdb.mi/mi-pending.exp.
There should be no changes in what is tested after this commit.
Reviewed-By: Pedro Alves <pedro@palves.net>
When creating a thread-specific breakpoint with a single location, the
'thread' field would be repeated in the MI output. This can be seen
in two existing tests gdb.mi/mi-nsmoribund.exp and
gdb.mi/mi-pending.exp, e.g.:
(gdb)
-break-insert -p 1 bar
^done,bkpt={number="1",type="breakpoint",disp="keep",
enabled="y",
addr="0x000000000040110a",func="bar",
file="/tmp/mi-thread-specific-bp.c",
fullname="/tmp/mi-thread-specific-bp.c",
line="32",thread-groups=["i1"],
thread="1",thread="1", <================ DUPLICATION!
times="0",original-location="bar"}
I know we need to be careful when adjusting MI output, but I'm hopeful
in this case, as the field is duplicated, and the field contents are
always identical, that we might get away with removing one of the
duplicates.
The change in GDB is a fairly trivial condition change.
We did have a couple of tests that contained the duplicate fields in
their expected output, but given there was no comment pointing out
this oddity either in the GDB code, or in the test, I suspect this was
more a case of copying whatever output GDB produced and using that as
the expected results. I've updated these tests to remove the
duplication.
I've update lib/mi-support.exp to provide support for building
breakpoint patterns that contain the thread field, and I've made use
of this in a new test I've added that is just about creating
thread-specific breakpoints and checking the results. The two tests I
mentioned above as being updated could also use the new
lib/mi-support.exp functionality, but I'm going to do that in a later
patch, this way it is clear what changes I'm actually proposing to
make to the expected output.
As I said, I hope that frontends will be able to handle this change,
but I still think its worth adding a NEWS entry, that way, if someone
runs into problems, there's a chance they can figure out what's going
on.
This should not impact CLI output at all.
Reviewed-By: Eli Zaretskii <eliz@gnu.org>
Approved-By: Pedro Alves <pedro@palves.net>
Delete an out of date comment about disp_del_at_next_stop, the comment
says:
/* NOTE drow/2003-09-08: This state only exists for removing
watchpoints. It's not clear that it's necessary... */
I'm sure this was true when the comment was added, but today the
disp_del_at_next_stop state is not just used for deleting watchpoints,
which leaves us with "It's not clear that it's necessary...", which
doesn't really help at all.
And then this comment is located on one random place where
disp_del_at_next_stop is used, rather than at its definition site.
Lets just delete the comment.
No user visible changes after this commit.
Reviewed-By: Pedro Alves <pedro@palves.net>
Seen when building binutils with gcc -m32 on x86_64-linux.
chew.c: In function ‘print’:
chew.c:1434:59: warning: format ‘%ld’ expects argument of type ‘long int’, but argument 3 has type ‘intptr_t’ {aka ‘int’} [-Wformat=]
1434 | fprintf (stderr, "print: illegal print destination `%ld'\n", *isp);
| ~~^ ~~~~
| | |
| | intptr_t {aka int}
| long int
| %d
* chew.c: Include inttypes.h.
(print): Use PRIdPTR for *isp.
Microsoft's DIA library, and thus also MSVC and WinDbg, expects section
contributions to be ordered by section number and offset, otherwise it's
unable to resolve line numbers.
This frees memory associated with the mips ecoff find_nearest_line.
* elfxx-mips.x (free_ecoff_debug): New function, extracted from..
(_bfd_mips_elf_read_ecoff_info): ..here. Free ext_hdr earlier.
Don't clear already NULL fdr.
(struct mips_elf_find_line): Move earlier.
(_bfd_mips_elf_close_and_cleanup): Call free_ecoff_debug.
(_bfd_mips_elf_find_nearest_line): Likewise on error paths,
and to clean up input_debug when done.
More anti-fuzzer bounds checking for the ECOFF support. A lot of this
is in ancient code using "long" for counts and sizes, which is why the
patch uses "(long) ((unsigned long) x + 1) > 0" in a few places. The
unsigned long cast is so that "x + 1" doesn't trigger ubsan warnings
about signed integer overflow. It would be a good idea to replace
most of the longs used in binutils with size_t, but that's more than I
care to do for COFF/ECOFF.
* ecofflink.c (mk_fdrtab): Sanity check string offsets.
(lookup_line): Likewise, and symbol indices.
I'd skipped this one before, thinking "obfd, that's the linker output
bfd so no need to test". Wrong, this is objcopy output.
* peXXigen.c (_bfd_XX_bfd_copy_private_bfd_data_common): Test
SEC_HAS_CONTENTS before reading section.
During my audit of the use of gdb_exception with regard to QUIT
processing, I found a try/catch in the scoped_switch_fork_info
destructor.
Static analysis found this call path from the destructor to
maybe_quit():
scoped_switch_fork_info::~scoped_switch_fork_info()
-> remove_breakpoints()
-> remove_breakpoint(bp_location*)
-> remove_breakpoint_1(bp_location*, remove_bp_reason)
-> memory_validate_breakpoint(gdbarch*, bp_target_info*)
-> target_read_memory(unsigned long, unsigned char*, long)
-> target_read(target_ops*, target_object, char const*, unsigned char*, unsigned long, long)
-> maybe_quit()
Since it's not safe to do a 'throw' from a destructor, we simply
call set_quit_flag and, for gdb_exception_forced_quit, also
set sync_quit_force_run. This will cause the appropriate
exception to be rethrown at the next QUIT check.
Another case is the try / catch in tui_getc() in tui-io.c. The
existing catch swallows the exception. I've added a catch for
'gdb_exception_forced_quit', which also swallows the exception,
but also sets sync_quit_force_run and calls set_quit_flag in
order to restart forced quit processing at the next QUIT check.
This is required because it isn't safe to throw into/through
readline.
Thanks to Pedro Alves for suggesting this idea.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=26761
Tested-by: Tom de Vries <tdevries@suse.de>
Approved-By: Pedro Alves <pedro@palves.net>
At the moment, handle_sigterm() in event-top.c does the following:
sync_quit_force_run = 1;
set_quit_flag ();
This was used several more times in a later patch in this series, so
I'm introducing (at Pedro's suggestion) a new function named
'set_force_quit_flag'. It simply sets sync_quit_force_run and also
calls set_quit_flag(). I've revised the later patch to call
set_force_quit_flag instead.
I noticed that sync_quit_force_run is declared as an int but is being
used as a bool, so I also changed its type to bool in this commit.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=26761
Approved-By: Pedro Alves <pedro@palves.net>
This commit contains changes which have an explicit throw for
gdb_exception_forced_quit, or, in a couple of cases for gdb_exception,
but with a throw following a check to see if 'reason' is
RETURN_FORCED_QUIT.
Most of these are straightforward - it made sense to continue to allow
an existing catch of gdb_exception to also catch gdb_exception_quit;
in these cases, a catch/throw for gdb_exception_forced_quit was added.
There are two cases, however, which deserve a more detailed
explanation.
1) remote_fileio_request in gdb/remote-fileio.c:
The try block calls do_remote_fileio_request which can (in turn)
call one of the functions in remote_fio_func_map[]. Taking the
first one, remote_fileio_func_open(), we have the following call
path to maybe_quit():
remote_fileio_func_open(remote_target*, char*)
-> target_read_memory(unsigned long, unsigned char*, long)
-> target_read(target_ops*, target_object, char const*, unsigned char*, unsigned long, long)
-> maybe_quit()
Since there is a path to maybe_quit(), we must ensure that the
catch block is not permitted to swallow a QUIT representing a
SIGTERM.
However, for this case, we must take care not to change the way that
Ctrl-C / SIGINT is handled; we want to send a suitable EINTR reply to
the remote target should that happen. That being the case, I added a
catch/throw for gdb_exception_forced_quit. I also did a bit of
rewriting here, adding a catch for gdb_exception_quit in favor of
checking the 'reason' code in the catch block for gdb_exception.
2) mi_execute_command in gdb/mi/mi-main.c:
The try block calls captured_mi_execute_command(); there exists
a call path to maybe_quit():
captured_mi_execute_command(ui_out*, mi_parse*)
-> mi_cmd_execute(mi_parse*)
-> get_current_frame()
-> get_prev_frame_always_1(frame_info*)
-> frame_register_unwind_location(frame_info*, int, int*, lval_type*, unsigned long*, int*)
-> frame_register_unwind(frame_info*, int, int*, int*, lval_type*, unsigned long*, int*, unsigned char*)
-> value_entirely_available(value*)
-> value_fetch_lazy(value*)
-> value_fetch_lazy_memory(value*)
-> read_value_memory(value*, long, int, unsigned long, unsigned char*, unsigned long)
-> maybe_quit()
That being the case, we can't allow the exception handler (catch block)
to swallow a gdb_exception_quit for SIGTERM. However, it does seem
reasonable to output the exception via the mi interface so that some
suitable message regarding SIGTERM might be printed; therefore, I
check the exception's 'reason' field for RETURN_FORCED_QUIT and
do a throw for this case.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=26761
Tested-by: Tom de Vries <tdevries@suse.de>
Approved-By: Pedro Alves <pedro@palves.net>
This commit contains QUIT processing updates for GDB's Guile support.
As with the Python updates, we don't want to permit this code to
swallow the exception, gdb_exception_forced_quit, which is associated
with GDB receiving a SIGTERM.
I've adopted the same solution that I used for Python; whereever
a gdb_exception is caught in try/catch code in the Guile extension
language support, a catch for gdb_exception_forced_quit has been
added; this catch block will simply call quit_force(), which will
cause the necessary cleanups to occur followed by GDB exiting.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=26761
Tested-by: Tom de Vries <tdevries@suse.de>
Approved-By: Pedro Alves <pedro@palves.net>
See the previous patches in this series for the motivation behind
these changes.
This commit contains updates to Python's QUIT handling. Ideally, we'd
like to throw gdb_exception_forced_quit through the extension
language; I made an attempt to do this for gdb_exception_quit in an
earlier version of this patch, but Pedro pointed out that it is
(almost certainly) not safe to do so.
Still, we definitely don't want to swallow the exception representing
a SIGTERM for GDB, nor do we want to force modules written in the
extension language to have to explicitly handle this case. Since the
idea is for GDB to cleanup and quit for this exception, we'll simply
call quit_force() just as if the gdb_exception_forced_quit propagation
had managed to make it back to the top level.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=26761
Tested-by: Tom de Vries <tdevries@suse.de>
Approved-By: Pedro Alves <pedro@palves.net>
As described in the previous commit for this series, I became
concerned that there might be instances in which a QUIT (due to either
a SIGINT or SIGTERM) might not cause execution to return to the top
level. In some (though very few) instances, it is okay to not
propagate the exception for a Ctrl-C / SIGINT, but I don't think that
it is ever okay to swallow the exception caused by a SIGTERM.
Allowing that to happen would definitely be a deviation from the
current behavior in which GDB exits upon receipt of a SIGTERM.
I looked at all cases where an exception handler catches a
gdb_exception. Handlers which did NOT need modification were those
which satisifed one or more of the following conditions:
1) There is no call path to maybe_quit() in the try block. I used a
static analysis tool to help make this determination. In
instances where the tool didn't provide an answer of "yes, this
call path can result in maybe_quit() being called", I reviewed it
by hand.
2) The catch block contains a throw for conditions that it
doesn't want to handle; these "not handled" conditions
must include the quit exception and the new "forced quit" exception.
3) There was (also) a catch for gdb_exception_quit.
Any try/catch blocks not meeting the above conditions could
potentially swallow a QUIT exception.
My first thought was to add catch blocks for gdb_exception_quit and
then rethrow the exception. But Pedro pointed out that this can be
handled without adding additional code by simply catching
gdb_exception_error instead. That's what this patch series does.
There are some oddball cases which needed to be handled differently,
plus the extension languages, but those are handled in later patches.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=26761
Tested-by: Tom de Vries <tdevries@suse.de>
Approved-by: Pedro Alves <pedro@palves.net>
When a GDB process receives the SIGTERM signal, handle_sigterm() in
event-top.c is called. The global variable 'sync_quit_force_run' is
set by this signal handler. It does some other things too, but the
setting of this global is the important bit for the SIGTERM part of
this discussion.
GDB will periodically check to see whether a Ctrl-C or SIGTERM has
been received. This is performed via use of the QUIT macro in
GDB's code. QUIT is defined to invoke maybe_quit(), which will be
periodically called during any lengthy operation. This is supposed to
ensure that the user won't have to wait too long for a Ctrl-C or
SIGTERM to be acted upon.
When a Ctrl-C / SIGINT is received, quit_handler() will decide whether
to pass the SIGINT onto the inferior or to call quit() which causes
gdb_exception_quit to be thrown. This exception (usually) propagates
to the top level. Control is then returned to the top level event
loop.
At the moment, SIGTERM is handled very differently. Instead of
throwing an exception, quit_force() is called. This does eventually
cause GDB to exit(), but prior to that happening, the inferiors
are killed or detached and other target related cleanup occurs.
As shown in this discussion between Pedro Alves and myself...
https://sourceware.org/pipermail/gdb-patches/2021-July/180802.htmlhttps://sourceware.org/pipermail/gdb-patches/2021-July/180902.htmlhttps://sourceware.org/pipermail/gdb-patches/2021-July/180903.html
...we found that it is possible for inferior_ptid and current_thread_
to get out of sync. When that happens, the "current_thread_ != nullptr"
assertion in inferior_thread() can fail resulting in a GDB internal
error.
Pedro recommended that we "let the normal quit exception propagate all
the way to the top level, and then have the top level call quit_force
if sync_quit_force_run is set." However, after the v2 series for this
patch set, we tweaked that idea by introducing a new exception for
handling SIGTERM.
This commit implements the obvious part of Pedro's suggestion:
Instead of calling quit_force from quit(), throw_forced_quit() is now
called instead. This causes the new exception 'gdb_exception_forced_quit'
to be thrown.
At the top level, I changed catch_command_errors() and captured_main()
to catch gdb_exception_forced_quit and then call quit_force() from the
catch block. I also changed start_event_loop() to also catch
gdb_exception_forced_quit; while we could also call quit_force() from
that catch block, it's sufficient to simply rethrow the exception
since it'll be caught by the newly added code in captured_main().
Making these changes fixed the failure / regression that I was seeing
for gdb.base/gdb-sigterm.exp when run on a machine with glibc-2.34.
However, there are many other paths back to the top level which this
test case does not test. I did an audit of all of the try / catch
code in GDB in which calls in the try-block might (eventually) call
QUIT. I found many cases where gdb_exception_quit and the new
gdb_exception_forced_quit will be swallowed. (When using GDB, have
you ever hit Ctrl-C and not have it do anything; if so, it could be
due to a swallowed gdb_exception_quit in one of the cases I've
identified.) The rest of the patches in this series deal with this
concern.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=26761
Tested-by: Tom de Vries <tdevries@suse.de>
Approved-by: Pedro Alves <pedro@palves.net>
This commit adds a new exception 'gdb_exception_forced_quit', reason
code 'REASON_FORCED_QUIT', return mask 'RETURN_MASK_FORCED_QUIT', and
a wrapper for throwing the exception, throw_forced_quit().
The addition of this exception plus supporting code will allow us to
recognize that a SIGTERM has been received by GDB and then propagate
recognition of that fact to the upper levels of GDB where it can be
correctly handled. At the moment, when GDB receives a SIGTERM, it
will attempt to exit via a series of calls from the QUIT checking
code. However, before it can exit, it must do various cleanups, such
as killing or detaching all inferiors. Should these cleanups be
attempted while GDB is executing very low level code, such as reading
target memory from within ps_xfer_memory(), it can happen that some of
GDB's state is out of sync with regard to the cleanup code's
expectations. In the case just mentioned, it's been observed that
inferior_ptid and the current_thread_ are not in sync; this triggers
an assert / internal error.
This commit only introduces the exception plus supporting machinery;
changes which use this new exception are in later commits in this
series.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=26761
Tested-by: Tom de Vries <tdevries@suse.de>
Approved-by: Pedro Alves <pedro@palves.net>
Hannes filed a bug showing a crash, where a pretty-printer written in
Python could cause a use-after-free. He sent a patch, but I thought a
different approach was needed.
In a much earlier patch (see bug #12533), we changed the Python code
to release new values from the value chain when constructing a
gdb.Value. The rationale for this is that if you write a command that
does a lot of computations in a loop, all the values will be kept live
by the value chain, resulting in gdb using a large amount of memory.
However, suppose a value is passed to Python from some code in gdb
that needs to use the value after the call into Python. In this
scenario, value_to_value_object will still release the value -- and
because gdb code doesn't generally keep strong references to values (a
consequence of the ancient decision to use the value chain to avoid
memory management), this will result in a use-after-free.
This scenario can happen, as it turns out, when a value is passed to
Python for pretty-printing. Now, normally this route boxes the value
via value_to_value_object_no_release, avoiding the problematic release
from the value chain. However, if you then call Value.cast, the
underlying value API might return the same value, when is then
released from the chain.
This patch fixes the problem by changing how value boxing is done.
value_to_value_object no longer removes a value from the chain.
Instead, every spot in gdb that might construct new values uses a
scoped_value_mark to ensure that the requirements of bug #12533 are
met. And, because incoming values aren't ever released from the chain
(the Value.cast one comes earlier on the chain than the
scoped_value_mark), the bug can no longer occur. (Note that many
spots in the Python layer already take this approach, so not many
places needed to be touched.)
In the future I think we should replace the use of raw "value *" with
value_ref_ptr pretty much everywhere. This will ensure lifetime
safety throughout gdb.
The test case in this patch comes from Hannes' original patch. I only
made a trivial ("require") change to it. However, while this fails
for him, I can't make it fail on this machine; nevertheless, he tried
my patch and reported the bug as being fixed.
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=30044
After the previous patches, I believe this observer isn't necessary
anymore for anything. Remove it.
Change-Id: Idb33fb6b6f55589c8c523a92169b3ca95a23d0b9
With:
- catch a fork in thread 1
- select thread 2
- set follow-fork child
- next
... follow_fork notices that thread 1 had last stopped for a fork
which hasn't been followed yet, and because thread 1 is not the
current thread, GDB aborts the execution command, presenting the stop
in thread 1.
That makes sense, as only the forking thread (thread 1) survives in
the child, so better stop and let the user decide how to proceed.
However, with:
- catch a fork in thread 1
- select thread 2
- set follow-fork parent << note difference here
- next
... GDB does the same: follow_fork notices that thread 1 had last
stopped for a fork which hasn't been followed yet, and because thread
1 is not the current thread, GDB aborts the execution command,
presenting the stop in thread 1.
Aborting/stopping in this case doesn't make sense to me. As we're
following the parent, thread 2 will still continue to exist in the
parent. What the child does after we've followed the parent shouldn't
matter -- it can go on running free, be detached, etc., depending on
"set schedule-multiple", "set detach-on-fork", etc. That does not
influence the execution command that the user issued for the parent
thread.
So this patch changes GDB in that direction -- in follow_fork, if
following the parent, and we've switched threads meanwhile, switch
back to the unfollowed thread, follow it (stay with the parent), and
don't abort/stop. If we're following a fork (as opposed to vfork),
then switch back again to the thread that the user was trying to
resume. If following a vfork, however, stay with the vforking-thread
selected, as we will need to see a vfork_done event first, before we
can resume any other thread.
As I was working on this, I managed to end up calling target_resume
for a solo-thread resume (to collect the vfork_done event), with
scope_ptid pointing at the vfork parent thread, and inferior_ptid
pointing to the vfork child. For a solo-thread resume, the scope_ptid
argument to target_resume must the same as inferior_ptid. The mistake
was caught by the assertion in target_resume, like so:
...
[infrun] resume_1: step=0, signal=GDB_SIGNAL_0, trap_expected=0, current thread [1722839.1722839.0] at 0x5555555553c3
[infrun] do_target_resume: resume_ptid=1722839.1722939.0, step=0, sig=GDB_SIGNAL_0
../../src/gdb/target.c:2661: internal-error: target_resume: Assertion `inferior_ptid.matches (scope_ptid)' failed.
...
but I think it doesn't hurt to catch such a mistake earlier, hence the
change in internal_resume_ptid.
Change-Id: I896705506a16d2488b1bfb4736315dd966f4e412