binutils-gdb/gdb/testsuite/gdb.dwarf2/symbol_needs_eval.c

26 lines
856 B
C
Raw Normal View History

Replace the symbol needs evaluator with a parser This patch addresses a design problem with the symbol_needs_eval_context class. It exposes the problem by introducing two new testsuite test cases. To explain the issue, I first need to explain the dwarf_expr_context class that the symbol_needs_eval_context class derives from. The intention behind the dwarf_expr_context class is to commonize the DWARF expression evaluation mechanism for different evaluation contexts. Currently in gdb, the evaluation context can contain some or all of the following information: architecture, object file, frame and compilation unit. Depending on the information needed to evaluate a given expression, there are currently three distinct DWARF expression evaluators:  - Frame: designed to evaluate an expression in the context of a call    frame information (dwarf_expr_executor class). This evaluator doesn't    need a compilation unit information.  - Location description: designed to evaluate an expression in the    context of a source level information (dwarf_evaluate_loc_desc    class). This evaluator expects all information needed for the    evaluation of the given expression to be present.  - Symbol needs: designed to answer a question about the parts of the    context information required to evaluate a DWARF expression behind a    given symbol (symbol_needs_eval_context class). This evaluator    doesn't need a frame information. The functional difference between the symbol needs evaluator and the others is that this evaluator is not meant to interact with the actual target. Instead, it is supposed to check which parts of the context information are needed for the given DWARF expression to be evaluated by the location description evaluator. The idea is to take advantage of the existing dwarf_expr_context evaluation mechanism and to fake all required interactions with the actual target, by returning back dummy values. The evaluation result is returned as one of three possible values, based on operations found in a given expression: - SYMBOL_NEEDS_NONE, - SYMBOL_NEEDS_REGISTERS and - SYMBOL_NEEDS_FRAME. The problem here is that faking results of target interactions can yield an incorrect evaluation result. For example, if we have a conditional DWARF expression, where the condition depends on a value read from an actual target, and the true branch of the condition requires a frame information to be evaluated, while the false branch doesn't, fake target reads could conclude that a frame information is not needed, where in fact it is. This wrong information would then cause the expression to be actually evaluated (by the location description evaluator) with a missing frame information. This would then crash the debugger. The gdb.dwarf2/symbol_needs_eval_fail.exp test introduces this scenario, with the following DWARF expression:                    DW_OP_addr $some_variable                    DW_OP_deref                    # conditional jump to DW_OP_bregx                    DW_OP_bra 4                    DW_OP_lit0                    # jump to DW_OP_stack_value                    DW_OP_skip 3                    DW_OP_bregx $dwarf_regnum 0                    DW_OP_stack_value This expression describes a case where some variable dictates the location of another variable. Depending on a value of some_variable, the variable whose location is described by this expression is either read from a register or it is defined as a constant value 0. In both cases, the value will be returned as an implicit location description on the DWARF stack. Currently, when the symbol needs evaluator fakes a memory read from the address behind the some_variable variable, the constant value 0 is used as the value of the variable A, and the check returns the SYMBOL_NEEDS_NONE result. This is clearly a wrong result and it causes the debugger to crash. The scenario might sound strange to some people, but it comes from a SIMD/SIMT architecture where $some_variable is an execution mask.  In any case, it is a valid DWARF expression, and GDB shouldn't crash while evaluating it. Also, a similar example could be made based on a condition of the frame base value, where if that value is concluded to be 0, the variable location could be defaulted to a TLS based memory address. The gdb.dwarf2/symbol_needs_eval_timeout.exp test introduces a second scenario. This scenario is a bit more abstract due to the DWARF assembler lacking the CFI support, but it exposes a different manifestation of the same problem. Like in the previous scenario, the DWARF expression used in the test is valid:                        DW_OP_lit1                        DW_OP_addr $some_variable                        DW_OP_deref                        # jump to DW_OP_fbreg                        DW_OP_skip 4                        DW_OP_drop                        DW_OP_fbreg 0                        DW_OP_dup                        DW_OP_lit0                        DW_OP_eq                        # conditional jump to DW_OP_drop                        DW_OP_bra -9                        DW_OP_stack_value Similarly to the previous scenario, the location of a variable A is an implicit location description with a constant value that depends on a value held by a global variable. The difference from the previous case is that DWARF expression contains a loop instead of just one branch. The end condition of that loop depends on the expectation that a frame base value is never zero. Currently, the act of faking the target reads will cause the symbol needs evaluator to get stuck in an infinite loop. Somebody could argue that we could change the fake reads to return something else, but that would only hide the real problem. The general impression seems to be that the desired design is to have one class that deals with parsing of the DWARF expression, while there are virtual methods that deal with specifics of some operations. Using an evaluator mechanism here doesn't seem to be correct, because the act of evaluation relies on accessing the data from the actual target with the possibility of skipping the evaluation of some parts of the expression. To better explain the proposed solution for the issue, I first need to explain a couple more details behind the current design: There are multiple places in gdb that handle DWARF expression parsing for different purposes. Some are in charge of converting the expression to some other internal representation (decode_location_expression, disassemble_dwarf_expression and dwarf2_compile_expr_to_ax), some are analysing the expression for specific information (compute_stack_depth_worker) and some are in charge of evaluating the expression in a given context (dwarf_expr_context::execute_stack_op and decode_locdesc). The problem is that all those functions have a similar (large) switch statement that handles each DWARF expression operation. The result of this is a code duplication and harder maintenance. As a step into the right direction to solve this problem (at least for the purpose of a DWARF expression evaluation) the expression parsing was commonized inside of an evaluator base class (dwarf_expr_context). This makes sense for all derived classes, except for the symbol needs evaluator (symbol_needs_eval_context) class. As described previously the problem with this evaluator is that if the evaluator is not allowed to access the actual target, it is not really evaluating. Instead, the desired function of a symbol needs evaluator seems to fall more into expression analysis category. This means that a more natural fit for this evaluator is to be a symbol needs analysis, similar to the existing compute_stack_depth_worker analysis. Another problem is that using a heavyweight mechanism of an evaluator to do an expression analysis seems to be an unneeded overhead. It also requires a more complicated design of the parent class to support fake target reads. The reality is that the whole symbol_needs_eval_context class can be replaced with a lightweight recursive analysis function, that will give more correct result without compromising the design of the dwarf_expr_context class. The analysis treats the expression byte stream as a DWARF operation graph, where each graph node can be visited only once and each operation can decide if the frame context is needed for their evaluation. The downside of this approach is adding of one more similar switch statement, but at least this way the new symbol needs analysis will be a lightweight mechnism and it will provide a correct result for any given DWARF expression. A more desired long term design would be to have one class that deals with parsing of the DWARF expression, while there would be a virtual methods that deal with specifics of some DWARF operations. Then that class would be used as a base for all DWARF expression parsing mentioned at the beginning. This however, requires a far bigger changes that are out of the scope of this patch series. The new analysis requires the DWARF location description for the argc argument of the main function to change in the assembly file gdb.python/amd64-py-framefilter-invalidarg.S. Originally, expression ended with a 0 value byte, which was never reached by the symbol needs evaluator, because it was detecting a stack underflow when evaluating the operation before. The new approach does not simulate a DWARF stack anymore, so the 0 value byte needs to be removed because it makes the DWARF expression invalid. gdb/ChangeLog: * dwarf2/loc.c (class symbol_needs_eval_context): Remove. (dwarf2_get_symbol_read_needs): New function. (dwarf2_loc_desc_get_symbol_read_needs): Remove. (locexpr_get_symbol_read_needs): Use dwarf2_get_symbol_read_needs. gdb/testsuite/ChangeLog: * gdb.python/amd64-py-framefilter-invalidarg.S : Update argc DWARF location expression. * lib/dwarf.exp (_location): Handle DW_OP_fbreg. * gdb.dwarf2/symbol_needs_eval.c: New file. * gdb.dwarf2/symbol_needs_eval_fail.exp: New file. * gdb.dwarf2/symbol_needs_eval_timeout.exp: New file.
2020-08-14 18:28:13 +08:00
/* This testcase is part of GDB, the GNU debugger.
Copyright 2017-2022 Free Software Foundation, Inc.
Replace the symbol needs evaluator with a parser This patch addresses a design problem with the symbol_needs_eval_context class. It exposes the problem by introducing two new testsuite test cases. To explain the issue, I first need to explain the dwarf_expr_context class that the symbol_needs_eval_context class derives from. The intention behind the dwarf_expr_context class is to commonize the DWARF expression evaluation mechanism for different evaluation contexts. Currently in gdb, the evaluation context can contain some or all of the following information: architecture, object file, frame and compilation unit. Depending on the information needed to evaluate a given expression, there are currently three distinct DWARF expression evaluators:  - Frame: designed to evaluate an expression in the context of a call    frame information (dwarf_expr_executor class). This evaluator doesn't    need a compilation unit information.  - Location description: designed to evaluate an expression in the    context of a source level information (dwarf_evaluate_loc_desc    class). This evaluator expects all information needed for the    evaluation of the given expression to be present.  - Symbol needs: designed to answer a question about the parts of the    context information required to evaluate a DWARF expression behind a    given symbol (symbol_needs_eval_context class). This evaluator    doesn't need a frame information. The functional difference between the symbol needs evaluator and the others is that this evaluator is not meant to interact with the actual target. Instead, it is supposed to check which parts of the context information are needed for the given DWARF expression to be evaluated by the location description evaluator. The idea is to take advantage of the existing dwarf_expr_context evaluation mechanism and to fake all required interactions with the actual target, by returning back dummy values. The evaluation result is returned as one of three possible values, based on operations found in a given expression: - SYMBOL_NEEDS_NONE, - SYMBOL_NEEDS_REGISTERS and - SYMBOL_NEEDS_FRAME. The problem here is that faking results of target interactions can yield an incorrect evaluation result. For example, if we have a conditional DWARF expression, where the condition depends on a value read from an actual target, and the true branch of the condition requires a frame information to be evaluated, while the false branch doesn't, fake target reads could conclude that a frame information is not needed, where in fact it is. This wrong information would then cause the expression to be actually evaluated (by the location description evaluator) with a missing frame information. This would then crash the debugger. The gdb.dwarf2/symbol_needs_eval_fail.exp test introduces this scenario, with the following DWARF expression:                    DW_OP_addr $some_variable                    DW_OP_deref                    # conditional jump to DW_OP_bregx                    DW_OP_bra 4                    DW_OP_lit0                    # jump to DW_OP_stack_value                    DW_OP_skip 3                    DW_OP_bregx $dwarf_regnum 0                    DW_OP_stack_value This expression describes a case where some variable dictates the location of another variable. Depending on a value of some_variable, the variable whose location is described by this expression is either read from a register or it is defined as a constant value 0. In both cases, the value will be returned as an implicit location description on the DWARF stack. Currently, when the symbol needs evaluator fakes a memory read from the address behind the some_variable variable, the constant value 0 is used as the value of the variable A, and the check returns the SYMBOL_NEEDS_NONE result. This is clearly a wrong result and it causes the debugger to crash. The scenario might sound strange to some people, but it comes from a SIMD/SIMT architecture where $some_variable is an execution mask.  In any case, it is a valid DWARF expression, and GDB shouldn't crash while evaluating it. Also, a similar example could be made based on a condition of the frame base value, where if that value is concluded to be 0, the variable location could be defaulted to a TLS based memory address. The gdb.dwarf2/symbol_needs_eval_timeout.exp test introduces a second scenario. This scenario is a bit more abstract due to the DWARF assembler lacking the CFI support, but it exposes a different manifestation of the same problem. Like in the previous scenario, the DWARF expression used in the test is valid:                        DW_OP_lit1                        DW_OP_addr $some_variable                        DW_OP_deref                        # jump to DW_OP_fbreg                        DW_OP_skip 4                        DW_OP_drop                        DW_OP_fbreg 0                        DW_OP_dup                        DW_OP_lit0                        DW_OP_eq                        # conditional jump to DW_OP_drop                        DW_OP_bra -9                        DW_OP_stack_value Similarly to the previous scenario, the location of a variable A is an implicit location description with a constant value that depends on a value held by a global variable. The difference from the previous case is that DWARF expression contains a loop instead of just one branch. The end condition of that loop depends on the expectation that a frame base value is never zero. Currently, the act of faking the target reads will cause the symbol needs evaluator to get stuck in an infinite loop. Somebody could argue that we could change the fake reads to return something else, but that would only hide the real problem. The general impression seems to be that the desired design is to have one class that deals with parsing of the DWARF expression, while there are virtual methods that deal with specifics of some operations. Using an evaluator mechanism here doesn't seem to be correct, because the act of evaluation relies on accessing the data from the actual target with the possibility of skipping the evaluation of some parts of the expression. To better explain the proposed solution for the issue, I first need to explain a couple more details behind the current design: There are multiple places in gdb that handle DWARF expression parsing for different purposes. Some are in charge of converting the expression to some other internal representation (decode_location_expression, disassemble_dwarf_expression and dwarf2_compile_expr_to_ax), some are analysing the expression for specific information (compute_stack_depth_worker) and some are in charge of evaluating the expression in a given context (dwarf_expr_context::execute_stack_op and decode_locdesc). The problem is that all those functions have a similar (large) switch statement that handles each DWARF expression operation. The result of this is a code duplication and harder maintenance. As a step into the right direction to solve this problem (at least for the purpose of a DWARF expression evaluation) the expression parsing was commonized inside of an evaluator base class (dwarf_expr_context). This makes sense for all derived classes, except for the symbol needs evaluator (symbol_needs_eval_context) class. As described previously the problem with this evaluator is that if the evaluator is not allowed to access the actual target, it is not really evaluating. Instead, the desired function of a symbol needs evaluator seems to fall more into expression analysis category. This means that a more natural fit for this evaluator is to be a symbol needs analysis, similar to the existing compute_stack_depth_worker analysis. Another problem is that using a heavyweight mechanism of an evaluator to do an expression analysis seems to be an unneeded overhead. It also requires a more complicated design of the parent class to support fake target reads. The reality is that the whole symbol_needs_eval_context class can be replaced with a lightweight recursive analysis function, that will give more correct result without compromising the design of the dwarf_expr_context class. The analysis treats the expression byte stream as a DWARF operation graph, where each graph node can be visited only once and each operation can decide if the frame context is needed for their evaluation. The downside of this approach is adding of one more similar switch statement, but at least this way the new symbol needs analysis will be a lightweight mechnism and it will provide a correct result for any given DWARF expression. A more desired long term design would be to have one class that deals with parsing of the DWARF expression, while there would be a virtual methods that deal with specifics of some DWARF operations. Then that class would be used as a base for all DWARF expression parsing mentioned at the beginning. This however, requires a far bigger changes that are out of the scope of this patch series. The new analysis requires the DWARF location description for the argc argument of the main function to change in the assembly file gdb.python/amd64-py-framefilter-invalidarg.S. Originally, expression ended with a 0 value byte, which was never reached by the symbol needs evaluator, because it was detecting a stack underflow when evaluating the operation before. The new approach does not simulate a DWARF stack anymore, so the 0 value byte needs to be removed because it makes the DWARF expression invalid. gdb/ChangeLog: * dwarf2/loc.c (class symbol_needs_eval_context): Remove. (dwarf2_get_symbol_read_needs): New function. (dwarf2_loc_desc_get_symbol_read_needs): Remove. (locexpr_get_symbol_read_needs): Use dwarf2_get_symbol_read_needs. gdb/testsuite/ChangeLog: * gdb.python/amd64-py-framefilter-invalidarg.S : Update argc DWARF location expression. * lib/dwarf.exp (_location): Handle DW_OP_fbreg. * gdb.dwarf2/symbol_needs_eval.c: New file. * gdb.dwarf2/symbol_needs_eval_fail.exp: New file. * gdb.dwarf2/symbol_needs_eval_timeout.exp: New file.
2020-08-14 18:28:13 +08:00
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>. */
int exec_mask = 1;
int
main (void)
{
asm volatile ("main_label: .globl main_label");
return 0;
}