read_a_source_file() would bump line numbers only when seeing a newline,
whereas is_end_of_line[] indicates further end-of-line characters, in
particular the nul character. s_linefile() attempts to compensate for
the bump, but was too aggressive with this so far: It should only adjust
when a newline ends the line. To facilitate such a check, the check for
nothing else on the line needs to move ahead, which luckily is easily
possible: The relevant two conditions match, and the function can
simply return from the body of that earlier instance of the conditional.
The more strict treatment in s_linefile() then requires an adjustment
to buffer_and_nest()'s invocation of the function: The line terminator
now needs to be a newline, not nul.
Move all the function local static state variables to file scope,
in order to tidy memory on exit and to reinit everything for that
annoying oss-fuzz. Also fix a couple memory leaks.
* read.h (read_begin, read_end): Declare.
* read.c (read_begin): Call stabs_begin.
(read_end): Call stabs_end.
* stabs.c (stabs_begin, stabs_end): New functions.
(in_dot_func_p): Delete, use current_function_label instead.
(cached_sec): Move from s_stab_generic.
(last_asm_file, file_label_count): Move from generate_asm_file.
(line_label_count, prev_lineno, prev_line_file): Move from
stabs_generate_asm_lineno.
(void_emitted_p): Move from stabs_generate_asm_func.
(endfunc_label_count): Move from stabs_generate_asm_endfunc.
(stabs_generate_asm_lineno): Simplify setting of
prev_line_file.
(stabs_generate_asm_func): Don't leak current_function_label.
(stabs_generate_asm_endfunc): Likewise.
This tidies memory allocated for entries in macro_hash. Freeing the
macro name requires a little restructuring of the define_macro
interface due to the name being used in the error message, and exposed
the fact that the name and other fields were not initialised by the
iq2000 backend.
There is also a fix for
.macro .macro
.endm
.macro .macro
.endm
which prior to this patch reported
mac.s:1: Warning: attempt to redefine pseudo-op `.macro' ignored
mac.s:3: Error: Macro `.macro' was already defined
rather than reporting the attempt to redefine twice.
* macro.c (macro_del_f): New function.
(macro_init): Use it when creating macro_hash.
(free_macro): Free macro name too.
(define_macro): Return the macro_entry, remove idx, file, line and
namep params. Call as_where. Report errors here. Delete macro
from macro_hash on attempt to redefined pseudo-op.
(delete_macro): Don't call free_macro.
* macro.h (define_macro): Update prototype.
* read.c (s_macro): Adjust to suit.
* config/tc-iq2000.c (iq2000_add_macro): Init all fields of
macro_entry.
The newer update-copyright.py fixes file encoding too, removing cr/lf
on binutils/bfdtest2.c and ld/testsuite/ld-cygwin/exe-export.exp, and
embedded cr in binutils/testsuite/binutils-all/ar.exp string match.
Not so long ago we started to insert these artificially when expanding
certain macro-like constructs; zap them as cluttering what actually
results from user input.
These two macros print either a 16 digit hex number or an 8 digit
hex number. Unfortunately they depend on both target and host, which
means that the output for 32-bit targets may be either 8 or 16 hex
digits.
Replace them in most cases with code that prints a bfd_vma using
PRIx64. In some cases, deliberately lose the leading zeros.
This change some output, notably in base/offset fields of m68k
disassembly which I think looks better that way, and in error
messages. I've kept leading zeros in symbol dumps (objdump -t)
and in PE header dumps.
bfd/
* bfd-in.h (fprintf_vma, sprintf_vma, printf_vma): Delete.
* bfd-in2.h: Regenerate.
* bfd.c (bfd_sprintf_vma): Don't use sprintf_vma.
(bfd_fprintf_vma): Don't use fprintf_vma.
* coff-rs6000.c (xcoff_reloc_type_tls): Don't use sprintf_vma.
Instead use PRIx64 to print bfd_vma values.
(xcoff_ppc_relocate_section): Likewise.
* cofflink.c (_bfd_coff_write_global_sym): Likewise.
* mmo.c (mmo_write_symbols_and_terminator): Likewise.
* srec.c (srec_write_symbols): Likewise.
* elf32-xtensa.c (print_r_reloc): Similarly for fprintf_vma.
* pei-x86_64.c (pex64_dump_xdata): Likewise.
(pex64_bfd_print_pdata_section): Likewise.
* som.c (som_print_symbol): Likewise.
* ecoff.c (_bfd_ecoff_print_symbol): Use bfd_fprintf_vma.
opcodes/
* dis-buf.c (perror_memory, generic_print_address): Don't use
sprintf_vma. Instead use PRIx64 to print bfd_vma values.
* i386-dis.c (print_operand_value, print_displacement): Likewise.
* m68k-dis.c (print_base, print_indexed): Likewise.
* ns32k-dis.c (print_insn_arg): Likewise.
* ia64-gen.c (_opcode_int64_low, _opcode_int64_high): Delete.
(opcode_fprintf_vma): Delete.
(print_main_table): Use PRIx64 to print opcode.
binutils/
* od-macho.c: Replace all uses of printf_vma with bfd_printf_vma.
* objcopy.c (copy_object): Don't use sprintf_vma. Instead use
PRIx64 to print bfd_vma values.
(copy_main): Likewise.
* readelf.c (CHECK_ENTSIZE_VALUES): Likewise.
(dynamic_section_mips_val): Likewise.
(print_vma): Don't use printf_vma. Instead use PRIx64 to print
bfd_vma values.
(dump_ia64_vms_dynamic_fixups): Likewise.
(process_version_sections): Likewise.
* rddbg.c (stab_context): Likewise.
gas/
* config/tc-i386.c (offset_in_range): Don't use sprintf_vma.
Instead use PRIx64 to print bfd_vma values.
(md_assemble): Likewise.
* config/tc-mips.c (load_register, macro): Likewise.
* messages.c (as_internal_value_out_of_range): Likewise.
* read.c (emit_expr_with_reloc): Likewise.
* config/tc-ia64.c (note_register_values): Don't use fprintf_vma.
Instead use PRIx64 to print bfd_vma values.
(print_dependency): Likewise.
* listing.c (list_symbol_table): Use bfd_sprintf_vma.
* symbols.c (print_symbol_value_1): Use %p to print pointers.
(print_binary): Likewise.
(print_expr_1): Use PRIx64 to print bfd_vma values.
* write.c (print_fixup): Use %p to print pointers. Don't use
fprintf_vma.
* testsuite/gas/all/overflow.l: Update expected output.
* testsuite/gas/m68k/mcf-mov3q.d: Likewise.
* testsuite/gas/m68k/operands.d: Likewise.
* testsuite/gas/s12z/truncated.d: Likewise.
ld/
* deffilep.y (def_file_print): Don't use fprintf_vma. Instead
use PRIx64 to print bfd_vma values.
* emultempl/armelf.em (gld${EMULATION_NAME}_finish): Don't use
sprintf_vma. Instead use PRIx64 to print bfd_vma values.
* emultempl/pe.em (gld${EMULATION_NAME}_finish): Likewise.
* ldlang.c (lang_map): Use %V to print region origin.
(lang_one_common): Don't use sprintf_vma.
* ldmisc.c (vfinfo): Don't use fprintf_vma or sprintf_vma.
* pe-dll.c (pe_dll_generate_def_file): Likewise.
gdb/
* remote.c (remote_target::trace_set_readonly_regions): Replace
uses of sprintf_vma with bfd_sprintf_vma.
Allows register names to appear in symbol assignments, so for example
tocp = %r2
mr %r3,tocp
now assembles.
* gas/config/tc-ppc.c (REG_NAME_CNT): Delete, replace uses with
ARRAY_SIZE.
(register_name): Rename to..
(md_operand): ..this. Only handle %reg.
(cr_names): Rename to..
(cr_cond): ..this. Just keep conditions.
(ppc_parse_name): Add mode param. Search both cr_cond and
pre_defined_registers. Handle absolute and register symbol
values here rather than in expr.c:operand().
(md_assemble): Don't special case register name matching in
operands, except to set cr_operand as appropriate.
* gas/config/tc-ppc.h (md_operand): Don't define.
(md_parse_name, ppc_parse_name): Update.
* read.c (pseudo_set): Copy over entire O_register value.
* testsuite/gas/ppc/regsyms.d.
* testsuite/gas/ppc/regsyms.s: New test.
* testsuite/gas/ppc/ppc.exp: Run it.
So that the notes obstack can be used for persistent storage in
parse_args.
* as.c (parse_args): Use notes_alloc and notes_strdup.
(free_notes): New function.
(main): Init notes obstack, and arrange to be freed on exit.
* read.c (read_begin): Don't init notes obstack.
(read_end): Free cond_obstack.
* subsegs.c (subsegs_end): Don't free cond_obstack or notes.
po_hash code duplicates the str_hash code in hash.h for no good reason.
* read.c (struct po_entry, po_entry_t): Delete.
(hash_po_entry, eq_po_entry, po_entry_alloc, po_entry_find): Delete.
(pop_insert): Use str_hash_insert.
(pobegin): Use str_htab_create.
(read_a_source_file, s_macro): Use str_hash_find.
read_symbol_name mallocs the string it returns. Free it when done.
* read.c (read_symbol_name): Free name on error path.
* config/tc-ppc.c (ppc_GNU_visibility): Free name returned from
read_symbol_name.
(ppc_extern, ppc_globl, ppc_weak): Likewise.
This fixes some horrible code using do_scrub_chars. What we had ran
text through do_scrub_chars twice, directly in read_a_source_file and
again via the input_scrub_include_sb call. That's silly, and since
do_scrub_chars is a state machine, possibly wrong. More silliness is
evident in the temporary malloc'd buffer for do_scrub_chars output,
which should have been written directly to sbuf.
So, get rid of the do_scrub_chars call and support functions, leaving
scrubbing to input_scrub_include_sb. I did wonder about #NO_APP
overlapping input_scrub_next_buffer buffers, but that should only
happen if the string starts in one file and finishes in another.
* read.c (scrub_string, scrub_string_end): Delete.
(scrub_from_string): Delete.
(read_a_source_file): Rewrite #APP processing.
Compilers would put decimal numbers there, so I think we should treat
finding octal numbers the same as finding bignums - ignore them as
actually being comments of some very specific form.
Any construct which to the scrubber looks like a C preprocessor
line/file "directive" is converted to .linefile, but the amount of
checking the scrubber does is minimal (albeit it does let through only
decimal digits for the line part of the contruct). Since the scrubber
conversion is further tied to # being a line comment character, anything
which upon closer inspection turns out not to be a line/file "directive"
is supposed to be treated as a comment, i.e. ignored. Therefore we
cannot use get_absolute_expression(), as this may raise errors. Open-
code the function instead, treating everything not resulting in
O_constant as a comment as well.
Furthermore also bounds-check the parsed value. This bounds check tries
to avoid implementation defined behavior (which may be the raising of an
implementation defined signal), but for now makes the assumption that
int has less than 64 bits. The way bfd_signed_vma (which is what offsetT
aliases) is defined in bfd.h for the BFD64 case I cannot really see a
clean way of avoiding this assumption. Omitting the #ifdef, otoh, would
risk "condition is always false" warnings by compilers.
Convert get_linefile_number() to return bool at this occasion as well.
do_repeat_with_expander() already deals with the "no expander" case
quite fine, so there's really little point having two functions. What it
lacks compared with do_repeat() is a call to sb_build(), which can
simply be moved (and the then redundant sb_new() be avoided). Along with
this moving also flip if the main if()'s condition such that the "no
expander" case is handled first.
These were used originally to represent "# <line> <file>" constructs
inserted by (typically) compilers when pre-processing. Quite some time
ago they were replaced by .linefile though. Since the original
directives were never documented, we ought to be able to remove support
for them. As a result in a number of case function parameter aren't used
anymore and can hence be dropped.
Commit 7992631e8c ("gas/Dwarf: improve debug info generation from .irp
and alike blocks"), while dealing okay with actual assembly source files
not using .file/.line and alike outside but not inside of .macro, has
undue effects when the logical file/line pair was already overridden:
Line numbers would continuously increment while processing the expanded
macro, while the goal of the PR gas/16908 workaround is to keep the
expansion associated with the line invoking the macro. However, as soon
as enough state was overridden _inside_ the macro to cause as_where() to
no longer fall back top as_where_physical(), honor this by resuming the
bumping of the logical line number.
Note that from_sb_is_expansion's initializer was 1 for an unknown
reason. While renaming the variable and changing its type, also change
the initializer to "expanding_none", which would have been "0" in the
original code. Originally the initializer value itself wasn't ever used
anyway (requiring sb_index != -1), as it necessarily had changed in
input_scrub_include_sb() alongside setting sb_index to other than -1.
Strictly speaking input_scrub_insert_line() perhaps shouldn't use
expanding_none, yet none of the other enumerators fit there either. And
then strictly speaking that function probably shouldn't exist in the
first place. It's used only by tic54x.
Commit 7992631e8c ("gas/Dwarf: improve debug info generation from .irp
and alike blocks"), while dealing okay with actual assembly source files
not using .file/.line and alike outside but not inside of .irp et al,
has undue effects when the logical file/line pair was already
overridden: Line numbers would continuously increment upon every
iteration, thus potentially getting far off. Furthermore it left it to
the user to actually insert .file/.line inside such constructs. Note
though that before aforementioned change things weren't pretty either:
Diagnostics (and debug info) would be associated with the directive
terminating the iteration construct, rather than with the actual lines.
Handle this automatically by simply latching the present line and then
re-instating coordinates first thing on every iteration; note that the
file can't change from what was previously pushed on the scrubber's
state stack, and hence can be taken from there by using a new flavor of
.linefile (which is far better memory-footprint-wise than recording the
full path in the inserted directive). (This then leaves undisturbed any
file/line control occurring in the body of the construct, as these will
only be seen and processed afterwards.)
The change in read_a_source_file prevents the particular testcase in
the PR from triggering the assertion in demand_empty_rest_of_line.
I've also removed the assertion. Nothing much goes wrong with gas if
something else triggers it, so it's not worthy of an abort.
I've also changed my previous patch to ignore_rest_of_line to allow
that function to increment input_line_pointer past buffer_limit, like
demand_empty_rest_of_line: The two functions ought to behave the
same in that respect. Finally, demand_empty_rest_of_line gets a
little hardening to prevent accesses past buffer_limit plus one.
PR 28979
* read.c (read_a_source_file): Calculate known size for sbuf
rather than calling strlen.
(demand_empty_rest_of_line): Remove "know" check. Expand comment.
Don't dereference input_line_pointer when past buffer_limit.
(ignore_rest_of_line): Allow input_line_pointer to increment to
buffer_limit plus one. Expand comment.
operand() is not a place that should be calling ignore_rest_of_line.
ignore_rest_of_line shouldn't increment input_line_pointer if already
at buffer limit.
* expr.c (operand): Don't call ignore_rest_of_line.
* read.c (s_mri_common): Likewise.
(ignore_rest_of_line): Don't increment input_line_pointer if
already at buffer_limit.
Much of the gas source and older BFD source use "long" for function
parameters and variables, when other types would be more appropriate.
This patch fixes one of those cases. Dollar labels and numeric local
labels do not need large numbers. Small positive itegers are usually
all that is required. Due to allowing longs, it was possible for
fb_label_name and dollar_label_name to overflow their buffers.
* symbols.c: Delete unnecessary forward declarations.
(dollar_labels, dollar_label_instances): Use unsigned int.
(dollar_label_defined, dollar_label_instance): Likewise.
(define_dollar_label): Likewise.
(fb_low_counter, fb_labels, fb_label_instances): Likewise.
(fb_label_instance_inc, fb_label_instance): Likewise.
(fb_label_count, fb_label_max): Make them size_t.
(dollar_label_name, fb_label_name): Rewrite using sprintf.
* symbols.h (dollar_label_defined): Update prototype.
(define_dollar_label, dollar_label_name): Likewise.
(fb_label_instance_inc, fb_label_name): Likewise.
* config/bfin-lex.l (yylex): Remove unnecessary casts.
* expr.c (integer_constant): Likewise.
* read.c (read_a_source_file): Limit numeric label range to int.
There are quite a few ubsan warnings in gas. This one disappears with
a code tidy.
* read.c (s_app_line): Rename 'l' to 'linenum'. Avoid ubsan
warning.
The result of running etc/update-copyright.py --this-year, fixing all
the files whose mode is changed by the script, plus a build with
--enable-maintainer-mode --enable-cgen-maint=yes, then checking
out */po/*.pot which we don't update frequently.
The copy of cgen was with commit d1dd5fcc38ead reverted as that commit
breaks building of bfp opcodes files.
Unlike the forms consuming/producing integer data, the floating point
ones so far required the 2nd argument to be present, contrary to
documentation. To avoid code duplication, split float_length() out of
hex_float() (taking the opportunity to adjust error message wording).
This is to be able to generate data acted upon by AVX512-BF16 and
AMX-BF16 insns. While not part of the IEEE standard, the format is
sufficiently standardized to warrant handling in config/atof-ieee.c.
Arm, where custom handling was implemented, may want to leverage this as
well. To be able to also use the hex forms supported for other floating
point formats, a small addition to the generic hex_float() is needed.
Extend existing x86 testcases.
This is to be able to generate data passed to {,V}CVTPH2PS and acted
upon by AVX512-FP16 insns. To be able to also use the hex forms
supported for other floating point formats, a small addition to the
generic hex_float() is needed.
Extend existing x86 testcases.
The ELF psABI-s are quite clear here: On 32-bit the data type is 12
bytes long (with 2 bytes of trailing padding), while on 64-bit it is 16
bytes long (with 6 bytes of padding). Make hex_float() capable of
handling such padding.
Note that this brings the emitted data size of .dc.x / .dcb.x in line
also for non-ELF targets; so far they were different depending on input
format (dec vs hex).
Extend the existing x86 testcases.
The ELF psABI-s are quite clear here: On 32-bit the underlying data type
is 12 bytes long (with 2 bytes of trailing padding), while on 64-bit it
is 16 bytes long (with 6 bytes of padding). Make s_space() capable of
handling 'x' (and 'p') type floating point being other than 12 bytes
wide (also adjusting documentation). This requires duplicating the
definition of X_PRECISION in the target speciifc header; the compiler
would complain if this was out of sync with config/atof-ieee.c.
Note that for now padding space doesn't get separated from actual
storage, which means that things will work correctly only for little-
endian cases, and which also means that by specifying large enough
numbers padding space can be set to non-zero. Since the logic is needed
for a single little-endian architecture only for now, I'm hoping that
this might be acceptable for the time being; otherwise the change will
become more intrusive.
Note also that this brings the emitted data size of .ds.x vs .tfloat in
line for non-ELF targets as well; the issue will be even more obvious
when further taking into account a subsequent patch fixing .dc.x/.dcb.x
(where output sizes currently differ depending on input format).
Extend existing x86 testcases.
Unlike for .dc.? the parsing here failed to skip the colon before
calling hex_float(). To avoid both variants of parsing going out of sync
again, introduce a helper used by both.
While the logic in fixup_segment() so far was off by 1 for fixups
dealing with quantities without known signedness, thus failing to report
an overflow when e.g. a byte-sized resulting value is -0x100, the logic
in emit_expr_with_reloc() reported an overflow even on large negative
values when the respective positive ones would not be warned
about, and when fixup_segment() wouldn't do so either. Such diagnostics
all ought to follow a similar pattern of what value range is acceptable.
(If expressions' X_unsigned was reliably set, emit_expr_with_reloc()'s
checking might make sense to tighten based on that flag.)
Note that with commit 80aab57939 ("Changes to let cons handle bignums
like general expressions") having added handling of nbytes >
sizeof(valueT) in the O_constant case, converting to O_big, the setting
to zero of "hibit" had become dead. With "hibit" no longer used, this
code now gets dropped altogether. But additionally a respective know()
gets inserted.
I've been repeatedly confused by, in particular, the .dc.a potable[]
entry being conditional. Grepping in gas/config/ reveals only very few
targets actually #define-ing it. But as of 7be1c4891a the symbol is
always defined, so #ifdef-s are pointless (and, as said, potentially
confusing).
Also adjust documentation to reflect this.
PR 27381
* read.c (s_incbin): Check that the file to be included is a
regular, non-directory file.
* testsuite/gas/all/pr27381.s: New test source file.
* testsuite/gas/all/pr27381.d: New test control file.
* testsuite/gas/all/pr27381.err: Expected error output for the new test.
* testsuite/gas/all/gas.exp: Run the new test.
This allows alignments up to 2**TC_ALIGN_LIMIT, which might be larger
than an unsigned int can hold.
PR 27101
* read.c (s_align): Use a large enough type for "align" to hold
the result of get_absolute_expression.
* read.c (stringer): Treat space separated, quote enclosed strings
as a single string.
* doc/as.texi (asciz): Mention this behaviour in the description
of the asciz directive.
* testsuite/gas/all/asciz.s: New test.
* testsuite/gas/all/asciz.d: New test driver.
* testsuite/gas/all/gas.exp: Run the new test.