Freeing ecoff_debug_info "pointers to the unswapped symbolic info"
isn't a simple matter, due to differing allocation strategies. In
_bfd_ecoff_slurp_symbolic_info the pointers are to objalloc memory.
In the ecoff linker they are to separately malloc'd memory. In gas we
have most (obj-elf) or all (obj-ecoff) into a single malloc'd buffer.
This patch fixes the leaks for binutils and ld, leaving the gas leaks
for another day. The mips elf backend already had this covered, and
the ecoff backend had a pointer, raw_syments used as a flag, so most
of the patch is moving these around a little so they are accessible
for both ecoff and elf.
include/
* coff/ecoff.h (struct ecoff_debug_info): Add alloc_syments.
bfd/
* libecoff.h (struct ecoff_tdata): Delete raw_syments.
* elfxx-mips.c (free_ecoff_debug): Delete. Replace uses with
_bfd_ecoff_free_ecoff_debug_info.
(_bfd_mips_elf_final_link): Init debug.alloc_syments.
* ecofflink.c (_bfd_ecoff_free_ecoff_debug_info): New function.
* ecoff.c (_bfd_ecoff_bfd_free_cached_info): Call
_bfd_ecoff_free_ecoff_debug_info.
(_bfd_ecoff_slurp_symbolic_info): Replace uses of raw_syments
with alloc_syments.
(ecoff_final_link_debug_accumulate): Likewise. Use
_bfd_ecoff_free_ecoff_debug_info.
(_bfd_ecoff_bfd_copy_private_bfd_data): Set alloc_syments for
copied output.
* elf64-alpha.c (elf64_alpha_read_ecoff_info): Use
_bfd_ecoff_free_ecoff_debug_info.
* libbfd-in.h (_bfd_ecoff_free_ecoff_debug_info): Declare.
* libbfd.h: Regenerate.
gas/
* config/obj-ecoff.c (ecoff_frob_file): Set alloc_syments.
* config/obj-elf.c (elf_frob_file_after_relocs): Likewise.
When configure with triples mipsisa[32,64]rN[el,], the march value
is pinned to a fix value if not given explicitly. for example
1) mipsisa32r6-linux-gnu -n32 xx.s will complains that:
-march=mips32r6 is not compatible with the selected ABI
2) mipsisa64r2el-linux-gnu -o32 generates objects with 64bit CPU:
ELF 32-bit LSB relocatable, MIPS, MIPS64 rel2 version 1 (SYSV)
They are not good default behaviors: Let's alter the CPU info
Since we are using these triples as a regular linux distributions,
let's alter march according to ABI.
Instead they're separators for pseudo-prefixes. Don't insert them in
mnemonic_chars[], handling them explicitly in parse_insn() instead. Note
that this eliminates the need for another separator after a pseudo-
prefix. While maybe not overly interesting for a following real
mnemonic, I view this as quite desirable between multiple successive
pseudo-prefixes (bringing things in line with the other use of figure
braces in AVX512's zeroing-masking).
Drop the unused is_mnemonic_char() at this occasion.
Having to add characters to both arrays can easily lead to oversights.
Consuming extra_symbol_chars[] when populating operand_chars[] also
allows to drop two special cases in md_begin().
Constify operand_special_chars[] at this occasion.
If in a "word ptr <address>" or alike construct the "ptr" part is
double-quoted, it shouldn't be recognized as the specific keyword we're
looking for (just like we don't recognize double-quoted operator or
register names anymore). Be careful though to tell closing from opening
double-quotes, as a quoted symbol may follow right afterwards.
The concept of quoted symbols names was introduced pretty late. Utilize
it to allow access to symbols with names matching that of a register (or,
in Intel syntax, also an identifier-like operator).
This is primarily to aid gcc when generating Intel syntax output; see
their bug target/53929.
96d6e190e9
There are some known limitations for now,
* Do not shrink the length of the uleb128 value, even if the value is reduced
after relaxations. Also reports error if the length grows up.
* The R_RISCV_SET_ULEB128 needs to be paired with and be placed before the
R_RISCV_SUB_ULEB128.
bfd/
* bfd-in2.h: Regenerated.
* elfnn-riscv.c (perform_relocation): Perform R_RISCV_SUB_ULEB128 and
R_RISCV_SET_ULEB128 relocations. Do not shrink the length of the
uleb128 value, and report error if the length grows up. Called the
generic functions, _bfd_read_unsigned_leb128 and _bfd_write_unsigned_leb128,
to encode the uleb128 into the section contents.
(riscv_elf_relocate_section): Make sure that the R_RISCV_SET_ULEB128
must be paired with and be placed before the R_RISCV_SUB_ULEB128.
* elfxx-riscv.c (howto_table): Added R_RISCV_SUB_ULEB128 and
R_RISCV_SET_ULEB128.
(riscv_reloc_map): Likewise.
(riscv_elf_ignore_reloc): New function.
* libbfd.h: Regenerated.
* reloc.c (BFD_RELOC_RISCV_SET_ULEB128, BFD_RELOC_RISCV_SUB_ULEB128):
New relocations to support .uleb128 subtraction.
gas/
* config/tc-riscv.c (md_apply_fix): Added BFD_RELOC_RISCV_SET_ULEB128
and BFD_RELOC_RISCV_SUB_ULEB128.
(s_riscv_leb128): Updated to allow uleb128 subtraction.
(riscv_insert_uleb128_fixes): New function, scan uleb128 subtraction
expressions and insert fixups for them.
(riscv_md_finish): Called riscv_insert_uleb128_fixes for all sections.
include/
* elf/riscv.h ((R_RISCV_SET_ULEB128, (R_RISCV_SUB_ULEB128): Defined.
ld/
* testsuite/ld-riscv-elf/ld-riscv-elf.exp: Updated.
* testsuite/ld-riscv-elf/uleb128*: New testcase for uleb128 subtraction.
binutils/
* testsuite/binutils-all/nm.exp: Updated since RISCV supports .uleb128.
While a442cac508 ("ix86: wrap constants") helped address a number of
inconsistencies between BFD64 and !BFD64 builds, it has also resulted in
certain bogus uses of constants to no longer be warned about. Leverage
the md_optimize_expr() hook to adjust when to actually truncate
expressions to 32 bits - any involvement of binary expressions (which
would be evaluated in 32 bits only when !BFD64) signals the need for
doing so. Plain constants (or ones merely subject to unary operators)
should remain un-truncated - they would be handled as bignums when
!BFD64, and hence are okay to permit.
To compensate
- slightly extend optimize_imm() (to be honest I never understood why
the code being added - or something similar - wasn't there in the
first place),
- adjust expectations of the disp-imm-32 testcase (there are now
warnings, as there should be for any code which won't build [warning-
free] when !BFD64, and Disp8/Imm8 are no longer used in the warned
about cases).
Give backends a chance to see these, just as they can see binary ones.
Most of those which use this hook already cope with NULL being passed
for the left operand (typically because of checking the operator first).
Adjust the two which don't.
Take the opportunity and also document the hook.
In a442cac508 ("ix86: wrap constants") I made the truncation condition
too relaxed: Any indication of a mode that's possible with BFD64 only
should avoid the truncation. Therefore, like in the other two cases of
calls to extend_to_32bit_address(), also check whether we're generating
a 64-bit object.
Well, it doesn't work on x86 or ppc, which both have # starting
comments anywhere on a line. I think it is therefore only useful on
sparc.
PR 11601
* config/obj-elf.c (obj_elf_section_word): Only compile for sparc.
(obj_elf_section): Only support solaris .section directive on
sparc.
* doc/as.texi (Section): Mention that solaris .section
directive is only supported for sparc.
Testing for NULL in pic_need_relax fixes the other call to this
function in md_estimate_size_before_relax.
PR 28955
* config/tc-mips.c (mips_frob_file): Move NULL sym test to..
(pic_need_relax): ..here.
There are two problems: symbol_equated_p() doesn't recognize equates of
registers, and S_CAN_BE_REDEFINED() goes by section rather than by
expression type. Both together undermine .eqv and .equiv clearly meaning
to guard the involved symbols against re-definition (both ways).
To compensate pseudo_set() now using O_symbol and S_CAN_BE_REDEFINED()
now checking for O_register,
- for targets creating register symbols through symbol_{new,create}() ->
symbol_init() -> S_SET_VALUE() (alpha, arc, dlx, ia64, m68k, mips,
mmix, tic4x, tic54x, plus anything using cgen or itbl-ops), have
symbol_init() set their expressions to O_register,
- x86'es parse_register() also can't go by section anymore when
trying to "look through" equates; probably symbol_equated_p() should
have been used there from the beginning, if only that had worked for
equates of registers,
- various targets need to "look through" equates when parsing insn
operands (which also helps transitive forward equates); perhaps even
more ought to, but many don't look to consider the possibility of
register equates in the first place.
This was uncovered by code reported in PR gas/30274 (duplicating
PR gas/30272), except that there .eqv was used when really .equ was
meant. Therefore that bug report is addressed here only in so far as
gas wouldn't crash anymore; the code there still won't assemble
successfully, just that now the issues there are properly diagnosed.
As per the spec merely a blank isn't okay as a separator, the operand
to the relocation function ought to be parenthesized. Enforcing this
then also eliminates an inconsistency in that
lui t0, %hi sym
lui t0, %hi 0x1000
were accepted, but
lui t0, %hi +sym
lui t0, %hi -0x1000
were not.
char is unsigned on s390x, so there are a lot of warnings like:
gas/config/tc-bpf.c: In function 'get_token':
gas/config/tc-bpf.c:900:14: error: comparison is always false due to limited range of data type [-Werror=type-limits]
900 | if (ch == EOF || len > MAX_TOKEN_SZ)
| ^~
Change its type to int, like in the other similar code.
There is also:
gas/config/tc-bpf.c:735:30: error: 'bpf_endianness' may be used uninitialized in this function [-Werror=maybe-uninitialized]
735 | dst, be ? size[endianness - BPF_BE16] : size[endianness - BPF_LE16]);
| ~~~~~~~~~~~^~~~~~~~~~
-Wmaybe-uninitialized doesn't seem to understand the FSM; just
initialize bpf_endianness to silence it. Add an assertion to
build_bpf_endianness() in order to catch potential bugs.
This patch adds support to the GNU assembler for an alternative
assembly syntax used in BPF. This syntax is C-like and very
unconventional for an assembly language, but it is generated by
clang/llvm and is also used in inline asm templates in kernel code, so
we ought to support it.
After this patch, the assembler is able to parse instructions in both
supported syntax: the normal assembly-like syntax and the pseudo-C
syntax. Instruction formats can be mixed in the source program: the
assembler recognizes the right syntax to use.
gas/ChangeLog:
2023-04-20 Guillermo E. Martinez <guillermo.e.martinez@oracle.com>
PR gas/29728
* config/tc-bpf.h (TC_EQUAL_IN_INSN): Define.
* config/tc-bpf.c (LEX_IS_SYMBOL_COMPONENT): Define.
(LEX_IS_WHITESPACE): Likewise.
(LEX_IS_NEWLINE): Likewise.
(LEX_IS_ARITHM_OP): Likewise.
(LEX_IS_STAR): Likewise.
(LEX_IS_CLSE_BR): Likewise.
(LEX_IS_OPEN_BR): Likewise.
(LEX_IS_EQUAL): Likewise.
(LEX_IS_EXCLA): Likewise.
(ST_EOI): Likewise.
(MAX_TOKEN_SZ): Likewise.
(init_pseudoc_lex): New function.
(md_begin): Call init_pseudoc_lex.
(valid_expr): New function.
(build_bpf_non_generic_load): Likewise.
(build_bpf_atomic_insn): Likewise.
(build_bpf_jmp_insn): Likewise.
(build_bpf_arithm_insn): Likewise.
(build_bpf_endianness): Likewise.
(build_bpf_load_store_insn): Likewise.
(look_for_reserved_word): Likewise.
(is_register): Likewise.
(is_cast): Likewise.
(get_token): Likewise.
(bpf_pseudoc_to_normal_syntax): Likewise.
(md_assemble): Try pseudo-C syntax if an instruction cannot be
parsed.
Special casing GPR names in my_getSmallExpression() leads to a number of
inconsistencies. Generalize this by utilizing the md_parse_name() hook,
limited to when instruction operands are being parsed (really: probed).
Then both the GPR lookup there and the yet more ad hoc workaround for
PR/gas 29940 can be removed (including its extension needed for making
the compressed form JAL work again).
With my_getSmallExpression() consistently and silently failing on
relocation operators not fitting an insn, it is no longer necessary to
hand it percent_op_itype[] "just in case" (i.e. to avoid errors when a
subsequent parsing attempt for another operand combination might
succeed). This also eliminates the latent problem of percent_op_itype[]
and percent_op_stype[] growing a non-identical set of recognized
relocation operators.
The use of a wrong (for the insn) relocation operator (or a future one
which simply isn't recognized by older gas yet) doesn't render the (rest
of the) expression "bad". Furthermore alongside the error from
expression() in most cases the parser would emit another error then
anyway. Suppress the call to my_getExpression() in such a case,
arranging for a guaranteed subsequent error message by marking the
expression "illegal".
Both callers check for no relocations, so there's no point parsing for
some. Have the function pass percent_op_null into
my_getSmallExpression(). Note that there's no point passing
percent_op_itype: Elsewhere, especially when processing compressed alias
insns ahead of non-alias ones, this has the effect of avoiding "bad
expression" errors when another parsing pass may follow (and succeed).
Here, however, all alternative forms of an insn type will again start
with the same O4 or O2, so avoiding errors earlier on doesn't really
help. Plus constructs with a relocation specifier (as percent_op_itype
would permit) can't be specified anyway, as the scrubber eats the
whitespace between .insn's type and the O4 or O2 expression when that
starts with % or ( - i.e. these will be seen as e.g. "i%lo(x)", and
riscv_ip() looks only for whitespace when finding the end of a mnemonic.
The sole caller of parse_relocation() has already checked for the %
prefix, so there's no need to check for it again in the strncasecmp()
and there's also no reason to make the involved string literals longer
than necessary.
-mfix-looongson3-llsc may add sync instructions not needed on some
asm code with lots of debug info.
PR: 30153
* gas/config/tc-mips.c(fix_loongson3_llsc): clear logistic.
This reverts the code change done by 100f993c53 ("x86: Check
unbalanced braces in memory reference"), which wrongly identified
e87fb6a6d0 ("x86/gas: support quoted address scale factor in AT&T
syntax") as the root cause of PR gas/30248. (The testcase is left in
place, no matter that it's at best marginally useful in that shape.)
The problem instead is that parse_register() alters the string handed to
it, thus breaking valid assumptions in subsequent parsing code. Since
the function's behavior is a result of get_symbol_name()'s, make a copy
of the incoming string before invoking that function.
Like for parse_real_register() follow the model of strtol() et al: input
string is const-qualified to signal that the string isn't altered, but
the returned "end" pointer is not const-qualified, requiring const to be
cast away (which generally is a bad idea, but the alternative would
again be more convoluted code).
Follow the model of strtol() et al - input string is const-qualified to
signal that the string isn't altered, but the returned "end" pointer is
not const-qualified, requiring const to be cast away (which generally is
a bad idea, but the alternative would be more convoluted code).
tc-aarch64.c:1473:27: runtime error: left shift of 7 by 30 places
cannot be represented in type 'int'.
* config/tc-aarch64.c (parse_vector_reg_list): Avoid UB left
shift.
This commit intends to move operands that require very special handling or
operand types that are so minor (e.g. only useful on a few instructions)
under "W". I also intend this "W" to be "temporary" operand storage until
we can find good two character (or less) operand type.
In this commit, prefetch offset operand "f" for 'Zicbop' extension is moved
to "Wif" because of its special handling (and allocating single character
"f" for this operand type seemed too much).
Current expected allocation guideline is as follows:
1. 'W'
2. The most closely related single-letter extension in lowercase
(strongly recommended but not mandatory)
3. Identify operand type
The author currently plans to allocate following three-character operand
types (for operands including instructions from unratified extensions).
1. "Wif" ('Zicbop': fetch offset)
2. "Wfv" (unratified 'Zfa': value operand from FLI.[HSDQ] instructions)
3. "Wfm" / "WfM"
'Zfh', 'F', 'D', 'Q': rounding modes "m" with special handling
solely for widening conversion instructions.
gas/ChangeLog:
* config/tc-riscv.c (validate_riscv_insn, riscv_ip): Move from
"f" to "Wif".
opcodes/ChangeLog:
* riscv-dis.c (print_insn_args): Move from "f" to "Wif".
* riscv-opc.c (riscv_opcodes): Reflect new operand type.
Since we have no insn suffix and it's also not realistic to infer
immediate size from the size of other (typically register) operands
(like optimize_imm() does), and since we also don't have a template
telling us permitted size(s), a new syntax construct is introduced to
allow size (and signedness) specification. In the absence of such, the
size is inferred from significant bits (which obviously may yield
inconsistent results at least for effectively negative values, depending
on whether BFD64 is enabled), and only if supplied expressions can be
evaluated at parsing time. Being explicit is generally recommended to
users.
Size specification is permitted at bit granularity, but of course the
eventually emitted immediate values will be padded up to 8-, 16-, 32-,
or 64-bit fields.
In particular the scaling factor cannot always be determined from pre-
existing operand attributes. Introduce a new {:d<N>} vector operand
syntax extension, restricted to .insn only, to allow specifying this in
(at least) otherwise ambiguous cases.
Deal with register and memory operands; immediate operands will follow
later, as will the handling of EVEX embedded broadcast and EVEX Disp8
scaling.
Note that because we can't really know how to encode their use, %cr8 and
up cannot be used with .insn outside of 64-bit mode. Users would need to
specify an explicit LOCK prefix in combination with %cr0 etc.
So called "short form" encoding is specified by a trailing "+r", whereas
a possible extension opcode is specified by the usual "/<digit>". Take
these off the expression before handing it to get_absolute_expression().
Note that on targets where / starts a comment, --divide needs passing to
gas in order to make use of the extension opcode functionality.
All encoding spaces can be used this way; there's a certain risk that
the bits presently reserved could be used for other purposes down the
road, but people using .insn are expected to know what they're doing
anyway. Plus this way there's at least _some_ way to have those bits
set.
For now this will only allow operand-less insns to be encoded this way.
This patch adds the RPRFM (range prefetch) instruction.
It was introduced as part of SME2, but it belongs to the
prefetch hint space and so doesn't require any specific
ISA flags.
The aarch64_rprfmop_array initialiser (deliberately) only
fills in the leading non-null elements.