More of a DWARF-generation non-regression test; fixed on the GCC side
with 2024-06-03 "Implement wrap-around arithmetics in DWARF
expressions" (f3d6d60d2ae).
Approved-By: Tom Tromey <tom@tromey.com>
RISC-V Profiles document defines number of "extensions" that indicate
certain platform properties/capabilities just like 'Zkt' extension from the
RISC-V cryptography extensions.
This commit defines 20 platform property/capability extensions as defined
in the RISC-V Profiles documentation.
The only exception: 'Ssstateen' extension is defined separately because it
defines a subset (supervisor/hypervisor view) of the 'Smstateen' extension.
This is based on the ratified version of RISC-V Profiles:
<https://github.com/riscv/riscv-profiles/releases/tag/v1.0>
[Definition]
"Main memory regions":
Main memory regions (in contrast to I/O or vacant memory regions) with
both the cacheability and coherence PMAs.
[New Unprivileged Extensions]
1. 'Ziccif'
"Main memory regions" support instruction fetch and any instruction
fetches of naturally aligned power-of-2 sizes up to min(ILEN, XLEN)
are atomic.
2. 'Ziccrse'
"Main memory regions" provide the eventual success guarantee for
LR/SC sequence (RsrvEventual).
3. 'Ziccamoa'
"Main memory regions" support all currently-defined AMO operations
including swap, logical and arithmetic operations (AMOArithmetic).
4. 'Za64rs'
For LR/SC instructions, reservation sets are contiguous, naturally
aligned and at most 64-bytes in size.
5. 'Za128rs'
Likewise, but reservation sets are at most 128-bytes in size.
6. 'Zicclsm'
Misaligned loads / stores to "main memory regions" are supported.
Those include both regular scalar and vector accesses but does not
include AMOs and other specialized forms of memory accesses.
7. 'Zic64b'
Cache blocks are (exactly) 64-bytes in size and naturally aligned.
[New Privileged Extensions]
1. 'Svbare'
"satp" mode Bare is supported.
2. 'Svade'
Page-fault exceptions are raised when a page is accessed when A bit is
clear, or written when D bit is clear.
3. 'Ssccptr'
"Main memory regions" support hardware page-table reads.
4. 'Sstvecd'
"stvec" mode Direct is supported. When "stvec" mode is Direct,
"stvec.BASE" is capable of holding any valid 4-byte aligned address.
5. 'Sstvala'
"stval" is always written with a nonzero value whenever possible as
specified in the Privileged Architecture documentation
(version 20211203: see section 4.1.9).
6. 'Sscounterenw'
For any "hpmcounter" that is not read-only zero, the corresponding bit
in "scounteren" is writable.
7. 'Ssu64xl'
"sstatus.UXL" is capable of holding the value 0b10
(UXLEN==64 is supported).
8. 'Shcounterenw'
Similar to 'Sscounterenw' but the same rule applies to "hcounteren".
9. 'Shvstvala'
Similar to 'Sstvala' but the same rule applies to "vstval".
10. 'Shtvala'
"htval" is written with the faulting guest physical address as long as
permitted by the ISA (a bit similar to 'Sstvala' and 'Shvstvala').
11. 'Shvstvecd'
Similar to 'Sstvecd' but the same rule applies to "vstvec".
12. 'Shvsatpa'
All translation modes supported in "satp" are also supported in "vsatp".
13. 'Shgatpa'
For each supported virtual memory scheme SvNN supported in "satp", the
corresponding "hgatp" SvNNx4 mode is supported. The "hgatp" mode Bare
is also supported.
[Implications]
(Due to reservation set size constraints)
- 'Za64rs' -> 'Za128rs'
(Due to the fact that a privileged "extension" directly refers a CSR)
- 'Svbare' -> 'Zicsr'
- 'Sstvecd' -> 'Zicsr'
- 'Sstvala' -> 'Zicsr'
- 'Sscounterenw' -> 'Zicsr'
- 'Ssu64xl' -> 'Zicsr'
(Due to the fact that a privileged "extension" indirectly depends on CSRs)
- 'Svade' -> 'Zicsr'
(Due to the fact that a privileged "extension" is a hypervisor property)
- 'Shcounterenw' -> 'H'
- 'Shvstvala' -> 'H'
- 'Shtvala' -> 'H'
- 'Shvstvecd' -> 'H'
- 'Shvsatpa' -> 'H'
- 'Shgatpa' -> 'H'
bfd/
* elfxx-riscv.c (riscv_implicit_subsets): Updated for property
and capability extensions.
(riscv_supported_std_z_ext): Added zic64b, ziccamoa, ziccif, zicclsm,
ziccrse, za64rs and za128rs extensions.
(riscv_supported_std_s_ext): Added shcounterenw, shgatpa, shtvala,
shvsatpa, shvstvala, shvstvecd, ssccptr, sscounterenw, sstvala,
sstvecd, ssu64xlm svade and svbare extensions.
gas/
* testsuite/gas/riscv/imply.d: Updated for property and capability
extensions.
* testsuite/gas/riscv/imply.s: Likewise.
* testsuite/gas/riscv/march-help.l: Likewse.
Fixes a failure on rx-elf where the standard data section isn't .data.
run_dump_test has machinery to translate .data in both options and
expected results for objdump, but not for readelf -x.
PR 31964
* testsuite/gas/all/base64.d: Dump .data with objdump. Run on
all targets.
Skip -z mark-plt tests, which are specific to glibc, on MUSL.
PR ld/31970
* ld/testsuite/ld-x86-64/x86-64.exp: Skip -z mark-plt tests on
MUSL.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Libsframe was missing AC_CANONICAL_TARGET, meaning that --target was
ignored. This could prevent libsframe.a to be installed in some cases,
the host fetching its canonical value while the target isn't. Both
having a different value, INSTALL_LIBBFD would be false.
There is no need to add a needed glibc version if the glibc base version
includes the needed glibc version.
PR ld/31966
* elflink.c (elf_link_add_glibc_verneed): Add glibc_minor_base.
Skip if the glibc base version includes the needed glibc version.
(_bfd_elf_link_add_glibc_version_dependency): Initialize
glibc_minor_base to INT_MAX and pass it to
elf_link_add_glibc_verneed.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
gprofng/ChangeLog
2024-07-07 Vladimir Mezentsev <vladimir.mezentsev@oracle.com>.
* common/hwc_cpus.h: New constant for Intel Ice Lake processor.
* common/hwcdrv.c: Add a new argument to hwcfuncs_get_x86_eventsel.
Set config1 in perf_event_attr. Remove the use of memset.
* common/core_pcbe.c (core_pcbe_get_eventnum): Return 0.
* common/hwcentry.h: Add config1.
* src/collctrl.cc (Coll_Ctrl::build_data_desc):Set config1.
* common/hwcfuncs.c (process_data_descriptor): Set config1.
* common/hwctable.c: Add the hwc table for Intel Ice Lake processor.
* src/hwc_intel_icelake.h: New file.
Add an appendix to provide a rough outline to show how to generate stack
traces using the SFrame format. Such content should hopefully aid the
reader assimmilate the information in the specification.
libsframe/
* doc/sframe-spec.texi: Add new appendix.
This also amends the incorrect comment:
offset3 (intrepreted as FP = CFA + offset2)
If RA tracking is enabled, the offset to recover FP is at the third
index. The SFrame format (V2) has assumption that if FP is saved on
stack, RA must have been saved as well. This is true for the currently
supported arch Aarch64. For AMD64, RA tracking per SFrame FRE is not
necessary.
In future, when extending support for more architectures, this will
likely need to be revisited.
include/
* sframe.h: Make the comments clearer by enumerating what
happens per-ABI.
The recipe to interpret the SFrame FRE stack offsets is
ABI/arch-specific.
Although, there is other information in the specification that is
ABI-specific (like pauth_key usage in AArch64), those pieces of
information are now assimmilated in the SFrame specification in a way
that it is fairly difficult to carve then out into a ABI/arch-specific
section without confusing the readers.
For future though, the specification must strive to keep the generic
parts and ABI/arch-specific parts clearly laid out in separate sections.
libsframe/
* doc/sframe-spec.texi: Reorder and adapt the contents.
Add wrapper_symbol to bfd_link_hash_entry and set it to true for wrapper
symbol. Set wrap_status to wrapper if wrapper_symbol is true in LTO.
Note: Calling unwrap_hash_lookup to check for the wrapper symbol works
only when there is a definition for the wrapped symbol since references
to the wrapped symbol have been redirected to the wrapper symbol.
bfd/
PR ld/31956
* linker.c (bfd_wrapped_link_hash_lookup): Set wrapper_symbol
for wrapper symbol.
include/
PR ld/31956
* bfdlink.h (bfd_link_hash_entry): Add wrapper_symbol.
ld/
PR ld/31956
* plugin.c (get_symbols): Set wrap_status to wrapper if
wrapper_symbol is set.
* testsuite/ld-plugin/lto.exp: Run PR ld/31956 tests.
* testsuite/ld-plugin/pr31956a.c: New file.
* testsuite/ld-plugin/pr31956b.c: Likewise.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
This patch adds support for followign SVE2p1 instruction, spec is available here [1].
1. PMOV (to vector)
2. PMOV (to predicate)
Both pmov (to vector) and pmov (to predicate) have destination scalable vector
register and source scalable vector register respectively as an operand with no
suffix and optional index. To handle this case we have added 8 new operands in
this patch.
AARCH64_OPND_SVE_Zn0_INDEX, /* Zn[index], bits [9:5]. */
AARCH64_OPND_SVE_Zn1_17_INDEX, /* Zn[index], bits [9:5,17]. */
AARCH64_OPND_SVE_Zn2_18_INDEX, /* Zn[index], bits [9:5,18:17]. */
AARCH64_OPND_SVE_Zn3_22_INDEX, /* Zn[index], bits [9:5,18:17,22]. */
AARCH64_OPND_SVE_Zd0_INDEX, /* Zn[index], bits [4:0]. */
AARCH64_OPND_SVE_Zd1_17_INDEX, /* Zn[index], bits [4:0,17]. */
AARCH64_OPND_SVE_Zd2_18_INDEX, /* Zn[index], bits [4:0,18:17]. */
AARCH64_OPND_SVE_Zd3_22_INDEX, /* Zn[index], bits [4:0,18:17,22]. */
Since the index of the <Zd> operand is optional, the index part is
dropped in disassembly in both the cases of "no index" or "zero index".
As per spec: PMOV <Zd>{[<imm>]}, <Pn>.D
PMOV <Pn>.D, <Zd>{[<imm>]}
Example1:
Assembly: pmov z5[0], p6.d
Disassembly: pmov z5, p6.d
Assembly: pmov z5, p6.d
Disassembly: pmov z5, p6.d
Example2:
Assembly: pmov p4.b, z5[0]
Disassembly: pmov p4.b, z5
Assembly: pmov p4.b, z5
Disassembly: pmov p4.b, z5
[1]: https://developer.arm.com/documentation/ddi0602/2024-03/SVE-Instructions?lang=en
This patch started life as a relatively simple change to fix some
unimportant objcopy memory leaks, but expanded into a larger patch
when I was annoyed by the awkwardness of passing data when using
bfd_map_over_sections. A simple loop over sections is much more
convenient, and we really don't need the abstraction layer. Sections
in a list isn't going to disappear any time soon.
The patch also removes use of the global "status" variable by all but
the top-level functions called from main.
* objcopy.c (filter_symbols): Return success as a bool. Pass
symcount as a pointer, updated on return.
(merge_gnu_build_notes): Similarly return a bool and add newsize
param with updated smaller section size.
(setup_bfd_headers): Return bool success rather than setting
"status" on failure.
(setup_section): Likewise.
(copy_relocations_in_section, copy_section): Likewise, and adjust
params.
(mark_symbols_used_in_relocations): Likewise, and free memory
on failure path. Don't call bfd_fatal.
(get_sections): Delete function.
(copy_object): Don't use bfd_map_over_sections, instead use a
loop allowing easy detection of failure status. Free memory on
error paths.
(copy_archive): Return bool success rather than setting "status"
on failure.
(copy_file): Set "status" here.
* testsuite/binutils-all/strip-13.d: Adjust to suit.
AArch64 defines new registers for the feature step2 (Enhanced Software Step
Extension). step2 is an Armv9.5-A feature.
This patch also adds relevant tests. Regression tested on aarch64-none-elf,
and no regression found.
AArch64 defines new registers for the feature spmu2 (System Performance
Monitors Extension version 2). spmu2 is an Armv9.5-A feature.
This patch also adds relevant tests. Regression tested on aarch64-none-elf,
and no regression found.
AArch64 defines new registers for the feature e3dse (Delegated SError
exceptions for EL3): vdisr_el3 and vdisr_el3. e3dse is an Armv9.5-A
feature.
This patch also adds relevant tests. Regression tested on aarch64-none-elf,
and no regression found.
Added the restriction in assemble for APX TLS IE that the destination
can only be a register.
gas/
* config/tc-i386.c (md_assemble): Added stricter restrictions
for APX TLS IE.
As of 27b33966b1 ("RISC-V: disallow x0 with certain macro-insns") the
.match_func field may be NULL for entries used for assembly only, which
is the case for the entire table. With .match and .mask both zero the
function would only ever succeed anyway. Save almost a hundred base
relocations in the final executable by using NULL instead.
Most tests are ported from AArch64.
The relr-addend test is added to make sure the addend (link-time address)
is correctly written into the relocated section. Doing so is not
strictly needed for RELA, but strictly needed for RELR).
Signed-off-by: Xi Ruoyao <xry111@xry111.site>
The logic is same as a71d876801 ("aarch64: Add DT_RELR support").
As LoongArch does not have -z dynamic-undefined-weak, we don't need to
consider UNDEFWEAK_NO_DYNAMIC_RELOC.
The linker relaxation adds another layer of complexity. When we delete
bytes in a section during relaxation, we need to fix up the offset in
the to-be-packed relative relocations against this section.
Signed-off-by: Xi Ruoyao <xry111@xry111.site>
On LoongArch there is no reason to treat STV_PROTECTED STT_FUNC symbols
as preemptible. See the comment above LARCH_REF_LOCAL for detailed
explanation.
Signed-off-by: Xi Ruoyao <xry111@xry111.site>
With a simple test case:
.globl ifunc
.globl ifunc_hidden
.hidden ifunc_hidden
.type ifunc, %gnu_indirect_function
.type ifunc_hidden, %gnu_indirect_function
.text
.align 2
ifunc: ret
ifunc_hidden: ret
test:
bl ifunc
bl ifunc_hidden
"ld -shared" produces a shared object with one R_LARCH_NONE (instead of
R_LARCH_JUMP_SLOT as we expect) to relocate the GOT entry of "ifunc".
It's because the indices in .plt and .rela.plt mismatches for
STV_DEFAULT STT_IFUNC symbols when another PLT entry exists for a
STV_HIDDEN STT_IFUNC symbol, and such a mismatch breaks the logic of
loongarch_elf_finish_dynamic_symbol. Fix the issue by reordering .plt
so the indices no longer mismatch.
Signed-off-by: Xi Ruoyao <xry111@xry111.site>
We were converting R_LARCH_32 to R_LARCH_RELATIVE for ELFCLASS64:
$ cat t.s
.data
x:
.4byte x
.4byte 0xdeadbeef
$ as/as-new t.s -o t.o
$ ld/ld-new -shared t.o
$ objdump -R
a.out: file format elf64-loongarch
DYNAMIC RELOCATION RECORDS
OFFSET TYPE VALUE
00000000000001a8 R_LARCH_RELATIVE *ABS*+0x00000000000001a8
But this is just wrong: at runtime the dynamic linker will run
*(uintptr *)&x += load_address, clobbering the next 4 bytes of data
("0xdeadbeef" in the example).
If we keep the R_LARCH_32 reloc as-is in ELFCLASS64, it'll be rejected
by the Glibc dynamic linker anyway. And it does not make too much sense
to modify Glibc to support it. So we can just reject it like x86_64:
relocation R_X86_64_32 against `.data' can not be used when making a
shared object; recompile with -fPIC
or RISC-V:
relocation R_RISCV_32 against non-absolute symbol `a local symbol'
can not be used in RV64 when making a shared object
Signed-off-by: Xi Ruoyao <xry111@xry111.site>
In commit dff565fcca, the fixups
for PCREL_LO12_I and PCREL_LO12_S were mixed, so the "IMM"
field were applied to incorrect position, this caused incorrect
src registers to be encoded.
gas/
* config/tc-riscv.c (md_apply_fix): Fix PCREL_LO12_S issue.
* testsuite/gas/riscv/ixup-local.s: Updated for PCREL_LO12_S cases.
* testsuite/gas/riscv/fixup-local-relax.d: Likewise.
* testsuite/gas/riscv/fixup-local-norelax.d: Likewise.
Signed-off-by: Jianwei Sun <sunny.sun@corelabtech.com>
When the same address across different segments (sections) needs to be
recorded, it will overwrite the slot, leading to a memory leak. To ensure
uniqueness, the segment (section) ID needs to be included in the hash key
calculation.
gas/
* config/tc-riscv.c (riscv_pcrel_hi_fixup): New "const asection *sec".
(riscv_pcrel_fixup_hash): make sec->id and e->adrsess as the
hash key.
(riscv_pcrel_fixup_eq): Check sec->id at first.
(riscv_record_pcrel_fixup): New member "sec".
(md_apply_fix) <case BFD_RELOC_RISCV_PCREL_HI20>: Likewise.
(md_apply_fix) <case BFD_RELOC_RISCV_PCREL_LO12_I>: Likewise.
The encoding was previously not taking into account that the Quad vector
registers were being encoded using their Q-register numbers rather than their
D-register equivalent (multiply by 2).
gas/
* config/tc-arm.c (do_neon_cvttb_1): Use Q-register vector number
rather than their D-register equivalent.
gas/testsuite/
* gas/arm/mve-vcvt-3.d: Correct expected values in test.
Verify all architectures participating in SFrame generation do define
the mandatory SFrame return address (RA) tracking predicate function
sframe_ra_tracking_p. Do so by explicitly not testing for the macro
SFRAME_FRE_RA_TRACKING as otherwise required.
Verify that architectures not using SFrame RA tracking specify a valid
fixed RA offset.
gas/
* gen-sframe.c (output_sframe_internal): Validate SFrame
RA tracking and fixed RA offset.
Signed-off-by: Jens Remus <jremus@linux.ibm.com>
The existence of the macro SFRAME_FRE_RA_TRACKING only ensures the
existence of the macro SFRAME_CFA_RA_REG and the predicate function
sframe_ra_tracking_p. It does not indicate whether SFrame RA tracking
is actually used.
Test the return value of the SFrame RA tracking predicate function
sframe_ra_tracking_p to determine whether RA tracking is used.
This aligns the logic in functions get_fre_num_offsets and
output_sframe_row_entry to the one used in all other places.
gas/
* gen-sframe.c (get_fre_num_offsets, output_sframe_row_entry):
Test predicate to determine whether SFrame RA tracking is used.
Signed-off-by: Jens Remus <jremus@linux.ibm.com>