glibc

mirror of git://sourceware.org/git/glibc.git synced 2024-11-21 01:12:26 +08:00

Author	SHA1	Message	Date
Vladislav Khmelevsky	691f70b84a	elf: Fix rtld-audit trampoline for aarch64 This patch fixes two problems with audit: 1. The DL_OFFSET_RV_VPCS offset was mixed up with DL_OFFSET_RG_VPCS, resulting in x2 register value nulling in RG structure. 2. We need to preserve the x8 register before function call, but don't have to save it's new value and restore it before return. Anyway the final restore was using OFFSET_RV instead of OFFSET_RG value which is wrong (althoug doesn't affect anything). Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> (cherry picked from commit `eb4181e9f4`)	2022-11-22 10:34:14 -03:00
Ben Woodard	b118bce87a	elf: Fix runtime linker auditing on aarch64 (BZ #26643 ) The rtld audit support show two problems on aarch64: 1. _dl_runtime_resolve does not preserve x8, the indirect result location register, which might generate wrong result calls depending of the function signature. 2. The NEON Q registers pushed onto the stack by _dl_runtime_resolve were twice the size of D registers extracted from the stack frame by _dl_runtime_profile. While 2. might result in wrong information passed on the PLT tracing, 1. generates wrong runtime behaviour. The aarch64 rtld audit support is changed to: * Both La_aarch64_regs and La_aarch64_retval are expanded to include both x8 and the full sized NEON V registers, as defined by the ABI. * dl_runtime_profile needed to extract registers saved by _dl_runtime_resolve and put them into the new correctly sized La_aarch64_regs structure. * The LAV_CURRENT check is change to only accept new audit modules to avoid the undefined behavior of not save/restore x8. * Different than other architectures, audit modules older than LAV_CURRENT are rejected (both La_aarch64_regs and La_aarch64_retval changed their layout and there are no requirements to support multiple audit interface with the inherent aarch64 issues). * A new field is also reserved on both La_aarch64_regs and La_aarch64_retval to support variant pcs symbols. Similar to x86, a new La_aarch64_vector type to represent the NEON register is added on the La_aarch64_regs (so each type can be accessed directly). Since LAV_CURRENT was already bumped to support bind-now, there is no need to increase it again. Checked on aarch64-linux-gnu. Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com> Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com> (cherry picked from commit `ce9a68c57c`) Resolved conflicts: NEWS elf/rtld.c	2022-04-12 13:33:10 -04:00
Adhemerval Zanella	a8e211daea	elf: Add _dl_audit_pltexit It consolidates the code required to call la_pltexit audit callback. Checked on x86_64-linux-gnu, i686-linux-gnu, and aarch64-linux-gnu. Reviewed-by: Florian Weimer <fweimer@redhat.com> (cherry picked from commit `8c0664e2b8`) Resolved conflicts: sysdeps/hppa/dl-runtime.c	2022-04-08 14:18:12 -04:00
Adhemerval Zanella	b868b45f67	elf: Fix dynamic-link.h usage on rtld.c The `4af6982e4c` fix does not fully handle RTLD_BOOTSTRAP usage on rtld.c due two issues: 1. RTLD_BOOTSTRAP is also used on dl-machine.h on various architectures and it changes the semantics of various machine relocation functions. 2. The elf_get_dynamic_info() change was done sideways, previously to `490e6c62aa` get-dynamic-info.h was included by the first dynamic-link.h include without RTLD_BOOTSTRAP being defined. It means that the code within elf_get_dynamic_info() that uses RTLD_BOOTSTRAP is in fact unused. To fix 1. this patch now includes dynamic-link.h only once with RTLD_BOOTSTRAP defined. The ELF_DYNAMIC_RELOCATE call will now have the relocation fnctions with the expected semantics for the loader. And to fix 2. part of `4af6982e4c` is reverted (the check argument elf_get_dynamic_info() is not required) and the RTLD_BOOTSTRAP pieces are removed. To reorganize the includes the static TLS definition is moved to its own header to avoid a circular dependency (it is defined on dynamic-link.h and dl-machine.h requires it at same time other dynamic-link.h definition requires dl-machine.h defitions). Also ELF_MACHINE_NO_REL, ELF_MACHINE_NO_RELA, and ELF_MACHINE_PLT_REL are moved to its own header. Only ancient ABIs need special values (arm, i386, and mips), so a generic one is used as default. The powerpc Elf64_FuncDesc is also moved to its own header, since csu code required its definition (which would require either include elf/ folder or add a full path with elf/). Checked on x86_64, i686, aarch64, armhf, powerpc64, powerpc32, and powerpc64le. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com> (cherry picked from commit `d6d89608ac`) Resolved conflicts: elf/rtld.c	2022-04-08 14:18:11 -04:00
Fangrui Song	b19de59d62	elf: Avoid nested functions in the loader [BZ #27220 ] dynamic-link.h is included more than once in some elf/ files (rtld.c, dl-conflict.c, dl-reloc.c, dl-reloc-static-pie.c) and uses GCC nested functions. This harms readability and the nested functions usage is the biggest obstacle prevents Clang build (Clang doesn't support GCC nested functions). The key idea for unnesting is to add extra parameters (struct link_map and struct r_scope_elm []) to RESOLVE_MAP, ELF_MACHINE_BEFORE_RTLD_RELOC, ELF_DYNAMIC_RELOCATE, elf_machine_rel[a], elf_machine_lazy_rel, and elf_machine_runtime_setup. (This is inspired by Stan Shebs' ppc64/x86-64 implementation in the google/grte/v5-2.27/master which uses mixed extra parameters and static variables.) Future simplification: * If mips elf_machine_runtime_setup no longer needs RESOLVE_GOTSYM, elf_machine_runtime_setup can drop the `scope` parameter. * If TLSDESC no longer need to be in elf_machine_lazy_rel, elf_machine_lazy_rel can drop the `scope` parameter. Tested on aarch64, i386, x86-64, powerpc64le, powerpc64, powerpc32, sparc64, sparcv9, s390x, s390, hppa, ia64, armhf, alpha, and mips64. In addition, tested build-many-glibcs.py with {arc,csky,microblaze,nios2}-linux-gnu and riscv64-linux-gnu-rv64imafdc-lp64d. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> (cherry picked from commit `490e6c62aa`)	2022-04-08 14:18:11 -04:00
Wilco Dijkstra	217b84127b	AArch64: Check for SVE in ifuncs [BZ #28744 ] Add a check for SVE in the A64FX ifuncs for memcpy, memset and memmove. This fixes BZ #28744. (cherry picked from commit `e5fa62b8db`)	2022-01-07 10:10:27 +01:00
Siddhesh Poyarekar	b5bd5bfe88	glibc.malloc.check: Wean away from malloc hooks The malloc-check debugging feature is tightly integrated into glibc malloc, so thanks to an idea from Florian Weimer, much of the malloc implementation has been moved into libc_malloc_debug.so to support malloc-check. Due to this, glibc malloc and malloc-check can no longer work together; they use altogether different (but identical) structures for heap management. This should not make a difference though since the malloc check hook is not disabled anywhere. malloc_set_state does, but it does so early enough that it shouldn't cause any problems. The malloc check tunable is now in the debug DSO and has no effect when the DSO is not preloaded. Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com>	2021-07-22 18:38:08 +05:30
Wilco Dijkstra	6a34c928c2	AArch64: Add hp-timing.h Add hp-timing.h using the cntvct_el0 counter. Return timing in nanoseconds so it is fully compatible with generic hp-timing. Don't set HP_TIMING_INLINE in the dynamic linker since it adds unnecessary overheads and some ancient kernels may not handle emulating cntcvt correctly. Currently cntvct_el0 is only used for timing in the benchtests. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>	2021-07-01 15:42:05 +01:00
Wilco Dijkstra	252cad02d4	AArch64: Improve strnlen performance Optimize strnlen by avoiding UMINV which is slow on most cores. On Neoverse N1 large strings are 1.8x faster than the current version, and bench-strnlen is 50% faster overall. This version is MTE compatible. Reviewed-by: Szabolcs Nagy <szabolcs.nagy@arm.com>	2021-07-01 15:32:36 +01:00
H.J. Lu	3213ed770c	Update math: redirect roundeven function Redirect target specific roundeven functions for aarch64, ldbl-128ibm and riscv.	2021-06-27 07:56:57 -07:00
Wilco Dijkstra	6a86bc0992	AArch64: Add support for roundeven[f] Add inline assembler for the roundeven functions. Passes GLIBC regression. Note GCC does not inline the builtin (PR100966), so this cannot be used for now.	2021-06-08 13:33:09 +01:00
Naohiro Tamura	4f26956d5b	aarch64: Added optimized memset for A64FX This patch optimizes the performance of memset for A64FX [1] which implements ARMv8-A SVE and has L1 64KB cache per core and L2 8MB cache per NUMA node. The performance optimization makes use of Scalable Vector Register with several techniques such as loop unrolling, memory access alignment, cache zero fill and prefetch. SVE assembler code for memset is implemented as Vector Length Agnostic code so theoretically it can be run on any SOC which supports ARMv8-A SVE standard. We confirmed that all testcases have been passed by running 'make check' and 'make xcheck' not only on A64FX but also on ThunderX2. And also we confirmed that the SVE 512 bit vector register performance is roughly 4 times better than Advanced SIMD 128 bit register and 8 times better than scalar 64 bit register by running 'make bench'. [1] https://github.com/fujitsu/A64FX Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com> Reviewed-by: Szabolcs Nagy <Szabolcs.Nagy@arm.com>	2021-05-27 09:47:53 +01:00
Naohiro Tamura	fa527f345c	aarch64: Added optimized memcpy and memmove for A64FX This patch optimizes the performance of memcpy/memmove for A64FX [1] which implements ARMv8-A SVE and has L1 64KB cache per core and L2 8MB cache per NUMA node. The performance optimization makes use of Scalable Vector Register with several techniques such as loop unrolling, memory access alignment, cache zero fill, and software pipelining. SVE assembler code for memcpy/memmove is implemented as Vector Length Agnostic code so theoretically it can be run on any SOC which supports ARMv8-A SVE standard. We confirmed that all testcases have been passed by running 'make check' and 'make xcheck' not only on A64FX but also on ThunderX2. And also we confirmed that the SVE 512 bit vector register performance is roughly 4 times better than Advanced SIMD 128 bit register and 8 times better than scalar 64 bit register by running 'make bench'. [1] https://github.com/fujitsu/A64FX Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com> Reviewed-by: Szabolcs Nagy <Szabolcs.Nagy@arm.com>	2021-05-27 09:47:53 +01:00
Naohiro Tamura	bd4317fbd6	aarch64: define BTI_C and BTI_J macros as NOP unless HAVE_AARCH64_BTI This patch defines BTI_C and BTI_J macros conditionally for performance. If HAVE_AARCH64_BTI is true, BTI_C and BTI_J are defined as HINT instruction for ARMv8.5 BTI (Branch Target Identification). If HAVE_AARCH64_BTI is false, both BTI_C and BTI_J are defined as NOP.	2021-05-26 12:01:06 +01:00
Naohiro Tamura	77d175e14e	config: Added HAVE_AARCH64_SVE_ASM for aarch64 This patch checks if assembler supports '-march=armv8.2-a+sve' to generate SVE code or not, and then define HAVE_AARCH64_SVE_ASM macro.	2021-05-26 12:01:06 +01:00
Szabolcs Nagy	2208066603	elf: Remove lazy tlsdesc relocation related code Remove generic tlsdesc code related to lazy tlsdesc processing since lazy tlsdesc relocation is no longer supported. This includes removing GL(dl_load_lock) from _dl_make_tlsdesc_dynamic which is only called at load time when that lock is already held. Added a documentation comment too. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2021-04-21 14:35:53 +01:00
Szabolcs Nagy	e06e6554c3	aarch64: update libm test ulps Update after commit `43576de04a`.	2021-04-08 08:24:30 +01:00
Szabolcs Nagy	69499bb6ee	aarch64: free tlsdesc data on dlclose [BZ #27403 ] DL_UNMAP_IS_SPECIAL and DL_UNMAP were not defined. The definitions are now copied from arm, since the same is needed on aarch64. The cleanup of tlsdesc data is handled by the custom _dl_unmap. Fixes bug 27403.	2021-04-06 14:35:05 +01:00
Paul Zimmermann	9acda61d94	Fix the inaccuracy of j0f/j1f/y0f/y1f [BZ #14469 , #14470 , #14471 , #14472 ] For j0f/j1f/y0f/y1f, the largest error for all binary32 inputs is reduced to at most 9 ulps for all rounding modes. The new code is enabled only when there is a cancellation at the very end of the j0f/j1f/y0f/y1f computation, or for very large inputs, thus should not give any visible slowdown on average. Two different algorithms are used: * around the first 64 zeros of j0/j1/y0/y1, approximation polynomials of degree 3 are used, computed using the Sollya tool (https://www.sollya.org/) * for large inputs, an asymptotic formula from [1] is used [1] Fast and Accurate Bessel Function Computation, John Harrison, Proceedings of Arith 19, 2009. Inputs yielding the new largest errors are added to auto-libm-test-in, and ulps are regenerated for various targets (thanks Adhemerval Zanella). Tested on x86_64 with --disable-multi-arch and on powerpc64le-linux-gnu. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2021-04-02 06:15:48 +02:00
Szabolcs Nagy	1dc17ea8f8	aarch64: Optimize __libc_mtag_tag_zero_region This is a target hook for memory tagging, the original was a naive implementation. Uses the same algorithm as __libc_mtag_tag_region, but with instructions that also zero the memory. This was not benchmarked on real cpu, but expected to be faster than the naive implementation.	2021-03-26 11:03:06 +00:00
Szabolcs Nagy	23fd760add	aarch64: Optimize __libc_mtag_tag_region This is a target hook for memory tagging, the original was a naive implementation. The optimized version relies on "dc gva" to tag 64 bytes at a time for large allocations and optimizes small cases without adding too many branches. This was not benchmarked on real cpu, but expected to be faster than the naive implementation.	2021-03-26 11:03:06 +00:00
Szabolcs Nagy	383bc24028	aarch64: inline __libc_mtag_new_tag This is a common operation when heap tagging is enabled, so inline the instructions instead of using an extern call.	2021-03-26 11:03:06 +00:00
Szabolcs Nagy	40dc773f92	aarch64: inline __libc_mtag_address_get_tag This is a common operation when heap tagging is enabled, so inline the instruction instead of using an extern call. The .inst directive is used instead of the name of the instruction (or acle intrinsics) because malloc.c is not compiled for armv8.5-a+memtag architecture, runtime cpu support detection is used. Prototypes are removed from the comments as they were not always correct.	2021-03-26 11:03:06 +00:00
Szabolcs Nagy	c076a0bc69	malloc: Only support zeroing and not arbitrary memset with mtag The memset api is suboptimal and does not provide much benefit. Memory tagging only needs a zeroing memset (and only for memory that's sized and aligned to multiples of the tag granule), so change the internal api and the target hooks accordingly. This is to simplify the implementation of the target hook. Reviewed-by: DJ Delorie <dj@redhat.com>	2021-03-26 11:03:06 +00:00
Wilco Dijkstra	db3f7bb558	math: Remove slow paths from asin and acos [BZ #15267 ] This patch series removes all remaining slow paths and related code. First asin/acos, tan, atan, atan2 implementations are updated, and the final patch removes the unused mpa files, headers and probes. Passes buildmanyglibc. Remove slow paths from asin/acos. Add ULP annotations based on previous slow path checks (which are approximate). Update AArch64 and x86_64 libm-test-ulps. Reviewed-By: Paul Zimmermann <Paul.Zimmermann@inria.fr>	2021-03-11 14:26:36 +00:00
Szabolcs Nagy	9fb07fd4e1	aarch64: update ulps. For new test cases in commit `5a051454a9`	2021-03-01 12:29:42 +00:00
Florian Weimer	035c012e32	Reduce the statically linked startup code [BZ #23323 ] It turns out the startup code in csu/elf-init.c has a perfect pair of ROP gadgets (see Marco-Gisbert and Ripoll-Ripoll, "return-to-csu: A New Method to Bypass 64-bit Linux ASLR"). These functions are not needed in dynamically-linked binaries because DT_INIT/DT_INIT_ARRAY are already processed by the dynamic linker. However, the dynamic linker skipped the main program for some reason. For maximum backwards compatibility, this is not changed, and instead, the main map is consulted from __libc_start_main if the init function argument is a NULL pointer. For statically linked binaries, the old approach based on linker symbols is still used because there is nothing else available. A new symbol version __libc_start_main@@GLIBC_2.34 is introduced because new binaries running on an old libc would not run their ELF constructors, leading to difficult-to-debug issues.	2021-02-25 12:13:02 +01:00
Szabolcs Nagy	04c6a8073d	aarch64: Fix the list of tested IFUNC variants [BZ #26818 ] Some IFUNC variants are not compatible with BTI and MTE so don't set them as usable for testing and benchmarking on a BTI or MTE enabled system. As far as IFUNC selectors are concerned a system is BTI enabled if the cpu supports it and glibc was built with BTI branch protection. Most IFUNC variants are BTI compatible, but thunderx2 memcpy and memmove use a jump table with indirect jump, without a BTI j. Fixes bug 26818.	2021-01-25 16:15:54 +00:00
Szabolcs Nagy	c3c4a25e65	aarch64: Move and update the definition of MTE_ENABLED The hwcap value is now in linux 5.10 and in glibc bits/hwcap.h, so use that definition. Move the definition to init-arch.h so all ifunc selectors can use it and expose an "mte" shorthand for mte enabled runtime. For now we allow user code to enable tag checks and use PROT_MTE mappings without libc involvment, this is not guaranteed ABI, but can be useful for testing and debugging with MTE.	2021-01-25 15:35:43 +00:00
Shuo Wang	28f2ce2772	aarch64: revert memcpy optimze for kunpeng to avoid performance degradation In commit `863d775c48`, kunpeng920 is added to default memcpy version, however, there is performance degradation when the copy size is some large bytes, eg: 100k. This is the result, tested in glibc-2.28: before backport after backport Performance improvement memcpy_1k 0.005 0.005 0.00% memcpy_10k 0.032 0.029 10.34% memcpy_100k 0.356 0.429 -17.02% memcpy_1m 7.470 11.153 -33.02% This is the demo #include "stdio.h" #include "string.h" #include "stdlib.h" char a[10241024] = {12}; char b[10241024] = {13}; int main(int argc, char argv[]) { int i = atoi(argv[1]); int j; int size = atoi(argv[2]); for (j = 0; j < i; j++) memcpy(b, a, size1024); return 0; } # gcc -g -O0 memcpy.c -o memcpy # time taskset -c 10 ./memcpy 100000 1024 Co-authored-by: liqingqing <liqingqing3@huawei.com>	2021-01-21 16:44:15 +00:00
Szabolcs Nagy	374cef32ac	configure: Check for static PIE support Add SUPPORT_STATIC_PIE that targets can define if they support static PIE. This requires PI_STATIC_AND_HIDDEN support and various linker features as described in commit `9d7a3741c9` Add --enable-static-pie configure option to build static PIE [BZ #19574] Currently defined on x86_64, i386 and aarch64 where static PIE is known to work. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2021-01-21 15:54:50 +00:00
Szabolcs Nagy	2f056e8a5d	aarch64: define PI_STATIC_AND_HIDDEN AArch64 always uses pc relative access to static and hidden object symbols, but the config setting was previously missing. This affects ld.so start up code.	2021-01-08 11:14:02 +00:00
Wilco Dijkstra	9e97f239ea	Remove dbl-64/wordsize-64 (part 2) Remove the wordsize-64 implementations by merging them into the main dbl-64 directory. The second patch just moves all wordsize-64 files and removes a few wordsize-64 uses in comments and Implies files. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2021-01-07 15:26:26 +00:00
Shuo Wang	f5082c7010	aarch64: push the set of rules before falling into slow path It is supposed to save the rules for the instructions before falling into slow path. Tested in glibc-2.28 before fixing: Thread 2 "xxxxxxx" hit Breakpoint 1, _dl_tlsdesc_dynamic () at ../sysdeps/aarch64/dl-tlsdesc.S:149 149 stp x1, x2, [sp, #-32]! Missing separate debuginfos, use: dnf debuginfo-install libgcc-7.3.0-20190804.h24.aarch64 (gdb) ni _dl_tlsdesc_dynamic () at ../sysdeps/aarch64/dl-tlsdesc.S:150 150 stp x3, x4, [sp, #16] (gdb) _dl_tlsdesc_dynamic () at ../sysdeps/aarch64/dl-tlsdesc.S:157 157 mrs x4, tpidr_el0 (gdb) 158 ldr PTR_REG (1), [x0,#TLSDESC_ARG] (gdb) 159 ldr PTR_REG (0), [x4,#TCBHEAD_DTV] (gdb) 160 ldr PTR_REG (3), [x1,#TLSDESC_GEN_COUNT] (gdb) 161 ldr PTR_REG (2), [x0,#DTV_COUNTER] (gdb) 162 cmp PTR_REG (3), PTR_REG (2) (gdb) 163 b.hi 2f (gdb) 165 ldp PTR_REG (2), PTR_REG (3), [x1,#TLSDESC_MODID] (gdb) 166 add PTR_REG (0), PTR_REG (0), PTR_REG (2), lsl #(PTR_LOG_SIZE + 1) (gdb) 167 ldr PTR_REG (0), [x0] /* Load val member of DTV entry. / (gdb) 168 cmp PTR_REG (0), #TLS_DTV_UNALLOCATED (gdb) 169 b.eq 2f (gdb) bt #0 _dl_tlsdesc_dynamic () at ../sysdeps/aarch64/dl-tlsdesc.S:169 #1 0x0000ffffbe4fbb44 in OurFunction (threadId=4294967295) at /home/test/test_function.c:30 #2 0x0000000000400c08 in initaaa () at thread.c:58 #3 0x0000000000400c50 in thread_proc (param=0x0) at thread.c:71 #4 0x0000ffffbf6918bc in start_thread (arg=0xfffffffff29f) at pthread_create.c:486 #5 0x0000ffffbf5669ec in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:78 (gdb) ni _dl_tlsdesc_dynamic () at ../sysdeps/aarch64/dl-tlsdesc.S:184 184 stp x29, x30, [sp,#-16NSAVEXREGPAIRS]! (gdb) bt #0 _dl_tlsdesc_dynamic () at ../sysdeps/aarch64/dl-tlsdesc.S:184 #1 0x0000ffffbe4fbb44 in OurFunction (threadId=4294967295) at /home/test/test_function.c:30 #2 0x0000000000000000 in ?? () Backtrace stopped: previous frame identical to this frame (corrupt stack?) Co-authored-by: liqingqing <liqingqing3@huawei.com>	2021-01-05 09:25:19 +00:00
Shuo Wang	cd6274089f	aarch64: fix stack missing after sp is updated After sp is updated, the CFA offset should be set before next instruction. Tested in glibc-2.28: Thread 2 "xxxxxxx" hit Breakpoint 1, _dl_tlsdesc_dynamic () at ../sysdeps/aarch64/dl-tlsdesc.S:149 149 stp x1, x2, [sp, #-32]! Missing separate debuginfos, use: dnf debuginfo-install libgcc-7.3.0-20190804.h24.aarch64 (gdb) bt #0 _dl_tlsdesc_dynamic () at ../sysdeps/aarch64/dl-tlsdesc.S:149 #1 0x0000ffffbe4fbb44 in OurFunction (threadId=3194870184) at /home/test/test_function.c:30 #2 0x0000000000400c08 in initaaa () at thread.c:58 #3 0x0000000000400c50 in thread_proc (param=0x0) at thread.c:71 #4 0x0000ffffbf6918bc in start_thread (arg=0xfffffffff29f) at pthread_create.c:486 #5 0x0000ffffbf5669ec in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:78 (gdb) ni _dl_tlsdesc_dynamic () at ../sysdeps/aarch64/dl-tlsdesc.S:150 150 stp x3, x4, [sp, #16] (gdb) bt #0 _dl_tlsdesc_dynamic () at ../sysdeps/aarch64/dl-tlsdesc.S:150 #1 0x0000ffffbe4fbb44 in OurFunction (threadId=3194870184) at /home/test/test_function.c:30 #2 0x0000000000000000 in ?? () Backtrace stopped: previous frame identical to this frame (corrupt stack?) (gdb) ni _dl_tlsdesc_dynamic () at ../sysdeps/aarch64/dl-tlsdesc.S:157 157 mrs x4, tpidr_el0 (gdb) bt #0 _dl_tlsdesc_dynamic () at ../sysdeps/aarch64/dl-tlsdesc.S:157 #1 0x0000ffffbe4fbb44 in OurFunction (threadId=3194870184) at /home/test/test_function.c:30 #2 0x0000000000400c08 in initaaa () at thread.c:58 #3 0x0000000000400c50 in thread_proc (param=0x0) at thread.c:71 #4 0x0000ffffbf6918bc in start_thread (arg=0xfffffffff29f) at pthread_create.c:486 #5 0x0000ffffbf5669ec in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:78 Signed-off-by: liqingqing <liqingqing3@huawei.com> Signed-off-by: Shuo Wang <wangshuo47@huawei.com>	2021-01-04 15:37:06 +00:00
Paul Eggert	2b778ceb40	Update copyright dates with scripts/update-copyrights I used these shell commands: ../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright (cd ../glibc && git commit -am"[this commit message]") and then ignored the output, which consisted lines saying "FOO: warning: copyright statement not found" for each of 6694 files FOO. I then removed trailing white space from benchtests/bench-pthread-locks.c and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this diagnostic from Savannah: remote: * pre-commit check failed ... remote: * error: lines with trailing whitespace found remote: error: hook declined to update refs/heads/master	2021-01-02 12:17:34 -08:00
Szabolcs Nagy	45b1e17e91	aarch64: use PTR_ARG and SIZE_ARG instead of DELOUSE DELOUSE was added to asm code to make them compatible with non-LP64 ABIs, but it is an unfortunate name and the code was not compatible with ABIs where pointer and size_t are different. Glibc currently only supports the LP64 ABI so these macros are not really needed or tested, but for now the name is changed to be more meaningful instead of removing them completely. Some DELOUSE macros were dropped: clone, strlen and strnlen used it unnecessarily. The out of tree ILP32 patches are currently not maintained and will likely need a rework to rebase them on top of the time64 changes.	2020-12-31 16:50:58 +00:00
Szabolcs Nagy	682cdd6e1a	aarch64: update ulps. For new test cases in commit `cad5ad81d2`	2020-12-21 16:40:34 +00:00
Richard Earnshaw	d27f0e5d88	aarch64: Add aarch64-specific files for memory tagging support This final patch provides the architecture-specific implementation of the memory-tagging support hooks for aarch64.	2020-12-21 15:25:25 +00:00
Szabolcs Nagy	4033f21eb2	aarch64: remove the strlen_asimd symbol This symbol is not in the implementation reserved namespace for static linking and it was never used: it seems it was mistakenly added in the orignal strlen_asimd commit `436e4d5b96`	2020-12-15 14:42:45 +00:00
Guillaume Gardet	d4136903a2	aarch64: fix static PIE start code for BTI [BZ #27068 ] A bti c was missing from rcrt1.o which made all -static-pie binaries fail at program startup on BTI enabled systems. Fixes bug 27068.	2020-12-15 13:48:45 +00:00
Szabolcs Nagy	cd543b5eb3	aarch64: Use mmap to add PROT_BTI instead of mprotect [BZ #26831 ] Re-mmap executable segments if possible instead of using mprotect to add PROT_BTI. This allows using BTI protection with security policies that prevent mprotect with PROT_EXEC. If the fd of the ELF module is not available because it was kernel mapped then mprotect is used and failures are ignored. To protect the main executable even when mprotect is filtered the linux kernel will have to be changed to add PROT_BTI to it. The delayed failure reporting is mainly needed because currently _dl_process_gnu_properties does not propagate failures such that the required cleanups happen. Using the link_map_machine struct for error propagation is not ideal, but this seemed to be the least intrusive solution. Fixes bug 26831. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2020-12-11 15:46:02 +00:00
Szabolcs Nagy	c00452d775	elf: Pass the fd to note processing To handle GNU property notes on aarch64 some segments need to be mmaped again, so the fd of the loaded ELF module is needed. When the fd is not available (kernel loaded modules), then -1 is passed. The fd is passed to both _dl_process_pt_gnu_property and _dl_process_pt_note for consistency. Target specific note processing functions are updated accordingly. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2020-12-11 15:45:37 +00:00
Szabolcs Nagy	8b8f616e6a	aarch64: align address for BTI protection [BZ #26988 ] Handle unaligned executable load segments (the bfd linker is not expected to produce such binaries, but other linkers may). Computing the mapping bounds follows _dl_map_object_from_fd more closely now. Fixes bug 26988. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2020-12-11 15:04:39 +00:00
Szabolcs Nagy	72739c79f6	aarch64: Fix missing BTI protection from dependencies [BZ #26926 ] The _dl_open_check and _rtld_main_check hooks are not called on the dependencies of a loaded module, so BTI protection was missed on every module other than the main executable and directly dlopened libraries. The fix just iterates over dependencies to enable BTI. Fixes bug 26926.	2020-12-11 14:52:13 +00:00
Florian Weimer	1daccf403b	nptl: Move stack list variables into _rtld_global Now __thread_gscope_wait (the function behind THREAD_GSCOPE_WAIT, formerly __wait_lookup_done) can be implemented directly in ld.so, eliminating the unprotected GL (dl_wait_lookup_done) function pointer. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2020-11-16 19:33:30 +01:00
Florian Weimer	5edf3d9fd6	aarch64: Add unwind information to _start (bug 26853) This adds CFI directives which communicate that the stack ends with this function. Fixes bug 26853.	2020-11-09 11:31:04 +01:00
Szabolcs Nagy	e156dabc76	aarch64: Add variant PCS lazy binding test [BZ #26798 ] This test fails without bug 26798 fixed because some integer registers likely get clobbered by lazy binding and variant PCS only allows x16 and x17 to be clobbered at call time. The test requires binutils 2.32.1 or newer for handling variant PCS symbols. SVE registers are not covered by this test, to avoid the complexity of handling multiple compile- and runtime feature support cases.	2020-11-02 09:39:24 +00:00
Szabolcs Nagy	558251bd87	aarch64: Fix DT_AARCH64_VARIANT_PCS handling [BZ #26798 ] The variant PCS support was ineffective because in the common case linkmap->l_mach.plt == 0 but then the symbol table flags were ignored and normal lazy binding was used instead of resolving the relocs early. (This was a misunderstanding about how GOT[1] is setup by the linker.) In practice this mainly affects SVE calls when the vector length is more than 128 bits, then the top bits of the argument registers get clobbered during lazy binding. Fixes bug 26798.	2020-11-02 09:39:24 +00:00
Wilco Dijkstra	e11ed9d2b4	AArch64: Use __memcpy_simd on Neoverse N2/V1 Add CPU detection of Neoverse N2 and Neoverse V1, and select __memcpy_simd as the memcpy/memmove ifunc. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2020-10-14 14:27:50 +01:00

1 2 3 4 5 ...

326 Commits