glibc

mirror of git://sourceware.org/git/glibc.git synced 2025-04-06 14:10:30 +08:00

Author	SHA1	Message	Date
Szabolcs Nagy	27a873fa06	aarch64: use __alloc_gcs in makecontext	2024-10-14 13:16:03 +01:00
Szabolcs Nagy	9ecd8855cc	aarch64: Add GCS user-space allocation logic Allocate GCS based on the stack size, this can be used for coroutines (makecontext) and thread creation (if the kernel allows user allocated GCS).	2024-10-14 13:16:03 +01:00
Szabolcs Nagy	d21abb82be	aarch64: Process gnu properties in static exe Unlike for BTI, the kernel does not process GCS properties so update GL(dl_aarch64_gcs) before the GCS status is set.	2024-10-14 13:16:03 +01:00
Szabolcs Nagy	f5d24d883a	aarch64: Ignore GCS property of ld.so check_gcs is called for each dependency of a DSO, but the GNU property of the ld.so is not processed so ldso->l_mach.gcs may not be correct. Just assume ld.so is GCS compatible independently of the ELF marking.	2024-10-14 13:16:03 +01:00
Szabolcs Nagy	57d43b1fee	aarch64: Use l_searchlist.r_list for gcs Allows using the same function for static exe.	2024-10-14 13:16:03 +01:00
Szabolcs Nagy	0b33d30ae1	aarch64: Handle gcs marking	2024-10-14 13:16:03 +01:00
Szabolcs Nagy	a233eb0f70	aarch64: Use l_searchlist.r_list for bti Allows using the same function for static exe.	2024-10-14 13:16:03 +01:00
Szabolcs Nagy	a4dcf30215	aarch64: Add glibc.cpu.aarch64_gcs_policy policy sets how gcs tunable and gcs marking turns into gcs state: 0: state = tunable 1: state = marking ? tunable : (tunable && dlopen ? err : 0) 2: state = marking ? tunable : (tunable ? err : 0)	2024-10-14 13:16:02 +01:00
Szabolcs Nagy	8d51bf5658	aarch64: Mark objects with GCS property note	2024-10-10 13:40:35 +01:00
Szabolcs Nagy	8013ecc85c	aarch64: Enable GCS in dynamic linked exe Use the dynamic linker start code to enable GCS in the dynamic linked case after _dl_start returns and before _dl_start_user which marks the point after which user code may run. Like in the static linked case this ensures that GCS is enabled on a top level stack frame.	2024-10-10 13:40:35 +01:00
Szabolcs Nagy	1120769432	aarch64: Enable GCS in static linked exe Use the ARCH_SETUP_TLS hook to enable GCS in the static linked case. The system call must be inlined and then GCS is enabled on a top level stack frame that does not return and has no exception handlers above it.	2024-10-10 13:40:35 +01:00
Szabolcs Nagy	6c973abacf	aarch64: Add glibc.cpu.aarch64_gcs tunable This tunable is for controlling the GCS status. It is the argument to the PR_SET_SHADOW_STACK_STATUS prctl, by default 0, so GCS is disabled. The status is stored into GL(dl_aarch64_gcs) early and only applied later, since enabling GCS is tricky: it must happen on a top level stack frame. (Using GL instead of GLRO because it may need updates depending on loaded libraries that happen after readonly protection is applied, however library marking based GCS setting is not yet implemented.)	2024-10-10 13:40:35 +01:00
Szabolcs Nagy	d2768c779c	aarch64: Try to free the GCS of makecontext Free GCS after a makecontext start func returns and at thread exit, so assume makecontext cannot outlive the thread where it was created. This is an attempt to bound the lifetime of the GCS allocated for makecontext, but it is still possible to have significant GCS leaks, new GCS aware APIs could solve that, but that would not allow using GCS with existing code transparently.	2024-10-10 13:40:35 +01:00
Szabolcs Nagy	491de01415	aarch64: Add GCS support for makecontext Changed the makecontext logic: previously the first setcontext jumped straight to the user callback function and the return address is set to __startcontext. This does not work when GCS is enabled as the integrity of the return address is protected, so instead the context is setup such that setcontext jumps to __startcontext which calls the user callback (passed in x20). The map_shadow_stack syscall is used to allocate a suitably sized GCS (which includes some reserved area to account for altstack signal handlers and otherwise supports maximum number of 16 byte aligned stack frames on the given stack) however the GCS is never freed as the lifetime of ucontext and related stack is user managed.	2024-10-10 13:40:35 +01:00
Szabolcs Nagy	c03f6bdca2	aarch64: Mark swapcontext with indirect_return	2024-10-10 13:40:35 +01:00
Szabolcs Nagy	46aa54af94	aarch64: Add GCS support for setcontext Userspace ucontext needs to store GCSPR, it does not have to be compatible with the kernel ucontext. For now we use the linux struct gcs_context layout but only use the gcspr field from it. Similar implementation to the longjmp code, supports switching GCS if the target GCS is capped, and unwinding a continous GCS to a previous state.	2024-10-10 13:40:35 +01:00
Szabolcs Nagy	4b2f9ca4e7	aarch64: Add GCS support to vfork	2024-10-10 13:40:35 +01:00
Szabolcs Nagy	839197fdeb	aarch64: Add GCS support to longjmp This implementations ensures that longjmp across different stacks works: it scans for GCS cap token and switches GCS if necessary then the target GCSPR is restored with a GCSPOPM loop once the current GCSPR is on the same GCS. This makes longjmp linear time in the number of jumped over stack frames when GCS is enabled.	2024-10-10 13:40:34 +01:00
Szabolcs Nagy	d813260d17	aarch64: Define jmp_buf offset for GCS The target specific internal __longjmp is called with a __jmp_buf argument which has its size exposed in the ABI. On aarch64 this has no space left, so GCSPR cannot be restored in longjmp in the usual way, which is needed for the Guarded Control Stack (GCS) extension. setjmp is implemented via __sigsetjmp which has a jmp_buf argument however it is also called with __pthread_unwind_buf_t argument cast to jmp_buf (in cancellation cleanup code built with -fno-exception). The two types, jmp_buf and __pthread_unwind_buf_t, have common bits beyond the __jmp_buf field and there is unused space there which we can use for saving GCSPR. For this to work some bits of those two generic types have to be reserved for target specific use and the generic code in glibc has to ensure that __longjmp is always called with a __jmp_buf that is embedded into one of those two types. Morally __longjmp should be changed to take jmp_buf as argument, but that is an intrusive change across targets. Note: longjmp is never called with __pthread_unwind_buf_t from user code, only the internal __libc_longjmp is called with that type and thus the two types could have separate longjmp implementations on a target. We don't rely on this now (but migh in the future given that cancellation unwind does not need to restore GCSPR). Given the above this patch finds an unused slot for GCSPR. This placement is not exposed in the ABI so it may change in the future. This is also very target ABI specific so the generic types cannot be easily changed to clearly mark the reserved fields.	2024-10-10 13:40:34 +01:00
Szabolcs Nagy	f6d409de05	aarch64: Add asm helpers for GCS The Guarded Control Stack instructions can be present even if the hardware does not support the extension (runtime checked feature), so the asm code should be backward compatible with old assemblers.	2024-10-10 13:40:34 +01:00
Szabolcs Nagy	01d2e29c10	aarch64: Add HWCAP_GCS Use upper 32 bits of HWCAP.	2024-10-10 13:40:34 +01:00
Joseph Myers	0e8738a48c	Fix header guard in sysdeps/mach/hurd/x86_64/vm_param.h GCC mainline produces a -Wheader-guard error building for x86_64-gnu. Fix what seems to be incorrect macro naming in the #ifndef conditional. Tested with build-many-glibc.py for x86_64-gnu (GCC mainline). Message-ID: <fd800046-5ecb-ebd5-4df1-29d4eb3d5433@redhat.com>	2024-10-09 19:16:53 +02:00
Adhemerval Zanella	d40ac01cbb	stdlib: Make abort/_Exit AS-safe (BZ 26275) The recursive lock used on abort does not synchronize with a new process creation (either by fork-like interfaces or posix_spawn ones), nor it is reinitialized after fork(). Also, the SIGABRT unblock before raise() shows another race condition, where a fork or posix_spawn() call by another thread, just after the recursive lock release and before the SIGABRT signal, might create programs with a non-expected signal mask. With the default option (without POSIX_SPAWN_SETSIGDEF), the process can see SIG_DFL for SIGABRT, where it should be SIG_IGN. To fix the AS-safe, raise() does not change the process signal mask, and an AS-safe lock is used if a SIGABRT is installed or the process is blocked or ignored. With the signal mask change removal, there is no need to use a recursive loc. The lock is also taken on both _Fork() and posix_spawn(), to avoid the spawn process to see the abort handler as SIG_DFL. A read-write lock is used to avoid serialize _Fork and posix_spawn execution. Both sigaction (SIGABRT) and abort() requires to lock as writer (since both change the disposition). The fallback is also simplified: there is no need to use a loop of ABORT_INSTRUCTION after _exit() (if the syscall does not terminate the process, the system is broken). The proposed fix changes how setjmp works on a SIGABRT handler, where glibc does not save the signal mask. So usage like the below will now always abort. static volatile int chk_fail_ok; static jmp_buf chk_fail_buf; static void handler (int sig) { if (chk_fail_ok) { chk_fail_ok = 0; longjmp (chk_fail_buf, 1); } else _exit (127); } [...] signal (SIGABRT, handler); [....] chk_fail_ok = 1; if (! setjmp (chk_fail_buf)) { // Something that can calls abort, like a failed fortify function. chk_fail_ok = 0; printf ("FAIL\n"); } Such cases will need to use sigsetjmp instead. The _dl_start_profile calls sigaction through _profil, and to avoid pulling abort() on loader the call is replaced with __libc_sigaction. Checked on x86_64-linux-gnu and aarch64-linux-gnu. Reviewed-by: DJ Delorie <dj@redhat.com>	2024-10-08 14:40:12 -03:00
Adhemerval Zanella	55d33108c7	linux: Use GLRO(dl_vdso_time) on time The BZ#24967 fix (1bdda52fe92fd01b424c) missed the time for architectures that define USE_IFUNC_TIME. Although it is not an issue, since there is no pointer mangling, there is also no need to call dl_vdso_vsym since the vDSO setup was already done by the loader. Checked on x86_64-linux-gnu and i686-linux-gnu.	2024-10-08 13:28:21 -03:00
Adhemerval Zanella	02b195d30f	linux: Use GLRO(dl_vdso_gettimeofday) on gettimeofday The BZ#24967 fix (1bdda52fe92fd01b424c) missed the gettimeofday for architectures that define USE_IFUNC_GETTIMEOFDAY. Although it is not an issue, since there is no pointer mangling, there is also no need to call dl_vdso_vsym since the vDSO setup was already done by the loader. Checked on x86_64-linux-gnu and i686-linux-gnu.	2024-10-08 13:28:21 -03:00
Stefan Liebler	7949f552cb	S390: Don't use r11 for cu-instructions as used as frame-pointer. [BZ# 32192] Building the s390 specific iconv modules - utf16-utf32-z9.c, utf8-utf32-z9.c and utf8-utf16-z9.c - with -fno-omit-frame-pointer leads to a build error "error: %r11 cannot be used in 'asm' here" as r11 is needed as frame-pointer. The cuXY-instructions need two even-odd register pairs. Therefore the register pinning is used. This patch just uses a different register pair. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2024-10-08 10:13:02 +02:00
Carlos O'Donell	cae9944a6c	Fix whitespace related license issues. Several copies of the licenses in files contained whitespace related problems. Two cases are addressed here, the first is two spaces after a period which appears between "PURPOSE." and "See". The other is a space after the last forward slash in the URL. Both issues are corrected and the licenses now match the official textual description of the license (and the other license in the sources). Since these whitespaces changes do not alter the paragraph structure of the license, nor create new sentences, they do not change the license.	2024-10-07 18:08:16 -04:00
Bruno Haible	e67f8e6dbd	hurd: Add missing va_end call in fcntl implementation. [BZ #32234 ] * sysdeps/mach/hurd/fcntl.c (__libc_fcntl): Add va_end call in two code paths.	2024-10-03 20:18:29 +02:00
Andreas Schwab	a36814e145	riscv: align .preinit_array (bug 32228) The section contains an array of pointers, so it should be aligned to pointer size.	2024-10-02 13:04:30 +02:00
Adhemerval Zanella	5e8cfc5d62	linux: sparc: Fix clone for LEON/sparcv8 (BZ 31394) The sparc clone mitigation (faeaa3bc9f76030) added the use of flushw, which is not support by LEON/sparcv8. As discussed on the libc-alpha, 'ta 3' is a working alternative [1]. [1] https://sourceware.org/pipermail/libc-alpha/2024-August/158905.html Checked with a build for sparcv8-linux-gnu targetting leon. Acked-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>	2024-10-01 10:37:21 -03:00
Adhemerval Zanella	49c3682ce1	linux: sparc: Fix syscall_cancel for LEON LEON2/LEON3 are both sparcv8, which does not support branch hints (bne,pn) nor the return instruction. Checked with a build for sparcv8-linux-gnu targetting leon. I also checked some cancellation tests with qemu-system (targeting LEON3). Acked-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>	2024-10-01 10:37:21 -03:00
Wilco Dijkstra	44fa9c1080	math: Improve layout of expf data GCC aligns global data to 16 bytes if their size is >= 16 bytes. This patch changes the exp2f_data struct slightly so that the fields are better aligned. As a result on targets that support them, load-pair instructions accessing poly_scaled and invln2_scaled are now 16-byte aligned. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-10-01 13:39:26 +01:00
Noah Goldstein	483443d321	x86/string: Fixup alignment of main loop in str{n}cmp-evex [BZ #32212 ] The loop should be aligned to 32-bytes so that it can ideally run out the DSB. This is particularly important on Skylake-Server where deficiencies in it's DSB implementation make it prone to not being able to run loops out of the DSB. For example running strcmp-evex on 200Mb string: 32-byte aligned loop: - 43,399,578,766 idq.dsb_uops not 32-byte aligned loop: - 6,060,139,704 idq.dsb_uops This results in a 25% performance degradation for the non-aligned version. The fix is to just ensure the code layout is such that the loop is aligned. (Which was previously the case but was accidentally dropped in 84e7c46df). NB: The fix was actually 64-byte alignment. This is because 64-byte alignment generally produces more stable performance than 32-byte aligned code (cache line crosses can affect perf), so if we are going past 16-byte alignmnent, might as well go to 64. 64-byte alignment also matches most other functions we over-align, so it creates a common point of optimization. Times are reported as ratio of Time_With_Patch / Time_Without_Patch. Lower is better. The values being reported is the geometric mean of the ratio across all tests in bench-strcmp and bench-strncmp. Note this patch is only attempting to improve the Skylake-Server strcmp for long strings. The rest of the numbers are only to test for regressions. Tigerlake Results Strings <= 512: strcmp : 1.026 strncmp: 0.949 Tigerlake Results Strings > 512: strcmp : 0.994 strncmp: 0.998 Skylake-Server Results Strings <= 512: strcmp : 0.945 strncmp: 0.943 Skylake-Server Results Strings > 512: strcmp : 0.778 strncmp: 1.000 The 2.6% regression on TGL-strcmp is due to slowdowns caused by changes in alignment of code handling small sizes (most on the page-cross logic). These should be safe to ignore because 1) We previously only 16-byte aligned the function so this behavior is not new and was essentially up to chance before this patch and 2) this type of alignment related regression on small sizes really only comes up in tight micro-benchmark loops and is unlikely to have any affect on realworld performance. Reviewed-by: H.J. Lu <hjl.tools@gmail.com>	2024-09-30 07:40:40 -07:00
Florian Weimer	b300078d97	Linux: Block signals around _Fork (bug 32215) This hides the inconsistent TCB state (missing robust mutex list) from signal handlers. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-09-28 09:44:25 +02:00
Andreas Schwab	5f62cf88c4	Fix missing randomness in __gen_tempname (bug 32214) Make sure to update the random value also if getrandom fails. Fixes: 686d542025 ("posix: Sync tempname with gnulib")	2024-09-26 11:45:44 +02:00
Pavel Kozlov	cc84cd389c	arc: Cleanup arcbe Remove the mention of arcbe ABI to avoid any mislead. ARC big endian ABI is no longer supported. Reviewed-by: Florian Weimer <fweimer@redhat.com>	2024-09-25 15:54:07 +01:00
Florian Weimer	4ff55d08df	arc: Remove HAVE_ARC_BE macro and disable big-endian port It is no longer needed, now that ARC is always little endian.	2024-09-25 11:25:22 +02:00
caiyinyu	255dc1e4ed	LoongArch: Undef __NR_fstat and __NR_newfstatat. In Linux 6.11, fstat and newfstatat are added back. To avoid the messy usage of the fstat, newfstatat, and statx system calls, we will continue using statx only in glibc, maintaining consistency with previous versions of the LoongArch-specific glibc implementation. Signed-off-by: caiyinyu <caiyinyu@loongson.cn> Reviewed-by: Xi Ruoyao <xry111@xry111.site> Suggested-by: Florian Weimer <fweimer@redhat.com>	2024-09-25 10:00:42 +08:00
Florian Weimer	7e21a65c58	misc: Enable internal use of memory protection keys This adds the necessary hidden prototypes.	2024-09-24 13:23:10 +02:00
Joe Ramsay	16a59571e4	AArch64: Simplify rounding-multiply pattern in several AdvSIMD routines This operation can be simplified to use simpler multiply-round-convert sequence, which uses fewer instructions and constants. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2024-09-23 15:44:08 +01:00
Joe Ramsay	7900ac490d	AArch64: Improve codegen in users of ADVSIMD expm1f helper Rearrange operations so MOV is not necessary in reduction or around the special-case handler. Reduce memory access by using more indexed MLAs in polynomial. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2024-09-23 15:44:07 +01:00
Joe Ramsay	5bc100bd4b	AArch64: Improve codegen in users of AdvSIMD log1pf helper log1pf is quite register-intensive - use fewer registers for the polynomial, and make various changes to shorten dependency chains in parent routines. There is now no spilling with GCC 14. Accuracy moves around a little - comments adjusted accordingly but does not require regen-ulps. Use the helper in log1pf as well, instead of having separate implementations. The more accurate polynomial means special-casing can be simplified, and the shorter dependency chain avoids the usual dance around v0, which is otherwise difficult. There is a small duplication of vectors containing 1.0f (or 0x3f800000) - GCC is not currently able to efficiently handle values which fit in FMOV but not MOVI, and are reinterpreted to integer. There may be potential for more optimisation if this is fixed. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2024-09-23 15:44:07 +01:00
Joe Ramsay	a15b1394b5	AArch64: Improve codegen in SVE F32 logs Reduce MOVPRFXs by using unpredicated (non-destructive) instructions where possible. Similar to the recent change to AdvSIMD F32 logs, adjust special-case arguments and bounds to allow for more optimal register usage. For all 3 routines one MOVPRFX remains in the reduction, which cannot be avoided as immediate AND and ASR are both destructive. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2024-09-23 15:44:07 +01:00
Joe Ramsay	7b8c134b54	AArch64: Improve codegen in SVE expf & related routines Reduce MOV and MOVPRFX by improving special-case handling. Use inline helper to duplicate the entire computation between the special- and non-special case branches, removing the contention for z0 between x and the return value. Also rearrange some MLAs and MLSs - by making the multiplicand the destination we can avoid a MOVPRFX in several cases. Also change which constants go in the vector used for lanewise ops - the last lane is no longer wasted. Spotted that shift was incorrect in exp2f and exp10f, w.r.t. to the comment that explains it. Fixed - worst-case ULP for exp2f moves around but it doesn't change significantly for either routine. Worst-case error for coshf increases due to passing x to exp rather than abs(x) - updated the comment, but does not require regen-ulps. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2024-09-23 15:44:07 +01:00
Florian Weimer	6f3f6c506c	Linux: readdir64_r should not skip d_ino == 0 entries (bug 32126) This is the same bug as bug 12165, but for readdir_r. The regression test covers both bug 12165 and bug 32126. Reviewed-by: DJ Delorie <dj@redhat.com>	2024-09-21 19:32:34 +02:00
Florian Weimer	e92718552e	Linux: Use readdir64_r for compat __old_readdir64_r (bug 32128) It is not necessary to do the conversion at the getdents64 layer for readdir64_r. Doing it piecewise for readdir64 is slightly simpler and allows deleting __old_getdents64. This fixes bug 32128 because readdir64_r handles the length check correctly. Reviewed-by: DJ Delorie <dj@redhat.com>	2024-09-21 19:32:34 +02:00
Joe Ramsay	751a5502be	AArch64: Add vector logp1 alias for log1p This enables vectorisation of C23 logp1, which is an alias for log1p. There are no new tests or ulp entries because the new symbols are simply aliases. Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>	2024-09-19 17:53:34 +01:00
Sergey Bugaev	4524670545	hurd: Avoid file_check_access () RPC for access (F_OK) A common use case of access () / faccessat () is checking for file existence, not any specific access permissions. In that case, we can avoid doing the file_check_access () RPC; whether the given path had been successfully resolved to a file is all we need to know to answer. This is prompted by GLib switching to use faccessat (F_OK) to implement g_file_query_exists () for local files. https://gitlab.gnome.org/GNOME/glib/-/merge_requests/4272 Signed-off-by: Sergey Bugaev <bugaevc@gmail.com> Message-ID: <20240919101439.179663-1-bugaevc@gmail.com>	2024-09-19 14:18:39 +02:00
Florian Weimer	c444cc1d83	Linux: Add missing scheduler constants to <sched.h> And add a test, misc/tst-sched-consts, that checks consistency with <sched.h>. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2024-09-11 10:05:08 +02:00
Florian Weimer	21571ca0d7	Linux: Add the sched_setattr and sched_getattr functions And struct sched_attr. In sysdeps/unix/sysv/linux/bits/sched.h, the hack that defines sched_param around the inclusion of <linux/sched/types.h> is quite ugly, but the definition of struct sched_param has already been dropped by the kernel, so there is nothing else we can do and maintain compatibility of <sched.h> with a wide range of kernel header versions. (An alternative would involve introducing a separate header for this functionality, but this seems unnecessary.) The existing sched_* functions that change scheduler parameters are already incompatible with PTHREAD_PRIO_PROTECT mutexes, so there is no harm in adding more functionality in this area. The documentation mostly defers to the Linux manual pages. Reviewed-by: Carlos O'Donell <carlos@redhat.com>	2024-09-11 10:05:08 +02:00

1 2 3 4 5 ...

16402 Commits