This will be required by the rseq extensible ABI implementation on all
Linux architectures exposing the '__rseq_size' and '__rseq_offset'
symbols to set the initial value of the 'cpu_id' field which can be used
by applications to test if rseq is available and registered. As long as
the symbols are exposed it is valid for an application to perform this
test even if rseq is not yet implemented in libc for this architecture.
Compile tested with build-many-glibcs.py but I don't have access to any
hardware to run the tests.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Linux 6.12 has no new syscalls. Update the version number in
syscall-names.list to reflect that it is still current for 6.12.
Tested with build-many-glibcs.py.
Clang does not define _Bool for -std=c++98:
/usr/include/bits/platform/features.h:31:19: error: unknown type name '_Bool'
31 | static __inline__ _Bool
| ^
Change _Bool to bool to silence clang++ error.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Florian Weimer <fweimer@redhat.com>
ieee_long_double_shape_type has
typedef union
{
long double value;
struct
{
...
int sign_exponent:16;
...
} parts;
} ieee_long_double_shape_type;
Clang issues an error:
../sysdeps/ieee754/ldbl-96/test-totalorderl-ldbl-96.c:49:2: error: implicit truncation from 'int' to bit-field changes value from 65535 to -1 [-Werror,-Wbitfield-constant-conversion]
49 | SET_LDOUBLE_WORDS (ldnx, 0xffff,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
50 | tests[i] >> 32, tests[i] & 0xffffffffULL);
|
Use -1, instead of 0xffff, to silence Clang.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Add _Atomic to futex_wait argument and ctid in tst-clone3[-internal].c to
silence Clang error:
../sysdeps/unix/sysv/linux/tst-clone3-internal.c:93:3: error: address argument to atomic operation must be a pointer to _Atomic type ('pid_t *' (aka 'int *') invalid)
93 | wait_tid (&ctid, CTID_INIT_VAL);
| ^ ~~~~~
../sysdeps/unix/sysv/linux/tst-clone3-internal.c:51:21: note: expanded from macro 'wait_tid'
51 | while ((__tid = atomic_load_explicit (ctid_ptr, \
| ^ ~~~~~~~~
/usr/bin/../lib/clang/19/include/stdatomic.h:145:30: note: expanded from macro 'atomic_load_explicit'
145 | #define atomic_load_explicit __c11_atomic_load
| ^
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Some hypervisors report 1 TiB L3 cache size. This results
in some variables incorrectly getting zeroed, causing crashes
in memcpy/memmove because invariants are violated.
Use "#include <...>" to silence Clang #include_next error:
In file included from ../sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c:19:
../sysdeps/x86_64/fpu/test-double-vlen4.h:19:2: error: #include_next in file found relative to primary source file or found by absolute path; will search from start of include path [-Werror,-Winclude-next-absolute-path]
19 | #include_next <test-double-vlen4.h>
| ^
1 error generated.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Pass -mshstk to compiler to silence Clang:
In file included from ../sysdeps/x86_64/tst-cet-legacy-10a.c:2:
../sysdeps/x86_64/tst-cet-legacy-10.c:29:7: error: always_inline function '_get_ssp' requires target feature 'shstk', but would be inlined into function 'do_test' that is compiled without support for 'shstk'
29 | if (_get_ssp () != 0)
| ^
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Load the polynomial evaluation coefficients into 2 vectors and use lanewise MLAs.
Also use intrinsics instead of native operations.
expf: 3% improvement in throughput microbenchmark on Neoverse V1, exp2f: 5%,
exp10f: 13%, coshf: 14%.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Load the polynomial evaluation coefficients into 2 vectors and use lanewise MLAs.
8% improvement in throughput microbenchmark on Neoverse V1.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Load the polynomial evaluation coefficients into 2 vectors and use lanewise MLAs.
8% improvement in throughput microbenchmark on Neoverse V1 for log2 and log,
and 2% for log10.
Reviewed-by: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Use empty initializer to silence GCC 4.9 or older:
getaddrinfo.c: In function ‘gaih_inet’:
getaddrinfo.c:1135:24: error: missing braces around initializer [-Werror=missing-braces]
/ sizeof (struct gaih_typeproto)] = {0};
^
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
As of Linux 6.13, there is no code in the vDSO that declines this
initialization request with the special ~0UL state size. If the vDSO
has the function, the call succeeds and returns 0. It's expected
that the code would follow the “a negative value indicating an error”
convention, as indicated in the __cvdso_getrandom_data function
comment, so that INTERNAL_SYSCALL_ERROR_P on glibc's side would return
true. This commit changes the commit to check for zero to indicate
success instead, which covers potential future non-zero success
return values and error returns.
Fixes commit 4f5704ea347e52ac3f272d1341da10aed6e9973e ("powerpc: Use
correct procedure call standard for getrandom vDSO call (bug 32440)").
Since GCC 4.9 doesn't have __builtin_add_overflow:
In file included from tst-stringtable.c:180:0:
stringtable.c: In function ‘stringtable_finalize’:
stringtable.c:185:7: error: implicit declaration of function ‘__builtin_add_overflow’ [-Werror=implicit-function-declaration]
else if (__builtin_add_overflow (previous->offset,
^
return EXIT_UNSUPPORTED for GCC 4.9 or older.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
Add braces to silence GCC 4.9 or older:
getaddrinfo.c: In function ‘gaih_inet’:
getaddrinfo.c:1135:24: error: missing braces around initializer [-Werror=missing-braces]
/ sizeof (struct gaih_typeproto)] = {0};
^
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
If an executable is static PIE and has a non-zero load address
(compare to elf/tst-pie-address-static), it segfaults as
elf_machine_load_address() returns 0x0 and elf_machine_dynamic()
returns the run-time instead of link-time address of _DYNAMIC.
Now rely on __ehdr_start and _DYNAMIC as also done on other
architectures.
Checked back to old arch-levels that this approach works fine:
- 31bit: -march=g5
- 64bit: -march=z900
Note, that there is no static-PIE support on 31bit, but this
approach cleans it also up.
Furthermore this cleanup in glibc does not change anything
regarding the first GOT-element as the s390 ABI
(https://github.com/IBM/s390x-abi) explicitely defines:
The doubleword at _GLOBAL_OFFSET_TABLE_[0] is set by the linkage
editor to hold the address of the dynamic structure, referenced
with the symbol _DYNAMIC. This allows a program, such as the dynamic
linker, to find its own dynamic structure without having yet processed
its relocation entries. This is especially important for the dynamic
linker, because it must initialize itself without relying on other
programs to relocate its memory image.
This will be required by the rseq extensible ABI implementation on all
Linux architectures exposing the '__rseq_size' and '__rseq_offset'
symbols to set the initial value of the 'cpu_id' field which can be used
by applications to test if rseq is available and registered. As long as
the symbols are exposed it is valid for an application to perform this
test even if rseq is not yet implemented in libc for this architecture.
Compile tested with build-many-glibcs.py but I don't have access to any
hardware to run the tests.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Stafford Horne <shorne@gmail.com>
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the atan2pi functions (atan2(y,x)/pi).
Tested for x86_64 and x86, and with build-many-glibcs.py.
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the atanpi functions (atan(x)/pi).
Tested for x86_64 and x86, and with build-many-glibcs.py.
A plain indirect function call does not work on POWER because
success and failure are signaled through a flag register, and
not via the usual Linux negative return value convention.
This has potential security impact, in two ways: the return value
could be out of bounds (EAGAIN is 11 on powerpc6le), and no
random bytes have been written despite the non-error return value.
Fixes commit 461cab1de747f3842f27a5d24977d78d561d45f9 ("linux: Add
support for getrandom vDSO").
Reported-by: Ján Stanček <jstancek@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Support testing glibc build with a different C compiler or a different
C++ compiler with
$ ../glibc-VERSION/configure TEST_CC="gcc-6.4.1" TEST_CXX="g++-6.4.1"
1. Add LIBC_TRY_CC_AND_TEST_CC_OPTION, LIBC_TRY_CC_AND_TEST_CC_COMMAND
and LIBC_TRY_CC_AND_TEST_LINK to test both CC and TEST_CC.
2. Add check and xcheck targets to Makefile.in and override build compiler
options with ones from TEST_CC and TEST_CXX.
Tested on Fedora 41/x86-64:
1. Building with GCC 14.2.1 and testing with GCC 6.4.1 and GCC 11.2.1.
2. Building with GCC 15 and testing with GCC 6.4.1.
Support for GCC versions older than GCC 6.2 may need to change the test
sources. Other targets may need to update configure.ac under sysdeps and
modify Makefile.in to override target build compiler options.
Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the asinpi functions (asin(x)/pi).
Tested for x86_64 and x86, and with build-many-glibcs.py.
C23 adds various <math.h> function families originally defined in TS
18661-4. Add the acospi functions (acos(x)/pi).
Tested for x86_64 and x86, and with build-many-glibcs.py.
Add ROP protection for the getcontext, setcontext, makecontext, swapcontext
and __sigsetjmp_symbol functions.
Reviewed-by: Peter Bergner <bergner@linux.ibm.com>
This will be required by the rseq extensible ABI implementation on all
Linux architectures exposing the '__rseq_size' and '__rseq_offset'
symbols to set the initial value of the 'cpu_id' field which can be used
by applications to test if rseq is available and registered. As long as
the symbols are exposed it is valid for an application to perform this
test even if rseq is not yet implemented in libc for this architecture.
Compile tested with build-many-glibcs.py but I don't have access to any
hardware to run the tests.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: Arjun Shankar <arjun@redhat.com>
This will be required by the rseq extensible ABI implementation on all
Linux architectures exposing the '__rseq_size' and '__rseq_offset'
symbols to set the initial value of the 'cpu_id' field which can be used
by applications to test if rseq is available and registered. As long as
the symbols are exposed it is valid for an application to perform this
test even if rseq is not yet implemented in libc for this architecture.
Both code paths tested on a Visionfive 2 with Debian sid.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Enable RSEQ for RISC-V, support was added in Linux 5.18.
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Add inline helper for expm1 and rearrange operations so MOV
is not necessary in reduction or around the special-case handler.
Reduce memory access by using more indexed MLAs in polynomial.
Speedup on Neoverse V1 for expm1 (19%), sinh (8.5%), and tanh (7.5%).
Add inline helper for log1p and rearrange operations so MOV
is not necessary in reduction or around the special-case handler.
Reduce memory access by using more indexed MLAs in polynomial.
Speedup on Neoverse V1 for log1p (3.5%), acosh (7.5%) and atanh (10%).
Remove spurious ADRP and a few MOVs.
Reduce memory access by using more indexed MLAs in polynomial.
Align notation so that algorithms are easier to compare.
Speedup on Neoverse V1 for log10 (8%), log (8.5%), and log2 (10%).
Update error threshold in AdvSIMD log (now matches SVE log).
Remove spurious ADRP. Improve memory access by shuffling constants and
using more indexed MLAs.
A few more optimisation with no impact on accuracy
- force fmas contraction
- switch from shift-aided rint to rint instruction
Between 1 and 5% throughput improvement on Neoverse
V1 depending on benchmark.