glibc

mirror of git://sourceware.org/git/glibc.git synced 2025-03-13 13:37:38 +08:00

Author	SHA1	Message	Date
DJ Delorie	e79e5c4899	assert: ensure posix compliance, add tests for such Fix assert.c so that even the fallback case conforms to POSIX, although not exactly the same as the default case so a test can tell the difference. Add a test that verifies that abort is called, and that the message printed to stderr has all the info that POSIX requires. Verify this even when malloc isn't usable. Reviewed-by: Paul Eggert <eggert@cs.ucla.edu>	2024-12-20 22:44:01 -05:00
Adhemerval Zanella	b3a7a15d99	cet: Drop '#pragma GCC target' in tst-cet-legacy-10a[-static].c After commit `215447f5cb` Author: H.J. Lu <hjl.tools@gmail.com> Date: Tue Dec 17 06:18:55 2024 +0800 cet: Pass -mshstk to compiler for tst-cet-legacy-10a[-static].c we can remove '#pragma GCC target' in tst-cet-legacy-10a[-static].c. Co-Authored-By: H.J. Lu <hjl.tools@gmail.com>	2024-12-21 06:16:58 +08:00
Aurelien Jarno	6fd215d6ae	posix: fix system when a child cannot be created [BZ #32450 ] POSIX states that "if a child process cannot be created, or if the termination status for the command language interpreter cannot be obtained, system() shall return -1 and set errno to indicate the error." In the glibc implementation it could happen when posix_spawn fails, which happens when the underlying fork, vfork, or clone call fails. They could fail with EAGAIN and ENOMEM. Resolves: BZ #32450 Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-12-20 22:57:06 +01:00
H.J. Lu	034cd67528	Don't use glibc <tgmath.h> when testing with Clang Clang has its own <tgmath.h> and doesn't use <tgmath.h> from glibc. Pass "-I." to compiler only if $($(<F)-no-include-dot) are undefined. Define it to yes for tgmath tests when testing with Clang. Signed-off-by: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: Sam James <sam@gentoo.org>	2024-12-21 05:24:07 +08:00
H.J. Lu	6025b399c7	stdio-common: Exclude bug28 when clang is used Clang 19 takes a very long time, it ran more than 27 minutes on Intel Core i7-1195G7 before the process was killed, to compile bug28.c: https://github.com/llvm/llvm-project/issues/120462 Exclude it when Clang is used for testing. Signed-off-by: H.J. Lu <hjl.tools@gmail.com> Reviewed-by: Sam James <sam@gentoo.org>	2024-12-21 05:14:01 +08:00
H.J. Lu	40bf25b754	Fix elf: Introduce is_rtld_link_map [BZ #32488 ] Also use is_rtld_link_map in dl-cet.c. This fixes BZ #32488. Signed-off-by: H.J. Lu <hjl.tools@gmail.com>	2024-12-21 04:36:18 +08:00
Adhemerval Zanella	c3ee510267	math: xfail some tanpi tests for ibm128-libgcc On powerpc math/test-ibm128-tanpi shows multiple failures: testing long double (without inline functions) Failure: tanpi_downward (0xfffffffffffffffdp-1): Exception "Divide by zero" not set Failure: tanpi_downward (0xfffffffffffffffdp-1): errno set to 0, expected 34 (ERANGE) Failure: Test: tanpi_downward (0xfffffffffffffffdp-1) Result: is: 4.68843873182857939141363635204365e+28 0x1.2efbb6629d1d59b032520400df8p+95 should be: inf inf Failure: tanpi_downward (0x3fffffffffffffffffffffffffdp-1): Exception "Divide by zero" not set Failure: tanpi_downward (0x3fffffffffffffffffffffffffdp-1): errno set to 0, expected 34 (ERANGE) Failure: Test: tanpi_downward (0x3fffffffffffffffffffffffffdp-1) Result: is: 1.41444453325831960404472183124793e+16 0x1.9202627cbf98e052d5fdbeee1f8p+53 should be: inf inf Failure: tanpi_downward (-0xf.ffffffffffffbffffffffffffcp+1020): Exception "Invalid operation" set Failure: tanpi_downward (-0xf.ffffffffffffbffffffffffffcp+1020): Exception "Overflow" set Failure: tanpi_downward (-0xf.ffffffffffffbffffffffffffcp+1020): errno set to 33, expected 0 (unchanged) Failure: Test: tanpi_downward (-0xf.ffffffffffffbffffffffffffcp+1020) Result: is: qNaN should be: -0.00000000000000000000000000000000e+00 -0x0.000000000000000000000000000p+0 Failure: Test: tanpi_downward (0x3.fffffffffffffffcp+108) Result: is: 2.91356019227449116879287504834896e-15 0x1.a3e365fee24d4632f95a2235698p-49 should be: 0.00000000000000000000000000000000e+00 0x0.000000000000000000000000000p+0 difference: 2.91356019227449116879287504834896e-15 0x1.a3e365fee24d4632f95a2235698p-49 ulp : 179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321 max.ulp : 8.0000 Failure: Test: tanpi_downward (0x3.ffffffffffffffffffffffffffp+108) Result: is: 7.94911926685664643005642781870827e-16 0x1.ca3c4b83eb5688e1474146dc338p-51 should be: 0.00000000000000000000000000000000e+00 0x0.000000000000000000000000000p+0 difference: 7.94911926685664643005642781870827e-16 0x1.ca3c4b83eb5688e1474146dc338p-51 ulp : 160891965142034222272327839154722485473479235229008379884749401713481320342777314570400076204240982703218835644458374555276642 max.ulp : 8.0000 Failure: tanpi_towardzero (0xfffffffffffffffdp-1): Exception "Divide by zero" not set Failure: tanpi_towardzero (0xfffffffffffffffdp-1): errno set to 0, expected 34 (ERANGE) Failure: Test: tanpi_towardzero (0xfffffffffffffffdp-1) Result: is: 2.14718475310122677917055904836884e+28 0x1.1584624c14882fff76592b4ec10p+94 should be: inf inf Failure: tanpi_towardzero (-0xfffffffffffffffdp-1): Exception "Divide by zero" not set Failure: tanpi_towardzero (-0xfffffffffffffffdp-1): errno set to 0, expected 34 (ERANGE) Failure: Test: tanpi_towardzero (-0xfffffffffffffffdp-1) Result: is: -2.14718475310122677917055904836884e+28 -0x1.1584624c14882fff76592b4ec10p+94 should be: -inf -inf Failure: tanpi_towardzero (0x3fffffffffffffffffffffffffdp-1): Exception "Divide by zero" not set Failure: tanpi_towardzero (0x3fffffffffffffffffffffffffdp-1): errno set to 0, expected 34 (ERANGE) Failure: Test: tanpi_towardzero (0x3fffffffffffffffffffffffffdp-1) Result: is: 6.60739946234609289593176521179840e+15 0x1.7796511d79d6ce55bc8bf083fe0p+52 should be: inf inf Failure: tanpi_towardzero (-0x3fffffffffffffffffffffffffdp-1): Exception "Divide by zero" not set Failure: tanpi_towardzero (-0x3fffffffffffffffffffffffffdp-1): errno set to 0, expected 34 (ERANGE) Failure: Test: tanpi_towardzero (-0x3fffffffffffffffffffffffffdp-1) Result: is: -6.60739946234609289593176521179840e+15 -0x1.7796511d79d6ce55bc8bf083fe0p+52 should be: -inf -inf Failure: Test: tanpi_towardzero (-0x3.fffffffffffffffcp+108) Result: is: -1.17953443892757434921819283936141e-14 -0x1.a8f8d97fb893518cbe5688935c0p-47 should be: -0.00000000000000000000000000000000e+00 -0x0.000000000000000000000000000p+0 difference: 1.17953443892757434921819283936141e-14 0x1.a8f8d97fb893518cbe5688935c0p-47 ulp : 179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321 max.ulp : 8.0000 Failure: Test: tanpi_towardzero (-0x3.ffffffffffffffffffffffffffp+108) Result: is: -1.85584803206881692897837494734542e-14 -0x1.4e51e25c1f5ab4470a3a0a42c24p-46 should be: -0.00000000000000000000000000000000e+00 -0x0.000000000000000000000000000p+0 difference: 1.85584803206881692897837494734542e-14 0x1.4e51e25c1f5ab4470a3a0a42c24p-46 ulp : 179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321 max.ulp : 8.0000 Failure: Test: tanpi_towardzero (0x3.fffffffffffffffcp+108) Result: is: 1.17953443892757434921819283936141e-14 0x1.a8f8d97fb893518cbe5688935c0p-47 should be: 0.00000000000000000000000000000000e+00 0x0.000000000000000000000000000p+0 difference: 1.17953443892757434921819283936141e-14 0x1.a8f8d97fb893518cbe5688935c0p-47 ulp : 179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321 max.ulp : 8.0000 Failure: Test: tanpi_towardzero (0x3.ffffffffffffffffffffffffffp+108) Result: is: 1.85584803206881692897837494734542e-14 0x1.4e51e25c1f5ab4470a3a0a42c24p-46 should be: 0.00000000000000000000000000000000e+00 0x0.000000000000000000000000000p+0 difference: 1.85584803206881692897837494734542e-14 0x1.4e51e25c1f5ab4470a3a0a42c24p-46 ulp : 179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321 max.ulp : 8.0000 Failure: tanpi_upward (-0xfffffffffffffffdp-1): Exception "Divide by zero" not set Failure: tanpi_upward (-0xfffffffffffffffdp-1): errno set to 0, expected 34 (ERANGE) Failure: Test: tanpi_upward (-0xfffffffffffffffdp-1) Result: is: -2.14718475310122677917055904836884e+28 -0x1.1584624c14882fff76592b4ec10p+94 should be: -inf -inf Failure: tanpi_upward (-0x3fffffffffffffffffffffffffdp-1): Exception "Divide by zero" not set Failure: tanpi_upward (-0x3fffffffffffffffffffffffffdp-1): errno set to 0, expected 34 (ERANGE) Failure: Test: tanpi_upward (-0x3fffffffffffffffffffffffffdp-1) Result: is: -6.60739946234609289593176521179829e+15 -0x1.7796511d79d6ce55bc8bf083fdbp+52 should be: -inf -inf Failure: Test: tanpi_upward (-0x3.fffffffffffffffcp+108) Result: is: -1.17953443892757434921819283936138e-14 -0x1.a8f8d97fb893518cbe5688935b0p-47 should be: -0.00000000000000000000000000000000e+00 -0x0.000000000000000000000000000p+0 difference: 1.17953443892757434921819283936139e-14 0x1.a8f8d97fb893518cbe5688935b0p-47 ulp : inf max.ulp : 8.0000 Failure: Test: tanpi_upward (-0x3.ffffffffffffffffffffffffffp+108) Result: is: -1.85584803206881692897837494734542e-14 -0x1.4e51e25c1f5ab4470a3a0a42c24p-46 should be: -0.00000000000000000000000000000000e+00 -0x0.000000000000000000000000000p+0 difference: 1.85584803206881692897837494734543e-14 0x1.4e51e25c1f5ab4470a3a0a42c24p-46 ulp : inf max.ulp : 8.0000 Failure: tanpi_upward (0xf.ffffffffffffbffffffffffffcp+1020): Exception "Invalid operation" set Failure: tanpi_upward (0xf.ffffffffffffbffffffffffffcp+1020): Exception "Overflow" set Failure: tanpi_upward (0xf.ffffffffffffbffffffffffffcp+1020): errno set to 33, expected 0 (unchanged) Failure: Test: tanpi_upward (0xf.ffffffffffffbffffffffffffcp+1020) Result: is: qNaN should be: 0.00000000000000000000000000000000e+00 0x0.000000000000000000000000000p+0	2024-12-20 15:09:40 -03:00
Florian Weimer	495b96e064	elf: Reorder audit events in dlcose to match _dl_fini (bug 32066) This was discovered after extending elf/tst-audit23 to cover dlclose of the dlmopen namespace. Auditors already experience the new order during process shutdown (_dl_fini), so no LAV_CURRENT bump or backwards compatibility code seems necessary. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-12-20 16:17:10 +01:00
Florian Weimer	c4b160744c	elf: Call la_objclose for proxy link maps in _dl_fini (bug 32065) Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-12-20 16:17:08 +01:00
Florian Weimer	8f36b14696	elf: Signal la_objopen for the proxy link map in dlmopen (bug 31985) Previously, the ld.so link map was silently added to the namespace. This change produces an auditing event for it. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-12-20 16:16:21 +01:00
Florian Weimer	a20bc2f623	elf: Add the endswith function to <endswith.h> And include <stdbool.h> for a definition of bool. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-12-20 16:15:53 +01:00
Florian Weimer	4a50fdf8b2	elf: Update DSO list, write audit log to elf/tst-audit23.out After commit `1d5024f4f0` ("support: Build with exceptions and asynchronous unwind tables [BZ #30587]"), libgcc_s is expected to show up in the DSO list on 32-bit Arm. Do not update max_objs because vdso is not tracked (and which is the reason why the test currently passes even with libgcc_s present). Also write the log output from the auditor to standard output, for easier test debugging. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-12-20 16:15:51 +01:00
Florian Weimer	ef5823d955	elf: Move _dl_rtld_map, _dl_rtld_audit_state out of GL This avoids immediate GLIBC_PRIVATE ABI issues if the size of struct link_map or struct auditstate changes. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-12-20 15:52:57 +01:00
Florian Weimer	2b1dba3eb3	elf: Introduce is_rtld_link_map Unconditionally define it to false for static builds. This avoids the awkward use of weak_extern for _dl_rtld_map in checks that cannot be possibly true on static builds. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-12-20 15:52:57 +01:00
Joseph Myers	322e9d4e44	Add F_CREATED_QUERY from Linux 6.12 to bits/fcntl-linux.h Linux 6.12 adds a new constant F_CREATED_QUERY. Add it to glibc's bits/fcntl-linux.h. Tested for x86_64.	2024-12-20 11:47:33 +00:00
Joseph Myers	37d9618492	Add HWCAP_LOONGARCH_LSPW from Linux 6.12 to bits/hwcap.h Add the new Linux 6.12 HWCAP_LOONGARCH_LSPW to the corresponding bits/hwcap.h. Tested with build-many-glibcs.py for loongarch64-linux-gnu-lp64d.	2024-12-20 11:47:03 +00:00
Joseph Myers	fbdd8b3fa8	Add MSG_SOCK_DEVMEM from Linux 6.12 to bits/socket.h Linux 6.12 adds a constant MSG_SOCK_DEVMEM (recall that various constants such as this one are defined in the non-uapi linux/socket.h but still form part of the kernel/userspace interface, so that non-uapi header is one that needs checking each release for new such constants). Add it to glibc's bits/socket.h. Tested for x86_64.	2024-12-20 11:46:06 +00:00
Florian Weimer	9a6533429e	i386: Regenerate ulps As seen on an Intel i9-9900K CPU, with glibc built with GCC 11.5, configured with and without --disable-multi-arch.	2024-12-20 12:40:17 +01:00
Florian Weimer	6fba7d6578	x86_64: Regenerate ulps As seen with an AMD 7950X CPU, on a glibc built with GCC 11.5.	2024-12-20 07:22:02 +01:00
Florian Weimer	6a99b4172a	aarch64: Regenerate ulps Results from running on Neoverse-V2, built with GCC 11.5.	2024-12-20 07:12:30 +01:00
Florian Weimer	e79b9e962d	elf: Remove code dependent on __rtld_lock_default_lock_recursive macro Neither NPTL nor Hurd define this macro anymore. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-12-19 21:29:58 +01:00
Florian Weimer	70d0836305	Linux: Accept null arguments for utimensat pathname This matches kernel behavior. With this change, it is possible to use utimensat as a replacement for the futimens interface, similar to what glibc does internally. Reviewed-by: Paul Eggert <eggert@cs.ucla.edu>	2024-12-19 21:21:30 +01:00
Florian Weimer	30d3fd7f4f	x86_64: Remove unused padding from tcbhead_t This padding is difficult to use for preserving the internal GLIBC_PRIVATE ABI. The comment is misleading. Current Address Sanitizer uses heuristics to determine struct pthread size. It does not depend on its precise layout. It merely scans for pointers allocated using malloc. Due to the removal of the padding, the assert for its start is no longer required. Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>	2024-12-19 21:21:30 +01:00
Joseph Myers	d7f587398c	Add further DSO dependency sorting tests The current DSO dependency sorting tests are for a limited number of specific cases, including some from particular bug reports. Add tests that systematically cover all possible DAGs for an executable and the shared libraries it depends on, directly or indirectly, up to four objects (an executable and three shared libraries). (For this kind of DAG - ones with a single source vertex from which all others are reachable, and an ordering on the edges from each vertex - there are 57 DAGs on four vertices, 3399 on five vertices and 1026944 on six vertices; see https://arxiv.org/pdf/2303.14710 for more details on this enumeration. I've tested that the 3399 cases with five vertices do all pass if enabled.) These tests are replicating the sorting logic from the dynamic linker (thereby, for example, asserting that it doesn't accidentally change); I'm not claiming that the logic in the dynamic linker is in some abstract sense optimal. Note that these tests do illustrate how in some cases the two sorting algorithms produce different results for a DAG (I think all the existing tests for such differences are ones involving cycles, and the motivation for the new algorithm was also to improve the handling of cycles): tst-dso-ordering-all4-44: a->[bc];{}->[cba] output(glibc.rtld.dynamic_sort=1): c>b>a>{}<a<b<c output(glibc.rtld.dynamic_sort=2): b>c>a>{}<a<c<b They also illustrate that sometimes the sorting algorithms do not follow the order in which dependencies are listed in DT_NEEDED even though there is a valid topological sort that does follow that, which might be counterintuitive considering that the DT_NEEDED ordering is followed in the simplest cases: tst-dso-ordering-all4-56: {}->[abc] output: c>b>a>{}<a<b<c shows such a simple case following DT_NEEDED order for destructor execution (the reverse of it for constructor execution), but tst-dso-ordering-all4-41: a->[cb];{}->[cba] output: c>b>a>{}<a<b<c shows that c and b are in the opposite order to what might be expected from the simplest case, though there is no dependency requiring such an opposite order to be used. (I'm not asserting that either of those things is a problem, simply observing them as less obvious properties of the sorting algorithms shown up by these tests.) Tested for x86_64.	2024-12-19 18:56:04 +00:00
Joseph Myers	539bf8dd41	Add NT_X86_XSAVE_LAYOUT and NT_ARM_POE from Linux 6.12 to elf.h Linux 6.12 adds new ELF note types NT_X86_XSAVE_LAYOUT and NT_ARM_POE. Add these to glibc's elf.h. Tested for x86_64.	2024-12-19 17:09:19 +00:00
Joseph Myers	29ae632e76	Add SCHED_EXT from Linux 6.12 to bits/sched.h Linux 6.12 adds the SCHED_EXT constant. Add it to glibc's bits/sched.h and update the kernel version in tst-sched-consts.py. Tested for x86_64.	2024-12-19 17:08:38 +00:00
John David Anglin	57256971b0	hppa: Fix strace detach-vfork test This change implements vfork.S for direct support of the vfork syscall. clone.S is revised to correct child support for the vfork case. The main bug was creating a frame prior to the clone syscall. This was done to allow the rp and r4 registers to be saved and restored from the stack frame. r4 was used to save and restore the PIC register, r19, across the system call and the call to set errno. But in the vfork case, it is undefined behavior for the child to return from the function in which vfork was called. It is surprising that this usually worked. Syscalls on hppa save and restore rp and r19, so we don't need to create a frame prior to the clone syscall. We only need a frame when __syscall_error is called. We also don't need to save and restore r19 around the call to $$dyncall as r19 is not used in the code after $$dyncall. This considerably simplifies clone.S. Signed-off-by: John David Anglin <dave.anglin@bell.net>	2024-12-19 11:30:09 -05:00
Joseph Myers	5fcee06dc7	Update kernel version to 6.12 in header constant tests There are no new constants covered by tst-mman-consts.py, tst-mount-consts.py or tst-pidfd-consts.py in Linux 6.12 that need any header changes, so update the kernel version in those tests. (tst-sched-consts.py will need updating separately along with adding SCHED_EXT.) Tested with build-many-glibcs.py.	2024-12-19 15:38:59 +00:00
Paul Zimmermann	d421d36582	added url of CORE-MATH project Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>	2024-12-18 17:24:43 -03:00
Adhemerval Zanella	0e0be3ed80	math: Use tanhf from CORE-MATH The CORE-MATH implementation is correctly rounded (for any rounding mode) and shows slight better performance to the generic tanhf. The code was adapted to glibc style and to use the definition of math_config.h (to handle errno, overflow, and underflow). Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (Neoverse-N1, gcc 13.3.1), and powerpc (POWER10, gcc 13.2.1): Latency master patched improvement x86_64 51.5273 41.0951 20.25% x86_64v2 47.7021 39.1526 17.92% x86_64v3 45.0373 34.2737 23.90% i686 133.9970 83.8596 37.42% aarch64 (Neoverse) 21.5439 14.7961 31.32% power10 13.3301 8.4406 36.68% reciprocal-throughput master patched improvement x86_64 24.9493 12.8547 48.48% x86_64v2 20.7051 12.7761 38.29% x86_64v3 19.2492 11.0851 42.41% i686 78.6498 29.8211 62.08% aarch64 (Neoverse) 11.6026 7.11487 38.68% power10 6.3328 2.8746 54.61% Signed-off-by: Alexei Sibidanov <sibid@uvic.ca> Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr> Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: DJ Delorie <dj@redhat.com>	2024-12-18 17:24:43 -03:00
Adhemerval Zanella	1751c0519a	math: Use sinhf from CORE-MATH The CORE-MATH implementation is correctly rounded (for any rounding mode) and shows slight better performance to the generic sinhf. The code was adapted to glibc style and to use the definition of math_config.h (to handle errno, overflow, and underflow). Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (Neoverse-N1, gcc 13.3.1), and powerpc (POWER10, gcc 13.2.1): Latency master patched improvement x86_64 52.6819 49.1489 6.71% x86_64v2 49.1162 42.9447 12.57% x86_64v3 46.9732 39.9157 15.02% i686 141.1470 129.6410 8.15% aarch64 (Neoverse) 20.8539 17.1288 17.86% power10 14.5258 9.1906 36.73% reciprocal-throughput master patched improvement x86_64 27.5553 23.9395 13.12% x86_64v2 21.6423 20.3219 6.10% x86_64v3 21.4842 16.0224 25.42% i686 87.9709 86.1626 2.06% aarch64 (Neoverse) 15.1919 12.2744 19.20% power10 7.2188 5.2611 27.12% Signed-off-by: Alexei Sibidanov <sibid@uvic.ca> Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr> Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: DJ Delorie <dj@redhat.com>	2024-12-18 17:24:43 -03:00
Adhemerval Zanella	9583836785	math: Use coshf from CORE-MATH The CORE-MATH implementation is correctly rounded (for any rounding mode), although it should worse performance than current one. The current implementation performance comes mainly from the internal usage of the optimize expf implementation, and shows a maximum ULPs of 2 for FE_TONEAREST and 3 for other rounding modes. The code was adapted to glibc style and to use the definition of math_config.h (to handle errno, overflow, and underflow). Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (Neoverse-N1, gcc 13.3.1), and powerpc (POWER10, gcc 13.2.1): Latency master patched improvement x86_64 40.6995 49.0737 -20.58% x86_64v2 40.5841 44.3604 -9.30% x86_64v3 39.3879 39.7502 -0.92% i686 112.3380 129.8570 -15.59% aarch64 (Neoverse) 18.6914 17.0946 8.54% power10 11.1343 9.3245 16.25% reciprocal-throughput master patched improvement x86_64 18.6471 24.1077 -29.28% x86_64v2 17.7501 20.2946 -14.34% x86_64v3 17.8262 17.1877 3.58% i686 64.1454 86.5645 -34.95% aarch64 (Neoverse) 9.77226 12.2314 -25.16% power10 4.0200 5.3316 -32.63% Signed-off-by: Alexei Sibidanov <sibid@uvic.ca> Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr> Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: DJ Delorie <dj@redhat.com>	2024-12-18 17:24:43 -03:00
Adhemerval Zanella	7cfd8b5698	math: Use atanhf from CORE-MATH The CORE-MATH implementation is correctly rounded (for any rounding mode) and shows slight better performance to the generic atanhf. The code was adapted to glibc style and to use the definition of math_config.h (to handle errno, overflow, and underflow). Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (Neoverse-N1, gcc 13.3.1), and powerpc (POWER10, gcc 13.2.1): Latency master patched improvement x86_64 59.4930 45.8568 22.92% x86_64v2 59.5705 45.5804 23.48% x86_64v3 53.1838 37.7155 29.08% i686 169.354 133.5940 21.12% aarch64 (Neoverse) 26.0781 16.9829 34.88% power10 15.6591 10.7623 31.27% reciprocal-throughput master patched improvement x86_64 23.5903 18.5766 21.25% x86_64v2 22.6489 18.2683 19.34% x86_64v3 19.0401 13.9474 26.75% i686 97.6034 107.3260 -9.96% aarch64 (Neoverse) 15.3664 9.57846 37.67% power10 6.8877 4.6242 32.86% Signed-off-by: Alexei Sibidanov <sibid@uvic.ca> Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr> Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: DJ Delorie <dj@redhat.com>	2024-12-18 17:24:43 -03:00
Adhemerval Zanella	6f9bacf36b	math: Use atan2f from CORE-MATH The CORE-MATH implementation is correctly rounded (for any rounding mode) and shows slight better performance to the generic atan2f. The code was adapted to glibc style and to use the definition of math_config.h (to handle errno, overflow, and underflow). Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (Neoverse-N1, gcc 13.3.1), and powerpc (POWER10, gcc 13.2.1): Latency master patched improvement x86_64 68.1175 69.2014 -1.59% x86_64v2 66.9884 66.0081 1.46% x86_64v3 57.7034 61.6407 -6.82% i686 189.8690 152.7560 19.55% aarch64 (Neoverse) 32.6151 24.5382 24.76% power10 21.7282 17.1896 20.89% reciprocal-throughput master patched improvement x86_64 34.5202 31.6155 8.41% x86_64v2 32.6379 30.3372 7.05% x86_64v3 34.3677 23.6455 31.20% i686 157.7290 75.8308 51.92% aarch64 (Neoverse) 27.7788 16.2671 41.44% power10 15.5715 8.1588 47.60% Signed-off-by: Alexei Sibidanov <sibid@uvic.ca> Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr> Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: DJ Delorie <dj@redhat.com>	2024-12-18 17:24:43 -03:00
Adhemerval Zanella	a357d6273f	math: Use atanf from CORE-MATH The CORE-MATH implementation is correctly rounded (for any rounding mode) and shows slight better performance to the generic atanf. The code was adapted to glibc style and to use the definition of math_config.h (to handle errno, overflow, and underflow). Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (Neoverse-N1, gcc 13.3.1), and powerpc (POWER10, gcc 13.2.1): Latency master patched improvement x86_64 56.8265 53.6842 5.53% x86_64v2 54.8177 53.6842 2.07% x86_64v3 46.2915 48.7034 -5.21% i686 158.3760 108.9560 31.20% aarch64 (Neoverse) 21.687 20.5893 5.06% power10 13.1903 13.5012 -2.36% reciprocal-throughput master patched improvement x86_64 16.6787 16.7601 -0.49% x86_64v2 16.6983 16.7601 -0.37% x86_64v3 16.2268 12.1391 25.19% i686 138.6840 36.0640 74.00% aarch64 (Neoverse) 11.8012 10.3565 12.24% power10 5.3212 4.2894 19.39% Signed-off-by: Alexei Sibidanov <sibid@uvic.ca> Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr> Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: DJ Delorie <dj@redhat.com>	2024-12-18 17:24:43 -03:00
Adhemerval Zanella	ed608a40e2	math: Use asinhf from CORE-MATH The CORE-MATH implementation is correctly rounded (for any rounding mode) and shows slight better performance to the generic asinhf. The code was adapted to glibc style and to use the definition of math_config.h (to handle errno, overflow, and underflow). Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (Neoverse-N1, gcc 13.3.1), and powerpc (POWER10, gcc 13.2.1): Latency master patched improvement x86_64 64.5128 56.9717 11.69% x86_64v2 63.3065 57.2666 9.54% x86_64v3 62.8719 51.4170 18.22% i686 189.1630 137.635 27.24% aarch64 (Neoverse) 25.3551 20.5757 18.85% power10 17.9712 13.3302 25.82% reciprocal-throughput master patched improvement x86_64 20.0844 15.4731 22.96% x86_64v2 19.2919 15.4000 20.17% x86_64v3 18.7226 11.9009 36.44% i686 103.7670 80.2681 22.65% aarch64 (Neoverse) 12.5005 8.68969 30.49% power10 7.2220 5.03617 30.27% Signed-off-by: Alexei Sibidanov <sibid@uvic.ca> Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr> Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>: Reviewed-by: DJ Delorie <dj@redhat.com>	2024-12-18 17:24:43 -03:00
Adhemerval Zanella	5fb4b566ef	math: Use asinf from CORE-MATH The CORE-MATH implementation is correctly rounded (for any rounding mode) and shows slight better performance to the generic asinf. The code was adapted to glibc style and to use the definition of math_config.h (to handle errno, overflow, and underflow). Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (Neoverse-N1, gcc 13.3.1), and powerpc (POWER10, gcc 13.2.1): Latency master patched improvement x86_64 42.8237 35.2460 17.70% x86_64v2 43.3711 35.9406 17.13% x86_64v3 35.0335 30.5744 12.73% i686 213.8780 104.4710 51.15% aarch64 (Neoverse) 17.2937 13.6025 21.34% power10 12.0227 7.4241 38.25% reciprocal-throughput master patched improvement x86_64 13.6770 15.5231 -13.50% x86_64v2 13.8722 16.0446 -15.66% x86_64v3 13.6211 13.2753 2.54% i686 186.7670 45.4388 75.67% aarch64 (Neoverse) 9.96089 9.39285 5.70% power10 4.9862 3.7819 24.15% Signed-off-by: Alexei Sibidanov <sibid@uvic.ca> Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr> Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: DJ Delorie <dj@redhat.com>	2024-12-18 17:24:43 -03:00
Adhemerval Zanella	673e6fe110	math: Use acoshf from CORE-MATH The CORE-MATH implementation is correctly rounded (for any rounding mode) and shows slight better performance to the generic acoshf. The code was adapted to glibc style and to use the definition of math_config.h (to handle errno, overflow, and underflow). Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (Neoverse-N1, gcc 13.3.1), and powerpc (POWER10, gcc 13.2.1): Latency master patched improvement x86_64 61.2471 58.7742 4.04% x86_64-v2 62.6519 59.0523 5.75% x86_64-v3 58.7408 50.1393 14.64% aarch64 24.8580 21.3317 14.19% power10 17.0469 13.1345 22.95% reciprocal-throughput master patched improvement x86_64 16.1618 15.1864 6.04% x86_64-v2 15.7729 14.7563 6.45% x86_64-v3 14.1669 11.9568 15.60% aarch64 10.911 9.5486 12.49% power10 6.38196 5.06734 20.60% Signed-off-by: Alexei Sibidanov <sibid@uvic.ca> Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr> Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: DJ Delorie <dj@redhat.com>	2024-12-18 17:24:43 -03:00
Adhemerval Zanella	66fa7ad437	math: Use acosf from CORE-MATH The CORE-MATH implementation is correctly rounded (for any rounding mode) and shows slight better performance to the generic acosf. The code was adapted to glibc style and to use the definition of math_config.h (to handle errno, overflow, and underflow). Benchtest on x64_64 (Ryzen 9 5900X, gcc 14.2.1), aarch64 (Neoverse-N1, gcc 13.3.1), and powerpc (POWER10, gcc 13.2.1): Latency master patched improvement x86_64 52.5098 36.6312 30.24% x86_64v2 53.0217 37.3091 29.63% x86_64v3 42.8501 32.3977 24.39% i686 207.3960 109.4000 47.25% aarch64 21.3694 13.7871 35.48% power10 14.5542 7.2891 49.92% reciprocal-throughput master patched improvement x86_64 14.1487 15.9508 -12.74% x86_64v2 14.3293 16.1899 -12.98% x86_64v3 13.6563 12.6161 7.62% i686 158.4060 45.7354 71.13% aarch64 12.5515 9.19233 26.76% power10 5.7868 3.3487 42.13% Signed-off-by: Alexei Sibidanov <sibid@uvic.ca> Signed-off-by: Paul Zimmermann <Paul.Zimmermann@inria.fr> Signed-off-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: DJ Delorie <dj@redhat.com>	2024-12-18 17:24:43 -03:00
Adhemerval Zanella	45126f866c	math: Fix the expected carg (inf) results The pi defined constants are not the expected value for carg on non-default rounding modes (similar to atan). Instead use autogenerated value.	2024-12-18 17:24:43 -03:00
Adhemerval Zanella	abe1d65aa6	math: Fix the expected atan2f (inf) results The pi defined constants are not the expected value for atan2 on non-default rounding modes. Instead use the autogenerated value. Reviewed-by: DJ Delorie <dj@redhat.com>	2024-12-18 17:24:43 -03:00
Adhemerval Zanella	517c213377	math: Fix the expected atanf (inf) results The M_PI_2 (lit_pi_2_d) constant is not the expected value for atanf on non-default rounding modes. Instead use the autogenerated value.	2024-12-18 17:24:43 -03:00
Adhemerval Zanella	aa3e67ced6	math: Add inf support on gen-auto-libm-tests.c For some correctly rounded inputs where infinity might generate a number (like atanf), comparing to a pre-defined constant does not yield the expected result in all rounding modes. The most straightforward way to handle it would be to get the expected result from mpfr, where it handles all the rounding modes.	2024-12-18 17:24:42 -03:00
Adhemerval Zanella	a993eea641	math: Fix spurious-divbyzero flag name Reviewed-by: DJ Delorie <dj@redhat.com>	2024-12-18 17:24:42 -03:00
Adhemerval Zanella	042ed4b28a	benchtests: Add tanhf benchmark Random inputs in the range [-10,10]. Reviewed-by: DJ Delorie <dj@redhat.com>	2024-12-18 17:24:42 -03:00
Adhemerval Zanella	b76b90a809	benchtests: Add sinhf benchmark Random inputs in the range [-10,10]. Reviewed-by: DJ Delorie <dj@redhat.com>	2024-12-18 17:24:42 -03:00
Adhemerval Zanella	7b7a3fa121	benchtests: Add coshf benchmark Random inputs in the range [-10,10]. Reviewed-by: DJ Delorie <dj@redhat.com>	2024-12-18 17:24:42 -03:00
Adhemerval Zanella	4f1e26ba47	benchtests: Add atanhf benchmark The input is based on acosf one (random inputs in [-1,1]). Reviewed-by: DJ Delorie <dj@redhat.com>	2024-12-18 17:24:42 -03:00
Adhemerval Zanella	fa857e6c7b	benchtests: Add atan2f benchmark Random inputs in the range [-10,10]. Reviewed-by: DJ Delorie <dj@redhat.com>	2024-12-18 17:24:42 -03:00
Adhemerval Zanella	74a275d244	benchtests: Add atanf benchmark Random inputs in the range [-10,10]. Reviewed-by: DJ Delorie <dj@redhat.com>	2024-12-18 17:24:42 -03:00

1 2 3 4 5 ...

41782 Commits