159 Commits

Author SHA1 Message Date
Joseph Myers
75ad83f564 Implement C23 pown
C23 adds various <math.h> function families originally defined in TS
18661-4.  Add the pown functions, which are like pow but with an
integer exponent.  That exponent has type long long int in C23; it was
intmax_t in TS 18661-4, and as with other interfaces changed after
their initial appearance in the TS, I don't think we need to support
the original version of the interface.  The test inputs are based on
the subset of test inputs for pow that use integer exponents that fit
in long long.

As the first such template implementation that saves and restores the
rounding mode internally (to avoid possible issues with directed
rounding and intermediate overflows or underflows in the wrong
rounding mode), support also needed to be added for using
SET_RESTORE_ROUND* in such template function implementations.  This
required math-type-macros-float128.h to include <fenv_private.h>, so
it can tell whether SET_RESTORE_ROUNDF128 is defined.  In turn, the
include order with <fenv_private.h> included before <math_private.h>
broke loongarch builds, showing up that
sysdeps/loongarch/math_private.h is really a fenv_private.h file
(maybe implemented internally before the consistent split of those
headers in 2018?) and needed to be renamed to fenv_private.h to avoid
errors with duplicate macro definitions if <math_private.h> is
included after <fenv_private.h>.

The underlying implementation uses __ieee754_pow functions (called
more than once in some cases, where the exponent does not fit in the
floating type).  I expect a custom implementation for a given format,
that only handles integer exponents but handles larger exponents
directly, could be faster and more accurate in some cases.

I encourage searching for worst cases for ulps error for these
implementations (necessarily non-exhaustively, given the size of the
input space).

Tested for x86_64 and x86, and with build-many-glibcs.py.
2025-03-27 10:44:44 +00:00
Joseph Myers
409668f6e8 Implement C23 powr
C23 adds various <math.h> function families originally defined in TS
18661-4.  Add the powr functions, which are like pow, but with simpler
handling of special cases (based on exp(y*log(x)), so negative x and
0^0 are domain errors, powers of -0 are always +0 or +Inf never -0 or
-Inf, and 1^+-Inf and Inf^0 are also domain errors, while NaN^0 and
1^NaN are NaN).  The test inputs are taken from those for pow, with
appropriate adjustments (including removing all tests that would be
domain errors from those in auto-libm-test-in and adding some more
such tests in libm-test-powr.inc).

The underlying implementation uses __ieee754_pow functions after
dealing with all special cases that need to be handled differently.
It might be a little faster (avoiding a wrapper and redundant checks
for special cases) to have an underlying implementation built
separately for both pow and powr with compile-time conditionals for
special-case handling, but I expect the benefit of that would be
limited given that both functions will end up needing to use the same
logic for computing pow outside of special cases.

My understanding is that powr(negative, qNaN) should raise "invalid":
that the rule on "invalid" for an argument outside the domain of the
function takes precedence over a quiet NaN argument producing a quiet
NaN result with no exceptions raised (for rootn it's explicit that the
0th root of qNaN raises "invalid").  I've raised this on the WG14
reflector to confirm the intent.

Tested for x86_64 and x86, and with build-many-glibcs.py.
2025-03-14 15:58:11 +00:00
Aurelien Jarno
443cb0b5f2 math: Remove an extra semicolon in math function declarations
Commit 6bc301672bfbd ("math: Remove __XXX math functions from installed
math.h [BZ #32418]") left an extra semicolon after macro expansion. For
instance the ceil declaration after expansion is:

  extern double ceil (double __x) __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__const__));;

This chokes very naive parsers like gauche c-wrapper. Fix that by
removing that extra semicolon in the macro.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>
2025-03-08 13:31:03 +01:00
Joseph Myers
77261698b4 Implement C23 rsqrt
C23 adds various <math.h> function families originally defined in TS
18661-4.  Add the rsqrt functions (1/sqrt(x)).  The test inputs are
taken from those for sqrt.

Tested for x86_64 and x86, and with build-many-glibcs.py.
2025-03-07 19:15:26 +00:00
Joe Ramsay
080998f6e7 AArch64: Add vector tanpi routines
Vector variant of the new C23 tanpi. New tests pass on AArch64.
2025-01-03 21:39:56 +00:00
Joe Ramsay
40c3a06293 AArch64: Add vector cospi routines
Vector variant of the new C23 cospi. New tests pass on AArch64.
2025-01-03 21:39:56 +00:00
Joe Ramsay
6050b45716 AArch64: Add vector sinpi to libmvec
Vector variant of the new C23 sinpi. New tests pass on AArch64.
2025-01-03 21:39:56 +00:00
Paul Eggert
2642002380 Update copyright dates with scripts/update-copyrights 2025-01-01 11:22:09 -08:00
Joseph Myers
3374de9038 Implement C23 atan2pi
C23 adds various <math.h> function families originally defined in TS
18661-4.  Add the atan2pi functions (atan2(y,x)/pi).

Tested for x86_64 and x86, and with build-many-glibcs.py.
2024-12-12 20:57:44 +00:00
H.J. Lu
6bc301672b math: Remove __XXX math functions from installed math.h [BZ #32418]
Since libm doesn't export __XXX math functions, don't declare them in
the installed math.h by adding <bits/mathcalls-macros.h> to declare
__XXX math functions internally for glibc build.  This fixes BZ #32418.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
Reviewed-by: Sam James <sam@gentoo.org>
2024-12-12 16:17:54 +08:00
Joseph Myers
ffe79c446c Implement C23 atanpi
C23 adds various <math.h> function families originally defined in TS
18661-4.  Add the atanpi functions (atan(x)/pi).

Tested for x86_64 and x86, and with build-many-glibcs.py.
2024-12-11 21:51:49 +00:00
Joseph Myers
f962932206 Implement C23 asinpi
C23 adds various <math.h> function families originally defined in TS
18661-4.  Add the asinpi functions (asin(x)/pi).

Tested for x86_64 and x86, and with build-many-glibcs.py.
2024-12-10 20:42:20 +00:00
Joseph Myers
28d102d15c Implement C23 acospi
C23 adds various <math.h> function families originally defined in TS
18661-4.  Add the acospi functions (acos(x)/pi).

Tested for x86_64 and x86, and with build-many-glibcs.py.
2024-12-09 23:01:29 +00:00
Joseph Myers
f9e90e4b4c Implement C23 tanpi
C23 adds various <math.h> function families originally defined in TS
18661-4.  Add the tanpi functions (tan(pi*x)).

Tested for x86_64 and x86, and with build-many-glibcs.py.
2024-12-05 21:42:10 +00:00
Joseph Myers
776938e8b8 Implement C23 sinpi
C23 adds various <math.h> function families originally defined in TS
18661-4.  Add the sinpi functions (sin(pi*x)).

Tested for x86_64 and x86, and with build-many-glibcs.py.
2024-12-04 20:04:04 +00:00
Joseph Myers
0ae0af68d8 Implement C23 cospi
C23 adds various <math.h> function families originally defined in TS
18661-4.  Add the cospi functions (cos(pi*x)).

Tested for x86_64 and x86, and with build-many-glibcs.py.
2024-12-04 10:20:44 +00:00
Joe Ramsay
751a5502be AArch64: Add vector logp1 alias for log1p
This enables vectorisation of C23 logp1, which is an alias for log1p.
There are no new tests or ulp entries because the new symbols are simply
aliases.

Reviewed-by: Wilco Dijkstra  <Wilco.Dijkstra@arm.com>
2024-09-19 17:53:34 +01:00
Joseph Myers
7ec903e028 Implement C23 exp2m1, exp10m1
C23 adds various <math.h> function families originally defined in TS
18661-4.  Add the exp2m1 and exp10m1 functions (exp2(x)-1 and
exp10(x)-1, like expm1).

As with other such functions, these use type-generic templates that
could be replaced with faster and more accurate type-specific
implementations in future.  Test inputs are copied from those for
expm1, plus some additions close to the overflow threshold (copied
from exp2 and exp10) and also some near the underflow threshold.

exp2m1 has the unusual property of having an input (M_MAX_EXP) where
whether the function overflows (under IEEE semantics) depends on the
rounding mode.  Although these could reasonably be XFAILed in the
testsuite (as we do in some cases for arguments very close to a
function's overflow threshold when an error of a few ulps in the
implementation can result in the implementation not agreeing with an
ideal one on whether overflow takes place - the testsuite isn't smart
enough to handle this automatically), since these functions aren't
required to be correctly rounding, I made the implementation check for
and handle this case specially.

The Makefile ordering expected by lint-makefiles for the new functions
is a bit peculiar, but I implemented it in this patch so that the test
passes; I don't know why log2 also needed moving in one Makefile
variable setting when it didn't in my previous patches, but the
failure showed a different place was expected for that function as
well.

The powerpc64le IFUNC setup seems not to be as self-contained as one
might hope; it shouldn't be necessary to add IFUNCs for new functions
such as these simply to get them building, but without setting up
IFUNCs for the new functions, there were undefined references to
__GI___expm1f128 (that IFUNC machinery results in no such function
being defined, but doesn't stop include/math.h from doing the
redirection resulting in the exp2m1f128 and exp10m1f128
implementations expecting to call it).

Tested for x86_64 and x86, and with build-many-glibcs.py.
2024-06-17 16:31:49 +00:00
Joseph Myers
55eb99e9a9 Implement C23 log10p1
C23 adds various <math.h> function families originally defined in TS
18661-4.  Add the log10p1 functions (log10(1+x): like log1p, but for
base-10 logarithms).

This is directly analogous to the log2p1 implementation (except that
whereas log2p1 has a smaller underflow range than log1p, log10p1 has a
larger underflow range).  The test inputs are copied from those for
log1p and log2p1, plus a few more inputs in that wider underflow
range.

Tested for x86_64 and x86, and with build-many-glibcs.py.
2024-06-17 13:48:13 +00:00
Joseph Myers
bb014f50c4 Implement C23 logp1
C23 adds various <math.h> function families originally defined in TS
18661-4.  Add the logp1 functions (aliases for log1p functions - the
name is intended to be more consistent with the new log2p1 and
log10p1, where clearly it would have been very confusing to name those
functions log21p and log101p).  As aliases rather than new functions,
the content of this patch is somewhat different from those actually
adding new functions.

Tests are shared with log1p, so this patch *does* mechanically update
all affected libm-test-ulps files to expect the same errors for both
functions.

The vector versions of log1p on aarch64 and x86_64 are *not* updated
to have logp1 aliases (and thus there are no corresponding header,
tests, abilist or ulps changes for vector functions either).  It would
be reasonable for such vector aliases and corresponding changes to
other files to be made separately.  For now, the log1p tests instead
avoid testing logp1 in the vector case (a Makefile change is needed to
avoid problems with grep, used in generating the .c files for vector
function tests, matching more than one ALL_RM_TEST line in a file
testing multiple functions with the same inputs, when it assumes that
the .inc file only has a single such line).

Tested for x86_64 and x86, and with build-many-glibcs.py.
2024-06-17 13:47:09 +00:00
Joseph Myers
79c52daf47 Implement C23 log2p1
C23 adds various <math.h> function families originally defined in TS
18661-4.  Add the log2p1 functions (log2(1+x): like log1p, but for
base-2 logarithms).

This illustrates the intended structure of implementations of all
these function families: define them initially with a type-generic
template implementation.  If someone wishes to add type-specific
implementations, it is likely such implementations can be both faster
and more accurate than the type-generic one and can then override it
for types for which they are implemented (adding benchmarks would be
desirable in such cases to demonstrate that a new implementation is
indeed faster).

The test inputs are copied from those for log1p.  Note that these
changes make gen-auto-libm-tests depend on MPFR 4.2 (or later).

The bulk of the changes are fairly generic for any such new function.
(sysdeps/powerpc/nofpu/Makefile only needs changing for those
type-generic templates that use fabs.)

Tested for x86_64 and x86, and with build-many-glibcs.py.
2024-05-20 13:41:39 +00:00
Joseph Myers
42cc619dfb Refer to C23 in place of C2X in glibc
WG14 decided to use the name C23 as the informal name of the next
revision of the C standard (notwithstanding the publication date in
2024).  Update references to C2X in glibc to use the C23 name.

This is intended to update everything *except* where it involves
renaming files (the changes involving renaming tests are intended to
be done separately).  In the case of the _ISOC2X_SOURCE feature test
macro - the only user-visible interface involved - support for that
macro is kept for backwards compatibility, while adding
_ISOC23_SOURCE.

Tested for x86_64.
2024-02-01 11:02:01 +00:00
Paul Eggert
dff8da6b3e Update copyright dates with scripts/update-copyrights 2024-01-01 10:53:40 -08:00
Paul Pluzhnikov
7f0d9e61f4 Fix all the remaining misspellings -- BZ 25337 2023-06-02 01:39:48 +00:00
Joseph Myers
6d7e8eda9b Update copyright dates with scripts/update-copyrights 2023-01-06 21:14:39 +00:00
Paul Eggert
581c785bf3 Update copyright dates with scripts/update-copyrights
I used these shell commands:

../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright
(cd ../glibc && git commit -am"[this commit message]")

and then ignored the output, which consisted lines saying "FOO: warning:
copyright statement not found" for each of 7061 files FOO.

I then removed trailing white space from math/tgmath.h,
support/tst-support-open-dev-null-range.c, and
sysdeps/x86_64/multiarch/strlen-vec.S, to work around the following
obscure pre-commit check failure diagnostics from Savannah.  I don't
know why I run into these diagnostics whereas others evidently do not.

remote: *** 912-#endif
remote: *** 913:
remote: *** 914-
remote: *** error: lines with trailing whitespace found
...
remote: *** error: sysdeps/unix/sysv/linux/statx_cp.c: trailing lines
2022-01-01 11:40:24 -08:00
Sunil K Pandey
c21c7bc24e x86-64: Add vector tan/tanf implementation to libmvec
Implement vectorized tan/tanf containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI.  It also contains
accuracy and ABI tests for vector tan/tanf with regenerated ulps.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-12-30 10:19:13 -08:00
Sunil K Pandey
8881cca8fb x86-64: Add vector erfc/erfcf implementation to libmvec
Implement vectorized erfc/erfcf containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI.  It also contains
accuracy and ABI tests for vector erfc/erfcf with regenerated ulps.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-12-30 10:19:03 -08:00
Sunil K Pandey
e682d01578 x86-64: Add vector asinh/asinhf implementation to libmvec
Implement vectorized asinh/asinhf containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI.  It also contains
accuracy and ABI tests for vector asinh/asinhf with regenerated ulps.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-12-29 11:38:56 -08:00
Sunil K Pandey
c0f36fc303 x86-64: Add vector tanh/tanhf implementation to libmvec
Implement vectorized tanh/tanhf containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI.  It also contains
accuracy and ABI tests for vector tanh/tanhf with regenerated ulps.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-12-29 11:38:50 -08:00
Sunil K Pandey
f9ce13fdac x86-64: Add vector erf/erff implementation to libmvec
Implement vectorized erf/erff containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI.  It also contains
accuracy and ABI tests for vector erf/erff with regenerated ulps.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-12-29 11:38:44 -08:00
Sunil K Pandey
0625489ccc x86-64: Add vector acosh/acoshf implementation to libmvec
Implement vectorized acosh/acoshf containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI.  It also contains
accuracy and ABI tests for vector acosh/acoshf with regenerated ulps.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-12-29 11:38:39 -08:00
Sunil K Pandey
6dea4dd3da x86-64: Add vector atanh/atanhf implementation to libmvec
Implement vectorized atanh/atanhf containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI.  It also contains
accuracy and ABI tests for vector atanh/atanhf with regenerated ulps.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-12-29 11:38:34 -08:00
Sunil K Pandey
74265c16ab x86-64: Add vector log1p/log1pf implementation to libmvec
Implement vectorized log1p/log1pf containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI.  It also contains
accuracy and ABI tests for vector log1p/log1pf with regenerated ulps.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-12-29 11:38:27 -08:00
Sunil K Pandey
7e1722fec8 x86-64: Add vector log2/log2f implementation to libmvec
Implement vectorized log2/log2f containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI.  It also contains
accuracy and ABI tests for vector log2/log2f with regenerated ulps.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-12-29 11:38:21 -08:00
Sunil K Pandey
8f8566026d x86-64: Add vector log10/log10f implementation to libmvec
Implement vectorized log10/log10f containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI.  It also contains
accuracy and ABI tests for vector log10/log10f with regenerated ulps.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-12-29 11:38:15 -08:00
Sunil K Pandey
2941a24f8c x86-64: Add vector atan2/atan2f implementation to libmvec
Implement vectorized atan2/atan2f containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI.  It also contains
accuracy and ABI tests for vector atan2/atan2f with regenerated ulps.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-12-29 11:38:09 -08:00
Sunil K Pandey
2bf02c5843 x86-64: Add vector cbrt/cbrtf implementation to libmvec
Implement vectorized cbrt/cbrtf containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI.  It also contains
accuracy and ABI tests for vector cbrt/cbrtf with regenerated ulps.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-12-29 11:38:02 -08:00
Sunil K Pandey
aa1809a1df x86-64: Add vector sinh/sinhf implementation to libmvec
Implement vectorized sinh/sinhf containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI.  It also contains
accuracy and ABI tests for vector sinh/sinhf with regenerated ulps.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-12-29 11:37:55 -08:00
Sunil K Pandey
76ddc74e86 x86-64: Add vector expm1/expm1f implementation to libmvec
Implement vectorized expm1/expm1f containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI.  It also contains
accuracy and ABI tests for vector expm1/expm1f with regenerated ulps.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-12-29 11:37:49 -08:00
Sunil K Pandey
ef7ea9c132 x86-64: Add vector cosh/coshf implementation to libmvec
Implement vectorized cosh/coshf containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI.  It also contains
accuracy and ABI tests for vector cosh/coshf with regenerated ulps.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-12-29 11:37:42 -08:00
Sunil K Pandey
8b726453d5 x86-64: Add vector exp10/exp10f implementation to libmvec
Implement vectorized exp10/exp10f containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI.  It also contains
accuracy and ABI tests for vector exp10/exp10f with regenerated ulps.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-12-29 11:37:35 -08:00
Sunil K Pandey
3fc9ccc20b x86-64: Add vector exp2/exp2f implementation to libmvec
Implement vectorized exp2/exp2f containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI.  It also contains
accuracy and ABI tests for vector exp2/exp2f with regenerated ulps.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-12-29 11:37:29 -08:00
Sunil K Pandey
37475ba883 x86-64: Add vector hypot/hypotf implementation to libmvec
Implement vectorized hypot/hypotf containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI.  It also contains
accuracy and ABI tests for vector hypot/hypotf with regenerated ulps.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-12-29 11:37:21 -08:00
Sunil K Pandey
11c01de14c x86-64: Add vector asin/asinf implementation to libmvec
Implement vectorized asin/asinf containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI.  It also contains
accuracy and ABI tests for vector asin/asinf with regenerated ulps.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-12-29 11:37:03 -08:00
Sunil K Pandey
146310177a x86-64: Add vector atan/atanf implementation to libmvec
Implement vectorized atan/atanf containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI.  It also contains
accuracy and ABI tests for vector atan/atanf with regenerated ulps.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-12-29 11:36:46 -08:00
Sunil K Pandey
f20f980c71 x86-64: Add vector acos/acosf implementation to libmvec
Implement vectorized acos/acosf containing SSE, AVX, AVX2 and
AVX512 versions for libmvec as per vector ABI.  It also contains
accuracy and ABI tests for vector acos/acosf with regenerated ulps.

Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-12-22 13:03:14 -08:00
Joseph Myers
9bd9978639 Do not declare fmax, fmin _FloatN, _FloatNx versions for C2X
At the last WG14 meeting,
<http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2711.htm> was
accepted, which places more emphasis on the new fmaximum / fminimum
functions and less on the old fmax / fmin functions.  Some of the
changes are to examples, notes or otherwise don't require
implementation changes.  However, the changes include removing the
_FloatN / _FloatNx versions of the fmax and fmin functions that came
from TS 18661-3.

Thus, those function versions should only be declared under similar
conditions to the _FloatN / _FloatNx versions of fmaxmag and fminmag:
for _GNU_SOURCE and pre-C2X use of __STDC_WANT_IEC_60559_TYPES_EXT__,
but not for C2X without _GNU_SOURCE.

In turn this requires a tgmath.h change so that the corresponding
tgmath.h macros, for C2X with __STDC_WANT_IEC_60559_TYPES_EXT__ but
without _GNU_SOURCE, don't try to use function variants that aren't
declared.  (That issue doesn't arise for the tgmath.h macros for
fmaxmag and fminmag, because those aren't defined at all in those
circumstances unless __STDC_WANT_IEC_60559_BFP_EXT__ (from TS 18661-1
and not specified at all by C2X) is also defined, and in that case the
_FloatN / _FloatNx versions of fmaxmag and fminmag get declared - this
is only ever an issue when it's possible for some functions
corresponding to a type-generic-macro to be declared, and for _FloatN
/ _FloatNx functions in general to be declared, but without the
_FloatN / _FloatNx functions corresponding to that particular macro
being declared.)

Tested for x86_64.
2021-09-29 18:20:32 +00:00
Joseph Myers
90f0ac10a7 Add fmaximum, fminimum functions
C2X adds new <math.h> functions for floating-point maximum and
minimum, corresponding to the new operations that were added in IEEE
754-2019 because of concerns about the old operations not being
associative in the presence of signaling NaNs.  fmaximum and fminimum
handle NaNs like most <math.h> functions (any NaN argument means the
result is a quiet NaN).  fmaximum_num and fminimum_num handle both
quiet and signaling NaNs the way fmax and fmin handle quiet NaNs (if
one argument is a number and the other is a NaN, return the number),
but still raise "invalid" for a signaling NaN argument, making them
exceptions to the normal rule that a function with a floating-point
result raising "invalid" also returns a quiet NaN.  fmaximum_mag,
fminimum_mag, fmaximum_mag_num and fminimum_mag_num are corresponding
functions returning the argument with greatest or least absolute
value.  All these functions also treat +0 as greater than -0.  There
are also corresponding <tgmath.h> type-generic macros.

Add these functions to glibc.  The implementations use type-generic
templates based on those for fmax, fmin, fmaxmag and fminmag, and test
inputs are based on those for those functions with appropriate
adjustments to the expected results.  The RISC-V maintainers might
wish to add optimized versions of fmaximum_num and fminimum_num (for
float and double), since RISC-V (F extension version 2.2 and later)
provides instructions corresponding to those functions - though it
might be at least as useful to add architecture-independent built-in
functions to GCC and teach the RISC-V back end to expand those
functions inline, which is what you generally want for functions that
can be implemented with a single instruction.

Tested for x86_64 and x86, and with build-many-glibcs.py.
2021-09-28 23:31:35 +00:00
Joseph Myers
b3f27d8150 Add narrowing fma functions
This patch adds the narrowing fused multiply-add functions from TS
18661-1 / TS 18661-3 / C2X to glibc's libm: ffma, ffmal, dfmal,
f32fmaf64, f32fmaf32x, f32xfmaf64 for all configurations; f32fmaf64x,
f32fmaf128, f64fmaf64x, f64fmaf128, f32xfmaf64x, f32xfmaf128,
f64xfmaf128 for configurations with _Float64x and _Float128;
__f32fmaieee128 and __f64fmaieee128 aliases in the powerpc64le case
(for calls to ffmal and dfmal when long double is IEEE binary128).
Corresponding tgmath.h macro support is also added.

The changes are mostly similar to those for the other narrowing
functions previously added, especially that for sqrt, so the
description of those generally applies to this patch as well.  As with
sqrt, I reused the same test inputs in auto-libm-test-in as for
non-narrowing fma rather than adding extra or separate inputs for
narrowing fma.  The tests in libm-test-narrow-fma.inc also follow
those for non-narrowing fma.

The non-narrowing fma has a known bug (bug 6801) that it does not set
errno on errors (overflow, underflow, Inf * 0, Inf - Inf).  Rather
than fixing this or having narrowing fma check for errors when
non-narrowing does not (complicating the cases when narrowing fma can
otherwise be an alias for a non-narrowing function), this patch does
not attempt to check for errors from narrowing fma and set errno; the
CHECK_NARROW_FMA macro is still present, but as a placeholder that
does nothing, and this missing errno setting is considered to be
covered by the existing bug rather than needing a separate open bug.
missing-errno annotations are duly added to many of the
auto-libm-test-in test inputs for fma.

This completes adding all the new functions from TS 18661-1 to glibc,
so will be followed by corresponding stdc-predef.h changes to define
__STDC_IEC_60559_BFP__ and __STDC_IEC_60559_COMPLEX__, as the support
for TS 18661-1 will be at a similar level to that for C standard
floating-point facilities up to C11 (pragmas not implemented, but
library functions done).  (There are still further changes to be done
to implement changes to the types of fromfp functions from N2548.)

Tested as followed: natively with the full glibc testsuite for x86_64
(GCC 11, 7, 6) and x86 (GCC 11); with build-many-glibcs.py with GCC
11, 7 and 6; cross testing of math/ tests for powerpc64le, powerpc32
hard float, mips64 (all three ABIs, both hard and soft float).  The
different GCC versions are to cover the different cases in tgmath.h
and tgmath.h tests properly (GCC 6 has _Float* only as typedefs in
glibc headers, GCC 7 has proper _Float* support, GCC 8 adds
__builtin_tgmath).
2021-09-22 21:25:31 +00:00