glibc/sysdeps/ieee754/dbl-64
Szabolcs Nagy 424c4f60ed Add new pow implementation
The algorithm is exp(y * log(x)), where log(x) is computed with about
1.3*2^-68 relative error (1.5*2^-68 without fma), returning the result
in two doubles, and the exp part uses the same algorithm (and lookup
tables) as exp, but takes the input as two doubles and a sign (to handle
negative bases with odd integer exponent).  The __exp1 internal symbol
is no longer necessary.

There is separate code path when fma is not available but the worst case
error is about 0.54 ULP in both cases.  The lookup table and consts for
log are 4168 bytes.  The .rodata+.text is decreased by 37908 bytes on
aarch64.  The non-nearest rounding error is less than 1 ULP.

Improvements on Cortex-A72 compared to current glibc master:
pow thruput: 2.40x in [0.01 11.1]x[0.01 11.1]
pow latency: 1.84x in [0.01 11.1]x[0.01 11.1]

Tested on
aarch64-linux-gnu (defined __FP_FAST_FMA, TOINT_INTRINSICS) and
arm-linux-gnueabihf (!defined __FP_FAST_FMA, !TOINT_INTRINSICS) and
x86_64-linux-gnu (!defined __FP_FAST_FMA, !TOINT_INTRINSICS) and
powerpc64le-linux-gnu (defined __FP_FAST_FMA, !TOINT_INTRINSICS) targets.

	* NEWS: Mention pow improvements.
	* math/Makefile (type-double-routines): Add e_pow_log_data.
	* sysdeps/generic/math_private.h (__exp1): Remove.
	* sysdeps/i386/fpu/e_pow_log_data.c: New file.
	* sysdeps/ia64/fpu/e_pow_log_data.c: New file.
	* sysdeps/ieee754/dbl-64/Makefile (CFLAGS-e_pow.c): Allow fma
	contraction.
	* sysdeps/ieee754/dbl-64/e_exp.c (__exp1): Remove.
	(exp_inline): Remove.
	(__ieee754_exp): Only single double input is handled.
	* sysdeps/ieee754/dbl-64/e_pow.c: Rewrite.
	* sysdeps/ieee754/dbl-64/e_pow_log_data.c: New file.
	* sysdeps/ieee754/dbl-64/math_config.h (issignaling_inline): Define.
	(__pow_log_data): Define.
	* sysdeps/ieee754/dbl-64/upow.h: Remove.
	* sysdeps/ieee754/dbl-64/upow.tbl: Remove.
	* sysdeps/m68k/m680x0/fpu/e_pow_log_data.c: New file.
	* sysdeps/x86_64/fpu/multiarch/Makefile (CFLAGS-e_pow-fma.c): Allow fma
	contraction.
	(CFLAGS-e_pow-fma4.c): Likewise.
2018-09-19 10:04:51 +01:00
..
wordsize-64 Use ceil functions not __ceil functions in glibc libm. 2018-09-17 20:42:06 +00:00
asincos.tbl Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
atnat2.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
atnat.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
branred.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
branred.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
dbl2mpn.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
dla.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
doasin.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
doasin.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
dosincos.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
dosincos.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
e_acos.c
e_acosh.c Rename all __ieee754_sqrt(f/l) calls to sqrt(f/l) 2018-03-15 19:21:36 +00:00
e_asin.c Move math_check_force_underflow macros to separate math-underflow.h. 2018-05-10 00:53:04 +00:00
e_atan2.c Do not include fenv_private.h in math_private.h. 2018-09-03 21:09:04 +00:00
e_atanh.c Do not include math-barriers.h in math_private.h. 2018-05-11 15:11:38 +00:00
e_cosh.c Move math_narrow_eval to separate math-narrow-eval.h. 2018-05-09 00:15:10 +00:00
e_exp2.c Add new exp and exp2 implementations 2018-09-05 16:22:00 +01:00
e_exp10.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
e_exp_data.c Add new exp and exp2 implementations 2018-09-05 16:22:00 +01:00
e_exp.c Add new pow implementation 2018-09-19 10:04:51 +01:00
e_fmod.c
e_gamma_r.c Use ceil functions not __ceil functions in glibc libm. 2018-09-17 20:42:06 +00:00
e_hypot.c Move math_check_force_underflow macros to separate math-underflow.h. 2018-05-10 00:53:04 +00:00
e_ilogb.c
e_j0.c Do not include math-barriers.h in math_private.h. 2018-05-11 15:11:38 +00:00
e_j1.c Move math_check_force_underflow macros to separate math-underflow.h. 2018-05-10 00:53:04 +00:00
e_jn.c Do not include fenv_private.h in math_private.h. 2018-09-03 21:09:04 +00:00
e_lgamma_r.c Use floor functions not __floor functions in glibc libm. 2018-09-14 13:09:01 +00:00
e_log2_data.c Add new log2 implementation 2018-09-12 17:36:33 +01:00
e_log2.c Add new log2 implementation 2018-09-12 17:36:33 +01:00
e_log10.c
e_log_data.c Add new log implementation 2018-09-12 17:33:30 +01:00
e_log.c Add new log implementation 2018-09-12 17:33:30 +01:00
e_pow_log_data.c Add new pow implementation 2018-09-19 10:04:51 +01:00
e_pow.c Add new pow implementation 2018-09-19 10:04:51 +01:00
e_remainder.c Do not include fenv_private.h in math_private.h. 2018-09-03 21:09:04 +00:00
e_sinh.c Move math_check_force_underflow macros to separate math-underflow.h. 2018-05-10 00:53:04 +00:00
e_sqrt.c Do not include fenv_private.h in math_private.h. 2018-09-03 21:09:04 +00:00
gamma_product.c Do not include fenv_private.h in math_private.h. 2018-09-03 21:09:04 +00:00
gamma_productf.c Move math_narrow_eval to separate math-narrow-eval.h. 2018-05-09 00:15:10 +00:00
k_rem_pio2.c Use floor functions not __floor functions in glibc libm. 2018-09-14 13:09:01 +00:00
k_tan.c
lgamma_neg.c Use floor functions not __floor functions in glibc libm. 2018-09-14 13:09:01 +00:00
lgamma_product.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
Makefile Add new pow implementation 2018-09-19 10:04:51 +01:00
math_config.h Add new pow implementation 2018-09-19 10:04:51 +01:00
math_err.c Add new exp and exp2 implementations 2018-09-05 16:22:00 +01:00
MathLib.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
mpa-arch.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
mpa.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
mpa.h Remove mplog and mpexp 2018-02-15 12:41:05 +00:00
mpatan2.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
mpatan.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
mpatan.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
mpn2dbl.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
mpsqrt.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
mpsqrt.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
mptan.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
mydefs.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
powtwo.tbl Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
root.tbl Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
s_asinh.c Move math_check_force_underflow macros to separate math-underflow.h. 2018-05-10 00:53:04 +00:00
s_atan.c Do not include fenv_private.h in math_private.h. 2018-09-03 21:09:04 +00:00
s_cbrt.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
s_ceil.c Use ceil functions not __ceil functions in glibc libm. 2018-09-17 20:42:06 +00:00
s_copysign.c
s_cos.c
s_erf.c Move math_check_force_underflow macros to separate math-underflow.h. 2018-05-10 00:53:04 +00:00
s_expm1.c Do not include math-barriers.h in math_private.h. 2018-05-11 15:11:38 +00:00
s_f32xaddf64.c Add narrowing add functions. 2018-02-10 02:08:43 +00:00
s_f32xdivf64.c Add narrowing divide functions. 2018-05-17 00:40:52 +00:00
s_f32xmulf64.c Add narrowing multiply functions. 2018-05-16 00:05:28 +00:00
s_f32xsubf64.c Add narrowing subtract functions. 2018-03-20 00:34:52 +00:00
s_fabs.c
s_fadd.c Add narrowing add functions. 2018-02-10 02:08:43 +00:00
s_fdiv.c Add narrowing divide functions. 2018-05-17 00:40:52 +00:00
s_finite.c Move LDBL_CLASSIFY_COMPAT to its own header. 2018-02-01 21:01:00 +00:00
s_floor.c Use floor functions not __floor functions in glibc libm. 2018-09-14 13:09:01 +00:00
s_fma.c Do not include fenv_private.h in math_private.h. 2018-09-03 21:09:04 +00:00
s_fmaf.c Do not include fenv_private.h in math_private.h. 2018-09-03 21:09:04 +00:00
s_fmul.c Add narrowing multiply functions. 2018-05-16 00:05:28 +00:00
s_fpclassify.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
s_frexp.c
s_fromfp_main.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
s_fromfp.c
s_fromfpx.c
s_fsub.c Add narrowing subtract functions. 2018-03-20 00:34:52 +00:00
s_getpayload.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
s_isinf.c Move LDBL_CLASSIFY_COMPAT to its own header. 2018-02-01 21:01:00 +00:00
s_isnan.c Move LDBL_CLASSIFY_COMPAT to its own header. 2018-02-01 21:01:00 +00:00
s_issignaling.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
s_llrint.c Move fenv.h soft-float inlines from fenv_private.h to include/fenv.h. 2018-09-04 19:52:06 +00:00
s_llround.c Move fenv.h soft-float inlines from fenv_private.h to include/fenv.h. 2018-09-04 19:52:06 +00:00
s_log1p.c Do not include math-barriers.h in math_private.h. 2018-05-11 15:11:38 +00:00
s_logb.c
s_lrint.c Move fenv.h soft-float inlines from fenv_private.h to include/fenv.h. 2018-09-04 19:52:06 +00:00
s_lround.c Move fenv.h soft-float inlines from fenv_private.h to include/fenv.h. 2018-09-04 19:52:06 +00:00
s_modf.c
s_nearbyint.c Do not include fenv_private.h in math_private.h. 2018-09-03 21:09:04 +00:00
s_nexttoward.c
s_nextup.c Move math_check_force_underflow macros to separate math-underflow.h. 2018-05-10 00:53:04 +00:00
s_remquo.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
s_rint.c Use rint functions not __rint functions in glibc libm. 2018-09-14 13:10:39 +00:00
s_round.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
s_roundeven.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
s_scalbln.c
s_scalbn.c
s_setpayload_main.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
s_setpayload.c
s_setpayloadsig.c
s_signbit.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
s_sin.c Do not include fenv_private.h in math_private.h. 2018-09-03 21:09:04 +00:00
s_sincos.c Do not include fenv_private.h in math_private.h. 2018-09-03 21:09:04 +00:00
s_tan.c Do not include fenv_private.h in math_private.h. 2018-09-03 21:09:04 +00:00
s_tanh.c Move math_check_force_underflow macros to separate math-underflow.h. 2018-05-10 00:53:04 +00:00
s_totalorder.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
s_totalordermag.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
s_trunc.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
s_ufromfp.c
s_ufromfpx.c
sincos32.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
sincos32.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
sincostab.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
uasncs.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
uatan.tbl Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
urem.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
usncs.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
utan.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
utan.tbl Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
x2y2m1.c Do not include fenv_private.h in math_private.h. 2018-09-03 21:09:04 +00:00
x2y2m1f.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00