glibc/sysdeps/x86_64
Szabolcs Nagy 424c4f60ed Add new pow implementation
The algorithm is exp(y * log(x)), where log(x) is computed with about
1.3*2^-68 relative error (1.5*2^-68 without fma), returning the result
in two doubles, and the exp part uses the same algorithm (and lookup
tables) as exp, but takes the input as two doubles and a sign (to handle
negative bases with odd integer exponent).  The __exp1 internal symbol
is no longer necessary.

There is separate code path when fma is not available but the worst case
error is about 0.54 ULP in both cases.  The lookup table and consts for
log are 4168 bytes.  The .rodata+.text is decreased by 37908 bytes on
aarch64.  The non-nearest rounding error is less than 1 ULP.

Improvements on Cortex-A72 compared to current glibc master:
pow thruput: 2.40x in [0.01 11.1]x[0.01 11.1]
pow latency: 1.84x in [0.01 11.1]x[0.01 11.1]

Tested on
aarch64-linux-gnu (defined __FP_FAST_FMA, TOINT_INTRINSICS) and
arm-linux-gnueabihf (!defined __FP_FAST_FMA, !TOINT_INTRINSICS) and
x86_64-linux-gnu (!defined __FP_FAST_FMA, !TOINT_INTRINSICS) and
powerpc64le-linux-gnu (defined __FP_FAST_FMA, !TOINT_INTRINSICS) targets.

	* NEWS: Mention pow improvements.
	* math/Makefile (type-double-routines): Add e_pow_log_data.
	* sysdeps/generic/math_private.h (__exp1): Remove.
	* sysdeps/i386/fpu/e_pow_log_data.c: New file.
	* sysdeps/ia64/fpu/e_pow_log_data.c: New file.
	* sysdeps/ieee754/dbl-64/Makefile (CFLAGS-e_pow.c): Allow fma
	contraction.
	* sysdeps/ieee754/dbl-64/e_exp.c (__exp1): Remove.
	(exp_inline): Remove.
	(__ieee754_exp): Only single double input is handled.
	* sysdeps/ieee754/dbl-64/e_pow.c: Rewrite.
	* sysdeps/ieee754/dbl-64/e_pow_log_data.c: New file.
	* sysdeps/ieee754/dbl-64/math_config.h (issignaling_inline): Define.
	(__pow_log_data): Define.
	* sysdeps/ieee754/dbl-64/upow.h: Remove.
	* sysdeps/ieee754/dbl-64/upow.tbl: Remove.
	* sysdeps/m68k/m680x0/fpu/e_pow_log_data.c: New file.
	* sysdeps/x86_64/fpu/multiarch/Makefile (CFLAGS-e_pow-fma.c): Allow fma
	contraction.
	(CFLAGS-e_pow-fma4.c): Likewise.
2018-09-19 10:04:51 +01:00
..
64
fpu Add new pow implementation 2018-09-19 10:04:51 +01:00
multiarch x86: Don't include <init-arch.h> in assembly codes 2018-08-03 08:05:00 -07:00
nptl x86: Rename __glibc_reserved2 to ssp_base in tcbhead_t 2018-07-25 04:39:39 -07:00
x32 Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
____longjmp_chk.S
__longjmp.S x86: Support shadow stack pointer in setjmp/longjmp 2018-07-14 05:59:53 -07:00
_mcount.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
abort-instr.h
add_n.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
addmul_1.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
atomic-machine.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
bsd-_setjmp.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
bsd-setjmp.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
bzero.S
configure Add --enable-static-pie configure option to build static PIE [BZ #19574] 2017-12-15 17:12:14 -08:00
configure.ac Add --enable-static-pie configure option to build static PIE [BZ #19574] 2017-12-15 17:12:14 -08:00
crti.S x86: Add _CET_ENDBR to functions in crti.S 2018-07-17 16:05:18 -07:00
crtn.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
dl-irel.h [BZ #20271] Add newlines in __libc_fatal calls. 2018-08-31 18:04:32 -07:00
dl-lookupcfg.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
dl-machine.h elf: Unify symbol address run-time calculation [BZ #19818] 2018-04-04 23:09:37 +01:00
dl-procinfo.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
dl-runtime.c
dl-tls.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
dl-tls.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
dl-tlsdesc.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
dl-tlsdesc.S x86: Add _CET_ENDBR to functions in dl-tlsdesc.S 2018-07-17 16:07:17 -07:00
dl-trampoline.h x86: Support IBT and SHSTK in Intel CET [BZ #21598] 2018-07-16 14:08:27 -07:00
dl-trampoline.S x86: Move STATE_SAVE_OFFSET/STATE_SAVE_MASK to sysdep.h 2018-08-06 06:25:43 -07:00
ffs.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
ffsll.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
hp-timing.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
htonl.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
ifuncmain8.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
ifuncmod8.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
Implies Add float128 support for x86_64, x86. 2017-06-26 22:02:24 +00:00
jmpbuf-offsets.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
jmpbuf-unwind.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
l10nflist.c
ldbl2mpn.c
link-defines.sym
locale-defines.sym
localplt.data ld.so: Introduce struct dl_exception 2017-08-10 16:54:57 +02:00
lshift.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
machine-gmon.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
Makefile Rename the glibc.tune namespace to glibc.cpu 2018-08-02 23:49:19 +05:30
memchr.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
memcmp.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
memcopy.h X86-64: Add dummy memcopy.h and wordcopy.c 2016-06-09 04:38:34 -07:00
memcpy_chk.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
memcpy.S
memmove_chk.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
memmove.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
mempcpy_chk.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
mempcpy.S
memrchr.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
memset_chk.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
memset.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
memusage.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
mp_clz_tab.c
mul_1.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
preconfigure
preconfigure.ac
rawmemchr.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
rshift.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
rtld-offsets.sym x86-64: Align the stack in __tls_get_addr [BZ #21609] 2017-07-06 04:43:20 -07:00
sched_cpucount.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
setjmp.S x86: Support shadow stack pointer in setjmp/longjmp 2018-07-14 05:59:53 -07:00
stack-aliasing.h
stackguard-macros.h
stackinfo.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
start.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
stpcpy.S
strcasecmp_l-nonascii.c Use locale_t, not __locale_t, throughout glibc 2017-06-20 20:30:06 -04:00
strcasecmp_l.S
strcasecmp.S
strcat.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
strchr.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
strchrnul.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
strcmp.S x86_64: Use _CET_NOTRACK in strcmp.S 2018-07-18 06:31:53 -07:00
strcpy.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
strcspn.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
strlen.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
strncase_l-nonascii.c Use locale_t, not __locale_t, throughout glibc 2017-06-20 20:30:06 -04:00
strncase_l.S
strncase.S
strncmp.S
strnlen.S
strpbrk.S x86-64: Implement strcspn/strpbrk/strspn IFUNC selectors in C 2017-06-15 08:59:05 -07:00
strrchr.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
strspn.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
sub_n.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
submul_1.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
sysdep.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tls_get_addr.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tls-macros.h
tlsdesc.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tlsdesc.sym x86-64: Align the stack in __tls_get_addr [BZ #21609] 2017-07-06 04:43:20 -07:00
tst-audit3.c
tst-audit4-aux.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tst-audit4.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tst-audit5.c
tst-audit6.c
tst-audit7.c
tst-audit10-aux.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tst-audit10.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tst-audit.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tst-auditmod3a.c
tst-auditmod3b.c Add missing header files throughout the testsuite. 2017-02-16 17:33:18 -05:00
tst-auditmod4a.c
tst-auditmod4b.c Add missing header files throughout the testsuite. 2017-02-16 17:33:18 -05:00
tst-auditmod5a.c
tst-auditmod5b.c Add missing header files throughout the testsuite. 2017-02-16 17:33:18 -05:00
tst-auditmod6a.c
tst-auditmod6b.c Add missing header files throughout the testsuite. 2017-02-16 17:33:18 -05:00
tst-auditmod6c.c Add missing header files throughout the testsuite. 2017-02-16 17:33:18 -05:00
tst-auditmod7a.c
tst-auditmod7b.c Add missing header files throughout the testsuite. 2017-02-16 17:33:18 -05:00
tst-auditmod10a.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tst-auditmod10b.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tst-avx512-aux.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tst-avx512.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tst-avx512mod.c x86-64: Verify that _dl_runtime_resolve preserves vector registers 2017-02-09 12:19:58 -08:00
tst-avx-aux.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tst-avx.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tst-avxmod.c x86-64: Verify that _dl_runtime_resolve preserves vector registers 2017-02-09 12:19:58 -08:00
tst-mallocalign1.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tst-platform-1.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tst-platformmod-1.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tst-platformmod-2.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tst-quad1.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tst-quad1pie.c
tst-quad2.c
tst-quad2pie.c
tst-quadmod1.S x86-64: Add endbr64 to tst-quadmod[12].S 2018-07-24 05:12:14 -07:00
tst-quadmod1pie.S
tst-quadmod2.S x86-64: Add endbr64 to tst-quadmod[12].S 2018-07-24 05:12:14 -07:00
tst-quadmod2pie.S
tst-split-dynreloc.c
tst-split-dynreloc.lds
tst-sse.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tst-ssemod.c x86-64: Verify that _dl_runtime_resolve preserves vector registers 2017-02-09 12:19:58 -08:00
tst-stack-align.h Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tst-x86_64-1.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
tst-x86_64mod-1.c Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
Versions Move __fentry__ version definition to sysdeps/{i386,x86_64} 2018-08-10 09:07:44 +02:00
wcschr.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
wcscmp.S x86-64: Optimize strcmp/wcscmp and strncmp/wcsncmp with AVX2 2018-06-01 16:32:43 -05:00
wcslen.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
wcsrchr.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
wmemset_chk.S Update copyright dates with scripts/update-copyrights. 2018-01-01 00:32:25 +00:00
wmemset.S x86-64: Optimize wmemset with SSE2/AVX2/AVX512 2017-06-05 11:09:59 -07:00
wordcopy.c X86-64: Add dummy memcopy.h and wordcopy.c 2016-06-09 04:38:34 -07:00