glibc/sysdeps/ieee754/flt-32/e_expf.c

115 lines
3.2 KiB
C
Raw Normal View History

Optimized generic expf and exp2f with wrappers Based on new expf and exp2f code from https://github.com/ARM-software/optimized-routines/ with wrapper on aarch64: expf reciprocal-throughput: 2.3x faster expf latency: 1.7x faster without wrapper on aarch64: expf reciprocal-throughput: 3.3x faster expf latency: 1.7x faster without wrapper on aarch64: exp2f reciprocal-throughput: 2.8x faster exp2f latency: 1.3x faster libm.so size on aarch64: .text size: -152 bytes .rodata size: -1740 bytes expf/exp2f worst case nearest rounding error: 0.502 ulp worst case non-nearest rounding error: 1 ulp Error checks are inline and errno setting is in separate tail called functions, but the wrappers are kept in this patch to handle the _LIB_VERSION==_SVID_ case. (So e.g. errno is set twice for expf calls and once for __expf_finite calls on targets where the new code is used.) Double precision arithmetics is used which is expected to be faster on most targets (including soft-float) than using single precision and it is easier to get good precision result with it. Const data is kept in a separate translation unit which complicates maintenance a bit, but is expected to give good code for literal loads on most targets and allows sharing data across expf, exp2f and powf. (This data is disabled on i386, m68k and ia64 which have their own expf, exp2f and powf code.) Some details may need target specific tweaks: - best convert and round to int operation in the arg reduction may be different across targets. - code was optimized on fma target, optimal polynomial eval may be different without fma. - gcc does not always generate good code for fp bit representation access via unions or it may be inherently slow on some targets. The libm-test-ulps will need adjustment because.. - The argument reduction ideally uses nearest rounded rint, but that is not efficient on most targets, so the polynomial can get evaluated on a wider interval in non-nearest rounding mode making 1 ulp errors common in that case. - The polynomial is evaluated such that it may have 1 ulp error on negative tiny inputs with upward rounding. * math/Makefile (type-float-routines): Add math_errf and e_exp2f_data. * sysdeps/aarch64/fpu/math_private.h (TOINT_INTRINSICS): Define. (roundtoint, converttoint): Likewise. * sysdeps/ieee754/flt-32/e_expf.c: New implementation. * sysdeps/ieee754/flt-32/e_exp2f.c: New implementation. * sysdeps/ieee754/flt-32/e_exp2f_data.c: New file. * sysdeps/ieee754/flt-32/math_config.h: New file. * sysdeps/ieee754/flt-32/math_errf.c: New file. * sysdeps/ieee754/flt-32/t_exp2f.h: Remove. * sysdeps/i386/fpu/e_exp2f_data.c: New file. * sysdeps/i386/fpu/math_errf.c: New file. * sysdeps/ia64/fpu/e_exp2f_data.c: New file. * sysdeps/ia64/fpu/math_errf.c: New file. * sysdeps/m68k/m680x0/fpu/e_exp2f_data.c: New file. * sysdeps/m68k/m680x0/fpu/math_errf.c: New file.
2017-09-07 00:42:00 +08:00
/* Single-precision e^x function.
Copyright (C) 2017-2021 Free Software Foundation, Inc.
Update. 1998-02-13 17:39 Ulrich Drepper <drepper@cygnus.com> * elf/Makefile: Don't use --version-script parameter to link ld.so unconditionally. 1998-01-02 04:19 Geoff Keating <geoffk@ozemail.com.au> * math/Makefile: Add t_exp. * math/libm-test.c: Tighten accuracy bounds for exp(), correct constants. * math/test-reduce.c: Remove temporarily, it seems to be broken. * sysdeps/libm-ieee754/e_exp.c: Use accurate table method. * sysdeps/libm-ieee754/e_expf.c: Use table & double precision for better accuracy. * sysdeps/libm-ieee754/s_exp2.c: Use better polynomial; correct algorithm for very large/very small arguments. * sysdeps/libm-ieee754/s_exp2f.c: Use slightly better polynomial; correct algorithm for very large/very small arguments; adjust for new table. * sysdeps/libm-ieee754/t_exp.c: New file. * sysdeps/libm-ieee754/t_exp2f.h: Use table with smaller deltas. * sysdeps/unix/sysv/linux/powerpc/dl-sysdep.c: Put 'strange test' back, with comment that explains what breaks when you remove it :-(. * localedata/xfrm-test.c: Avoid integer overflow. * stdlib/strfmon.c: char is unsigned, sometimes. *sysdeps/powerpc * sysdeps/powerpc/Makefile: Remove quad float support. * sysdeps/powerpc/q_*.c: Remove, they will become an add-on. * sysdeps/powerpc/quad_float.h: Likewise. * sysdeps/powerpc/test-arith.c: Likewise. * sysdeps/powerpc/test-arithf.c: Likewise. * sysdeps/generic/s_exp2.c: Remove, we have this implemented now. * sysdeps/generic/s_exp2f.c: Likewise. * sysdeps/powerpc/bits/mathinline.h: Use underscores around __asm__, don't try anything if _SOFT_FLOAT. 1997-12-31 Andreas Schwab <schwab@issan.informatik.uni-dortmund.de> * locale/C-ctype.c (_nl_C_LC_CTYPE_class32): Undo last change. * locale/programs/ld-ctype.c (CHAR_CLASS32_TRANS): Likewise. * wctype/wctype.c: Likewise. * wctype/wctype.h (_ISwxxx): Renamed from _ISxxx, all uses changed. They are incompatible with the _ISxxx values from <ctype.h> on little endian machines. (_ISwbit) [__BYTE_ORDER == __LITTLE_ENDIAN]: Correctly transform bit number. This fixes the real bug and restores the integrity of the ctype locale file. * wctype/wcfuncs.c: Change all _ISxxx to _ISwxxx. * wctype/wcfuncs_l.c: Likewise. * wctype/wcextra.c: Likewise. * wctype/wctype_l.c [__BYTE_ORDER == __LITTLE_ENDIAN]: Use correct byte swapping. 1998-02-09 Andreas Schwab <schwab@issan.informatik.uni-dortmund.de> * sysdeps/unix/sysv/linux/sparc/sparc32/sysdep.S (errno): Put it into .bss segment instead of .common, so that aliases on it work. * sysdeps/unix/sysv/linux/i386/sysdep.S (errno): Add .type and .size directives, put into .bss segment instead of initializing it to 4. 1998-02-12 08:00 H.J. Lu <hjl@gnu.org> * libc.map (gnu_get_libc_release, gnu_get_libc_version): Added. * version.c (__gnu_get_libc_release, __gnu_get_libc_version): New functions. Make names without __ weak aliases. (__libc_release, __libc_version): Make them static. * include/gnu/libc-version.h: New file. * Makefile (headers): Add gnu/libc-version.h. 1998-02-13 Ulrich Drepper <drepper@cygnus.com> * stdlib/stdlib.h (struct drand48_data): Leave X to user macros and use x for member name. Reported by Daniel Lyddy <daniell@cs.berkeley.edu>. * stdlib/drand48.c: Change according to member name change. * stdlib/drand48_r.c: Likewise. * stdlib/lcong48_r.c: Likewise. * stdlib/lrand48.c: Likewise. * stdlib/lrand48_r.c: Likewise. * stdlib/mrand48.c: Likewise. * stdlib/mrand48_r.c: Likewise. * stdlib/seed48.c: Likewise. * stdlib/seed48_r.c: Likewise. * stdlib/srand48_r.c: Likewise. 1998-02-11 Andreas Jaeger <aj@arthur.rhein-neckar.de> * nss/test-netdb.c: Add some more test cases. 1998-02-13 11:39 Ulrich Drepper <drepper@cygnus.com> * libio/iovsscanf.c: Undo last change modifying errno. 1998-02-12 Andreas Schwab <schwab@issan.informatik.uni-dortmund.de> * stdio-common/vfscanf.c: Never try to read another character after EOF. Don't decrement read_in after EOF, it wasn't incremented in the first place. (NEXT_WIDE_CHAR): Set First, not first. 1998-02-06 07:48 H.J. Lu <hjl@gnu.org> * db/Makefile ($(inst_libdir)/libndbm.a, $(inst_libdir)/libndbm.so): New targets. * db2/Makefile: Likewise. 1998-02-12 08:20 H.J. Lu <hjl@gnu.org> * sysdeps/gnu/errlist.awk (sys_errlist, sys_nerr): Create weak aliases if HAVE_ELF or PIC or DO_VERSIONING is not defined. 1998-02-12 Andreas Schwab <schwab@issan.informatik.uni-dortmund.de> * sysdeps/generic/_G_config.h: Define _G_wchar_t, for C++ <streambuf.h>. * sysdeps/unix/sysv/linux/_G_config.h: Likewise. 1998-02-11 Andreas Schwab <schwab@issan.informatik.uni-dortmund.de> * sysdeps/unix/make-syscalls.sh: Fix sed pattern when dealing with versioned symbols. 1998-02-13 08:14 H.J. Lu <hjl@gnu.org> * libc.map (_dl_global_scope, _dl_lookup_symbol_skip, _dl_lookup_versioned_symbol, _dl_lookup_versioned_symbol_skip): Added for libdl.so. * elf/rtld.map: New file. Needed to define the GLIBC_2.* * manual/socket.texi (Host Address Functions): Clarify description * sysdeps/unix/sysv/linux/alpha/bits/time.h (struct timeval):
1998-02-14 01:54:15 +08:00
This file is part of the GNU C Library.
Update. 1998-02-13 17:39 Ulrich Drepper <drepper@cygnus.com> * elf/Makefile: Don't use --version-script parameter to link ld.so unconditionally. 1998-01-02 04:19 Geoff Keating <geoffk@ozemail.com.au> * math/Makefile: Add t_exp. * math/libm-test.c: Tighten accuracy bounds for exp(), correct constants. * math/test-reduce.c: Remove temporarily, it seems to be broken. * sysdeps/libm-ieee754/e_exp.c: Use accurate table method. * sysdeps/libm-ieee754/e_expf.c: Use table & double precision for better accuracy. * sysdeps/libm-ieee754/s_exp2.c: Use better polynomial; correct algorithm for very large/very small arguments. * sysdeps/libm-ieee754/s_exp2f.c: Use slightly better polynomial; correct algorithm for very large/very small arguments; adjust for new table. * sysdeps/libm-ieee754/t_exp.c: New file. * sysdeps/libm-ieee754/t_exp2f.h: Use table with smaller deltas. * sysdeps/unix/sysv/linux/powerpc/dl-sysdep.c: Put 'strange test' back, with comment that explains what breaks when you remove it :-(. * localedata/xfrm-test.c: Avoid integer overflow. * stdlib/strfmon.c: char is unsigned, sometimes. *sysdeps/powerpc * sysdeps/powerpc/Makefile: Remove quad float support. * sysdeps/powerpc/q_*.c: Remove, they will become an add-on. * sysdeps/powerpc/quad_float.h: Likewise. * sysdeps/powerpc/test-arith.c: Likewise. * sysdeps/powerpc/test-arithf.c: Likewise. * sysdeps/generic/s_exp2.c: Remove, we have this implemented now. * sysdeps/generic/s_exp2f.c: Likewise. * sysdeps/powerpc/bits/mathinline.h: Use underscores around __asm__, don't try anything if _SOFT_FLOAT. 1997-12-31 Andreas Schwab <schwab@issan.informatik.uni-dortmund.de> * locale/C-ctype.c (_nl_C_LC_CTYPE_class32): Undo last change. * locale/programs/ld-ctype.c (CHAR_CLASS32_TRANS): Likewise. * wctype/wctype.c: Likewise. * wctype/wctype.h (_ISwxxx): Renamed from _ISxxx, all uses changed. They are incompatible with the _ISxxx values from <ctype.h> on little endian machines. (_ISwbit) [__BYTE_ORDER == __LITTLE_ENDIAN]: Correctly transform bit number. This fixes the real bug and restores the integrity of the ctype locale file. * wctype/wcfuncs.c: Change all _ISxxx to _ISwxxx. * wctype/wcfuncs_l.c: Likewise. * wctype/wcextra.c: Likewise. * wctype/wctype_l.c [__BYTE_ORDER == __LITTLE_ENDIAN]: Use correct byte swapping. 1998-02-09 Andreas Schwab <schwab@issan.informatik.uni-dortmund.de> * sysdeps/unix/sysv/linux/sparc/sparc32/sysdep.S (errno): Put it into .bss segment instead of .common, so that aliases on it work. * sysdeps/unix/sysv/linux/i386/sysdep.S (errno): Add .type and .size directives, put into .bss segment instead of initializing it to 4. 1998-02-12 08:00 H.J. Lu <hjl@gnu.org> * libc.map (gnu_get_libc_release, gnu_get_libc_version): Added. * version.c (__gnu_get_libc_release, __gnu_get_libc_version): New functions. Make names without __ weak aliases. (__libc_release, __libc_version): Make them static. * include/gnu/libc-version.h: New file. * Makefile (headers): Add gnu/libc-version.h. 1998-02-13 Ulrich Drepper <drepper@cygnus.com> * stdlib/stdlib.h (struct drand48_data): Leave X to user macros and use x for member name. Reported by Daniel Lyddy <daniell@cs.berkeley.edu>. * stdlib/drand48.c: Change according to member name change. * stdlib/drand48_r.c: Likewise. * stdlib/lcong48_r.c: Likewise. * stdlib/lrand48.c: Likewise. * stdlib/lrand48_r.c: Likewise. * stdlib/mrand48.c: Likewise. * stdlib/mrand48_r.c: Likewise. * stdlib/seed48.c: Likewise. * stdlib/seed48_r.c: Likewise. * stdlib/srand48_r.c: Likewise. 1998-02-11 Andreas Jaeger <aj@arthur.rhein-neckar.de> * nss/test-netdb.c: Add some more test cases. 1998-02-13 11:39 Ulrich Drepper <drepper@cygnus.com> * libio/iovsscanf.c: Undo last change modifying errno. 1998-02-12 Andreas Schwab <schwab@issan.informatik.uni-dortmund.de> * stdio-common/vfscanf.c: Never try to read another character after EOF. Don't decrement read_in after EOF, it wasn't incremented in the first place. (NEXT_WIDE_CHAR): Set First, not first. 1998-02-06 07:48 H.J. Lu <hjl@gnu.org> * db/Makefile ($(inst_libdir)/libndbm.a, $(inst_libdir)/libndbm.so): New targets. * db2/Makefile: Likewise. 1998-02-12 08:20 H.J. Lu <hjl@gnu.org> * sysdeps/gnu/errlist.awk (sys_errlist, sys_nerr): Create weak aliases if HAVE_ELF or PIC or DO_VERSIONING is not defined. 1998-02-12 Andreas Schwab <schwab@issan.informatik.uni-dortmund.de> * sysdeps/generic/_G_config.h: Define _G_wchar_t, for C++ <streambuf.h>. * sysdeps/unix/sysv/linux/_G_config.h: Likewise. 1998-02-11 Andreas Schwab <schwab@issan.informatik.uni-dortmund.de> * sysdeps/unix/make-syscalls.sh: Fix sed pattern when dealing with versioned symbols. 1998-02-13 08:14 H.J. Lu <hjl@gnu.org> * libc.map (_dl_global_scope, _dl_lookup_symbol_skip, _dl_lookup_versioned_symbol, _dl_lookup_versioned_symbol_skip): Added for libdl.so. * elf/rtld.map: New file. Needed to define the GLIBC_2.* * manual/socket.texi (Host Address Functions): Clarify description * sysdeps/unix/sysv/linux/alpha/bits/time.h (struct timeval):
1998-02-14 01:54:15 +08:00
The GNU C Library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.
Update. 1998-02-13 17:39 Ulrich Drepper <drepper@cygnus.com> * elf/Makefile: Don't use --version-script parameter to link ld.so unconditionally. 1998-01-02 04:19 Geoff Keating <geoffk@ozemail.com.au> * math/Makefile: Add t_exp. * math/libm-test.c: Tighten accuracy bounds for exp(), correct constants. * math/test-reduce.c: Remove temporarily, it seems to be broken. * sysdeps/libm-ieee754/e_exp.c: Use accurate table method. * sysdeps/libm-ieee754/e_expf.c: Use table & double precision for better accuracy. * sysdeps/libm-ieee754/s_exp2.c: Use better polynomial; correct algorithm for very large/very small arguments. * sysdeps/libm-ieee754/s_exp2f.c: Use slightly better polynomial; correct algorithm for very large/very small arguments; adjust for new table. * sysdeps/libm-ieee754/t_exp.c: New file. * sysdeps/libm-ieee754/t_exp2f.h: Use table with smaller deltas. * sysdeps/unix/sysv/linux/powerpc/dl-sysdep.c: Put 'strange test' back, with comment that explains what breaks when you remove it :-(. * localedata/xfrm-test.c: Avoid integer overflow. * stdlib/strfmon.c: char is unsigned, sometimes. *sysdeps/powerpc * sysdeps/powerpc/Makefile: Remove quad float support. * sysdeps/powerpc/q_*.c: Remove, they will become an add-on. * sysdeps/powerpc/quad_float.h: Likewise. * sysdeps/powerpc/test-arith.c: Likewise. * sysdeps/powerpc/test-arithf.c: Likewise. * sysdeps/generic/s_exp2.c: Remove, we have this implemented now. * sysdeps/generic/s_exp2f.c: Likewise. * sysdeps/powerpc/bits/mathinline.h: Use underscores around __asm__, don't try anything if _SOFT_FLOAT. 1997-12-31 Andreas Schwab <schwab@issan.informatik.uni-dortmund.de> * locale/C-ctype.c (_nl_C_LC_CTYPE_class32): Undo last change. * locale/programs/ld-ctype.c (CHAR_CLASS32_TRANS): Likewise. * wctype/wctype.c: Likewise. * wctype/wctype.h (_ISwxxx): Renamed from _ISxxx, all uses changed. They are incompatible with the _ISxxx values from <ctype.h> on little endian machines. (_ISwbit) [__BYTE_ORDER == __LITTLE_ENDIAN]: Correctly transform bit number. This fixes the real bug and restores the integrity of the ctype locale file. * wctype/wcfuncs.c: Change all _ISxxx to _ISwxxx. * wctype/wcfuncs_l.c: Likewise. * wctype/wcextra.c: Likewise. * wctype/wctype_l.c [__BYTE_ORDER == __LITTLE_ENDIAN]: Use correct byte swapping. 1998-02-09 Andreas Schwab <schwab@issan.informatik.uni-dortmund.de> * sysdeps/unix/sysv/linux/sparc/sparc32/sysdep.S (errno): Put it into .bss segment instead of .common, so that aliases on it work. * sysdeps/unix/sysv/linux/i386/sysdep.S (errno): Add .type and .size directives, put into .bss segment instead of initializing it to 4. 1998-02-12 08:00 H.J. Lu <hjl@gnu.org> * libc.map (gnu_get_libc_release, gnu_get_libc_version): Added. * version.c (__gnu_get_libc_release, __gnu_get_libc_version): New functions. Make names without __ weak aliases. (__libc_release, __libc_version): Make them static. * include/gnu/libc-version.h: New file. * Makefile (headers): Add gnu/libc-version.h. 1998-02-13 Ulrich Drepper <drepper@cygnus.com> * stdlib/stdlib.h (struct drand48_data): Leave X to user macros and use x for member name. Reported by Daniel Lyddy <daniell@cs.berkeley.edu>. * stdlib/drand48.c: Change according to member name change. * stdlib/drand48_r.c: Likewise. * stdlib/lcong48_r.c: Likewise. * stdlib/lrand48.c: Likewise. * stdlib/lrand48_r.c: Likewise. * stdlib/mrand48.c: Likewise. * stdlib/mrand48_r.c: Likewise. * stdlib/seed48.c: Likewise. * stdlib/seed48_r.c: Likewise. * stdlib/srand48_r.c: Likewise. 1998-02-11 Andreas Jaeger <aj@arthur.rhein-neckar.de> * nss/test-netdb.c: Add some more test cases. 1998-02-13 11:39 Ulrich Drepper <drepper@cygnus.com> * libio/iovsscanf.c: Undo last change modifying errno. 1998-02-12 Andreas Schwab <schwab@issan.informatik.uni-dortmund.de> * stdio-common/vfscanf.c: Never try to read another character after EOF. Don't decrement read_in after EOF, it wasn't incremented in the first place. (NEXT_WIDE_CHAR): Set First, not first. 1998-02-06 07:48 H.J. Lu <hjl@gnu.org> * db/Makefile ($(inst_libdir)/libndbm.a, $(inst_libdir)/libndbm.so): New targets. * db2/Makefile: Likewise. 1998-02-12 08:20 H.J. Lu <hjl@gnu.org> * sysdeps/gnu/errlist.awk (sys_errlist, sys_nerr): Create weak aliases if HAVE_ELF or PIC or DO_VERSIONING is not defined. 1998-02-12 Andreas Schwab <schwab@issan.informatik.uni-dortmund.de> * sysdeps/generic/_G_config.h: Define _G_wchar_t, for C++ <streambuf.h>. * sysdeps/unix/sysv/linux/_G_config.h: Likewise. 1998-02-11 Andreas Schwab <schwab@issan.informatik.uni-dortmund.de> * sysdeps/unix/make-syscalls.sh: Fix sed pattern when dealing with versioned symbols. 1998-02-13 08:14 H.J. Lu <hjl@gnu.org> * libc.map (_dl_global_scope, _dl_lookup_symbol_skip, _dl_lookup_versioned_symbol, _dl_lookup_versioned_symbol_skip): Added for libdl.so. * elf/rtld.map: New file. Needed to define the GLIBC_2.* * manual/socket.texi (Host Address Functions): Clarify description * sysdeps/unix/sysv/linux/alpha/bits/time.h (struct timeval):
1998-02-14 01:54:15 +08:00
The GNU C Library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public
License along with the GNU C Library; if not, see
Prefer https to http for gnu.org and fsf.org URLs Also, change sources.redhat.com to sourceware.org. This patch was automatically generated by running the following shell script, which uses GNU sed, and which avoids modifying files imported from upstream: sed -ri ' s,(http|ftp)(://(.*\.)?(gnu|fsf|sourceware)\.org($|[^.]|\.[^a-z])),https\2,g s,(http|ftp)(://(.*\.)?)sources\.redhat\.com($|[^.]|\.[^a-z]),https\2sourceware.org\4,g ' \ $(find $(git ls-files) -prune -type f \ ! -name '*.po' \ ! -name 'ChangeLog*' \ ! -path COPYING ! -path COPYING.LIB \ ! -path manual/fdl-1.3.texi ! -path manual/lgpl-2.1.texi \ ! -path manual/texinfo.tex ! -path scripts/config.guess \ ! -path scripts/config.sub ! -path scripts/install-sh \ ! -path scripts/mkinstalldirs ! -path scripts/move-if-change \ ! -path INSTALL ! -path locale/programs/charmap-kw.h \ ! -path po/libc.pot ! -path sysdeps/gnu/errlist.c \ ! '(' -name configure \ -execdir test -f configure.ac -o -f configure.in ';' ')' \ ! '(' -name preconfigure \ -execdir test -f preconfigure.ac ';' ')' \ -print) and then by running 'make dist-prepare' to regenerate files built from the altered files, and then executing the following to cleanup: chmod a+x sysdeps/unix/sysv/linux/riscv/configure # Omit irrelevant whitespace and comment-only changes, # perhaps from a slightly-different Autoconf version. git checkout -f \ sysdeps/csky/configure \ sysdeps/hppa/configure \ sysdeps/riscv/configure \ sysdeps/unix/sysv/linux/csky/configure # Omit changes that caused a pre-commit check to fail like this: # remote: *** error: sysdeps/powerpc/powerpc64/ppc-mcount.S: trailing lines git checkout -f \ sysdeps/powerpc/powerpc64/ppc-mcount.S \ sysdeps/unix/sysv/linux/s390/s390-64/syscall.S # Omit change that caused a pre-commit check to fail like this: # remote: *** error: sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S: last line does not end in newline git checkout -f sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S
2019-09-07 13:40:42 +08:00
<https://www.gnu.org/licenses/>. */
Update. 1998-02-13 17:39 Ulrich Drepper <drepper@cygnus.com> * elf/Makefile: Don't use --version-script parameter to link ld.so unconditionally. 1998-01-02 04:19 Geoff Keating <geoffk@ozemail.com.au> * math/Makefile: Add t_exp. * math/libm-test.c: Tighten accuracy bounds for exp(), correct constants. * math/test-reduce.c: Remove temporarily, it seems to be broken. * sysdeps/libm-ieee754/e_exp.c: Use accurate table method. * sysdeps/libm-ieee754/e_expf.c: Use table & double precision for better accuracy. * sysdeps/libm-ieee754/s_exp2.c: Use better polynomial; correct algorithm for very large/very small arguments. * sysdeps/libm-ieee754/s_exp2f.c: Use slightly better polynomial; correct algorithm for very large/very small arguments; adjust for new table. * sysdeps/libm-ieee754/t_exp.c: New file. * sysdeps/libm-ieee754/t_exp2f.h: Use table with smaller deltas. * sysdeps/unix/sysv/linux/powerpc/dl-sysdep.c: Put 'strange test' back, with comment that explains what breaks when you remove it :-(. * localedata/xfrm-test.c: Avoid integer overflow. * stdlib/strfmon.c: char is unsigned, sometimes. *sysdeps/powerpc * sysdeps/powerpc/Makefile: Remove quad float support. * sysdeps/powerpc/q_*.c: Remove, they will become an add-on. * sysdeps/powerpc/quad_float.h: Likewise. * sysdeps/powerpc/test-arith.c: Likewise. * sysdeps/powerpc/test-arithf.c: Likewise. * sysdeps/generic/s_exp2.c: Remove, we have this implemented now. * sysdeps/generic/s_exp2f.c: Likewise. * sysdeps/powerpc/bits/mathinline.h: Use underscores around __asm__, don't try anything if _SOFT_FLOAT. 1997-12-31 Andreas Schwab <schwab@issan.informatik.uni-dortmund.de> * locale/C-ctype.c (_nl_C_LC_CTYPE_class32): Undo last change. * locale/programs/ld-ctype.c (CHAR_CLASS32_TRANS): Likewise. * wctype/wctype.c: Likewise. * wctype/wctype.h (_ISwxxx): Renamed from _ISxxx, all uses changed. They are incompatible with the _ISxxx values from <ctype.h> on little endian machines. (_ISwbit) [__BYTE_ORDER == __LITTLE_ENDIAN]: Correctly transform bit number. This fixes the real bug and restores the integrity of the ctype locale file. * wctype/wcfuncs.c: Change all _ISxxx to _ISwxxx. * wctype/wcfuncs_l.c: Likewise. * wctype/wcextra.c: Likewise. * wctype/wctype_l.c [__BYTE_ORDER == __LITTLE_ENDIAN]: Use correct byte swapping. 1998-02-09 Andreas Schwab <schwab@issan.informatik.uni-dortmund.de> * sysdeps/unix/sysv/linux/sparc/sparc32/sysdep.S (errno): Put it into .bss segment instead of .common, so that aliases on it work. * sysdeps/unix/sysv/linux/i386/sysdep.S (errno): Add .type and .size directives, put into .bss segment instead of initializing it to 4. 1998-02-12 08:00 H.J. Lu <hjl@gnu.org> * libc.map (gnu_get_libc_release, gnu_get_libc_version): Added. * version.c (__gnu_get_libc_release, __gnu_get_libc_version): New functions. Make names without __ weak aliases. (__libc_release, __libc_version): Make them static. * include/gnu/libc-version.h: New file. * Makefile (headers): Add gnu/libc-version.h. 1998-02-13 Ulrich Drepper <drepper@cygnus.com> * stdlib/stdlib.h (struct drand48_data): Leave X to user macros and use x for member name. Reported by Daniel Lyddy <daniell@cs.berkeley.edu>. * stdlib/drand48.c: Change according to member name change. * stdlib/drand48_r.c: Likewise. * stdlib/lcong48_r.c: Likewise. * stdlib/lrand48.c: Likewise. * stdlib/lrand48_r.c: Likewise. * stdlib/mrand48.c: Likewise. * stdlib/mrand48_r.c: Likewise. * stdlib/seed48.c: Likewise. * stdlib/seed48_r.c: Likewise. * stdlib/srand48_r.c: Likewise. 1998-02-11 Andreas Jaeger <aj@arthur.rhein-neckar.de> * nss/test-netdb.c: Add some more test cases. 1998-02-13 11:39 Ulrich Drepper <drepper@cygnus.com> * libio/iovsscanf.c: Undo last change modifying errno. 1998-02-12 Andreas Schwab <schwab@issan.informatik.uni-dortmund.de> * stdio-common/vfscanf.c: Never try to read another character after EOF. Don't decrement read_in after EOF, it wasn't incremented in the first place. (NEXT_WIDE_CHAR): Set First, not first. 1998-02-06 07:48 H.J. Lu <hjl@gnu.org> * db/Makefile ($(inst_libdir)/libndbm.a, $(inst_libdir)/libndbm.so): New targets. * db2/Makefile: Likewise. 1998-02-12 08:20 H.J. Lu <hjl@gnu.org> * sysdeps/gnu/errlist.awk (sys_errlist, sys_nerr): Create weak aliases if HAVE_ELF or PIC or DO_VERSIONING is not defined. 1998-02-12 Andreas Schwab <schwab@issan.informatik.uni-dortmund.de> * sysdeps/generic/_G_config.h: Define _G_wchar_t, for C++ <streambuf.h>. * sysdeps/unix/sysv/linux/_G_config.h: Likewise. 1998-02-11 Andreas Schwab <schwab@issan.informatik.uni-dortmund.de> * sysdeps/unix/make-syscalls.sh: Fix sed pattern when dealing with versioned symbols. 1998-02-13 08:14 H.J. Lu <hjl@gnu.org> * libc.map (_dl_global_scope, _dl_lookup_symbol_skip, _dl_lookup_versioned_symbol, _dl_lookup_versioned_symbol_skip): Added for libdl.so. * elf/rtld.map: New file. Needed to define the GLIBC_2.* * manual/socket.texi (Host Address Functions): Clarify description * sysdeps/unix/sysv/linux/alpha/bits/time.h (struct timeval):
1998-02-14 01:54:15 +08:00
#ifdef __expf
# undef libm_hidden_proto
# define libm_hidden_proto(ignored)
#endif
Update. 1998-02-13 17:39 Ulrich Drepper <drepper@cygnus.com> * elf/Makefile: Don't use --version-script parameter to link ld.so unconditionally. 1998-01-02 04:19 Geoff Keating <geoffk@ozemail.com.au> * math/Makefile: Add t_exp. * math/libm-test.c: Tighten accuracy bounds for exp(), correct constants. * math/test-reduce.c: Remove temporarily, it seems to be broken. * sysdeps/libm-ieee754/e_exp.c: Use accurate table method. * sysdeps/libm-ieee754/e_expf.c: Use table & double precision for better accuracy. * sysdeps/libm-ieee754/s_exp2.c: Use better polynomial; correct algorithm for very large/very small arguments. * sysdeps/libm-ieee754/s_exp2f.c: Use slightly better polynomial; correct algorithm for very large/very small arguments; adjust for new table. * sysdeps/libm-ieee754/t_exp.c: New file. * sysdeps/libm-ieee754/t_exp2f.h: Use table with smaller deltas. * sysdeps/unix/sysv/linux/powerpc/dl-sysdep.c: Put 'strange test' back, with comment that explains what breaks when you remove it :-(. * localedata/xfrm-test.c: Avoid integer overflow. * stdlib/strfmon.c: char is unsigned, sometimes. *sysdeps/powerpc * sysdeps/powerpc/Makefile: Remove quad float support. * sysdeps/powerpc/q_*.c: Remove, they will become an add-on. * sysdeps/powerpc/quad_float.h: Likewise. * sysdeps/powerpc/test-arith.c: Likewise. * sysdeps/powerpc/test-arithf.c: Likewise. * sysdeps/generic/s_exp2.c: Remove, we have this implemented now. * sysdeps/generic/s_exp2f.c: Likewise. * sysdeps/powerpc/bits/mathinline.h: Use underscores around __asm__, don't try anything if _SOFT_FLOAT. 1997-12-31 Andreas Schwab <schwab@issan.informatik.uni-dortmund.de> * locale/C-ctype.c (_nl_C_LC_CTYPE_class32): Undo last change. * locale/programs/ld-ctype.c (CHAR_CLASS32_TRANS): Likewise. * wctype/wctype.c: Likewise. * wctype/wctype.h (_ISwxxx): Renamed from _ISxxx, all uses changed. They are incompatible with the _ISxxx values from <ctype.h> on little endian machines. (_ISwbit) [__BYTE_ORDER == __LITTLE_ENDIAN]: Correctly transform bit number. This fixes the real bug and restores the integrity of the ctype locale file. * wctype/wcfuncs.c: Change all _ISxxx to _ISwxxx. * wctype/wcfuncs_l.c: Likewise. * wctype/wcextra.c: Likewise. * wctype/wctype_l.c [__BYTE_ORDER == __LITTLE_ENDIAN]: Use correct byte swapping. 1998-02-09 Andreas Schwab <schwab@issan.informatik.uni-dortmund.de> * sysdeps/unix/sysv/linux/sparc/sparc32/sysdep.S (errno): Put it into .bss segment instead of .common, so that aliases on it work. * sysdeps/unix/sysv/linux/i386/sysdep.S (errno): Add .type and .size directives, put into .bss segment instead of initializing it to 4. 1998-02-12 08:00 H.J. Lu <hjl@gnu.org> * libc.map (gnu_get_libc_release, gnu_get_libc_version): Added. * version.c (__gnu_get_libc_release, __gnu_get_libc_version): New functions. Make names without __ weak aliases. (__libc_release, __libc_version): Make them static. * include/gnu/libc-version.h: New file. * Makefile (headers): Add gnu/libc-version.h. 1998-02-13 Ulrich Drepper <drepper@cygnus.com> * stdlib/stdlib.h (struct drand48_data): Leave X to user macros and use x for member name. Reported by Daniel Lyddy <daniell@cs.berkeley.edu>. * stdlib/drand48.c: Change according to member name change. * stdlib/drand48_r.c: Likewise. * stdlib/lcong48_r.c: Likewise. * stdlib/lrand48.c: Likewise. * stdlib/lrand48_r.c: Likewise. * stdlib/mrand48.c: Likewise. * stdlib/mrand48_r.c: Likewise. * stdlib/seed48.c: Likewise. * stdlib/seed48_r.c: Likewise. * stdlib/srand48_r.c: Likewise. 1998-02-11 Andreas Jaeger <aj@arthur.rhein-neckar.de> * nss/test-netdb.c: Add some more test cases. 1998-02-13 11:39 Ulrich Drepper <drepper@cygnus.com> * libio/iovsscanf.c: Undo last change modifying errno. 1998-02-12 Andreas Schwab <schwab@issan.informatik.uni-dortmund.de> * stdio-common/vfscanf.c: Never try to read another character after EOF. Don't decrement read_in after EOF, it wasn't incremented in the first place. (NEXT_WIDE_CHAR): Set First, not first. 1998-02-06 07:48 H.J. Lu <hjl@gnu.org> * db/Makefile ($(inst_libdir)/libndbm.a, $(inst_libdir)/libndbm.so): New targets. * db2/Makefile: Likewise. 1998-02-12 08:20 H.J. Lu <hjl@gnu.org> * sysdeps/gnu/errlist.awk (sys_errlist, sys_nerr): Create weak aliases if HAVE_ELF or PIC or DO_VERSIONING is not defined. 1998-02-12 Andreas Schwab <schwab@issan.informatik.uni-dortmund.de> * sysdeps/generic/_G_config.h: Define _G_wchar_t, for C++ <streambuf.h>. * sysdeps/unix/sysv/linux/_G_config.h: Likewise. 1998-02-11 Andreas Schwab <schwab@issan.informatik.uni-dortmund.de> * sysdeps/unix/make-syscalls.sh: Fix sed pattern when dealing with versioned symbols. 1998-02-13 08:14 H.J. Lu <hjl@gnu.org> * libc.map (_dl_global_scope, _dl_lookup_symbol_skip, _dl_lookup_versioned_symbol, _dl_lookup_versioned_symbol_skip): Added for libdl.so. * elf/rtld.map: New file. Needed to define the GLIBC_2.* * manual/socket.texi (Host Address Functions): Clarify description * sysdeps/unix/sysv/linux/alpha/bits/time.h (struct timeval):
1998-02-14 01:54:15 +08:00
#include <math.h>
Move math_narrow_eval to separate math-narrow-eval.h. This patch continues cleaning up the math_private.h header, which contains lots of different definitions many of which are only needed by a limited subset of files using that header (and some of which are overridden by architectures that only want to override selected parts of the header), by moving the math_narrow_eval macro out to a separate math-narrow-eval.h header, only included by those files that need it. That header is placed in include/ (since it's used in stdlib/, not just files built in math/, but no sysdeps variants are needed at present). Tested for x86_64, and with build-many-glibcs.py. (Installed stripped shared libraries change because of line numbers in assertions in strtod_l.c.) * include/math-narrow-eval.h: New file. Contents moved from .... * sysdeps/generic/math_private.h: ... here. (math_narrow_eval): Remove macro. Moved to math-narrow-eval.h. [FLT_EVAL_METHOD != 0] (excess_precision): Likewise. * math/s_fdim_template.c: Include <math-narrow-eval.h>. * stdlib/strtod_l.c: Likewise. * sysdeps/i386/fpu/s_f32xaddf64.c: Likewise. * sysdeps/i386/fpu/s_f32xsubf64.c: Likewise. * sysdeps/i386/fpu/s_fdim.c: Likewise. * sysdeps/ieee754/dbl-64/e_cosh.c: Likewise. * sysdeps/ieee754/dbl-64/e_gamma_r.c: Likewise. * sysdeps/ieee754/dbl-64/e_j1.c: Likewise. * sysdeps/ieee754/dbl-64/e_jn.c: Likewise. * sysdeps/ieee754/dbl-64/e_lgamma_r.c: Likewise. * sysdeps/ieee754/dbl-64/e_sinh.c: Likewise. * sysdeps/ieee754/dbl-64/gamma_productf.c: Likewise. * sysdeps/ieee754/dbl-64/k_rem_pio2.c: Likewise. * sysdeps/ieee754/dbl-64/lgamma_neg.c: Likewise. * sysdeps/ieee754/dbl-64/s_erf.c: Likewise. * sysdeps/ieee754/dbl-64/s_llrint.c: Likewise. * sysdeps/ieee754/dbl-64/s_lrint.c: Likewise. * sysdeps/ieee754/flt-32/e_coshf.c: Likewise. * sysdeps/ieee754/flt-32/e_exp2f.c: Likewise. * sysdeps/ieee754/flt-32/e_expf.c: Likewise. * sysdeps/ieee754/flt-32/e_gammaf_r.c: Likewise. * sysdeps/ieee754/flt-32/e_j1f.c: Likewise. * sysdeps/ieee754/flt-32/e_jnf.c: Likewise. * sysdeps/ieee754/flt-32/e_lgammaf_r.c: Likewise. * sysdeps/ieee754/flt-32/e_sinhf.c: Likewise. * sysdeps/ieee754/flt-32/k_rem_pio2f.c: Likewise. * sysdeps/ieee754/flt-32/lgamma_negf.c: Likewise. * sysdeps/ieee754/flt-32/s_erff.c: Likewise. * sysdeps/ieee754/flt-32/s_llrintf.c: Likewise. * sysdeps/ieee754/flt-32/s_lrintf.c: Likewise. * sysdeps/ieee754/ldbl-96/gamma_product.c: Likewise.
2018-05-09 08:15:10 +08:00
#include <math-narrow-eval.h>
Optimized generic expf and exp2f with wrappers Based on new expf and exp2f code from https://github.com/ARM-software/optimized-routines/ with wrapper on aarch64: expf reciprocal-throughput: 2.3x faster expf latency: 1.7x faster without wrapper on aarch64: expf reciprocal-throughput: 3.3x faster expf latency: 1.7x faster without wrapper on aarch64: exp2f reciprocal-throughput: 2.8x faster exp2f latency: 1.3x faster libm.so size on aarch64: .text size: -152 bytes .rodata size: -1740 bytes expf/exp2f worst case nearest rounding error: 0.502 ulp worst case non-nearest rounding error: 1 ulp Error checks are inline and errno setting is in separate tail called functions, but the wrappers are kept in this patch to handle the _LIB_VERSION==_SVID_ case. (So e.g. errno is set twice for expf calls and once for __expf_finite calls on targets where the new code is used.) Double precision arithmetics is used which is expected to be faster on most targets (including soft-float) than using single precision and it is easier to get good precision result with it. Const data is kept in a separate translation unit which complicates maintenance a bit, but is expected to give good code for literal loads on most targets and allows sharing data across expf, exp2f and powf. (This data is disabled on i386, m68k and ia64 which have their own expf, exp2f and powf code.) Some details may need target specific tweaks: - best convert and round to int operation in the arg reduction may be different across targets. - code was optimized on fma target, optimal polynomial eval may be different without fma. - gcc does not always generate good code for fp bit representation access via unions or it may be inherently slow on some targets. The libm-test-ulps will need adjustment because.. - The argument reduction ideally uses nearest rounded rint, but that is not efficient on most targets, so the polynomial can get evaluated on a wider interval in non-nearest rounding mode making 1 ulp errors common in that case. - The polynomial is evaluated such that it may have 1 ulp error on negative tiny inputs with upward rounding. * math/Makefile (type-float-routines): Add math_errf and e_exp2f_data. * sysdeps/aarch64/fpu/math_private.h (TOINT_INTRINSICS): Define. (roundtoint, converttoint): Likewise. * sysdeps/ieee754/flt-32/e_expf.c: New implementation. * sysdeps/ieee754/flt-32/e_exp2f.c: New implementation. * sysdeps/ieee754/flt-32/e_exp2f_data.c: New file. * sysdeps/ieee754/flt-32/math_config.h: New file. * sysdeps/ieee754/flt-32/math_errf.c: New file. * sysdeps/ieee754/flt-32/t_exp2f.h: Remove. * sysdeps/i386/fpu/e_exp2f_data.c: New file. * sysdeps/i386/fpu/math_errf.c: New file. * sysdeps/ia64/fpu/e_exp2f_data.c: New file. * sysdeps/ia64/fpu/math_errf.c: New file. * sysdeps/m68k/m680x0/fpu/e_exp2f_data.c: New file. * sysdeps/m68k/m680x0/fpu/math_errf.c: New file.
2017-09-07 00:42:00 +08:00
#include <stdint.h>
#include <libm-alias-finite.h>
Add libm_alias_*_other_r macros. Some libm functions are unable to use the generic alias macros such as libm_alias_double because they have special symbol versioning requirements for the main float, double or long double public names. To facilitate adding _FloatN / _FloatNx function aliases in future, it's still desirable to have generic macros those functions can use as far as possible. This patch adds macros such as libm_alias_double_other, which only define names for _FloatN / _FloatNx aliases, not for float / double / long double. As present, all these new macros do nothing, but they are called in the appropriate places in macros such as libm_alias_double. This patch also arranges for lgamma implementations, and the recently added optimized float function implementations, to use the new macros to make them ready for addition of _FloatN / _FloatNx aliases. Tested for x86_64, and tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by this patch. * sysdeps/generic/libm-alias-double.h (libm_alias_double_other_r): New macro. (libm_alias_double_other): Likewise. (libm_alias_double_r): Use libm_alias_double_other_r. * sysdeps/generic/libm-alias-float.h (libm_alias_float_other_r): New macro. (libm_alias_float_other): Likewise. (libm_alias_float_r): Use libm_alias_float_other_r. * sysdeps/generic/libm-alias-float128.h (libm_alias_float128_other_r): New macro. (libm_alias_float128_other): Likewise. (libm_alias_float128_r): Use libm_alias_float128_other_r. * sysdeps/generic/libm-alias-ldouble.h (libm_alias_ldouble_other_r): New macro. (libm_alias_ldouble_other): Likewise. (libm_alias_ldouble_r): Use libm_alias_ldouble_other_r. * sysdeps/ieee754/ldbl-opt/libm-alias-double.h (libm_alias_double_other_r): New macro. (libm_alias_double_other): Likewise. (libm_alias_double_r): Use libm_alias_double_other_r. * sysdeps/ieee754/ldbl-opt/libm-alias-ldouble.h (libm_alias_ldouble_other_r): New macro. (libm_alias_ldouble_other): Likewise. (libm_alias_ldouble_r): Use libm_alias_ldouble_other_r. * math/w_lgamma_main.c: Include <libm-alias-double.h>. [!USE_AS_COMPAT]: Use libm_alias_double_other. * math/w_lgammaf_main.c: Include <libm-alias-float.h>. [!USE_AS_COMPAT]: Use libm_alias_float_other. * math/w_lgammal_main.c: Include <libm-alias-ldouble.h>. [!USE_AS_COMPAT]: Use libm_alias_ldouble_other. * math/w_exp2f.c: Use libm_alias_float_other. * math/w_expf.c: Likewise. * math/w_log2f.c: Likewise. * math/w_logf.c: Likewise. * math/w_powf.c: Likewise. * sysdeps/ieee754/flt-32/e_exp2f.c: Include <libm-alias-float.h>. [!__exp2f]: Use libm_alias_float_other. * sysdeps/ieee754/flt-32/e_expf.c: Include <libm-alias-float.h>. [!__expf]: Use libm_alias_float_other. * sysdeps/ieee754/flt-32/e_log2f.c: Include <libm-alias-float.h>. [!__log2f]: Use libm_alias_float_other. * sysdeps/ieee754/flt-32/e_logf.c: Include <libm-alias-float.h>. [!__logf]: Use libm_alias_float_other. * sysdeps/ieee754/flt-32/e_powf.c: Include <libm-alias-float.h>. [!__powf]: Use libm_alias_float_other.
2017-10-11 05:29:11 +08:00
#include <libm-alias-float.h>
Optimized generic expf and exp2f with wrappers Based on new expf and exp2f code from https://github.com/ARM-software/optimized-routines/ with wrapper on aarch64: expf reciprocal-throughput: 2.3x faster expf latency: 1.7x faster without wrapper on aarch64: expf reciprocal-throughput: 3.3x faster expf latency: 1.7x faster without wrapper on aarch64: exp2f reciprocal-throughput: 2.8x faster exp2f latency: 1.3x faster libm.so size on aarch64: .text size: -152 bytes .rodata size: -1740 bytes expf/exp2f worst case nearest rounding error: 0.502 ulp worst case non-nearest rounding error: 1 ulp Error checks are inline and errno setting is in separate tail called functions, but the wrappers are kept in this patch to handle the _LIB_VERSION==_SVID_ case. (So e.g. errno is set twice for expf calls and once for __expf_finite calls on targets where the new code is used.) Double precision arithmetics is used which is expected to be faster on most targets (including soft-float) than using single precision and it is easier to get good precision result with it. Const data is kept in a separate translation unit which complicates maintenance a bit, but is expected to give good code for literal loads on most targets and allows sharing data across expf, exp2f and powf. (This data is disabled on i386, m68k and ia64 which have their own expf, exp2f and powf code.) Some details may need target specific tweaks: - best convert and round to int operation in the arg reduction may be different across targets. - code was optimized on fma target, optimal polynomial eval may be different without fma. - gcc does not always generate good code for fp bit representation access via unions or it may be inherently slow on some targets. The libm-test-ulps will need adjustment because.. - The argument reduction ideally uses nearest rounded rint, but that is not efficient on most targets, so the polynomial can get evaluated on a wider interval in non-nearest rounding mode making 1 ulp errors common in that case. - The polynomial is evaluated such that it may have 1 ulp error on negative tiny inputs with upward rounding. * math/Makefile (type-float-routines): Add math_errf and e_exp2f_data. * sysdeps/aarch64/fpu/math_private.h (TOINT_INTRINSICS): Define. (roundtoint, converttoint): Likewise. * sysdeps/ieee754/flt-32/e_expf.c: New implementation. * sysdeps/ieee754/flt-32/e_exp2f.c: New implementation. * sysdeps/ieee754/flt-32/e_exp2f_data.c: New file. * sysdeps/ieee754/flt-32/math_config.h: New file. * sysdeps/ieee754/flt-32/math_errf.c: New file. * sysdeps/ieee754/flt-32/t_exp2f.h: Remove. * sysdeps/i386/fpu/e_exp2f_data.c: New file. * sysdeps/i386/fpu/math_errf.c: New file. * sysdeps/ia64/fpu/e_exp2f_data.c: New file. * sysdeps/ia64/fpu/math_errf.c: New file. * sysdeps/m68k/m680x0/fpu/e_exp2f_data.c: New file. * sysdeps/m68k/m680x0/fpu/math_errf.c: New file.
2017-09-07 00:42:00 +08:00
#include "math_config.h"
/*
EXP2F_TABLE_BITS = 5
EXP2F_POLY_ORDER = 3
ULP error: 0.502 (nearest rounding.)
Relative error: 1.69 * 2^-34 in [-ln2/64, ln2/64] (before rounding.)
Wrong count: 170635 (all nearest rounding wrong results with fma.)
Non-nearest ULP error: 1 (rounded ULP error)
*/
#define N (1 << EXP2F_TABLE_BITS)
#define InvLn2N __exp2f_data.invln2_scaled
#define T __exp2f_data.tab
#define C __exp2f_data.poly_scaled
static inline uint32_t
top12 (float x)
{
return asuint (x) >> 20;
}
Update. 1998-02-13 17:39 Ulrich Drepper <drepper@cygnus.com> * elf/Makefile: Don't use --version-script parameter to link ld.so unconditionally. 1998-01-02 04:19 Geoff Keating <geoffk@ozemail.com.au> * math/Makefile: Add t_exp. * math/libm-test.c: Tighten accuracy bounds for exp(), correct constants. * math/test-reduce.c: Remove temporarily, it seems to be broken. * sysdeps/libm-ieee754/e_exp.c: Use accurate table method. * sysdeps/libm-ieee754/e_expf.c: Use table & double precision for better accuracy. * sysdeps/libm-ieee754/s_exp2.c: Use better polynomial; correct algorithm for very large/very small arguments. * sysdeps/libm-ieee754/s_exp2f.c: Use slightly better polynomial; correct algorithm for very large/very small arguments; adjust for new table. * sysdeps/libm-ieee754/t_exp.c: New file. * sysdeps/libm-ieee754/t_exp2f.h: Use table with smaller deltas. * sysdeps/unix/sysv/linux/powerpc/dl-sysdep.c: Put 'strange test' back, with comment that explains what breaks when you remove it :-(. * localedata/xfrm-test.c: Avoid integer overflow. * stdlib/strfmon.c: char is unsigned, sometimes. *sysdeps/powerpc * sysdeps/powerpc/Makefile: Remove quad float support. * sysdeps/powerpc/q_*.c: Remove, they will become an add-on. * sysdeps/powerpc/quad_float.h: Likewise. * sysdeps/powerpc/test-arith.c: Likewise. * sysdeps/powerpc/test-arithf.c: Likewise. * sysdeps/generic/s_exp2.c: Remove, we have this implemented now. * sysdeps/generic/s_exp2f.c: Likewise. * sysdeps/powerpc/bits/mathinline.h: Use underscores around __asm__, don't try anything if _SOFT_FLOAT. 1997-12-31 Andreas Schwab <schwab@issan.informatik.uni-dortmund.de> * locale/C-ctype.c (_nl_C_LC_CTYPE_class32): Undo last change. * locale/programs/ld-ctype.c (CHAR_CLASS32_TRANS): Likewise. * wctype/wctype.c: Likewise. * wctype/wctype.h (_ISwxxx): Renamed from _ISxxx, all uses changed. They are incompatible with the _ISxxx values from <ctype.h> on little endian machines. (_ISwbit) [__BYTE_ORDER == __LITTLE_ENDIAN]: Correctly transform bit number. This fixes the real bug and restores the integrity of the ctype locale file. * wctype/wcfuncs.c: Change all _ISxxx to _ISwxxx. * wctype/wcfuncs_l.c: Likewise. * wctype/wcextra.c: Likewise. * wctype/wctype_l.c [__BYTE_ORDER == __LITTLE_ENDIAN]: Use correct byte swapping. 1998-02-09 Andreas Schwab <schwab@issan.informatik.uni-dortmund.de> * sysdeps/unix/sysv/linux/sparc/sparc32/sysdep.S (errno): Put it into .bss segment instead of .common, so that aliases on it work. * sysdeps/unix/sysv/linux/i386/sysdep.S (errno): Add .type and .size directives, put into .bss segment instead of initializing it to 4. 1998-02-12 08:00 H.J. Lu <hjl@gnu.org> * libc.map (gnu_get_libc_release, gnu_get_libc_version): Added. * version.c (__gnu_get_libc_release, __gnu_get_libc_version): New functions. Make names without __ weak aliases. (__libc_release, __libc_version): Make them static. * include/gnu/libc-version.h: New file. * Makefile (headers): Add gnu/libc-version.h. 1998-02-13 Ulrich Drepper <drepper@cygnus.com> * stdlib/stdlib.h (struct drand48_data): Leave X to user macros and use x for member name. Reported by Daniel Lyddy <daniell@cs.berkeley.edu>. * stdlib/drand48.c: Change according to member name change. * stdlib/drand48_r.c: Likewise. * stdlib/lcong48_r.c: Likewise. * stdlib/lrand48.c: Likewise. * stdlib/lrand48_r.c: Likewise. * stdlib/mrand48.c: Likewise. * stdlib/mrand48_r.c: Likewise. * stdlib/seed48.c: Likewise. * stdlib/seed48_r.c: Likewise. * stdlib/srand48_r.c: Likewise. 1998-02-11 Andreas Jaeger <aj@arthur.rhein-neckar.de> * nss/test-netdb.c: Add some more test cases. 1998-02-13 11:39 Ulrich Drepper <drepper@cygnus.com> * libio/iovsscanf.c: Undo last change modifying errno. 1998-02-12 Andreas Schwab <schwab@issan.informatik.uni-dortmund.de> * stdio-common/vfscanf.c: Never try to read another character after EOF. Don't decrement read_in after EOF, it wasn't incremented in the first place. (NEXT_WIDE_CHAR): Set First, not first. 1998-02-06 07:48 H.J. Lu <hjl@gnu.org> * db/Makefile ($(inst_libdir)/libndbm.a, $(inst_libdir)/libndbm.so): New targets. * db2/Makefile: Likewise. 1998-02-12 08:20 H.J. Lu <hjl@gnu.org> * sysdeps/gnu/errlist.awk (sys_errlist, sys_nerr): Create weak aliases if HAVE_ELF or PIC or DO_VERSIONING is not defined. 1998-02-12 Andreas Schwab <schwab@issan.informatik.uni-dortmund.de> * sysdeps/generic/_G_config.h: Define _G_wchar_t, for C++ <streambuf.h>. * sysdeps/unix/sysv/linux/_G_config.h: Likewise. 1998-02-11 Andreas Schwab <schwab@issan.informatik.uni-dortmund.de> * sysdeps/unix/make-syscalls.sh: Fix sed pattern when dealing with versioned symbols. 1998-02-13 08:14 H.J. Lu <hjl@gnu.org> * libc.map (_dl_global_scope, _dl_lookup_symbol_skip, _dl_lookup_versioned_symbol, _dl_lookup_versioned_symbol_skip): Added for libdl.so. * elf/rtld.map: New file. Needed to define the GLIBC_2.* * manual/socket.texi (Host Address Functions): Clarify description * sysdeps/unix/sysv/linux/alpha/bits/time.h (struct timeval):
1998-02-14 01:54:15 +08:00
float
__expf (float x)
{
Optimized generic expf and exp2f with wrappers Based on new expf and exp2f code from https://github.com/ARM-software/optimized-routines/ with wrapper on aarch64: expf reciprocal-throughput: 2.3x faster expf latency: 1.7x faster without wrapper on aarch64: expf reciprocal-throughput: 3.3x faster expf latency: 1.7x faster without wrapper on aarch64: exp2f reciprocal-throughput: 2.8x faster exp2f latency: 1.3x faster libm.so size on aarch64: .text size: -152 bytes .rodata size: -1740 bytes expf/exp2f worst case nearest rounding error: 0.502 ulp worst case non-nearest rounding error: 1 ulp Error checks are inline and errno setting is in separate tail called functions, but the wrappers are kept in this patch to handle the _LIB_VERSION==_SVID_ case. (So e.g. errno is set twice for expf calls and once for __expf_finite calls on targets where the new code is used.) Double precision arithmetics is used which is expected to be faster on most targets (including soft-float) than using single precision and it is easier to get good precision result with it. Const data is kept in a separate translation unit which complicates maintenance a bit, but is expected to give good code for literal loads on most targets and allows sharing data across expf, exp2f and powf. (This data is disabled on i386, m68k and ia64 which have their own expf, exp2f and powf code.) Some details may need target specific tweaks: - best convert and round to int operation in the arg reduction may be different across targets. - code was optimized on fma target, optimal polynomial eval may be different without fma. - gcc does not always generate good code for fp bit representation access via unions or it may be inherently slow on some targets. The libm-test-ulps will need adjustment because.. - The argument reduction ideally uses nearest rounded rint, but that is not efficient on most targets, so the polynomial can get evaluated on a wider interval in non-nearest rounding mode making 1 ulp errors common in that case. - The polynomial is evaluated such that it may have 1 ulp error on negative tiny inputs with upward rounding. * math/Makefile (type-float-routines): Add math_errf and e_exp2f_data. * sysdeps/aarch64/fpu/math_private.h (TOINT_INTRINSICS): Define. (roundtoint, converttoint): Likewise. * sysdeps/ieee754/flt-32/e_expf.c: New implementation. * sysdeps/ieee754/flt-32/e_exp2f.c: New implementation. * sysdeps/ieee754/flt-32/e_exp2f_data.c: New file. * sysdeps/ieee754/flt-32/math_config.h: New file. * sysdeps/ieee754/flt-32/math_errf.c: New file. * sysdeps/ieee754/flt-32/t_exp2f.h: Remove. * sysdeps/i386/fpu/e_exp2f_data.c: New file. * sysdeps/i386/fpu/math_errf.c: New file. * sysdeps/ia64/fpu/e_exp2f_data.c: New file. * sysdeps/ia64/fpu/math_errf.c: New file. * sysdeps/m68k/m680x0/fpu/e_exp2f_data.c: New file. * sysdeps/m68k/m680x0/fpu/math_errf.c: New file.
2017-09-07 00:42:00 +08:00
uint32_t abstop;
uint64_t ki, t;
/* double_t for better performance on targets with FLT_EVAL_METHOD==2. */
double_t kd, xd, z, r, r2, y, s;
xd = (double_t) x;
abstop = top12 (x) & 0x7ff;
if (__glibc_unlikely (abstop >= top12 (88.0f)))
Update. 1998-02-13 17:39 Ulrich Drepper <drepper@cygnus.com> * elf/Makefile: Don't use --version-script parameter to link ld.so unconditionally. 1998-01-02 04:19 Geoff Keating <geoffk@ozemail.com.au> * math/Makefile: Add t_exp. * math/libm-test.c: Tighten accuracy bounds for exp(), correct constants. * math/test-reduce.c: Remove temporarily, it seems to be broken. * sysdeps/libm-ieee754/e_exp.c: Use accurate table method. * sysdeps/libm-ieee754/e_expf.c: Use table & double precision for better accuracy. * sysdeps/libm-ieee754/s_exp2.c: Use better polynomial; correct algorithm for very large/very small arguments. * sysdeps/libm-ieee754/s_exp2f.c: Use slightly better polynomial; correct algorithm for very large/very small arguments; adjust for new table. * sysdeps/libm-ieee754/t_exp.c: New file. * sysdeps/libm-ieee754/t_exp2f.h: Use table with smaller deltas. * sysdeps/unix/sysv/linux/powerpc/dl-sysdep.c: Put 'strange test' back, with comment that explains what breaks when you remove it :-(. * localedata/xfrm-test.c: Avoid integer overflow. * stdlib/strfmon.c: char is unsigned, sometimes. *sysdeps/powerpc * sysdeps/powerpc/Makefile: Remove quad float support. * sysdeps/powerpc/q_*.c: Remove, they will become an add-on. * sysdeps/powerpc/quad_float.h: Likewise. * sysdeps/powerpc/test-arith.c: Likewise. * sysdeps/powerpc/test-arithf.c: Likewise. * sysdeps/generic/s_exp2.c: Remove, we have this implemented now. * sysdeps/generic/s_exp2f.c: Likewise. * sysdeps/powerpc/bits/mathinline.h: Use underscores around __asm__, don't try anything if _SOFT_FLOAT. 1997-12-31 Andreas Schwab <schwab@issan.informatik.uni-dortmund.de> * locale/C-ctype.c (_nl_C_LC_CTYPE_class32): Undo last change. * locale/programs/ld-ctype.c (CHAR_CLASS32_TRANS): Likewise. * wctype/wctype.c: Likewise. * wctype/wctype.h (_ISwxxx): Renamed from _ISxxx, all uses changed. They are incompatible with the _ISxxx values from <ctype.h> on little endian machines. (_ISwbit) [__BYTE_ORDER == __LITTLE_ENDIAN]: Correctly transform bit number. This fixes the real bug and restores the integrity of the ctype locale file. * wctype/wcfuncs.c: Change all _ISxxx to _ISwxxx. * wctype/wcfuncs_l.c: Likewise. * wctype/wcextra.c: Likewise. * wctype/wctype_l.c [__BYTE_ORDER == __LITTLE_ENDIAN]: Use correct byte swapping. 1998-02-09 Andreas Schwab <schwab@issan.informatik.uni-dortmund.de> * sysdeps/unix/sysv/linux/sparc/sparc32/sysdep.S (errno): Put it into .bss segment instead of .common, so that aliases on it work. * sysdeps/unix/sysv/linux/i386/sysdep.S (errno): Add .type and .size directives, put into .bss segment instead of initializing it to 4. 1998-02-12 08:00 H.J. Lu <hjl@gnu.org> * libc.map (gnu_get_libc_release, gnu_get_libc_version): Added. * version.c (__gnu_get_libc_release, __gnu_get_libc_version): New functions. Make names without __ weak aliases. (__libc_release, __libc_version): Make them static. * include/gnu/libc-version.h: New file. * Makefile (headers): Add gnu/libc-version.h. 1998-02-13 Ulrich Drepper <drepper@cygnus.com> * stdlib/stdlib.h (struct drand48_data): Leave X to user macros and use x for member name. Reported by Daniel Lyddy <daniell@cs.berkeley.edu>. * stdlib/drand48.c: Change according to member name change. * stdlib/drand48_r.c: Likewise. * stdlib/lcong48_r.c: Likewise. * stdlib/lrand48.c: Likewise. * stdlib/lrand48_r.c: Likewise. * stdlib/mrand48.c: Likewise. * stdlib/mrand48_r.c: Likewise. * stdlib/seed48.c: Likewise. * stdlib/seed48_r.c: Likewise. * stdlib/srand48_r.c: Likewise. 1998-02-11 Andreas Jaeger <aj@arthur.rhein-neckar.de> * nss/test-netdb.c: Add some more test cases. 1998-02-13 11:39 Ulrich Drepper <drepper@cygnus.com> * libio/iovsscanf.c: Undo last change modifying errno. 1998-02-12 Andreas Schwab <schwab@issan.informatik.uni-dortmund.de> * stdio-common/vfscanf.c: Never try to read another character after EOF. Don't decrement read_in after EOF, it wasn't incremented in the first place. (NEXT_WIDE_CHAR): Set First, not first. 1998-02-06 07:48 H.J. Lu <hjl@gnu.org> * db/Makefile ($(inst_libdir)/libndbm.a, $(inst_libdir)/libndbm.so): New targets. * db2/Makefile: Likewise. 1998-02-12 08:20 H.J. Lu <hjl@gnu.org> * sysdeps/gnu/errlist.awk (sys_errlist, sys_nerr): Create weak aliases if HAVE_ELF or PIC or DO_VERSIONING is not defined. 1998-02-12 Andreas Schwab <schwab@issan.informatik.uni-dortmund.de> * sysdeps/generic/_G_config.h: Define _G_wchar_t, for C++ <streambuf.h>. * sysdeps/unix/sysv/linux/_G_config.h: Likewise. 1998-02-11 Andreas Schwab <schwab@issan.informatik.uni-dortmund.de> * sysdeps/unix/make-syscalls.sh: Fix sed pattern when dealing with versioned symbols. 1998-02-13 08:14 H.J. Lu <hjl@gnu.org> * libc.map (_dl_global_scope, _dl_lookup_symbol_skip, _dl_lookup_versioned_symbol, _dl_lookup_versioned_symbol_skip): Added for libdl.so. * elf/rtld.map: New file. Needed to define the GLIBC_2.* * manual/socket.texi (Host Address Functions): Clarify description * sysdeps/unix/sysv/linux/alpha/bits/time.h (struct timeval):
1998-02-14 01:54:15 +08:00
{
Optimized generic expf and exp2f with wrappers Based on new expf and exp2f code from https://github.com/ARM-software/optimized-routines/ with wrapper on aarch64: expf reciprocal-throughput: 2.3x faster expf latency: 1.7x faster without wrapper on aarch64: expf reciprocal-throughput: 3.3x faster expf latency: 1.7x faster without wrapper on aarch64: exp2f reciprocal-throughput: 2.8x faster exp2f latency: 1.3x faster libm.so size on aarch64: .text size: -152 bytes .rodata size: -1740 bytes expf/exp2f worst case nearest rounding error: 0.502 ulp worst case non-nearest rounding error: 1 ulp Error checks are inline and errno setting is in separate tail called functions, but the wrappers are kept in this patch to handle the _LIB_VERSION==_SVID_ case. (So e.g. errno is set twice for expf calls and once for __expf_finite calls on targets where the new code is used.) Double precision arithmetics is used which is expected to be faster on most targets (including soft-float) than using single precision and it is easier to get good precision result with it. Const data is kept in a separate translation unit which complicates maintenance a bit, but is expected to give good code for literal loads on most targets and allows sharing data across expf, exp2f and powf. (This data is disabled on i386, m68k and ia64 which have their own expf, exp2f and powf code.) Some details may need target specific tweaks: - best convert and round to int operation in the arg reduction may be different across targets. - code was optimized on fma target, optimal polynomial eval may be different without fma. - gcc does not always generate good code for fp bit representation access via unions or it may be inherently slow on some targets. The libm-test-ulps will need adjustment because.. - The argument reduction ideally uses nearest rounded rint, but that is not efficient on most targets, so the polynomial can get evaluated on a wider interval in non-nearest rounding mode making 1 ulp errors common in that case. - The polynomial is evaluated such that it may have 1 ulp error on negative tiny inputs with upward rounding. * math/Makefile (type-float-routines): Add math_errf and e_exp2f_data. * sysdeps/aarch64/fpu/math_private.h (TOINT_INTRINSICS): Define. (roundtoint, converttoint): Likewise. * sysdeps/ieee754/flt-32/e_expf.c: New implementation. * sysdeps/ieee754/flt-32/e_exp2f.c: New implementation. * sysdeps/ieee754/flt-32/e_exp2f_data.c: New file. * sysdeps/ieee754/flt-32/math_config.h: New file. * sysdeps/ieee754/flt-32/math_errf.c: New file. * sysdeps/ieee754/flt-32/t_exp2f.h: Remove. * sysdeps/i386/fpu/e_exp2f_data.c: New file. * sysdeps/i386/fpu/math_errf.c: New file. * sysdeps/ia64/fpu/e_exp2f_data.c: New file. * sysdeps/ia64/fpu/math_errf.c: New file. * sysdeps/m68k/m680x0/fpu/e_exp2f_data.c: New file. * sysdeps/m68k/m680x0/fpu/math_errf.c: New file.
2017-09-07 00:42:00 +08:00
/* |x| >= 88 or x is nan. */
if (asuint (x) == asuint (-INFINITY))
return 0.0f;
if (abstop >= top12 (INFINITY))
return x + x;
if (x > 0x1.62e42ep6f) /* x > log(0x1p128) ~= 88.72 */
return __math_oflowf (0);
if (x < -0x1.9fe368p6f) /* x < log(0x1p-150) ~= -103.97 */
return __math_uflowf (0);
#if WANT_ERRNO_UFLOW
if (x < -0x1.9d1d9ep6f) /* x < log(0x1p-149) ~= -103.28 */
return __math_may_uflowf (0);
#endif
Update. 1998-02-13 17:39 Ulrich Drepper <drepper@cygnus.com> * elf/Makefile: Don't use --version-script parameter to link ld.so unconditionally. 1998-01-02 04:19 Geoff Keating <geoffk@ozemail.com.au> * math/Makefile: Add t_exp. * math/libm-test.c: Tighten accuracy bounds for exp(), correct constants. * math/test-reduce.c: Remove temporarily, it seems to be broken. * sysdeps/libm-ieee754/e_exp.c: Use accurate table method. * sysdeps/libm-ieee754/e_expf.c: Use table & double precision for better accuracy. * sysdeps/libm-ieee754/s_exp2.c: Use better polynomial; correct algorithm for very large/very small arguments. * sysdeps/libm-ieee754/s_exp2f.c: Use slightly better polynomial; correct algorithm for very large/very small arguments; adjust for new table. * sysdeps/libm-ieee754/t_exp.c: New file. * sysdeps/libm-ieee754/t_exp2f.h: Use table with smaller deltas. * sysdeps/unix/sysv/linux/powerpc/dl-sysdep.c: Put 'strange test' back, with comment that explains what breaks when you remove it :-(. * localedata/xfrm-test.c: Avoid integer overflow. * stdlib/strfmon.c: char is unsigned, sometimes. *sysdeps/powerpc * sysdeps/powerpc/Makefile: Remove quad float support. * sysdeps/powerpc/q_*.c: Remove, they will become an add-on. * sysdeps/powerpc/quad_float.h: Likewise. * sysdeps/powerpc/test-arith.c: Likewise. * sysdeps/powerpc/test-arithf.c: Likewise. * sysdeps/generic/s_exp2.c: Remove, we have this implemented now. * sysdeps/generic/s_exp2f.c: Likewise. * sysdeps/powerpc/bits/mathinline.h: Use underscores around __asm__, don't try anything if _SOFT_FLOAT. 1997-12-31 Andreas Schwab <schwab@issan.informatik.uni-dortmund.de> * locale/C-ctype.c (_nl_C_LC_CTYPE_class32): Undo last change. * locale/programs/ld-ctype.c (CHAR_CLASS32_TRANS): Likewise. * wctype/wctype.c: Likewise. * wctype/wctype.h (_ISwxxx): Renamed from _ISxxx, all uses changed. They are incompatible with the _ISxxx values from <ctype.h> on little endian machines. (_ISwbit) [__BYTE_ORDER == __LITTLE_ENDIAN]: Correctly transform bit number. This fixes the real bug and restores the integrity of the ctype locale file. * wctype/wcfuncs.c: Change all _ISxxx to _ISwxxx. * wctype/wcfuncs_l.c: Likewise. * wctype/wcextra.c: Likewise. * wctype/wctype_l.c [__BYTE_ORDER == __LITTLE_ENDIAN]: Use correct byte swapping. 1998-02-09 Andreas Schwab <schwab@issan.informatik.uni-dortmund.de> * sysdeps/unix/sysv/linux/sparc/sparc32/sysdep.S (errno): Put it into .bss segment instead of .common, so that aliases on it work. * sysdeps/unix/sysv/linux/i386/sysdep.S (errno): Add .type and .size directives, put into .bss segment instead of initializing it to 4. 1998-02-12 08:00 H.J. Lu <hjl@gnu.org> * libc.map (gnu_get_libc_release, gnu_get_libc_version): Added. * version.c (__gnu_get_libc_release, __gnu_get_libc_version): New functions. Make names without __ weak aliases. (__libc_release, __libc_version): Make them static. * include/gnu/libc-version.h: New file. * Makefile (headers): Add gnu/libc-version.h. 1998-02-13 Ulrich Drepper <drepper@cygnus.com> * stdlib/stdlib.h (struct drand48_data): Leave X to user macros and use x for member name. Reported by Daniel Lyddy <daniell@cs.berkeley.edu>. * stdlib/drand48.c: Change according to member name change. * stdlib/drand48_r.c: Likewise. * stdlib/lcong48_r.c: Likewise. * stdlib/lrand48.c: Likewise. * stdlib/lrand48_r.c: Likewise. * stdlib/mrand48.c: Likewise. * stdlib/mrand48_r.c: Likewise. * stdlib/seed48.c: Likewise. * stdlib/seed48_r.c: Likewise. * stdlib/srand48_r.c: Likewise. 1998-02-11 Andreas Jaeger <aj@arthur.rhein-neckar.de> * nss/test-netdb.c: Add some more test cases. 1998-02-13 11:39 Ulrich Drepper <drepper@cygnus.com> * libio/iovsscanf.c: Undo last change modifying errno. 1998-02-12 Andreas Schwab <schwab@issan.informatik.uni-dortmund.de> * stdio-common/vfscanf.c: Never try to read another character after EOF. Don't decrement read_in after EOF, it wasn't incremented in the first place. (NEXT_WIDE_CHAR): Set First, not first. 1998-02-06 07:48 H.J. Lu <hjl@gnu.org> * db/Makefile ($(inst_libdir)/libndbm.a, $(inst_libdir)/libndbm.so): New targets. * db2/Makefile: Likewise. 1998-02-12 08:20 H.J. Lu <hjl@gnu.org> * sysdeps/gnu/errlist.awk (sys_errlist, sys_nerr): Create weak aliases if HAVE_ELF or PIC or DO_VERSIONING is not defined. 1998-02-12 Andreas Schwab <schwab@issan.informatik.uni-dortmund.de> * sysdeps/generic/_G_config.h: Define _G_wchar_t, for C++ <streambuf.h>. * sysdeps/unix/sysv/linux/_G_config.h: Likewise. 1998-02-11 Andreas Schwab <schwab@issan.informatik.uni-dortmund.de> * sysdeps/unix/make-syscalls.sh: Fix sed pattern when dealing with versioned symbols. 1998-02-13 08:14 H.J. Lu <hjl@gnu.org> * libc.map (_dl_global_scope, _dl_lookup_symbol_skip, _dl_lookup_versioned_symbol, _dl_lookup_versioned_symbol_skip): Added for libdl.so. * elf/rtld.map: New file. Needed to define the GLIBC_2.* * manual/socket.texi (Host Address Functions): Clarify description * sysdeps/unix/sysv/linux/alpha/bits/time.h (struct timeval):
1998-02-14 01:54:15 +08:00
}
Optimized generic expf and exp2f with wrappers Based on new expf and exp2f code from https://github.com/ARM-software/optimized-routines/ with wrapper on aarch64: expf reciprocal-throughput: 2.3x faster expf latency: 1.7x faster without wrapper on aarch64: expf reciprocal-throughput: 3.3x faster expf latency: 1.7x faster without wrapper on aarch64: exp2f reciprocal-throughput: 2.8x faster exp2f latency: 1.3x faster libm.so size on aarch64: .text size: -152 bytes .rodata size: -1740 bytes expf/exp2f worst case nearest rounding error: 0.502 ulp worst case non-nearest rounding error: 1 ulp Error checks are inline and errno setting is in separate tail called functions, but the wrappers are kept in this patch to handle the _LIB_VERSION==_SVID_ case. (So e.g. errno is set twice for expf calls and once for __expf_finite calls on targets where the new code is used.) Double precision arithmetics is used which is expected to be faster on most targets (including soft-float) than using single precision and it is easier to get good precision result with it. Const data is kept in a separate translation unit which complicates maintenance a bit, but is expected to give good code for literal loads on most targets and allows sharing data across expf, exp2f and powf. (This data is disabled on i386, m68k and ia64 which have their own expf, exp2f and powf code.) Some details may need target specific tweaks: - best convert and round to int operation in the arg reduction may be different across targets. - code was optimized on fma target, optimal polynomial eval may be different without fma. - gcc does not always generate good code for fp bit representation access via unions or it may be inherently slow on some targets. The libm-test-ulps will need adjustment because.. - The argument reduction ideally uses nearest rounded rint, but that is not efficient on most targets, so the polynomial can get evaluated on a wider interval in non-nearest rounding mode making 1 ulp errors common in that case. - The polynomial is evaluated such that it may have 1 ulp error on negative tiny inputs with upward rounding. * math/Makefile (type-float-routines): Add math_errf and e_exp2f_data. * sysdeps/aarch64/fpu/math_private.h (TOINT_INTRINSICS): Define. (roundtoint, converttoint): Likewise. * sysdeps/ieee754/flt-32/e_expf.c: New implementation. * sysdeps/ieee754/flt-32/e_exp2f.c: New implementation. * sysdeps/ieee754/flt-32/e_exp2f_data.c: New file. * sysdeps/ieee754/flt-32/math_config.h: New file. * sysdeps/ieee754/flt-32/math_errf.c: New file. * sysdeps/ieee754/flt-32/t_exp2f.h: Remove. * sysdeps/i386/fpu/e_exp2f_data.c: New file. * sysdeps/i386/fpu/math_errf.c: New file. * sysdeps/ia64/fpu/e_exp2f_data.c: New file. * sysdeps/ia64/fpu/math_errf.c: New file. * sysdeps/m68k/m680x0/fpu/e_exp2f_data.c: New file. * sysdeps/m68k/m680x0/fpu/math_errf.c: New file.
2017-09-07 00:42:00 +08:00
/* x*N/Ln2 = k + r with r in [-1/2, 1/2] and int k. */
z = InvLn2N * xd;
/* Round and convert z to int, the result is in [-150*N, 128*N] and
ideally ties-to-even rule is used, otherwise the magnitude of r
can be bigger which gives larger approximation error. */
#if TOINT_INTRINSICS
kd = roundtoint (z);
ki = converttoint (z);
Clean up converttoint handling and document the semantics This patch currently only affects aarch64. The roundtoint and converttoint internal functions are only called with small values, so 32 bit result is enough for converttoint and it is a signed int conversion so the return type is changed to int32_t. The original idea was to help the compiler keeping the result in uint64_t, then it's clear that no sign extension is needed and there is no accidental undefined or implementation defined signed int arithmetics. But it turns out gcc does a good job with inlining so changing the type has no overhead and the semantics of the conversion is less surprising this way. Since we want to allow the asuint64 (x + 0x1.8p52) style conversion, the top bits were never usable and the existing code ensures that only the bottom 32 bits of the conversion result are used. On aarch64 the neon intrinsics (which round ties to even) are changed to round and lround (which round ties away from zero) this does not affect the results in a significant way, but more portable (relies on round and lround being inlined which works with -fno-math-errno). The TOINT_SHIFT and TOINT_RINT macros were removed, only keep separate code paths for TOINT_INTRINSICS and !TOINT_INTRINSICS. * sysdeps/aarch64/fpu/math_private.h (roundtoint): Use round. (converttoint): Use lround. * sysdeps/ieee754/flt-32/math_config.h (roundtoint): Declare and document the semantics when TOINT_INTRINSICS is set. (converttoint): Likewise. (TOINT_RINT): Remove. (TOINT_SHIFT): Remove. * sysdeps/ieee754/flt-32/e_expf.c (__expf): Remove the TOINT_RINT code path.
2018-07-04 19:29:29 +08:00
#else
Optimized generic expf and exp2f with wrappers Based on new expf and exp2f code from https://github.com/ARM-software/optimized-routines/ with wrapper on aarch64: expf reciprocal-throughput: 2.3x faster expf latency: 1.7x faster without wrapper on aarch64: expf reciprocal-throughput: 3.3x faster expf latency: 1.7x faster without wrapper on aarch64: exp2f reciprocal-throughput: 2.8x faster exp2f latency: 1.3x faster libm.so size on aarch64: .text size: -152 bytes .rodata size: -1740 bytes expf/exp2f worst case nearest rounding error: 0.502 ulp worst case non-nearest rounding error: 1 ulp Error checks are inline and errno setting is in separate tail called functions, but the wrappers are kept in this patch to handle the _LIB_VERSION==_SVID_ case. (So e.g. errno is set twice for expf calls and once for __expf_finite calls on targets where the new code is used.) Double precision arithmetics is used which is expected to be faster on most targets (including soft-float) than using single precision and it is easier to get good precision result with it. Const data is kept in a separate translation unit which complicates maintenance a bit, but is expected to give good code for literal loads on most targets and allows sharing data across expf, exp2f and powf. (This data is disabled on i386, m68k and ia64 which have their own expf, exp2f and powf code.) Some details may need target specific tweaks: - best convert and round to int operation in the arg reduction may be different across targets. - code was optimized on fma target, optimal polynomial eval may be different without fma. - gcc does not always generate good code for fp bit representation access via unions or it may be inherently slow on some targets. The libm-test-ulps will need adjustment because.. - The argument reduction ideally uses nearest rounded rint, but that is not efficient on most targets, so the polynomial can get evaluated on a wider interval in non-nearest rounding mode making 1 ulp errors common in that case. - The polynomial is evaluated such that it may have 1 ulp error on negative tiny inputs with upward rounding. * math/Makefile (type-float-routines): Add math_errf and e_exp2f_data. * sysdeps/aarch64/fpu/math_private.h (TOINT_INTRINSICS): Define. (roundtoint, converttoint): Likewise. * sysdeps/ieee754/flt-32/e_expf.c: New implementation. * sysdeps/ieee754/flt-32/e_exp2f.c: New implementation. * sysdeps/ieee754/flt-32/e_exp2f_data.c: New file. * sysdeps/ieee754/flt-32/math_config.h: New file. * sysdeps/ieee754/flt-32/math_errf.c: New file. * sysdeps/ieee754/flt-32/t_exp2f.h: Remove. * sysdeps/i386/fpu/e_exp2f_data.c: New file. * sysdeps/i386/fpu/math_errf.c: New file. * sysdeps/ia64/fpu/e_exp2f_data.c: New file. * sysdeps/ia64/fpu/math_errf.c: New file. * sysdeps/m68k/m680x0/fpu/e_exp2f_data.c: New file. * sysdeps/m68k/m680x0/fpu/math_errf.c: New file.
2017-09-07 00:42:00 +08:00
# define SHIFT __exp2f_data.shift
kd = math_narrow_eval ((double) (z + SHIFT)); /* Needs to be double. */
ki = asuint64 (kd);
kd -= SHIFT;
#endif
r = z - kd;
/* exp(x) = 2^(k/N) * 2^(r/N) ~= s * (C0*r^3 + C1*r^2 + C2*r + 1) */
t = T[ki % N];
t += ki << (52 - EXP2F_TABLE_BITS);
s = asdouble (t);
z = C[0] * r + C[1];
r2 = r * r;
y = C[2] * r + 1;
y = z * r2 + y;
y = y * s;
return (float) y;
}
#ifndef __expf
hidden_def (__expf)
strong_alias (__expf, __ieee754_expf)
libm_alias_finite (__ieee754_expf, __expf)
versioned_symbol (libm, __expf, expf, GLIBC_2_27);
Add libm_alias_*_other_r macros. Some libm functions are unable to use the generic alias macros such as libm_alias_double because they have special symbol versioning requirements for the main float, double or long double public names. To facilitate adding _FloatN / _FloatNx function aliases in future, it's still desirable to have generic macros those functions can use as far as possible. This patch adds macros such as libm_alias_double_other, which only define names for _FloatN / _FloatNx aliases, not for float / double / long double. As present, all these new macros do nothing, but they are called in the appropriate places in macros such as libm_alias_double. This patch also arranges for lgamma implementations, and the recently added optimized float function implementations, to use the new macros to make them ready for addition of _FloatN / _FloatNx aliases. Tested for x86_64, and tested with build-many-glibcs.py that installed stripped shared libraries are unchanged by this patch. * sysdeps/generic/libm-alias-double.h (libm_alias_double_other_r): New macro. (libm_alias_double_other): Likewise. (libm_alias_double_r): Use libm_alias_double_other_r. * sysdeps/generic/libm-alias-float.h (libm_alias_float_other_r): New macro. (libm_alias_float_other): Likewise. (libm_alias_float_r): Use libm_alias_float_other_r. * sysdeps/generic/libm-alias-float128.h (libm_alias_float128_other_r): New macro. (libm_alias_float128_other): Likewise. (libm_alias_float128_r): Use libm_alias_float128_other_r. * sysdeps/generic/libm-alias-ldouble.h (libm_alias_ldouble_other_r): New macro. (libm_alias_ldouble_other): Likewise. (libm_alias_ldouble_r): Use libm_alias_ldouble_other_r. * sysdeps/ieee754/ldbl-opt/libm-alias-double.h (libm_alias_double_other_r): New macro. (libm_alias_double_other): Likewise. (libm_alias_double_r): Use libm_alias_double_other_r. * sysdeps/ieee754/ldbl-opt/libm-alias-ldouble.h (libm_alias_ldouble_other_r): New macro. (libm_alias_ldouble_other): Likewise. (libm_alias_ldouble_r): Use libm_alias_ldouble_other_r. * math/w_lgamma_main.c: Include <libm-alias-double.h>. [!USE_AS_COMPAT]: Use libm_alias_double_other. * math/w_lgammaf_main.c: Include <libm-alias-float.h>. [!USE_AS_COMPAT]: Use libm_alias_float_other. * math/w_lgammal_main.c: Include <libm-alias-ldouble.h>. [!USE_AS_COMPAT]: Use libm_alias_ldouble_other. * math/w_exp2f.c: Use libm_alias_float_other. * math/w_expf.c: Likewise. * math/w_log2f.c: Likewise. * math/w_logf.c: Likewise. * math/w_powf.c: Likewise. * sysdeps/ieee754/flt-32/e_exp2f.c: Include <libm-alias-float.h>. [!__exp2f]: Use libm_alias_float_other. * sysdeps/ieee754/flt-32/e_expf.c: Include <libm-alias-float.h>. [!__expf]: Use libm_alias_float_other. * sysdeps/ieee754/flt-32/e_log2f.c: Include <libm-alias-float.h>. [!__log2f]: Use libm_alias_float_other. * sysdeps/ieee754/flt-32/e_logf.c: Include <libm-alias-float.h>. [!__logf]: Use libm_alias_float_other. * sysdeps/ieee754/flt-32/e_powf.c: Include <libm-alias-float.h>. [!__powf]: Use libm_alias_float_other.
2017-10-11 05:29:11 +08:00
libm_alias_float_other (__exp, exp)
#endif