This patch adds _Float128 support to the ldbl-96 bits/iscanonical.h,
as needed for x86_64 / x86 / ia64 support of _Float128.
Tested for x86_64 (in conjunction with float128 patches).
* sysdeps/ieee754/ldbl-96/bits/iscanonical.h
[__HAVE_DISTINCT_FLOAT128] (__iscanonicalf128): New macro.
This patch makes math-tests.h, as used to describe support of given
floating-point types for sNaNs, rounding modes and exceptions, handle
distinguishing _Float128 from long double. This is needed for x86_64,
where if building with GCC 6 or earlier there is no __builtin_nansq,
so no way to get a signaling NaN of _Float128 type, so associated
tests cannot be run (although glibc itself works fine, as there is
never any need to create such an sNaN with a built-in function inside
glibc).
Tested for x86_64 (in conjunction with float128 patches).
* sysdeps/generic/math-tests.h: Include <bits/floatn.h>.
(MATH_TESTS_TG): New macro.
(SNAN_TESTS_float128): Likewise.
(ROUNDING_TESTS_float128): Likewise.
(EXCEPTION_TESTS_float128): Likewise.
(SNAN_TESTS): Define using MATH_TESTS_TG.
(ROUNDING_TESTS): Likewise.
(EXCEPTION_TESTS): Likewise.
As with other long double identifiers, float128_private.h has a
redefinition of SET_RESTORE_ROUNDL. However, that redefinition is
broken, since this is a macro with one argument being defined to take
no arguments. This patch fixes the redefinition. (x86_64 needs the
redefinition because SET_RESTORE_ROUNDL only changes the x87 rounding
mode, whereas _Float128 arithmetic uses the SSE rounding mode instead
on x86_64.)
Tested for x86_64 (in conjunction with float128 patches).
* sysdeps/ieee754/float128/float128_private.h
[SET_RESTORE_ROUNDF128] (SET_RESTORE_ROUNDL): Take an argument and
pass it to SET_RESTORE_ROUNDF128.
float128_private.h redefines ieee754.h identifiers ieee854_long_double
and IEEE854_LONG_DOUBLE_BIAS to map them to identifiers from
ieee754_float128.h.
This causes problems when ieee754.h is included after
float128_private.h and it's a version of ieee754.h that also defines
those identifiers; specifically, sysdeps/ieee754/ieee754.h, which
defines those identifiers for the x86 extended format. This patch
fixes this by ensuring an include of ieee754.h from float128_private.h
before the redefinitions.
Tested for x86_64 (in conjunction with float128 patches).
* sysdeps/ieee754/float128/float128_private.h: Include
<ieee754.h>.
The math_private.h macro min_of_type has broken _Float128 handling:
instead of passing its type argument to the key __EXPR_FLT128 macro,
it passes x, which is not a macro argument but whatever variable
called x happens to be visible in the calling function. If that
variable has the wrong type, the wrong one of long double and
_Float128 can get chosen. In particular, this applies to some
_Complex long double functions (where x happens to have type _Complex
long double, resulting in min_of_type returning a _Float128 value when
it should return a long double value). For some reason, this only
caused test failures for me on x86_64 with GCC 6 but not GCC 7 (I
suspect it triggers known bugs in conversions from x86 long double to
_Float128 that are present in GCC 6's soft-fp).
Tested for x86_64 (in conjunction with float128 patches).
* sysdeps/generic/math_private.h (__EXPR_FLT128): Do not apply
typeof to argument passed to __builtin_types_compatible_p.
(min_of_type): Pass type argument, not x, to __EXPR_FLT128.
Various type-generic libm wrapper templates, as used for float128, set
errno but do not include errno.h. I presume they must get an implicit
include from some internal header on powerpc64le; they don't get such
an implicit include on x86_64. This patch adds the missing includes
of errno.h to each such wrapper.
Tested for x86_64 (in conjunction with float128 patches).
* math/w_acos_template.c [__USE_WRAPPER_TEMPLATE]: Include
<errno.h>.
* math/w_acosh_template.c [__USE_WRAPPER_TEMPLATE]: Likewise.
* math/w_asin_template.c [__USE_WRAPPER_TEMPLATE]: Likewise.
* math/w_atanh_template.c [__USE_WRAPPER_TEMPLATE]: Likewise.
* math/w_cosh_template.c [__USE_WRAPPER_TEMPLATE]: Likewise.
* math/w_exp10_template.c [__USE_WRAPPER_TEMPLATE]: Likewise.
* math/w_exp2_template.c [__USE_WRAPPER_TEMPLATE]: Likewise.
* math/w_exp_template.c [__USE_WRAPPER_TEMPLATE]: Likewise.
* math/w_fmod_template.c [__USE_WRAPPER_TEMPLATE]: Likewise.
* math/w_hypot_template.c [__USE_WRAPPER_TEMPLATE]: Likewise.
* math/w_j0_template.c [__USE_WRAPPER_TEMPLATE]: Likewise.
* math/w_j1_template.c [__USE_WRAPPER_TEMPLATE]: Likewise.
* math/w_jn_template.c [__USE_WRAPPER_TEMPLATE]: Likewise.
* math/w_lgamma_r_template.c [__USE_WRAPPER_TEMPLATE]: Likewise.
* math/w_lgamma_template.c [__USE_WRAPPER_TEMPLATE]: Likewise.
* math/w_log10_template.c [__USE_WRAPPER_TEMPLATE]: Likewise.
* math/w_log2_template.c [__USE_WRAPPER_TEMPLATE]: Likewise.
* math/w_log_template.c [__USE_WRAPPER_TEMPLATE]: Likewise.
* math/w_pow_template.c [__USE_WRAPPER_TEMPLATE]: Likewise.
* math/w_remainder_template.c [__USE_WRAPPER_TEMPLATE]: Likewise.
* math/w_sinh_template.c [__USE_WRAPPER_TEMPLATE]: Likewise.
* math/w_sqrt_template.c [__USE_WRAPPER_TEMPLATE]: Likewise.
* math/w_tgamma_template.c [__USE_WRAPPER_TEMPLATE]: Likewise.
Three float128 files still include xlocale.h after it was removed. I
don't know why this didn't cause problems for powerpc64le float128
testing; it did cause problems for my x86_64 float128 testing. This
patch changes the includes to use bits/types/locale_t.h.
Tested for x86_64 (in conjunction with float128 patches).
* sysdeps/ieee754/float128/strtof128_l.c: Include
<bits/types/locale_t.h> instead of <xlocale.h>.
* sysdeps/ieee754/float128/wcstof128.c: Likewise.
* sysdeps/ieee754/float128/wcstof128_l.c: Likewise.
Read the memcpy results in json and print out the results in tabular
form, in addition to generating a graph of the results to compare all
of the implementations.
The format of the output is extensible enough to allow this kind of
analysis to be done on other string functions as well.
* benchtests/scripts/benchout_strings.schema.json: New file.
* benchtests/scripts/compare_strings.py: New file.
Print the benchmark output for various memcpy benchmarks in json so
that it can be predictably parsed and analyzed.
* benchtests/bench-memcpy-large.c: Include json-lib.h.
(do_one_test): Print json.
(do_test): Likewise.
(test_main): Likewise.
* benchtests/bench-memcpy-random.c: Include json-lib.h.
(do_one_test): Print json.
(do_test): Likewise.
(test_main): Likewise.
* benchtests/bench-memcpy.c: Include json-lib.h.
(do_one_test): Print json.
(do_test): Likewise.
(test_main): Likewise.
Enhance the json module in benchtests to print signed and unsigned
integers and string array elements.
* benchtests/json-lib.h: Include inttypes.h.
(json_attr_int, json_attr_int, json_element_string,
json_element_int, json_element_uint): New functions.
* benchtests/json-lib.c: (json_attr_int, json_attr_int,
json_element_string, json_element_int, json_element_uint): New
functions.
In preparation for the documentation of _FloatN and _FloatNx variants of
the remainder function, this patch changes the descriptions of remainder
and drem, so that remainder is described as primary and drem as an
alternative name for the same functionality.
* manual/arith.texi (Remainder Functions): Describe remainder as
primary and drem as an alternative name. Change the comment on
remainder to ISO, since it is defined in ISO C99.
The macro F128 in stdlib/tst-strtod.h is defined to provide the literal
suffix for _Float128 constants. It uses the macro __f128 (), which is
defined in bits/floatn.h to provide the correct literal suffix depending on
what is provided by the compiler.
However, F128 was not being expanded and only worked correctly, when
compiling with GCC 7 (or greater), since F128 is the literal suffix itself.
This patch adds an additional macro expansion so that the macro F128
expands to the correct literal suffix on older compilers.
* stdlib/tst-strtod.h (MMFUNC): New macro to provide an addition
macro expansion.
(GEN_TEST_STRTOD_FOREACH): Use MMFUNC for _Float128.
* Unicode 10.0.0 Support: Character encoding, character type info, and
transliteration tables are all updated to Unicode 10.0.0, using
generator scripts contributed by Mike FABIAN (Red Hat).
This commit adds fixed-size allocation buffers. The primary use
case is in NSS modules, where dynamically sized data is stored
in a fixed-size buffer provided by the caller.
Other uses include a replacement of mempcpy cascades (which is
safer due to the size checking inherent to allocation buffers).
Implement strcmp family IFUNC selectors in C.
All internal calls within libc.so can use IFUNC on x86-64 since unlike
x86, x86-64 supports PC-relative addressing to access the GOT entry so
that it can call via PLT without using an extra register. For libc.a,
we can't use IFUNC for functions which are called before IFUNC has been
initialized. Use IFUNC internally reduces the icache footprint since
libc.so and other codes in the process use the same implementations.
This patch uses IFUNC for strcmp family functions within libc.
* sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add
strcmp-sse2, strcmp-sse4_2, strncmp-sse2, strncmp-sse4_2,
strcasecmp_l-sse2, strcasecmp_l-sse4_2, strcasecmp_l-avx,
strncase_l-sse2, strncase_l-sse4_2 and strncase_l-avx.
* sysdeps/x86_64/multiarch/ifunc-strcasecmp.h: New file.
* sysdeps/x86_64/multiarch/strcasecmp.c: Likewise.
* sysdeps/x86_64/multiarch/strcasecmp_l-avx.S: Likewise.
* sysdeps/x86_64/multiarch/strcasecmp_l-sse2.S: Likewise.
* sysdeps/x86_64/multiarch/strcasecmp_l-sse4_2.S: Likewise.
* sysdeps/x86_64/multiarch/strcasecmp_l.c: Likewise.
* sysdeps/x86_64/multiarch/strcmp-sse2.S: Likewise.
* sysdeps/x86_64/multiarch/strcmp-sse4_2.S: Likewise.
* sysdeps/x86_64/multiarch/strcmp.c: Likewise.
* sysdeps/x86_64/multiarch/strncase.c: Likewise.
* sysdeps/x86_64/multiarch/strncase_l-avx.S : Likewise.
* sysdeps/x86_64/multiarch/strncase_l-sse2.S: Likewise.
* sysdeps/x86_64/multiarch/strncase_l-sse4_2.S: Likewise.
* sysdeps/x86_64/multiarch/strncase_l.c: Likewise.
* sysdeps/x86_64/multiarch/strncmp-sse2.S: Likewise.
* sysdeps/x86_64/multiarch/strncmp-sse4_2.S: Likewise.
* sysdeps/x86_64/multiarch/strncmp.c: Likewise.
* sysdeps/x86_64/multiarch/strcasecmp_l.S: Removed.
* sysdeps/x86_64/multiarch/strcmp.S: Likewise.
* sysdeps/x86_64/multiarch/strncase_l.S: Likewise.
* sysdeps/x86_64/multiarch/strncmp.S: Likewise.
* sysdeps/x86_64/multiarch/strcmp-sse42.S: Include <sysdep.h>.
(STRCMP_SSE42): New. Defined to __strcmp_sse42 if not defined.
[USE_AS_STRCASECMP_L || USE_AS_STRNCASECMP_L]: Include
"locale-defines.h".
(UPDATE_STRNCMP_COUNTER): New.
(SECTION): Likewise.
(GLABEL): Likewise.
(LABEL): Likewise.
* sysdeps/x86_64/multiarch/strncmp-ssse3.S: Rewrite and enable
for libc.a.
As shown by conform/ tests once the remaining namespace issues are
fixed, the tile bits/sigaction.h fails to declare SA_RESETHAND,
SA_RESTART and SA_NODEFER for non-XSI POSIX.1:2008 as other versions
do. Those constants were moved from XSI to Base in the 2008 edition
of POSIX. This patch fixes the conditions to match other versions of
this header.
Tested (compilation only) for tilegx-linux-gnu with
build-many-glibcs.py.
[BZ #21622]
* sysdeps/unix/sysv/linux/tile/bits/sigaction.h (SA_RESTART):
Define for [__USE_UNIX98 || __USE_XOPEN2K8], not [__USE_UNIX98 ||
__USE_MISC].
(SA_NODEFER): Likewise.
(SA_RESETHAND): Likewise.
Rename glibc.tune.ifunc to glibc.tune.hwcaps and move it to
sysdeps/x86/dl-tunables.list since it is x86 specicifc. Also
change type of data_cache_size, data_cache_size and
non_temporal_threshold to unsigned long int to match size_t.
Remove usage DEFAULT_STRLEN from cpu-tunables.c.
* elf/dl-tunables.list (glibc.tune.ifunc): Removed.
* sysdeps/x86/dl-tunables.list (glibc.tune.hwcaps): New.
Remove security_level on all fields.
* manual/tunables.texi: Replace ifunc with hwcaps.
* sysdeps/x86/cpu-features.c (TUNABLE_CALLBACK (set_ifunc)):
Renamed to ..
(TUNABLE_CALLBACK (set_hwcaps)): This.
(init_cpu_features): Updated.
* sysdeps/x86/cpu-features.h (cpu_features): Change type of
data_cache_size, data_cache_size and non_temporal_threshold to
unsigned long int.
* sysdeps/x86/cpu-tunables.c (DEFAULT_STRLEN): Removed.
(TUNABLE_CALLBACK (set_ifunc)): Renamed to ...
(TUNABLE_CALLBACK (set_hwcaps)): This. Update comments. Don't
use DEFAULT_STRLEN.
Backtrace through _dl_tlsdesc_resolve_rela was broken because the offset
of x30 from cfa was not in the debug info.
Add enough annotation so backtracing from the dynamic linker through
tlsdesc entry points works and the debugger shows registers correctly.
This patch add an extra test for passing invalid flags and check its
expected failure. It shows an invalid LO_HI_LONG macro definition for
x86_64 with leads to passing invalid flags on some configurations.
The new tests fails on i686-linux-gnu and potentially on other 32 bits
architecture that uses the compat syscall definition due a kernel bug.
It is intended to be fixed upstream.
Checked on x86_64-linux-gnu
* misc/tst-preadvwritev2-common.c: New file.
* misc/tst-preadvwritev2.c (do_test): Add test for invalid flag.
* misc/tst-preadvwritev64v2.c (do_test): Likewise.
Many of the things defined by bits/signum.h are invariant across all
supported operating systems. This patch factors out all of them to a
new header bits/signum-generic.h, which each bits/signum.h will include
and then override whichever things need adjustment. Normally that will
mean, at most, adding or changing a few signal numbers.
A user-visible side effect is that the obsolete signal constant SIGUNUSED
(which is an alias for SIGSYS on all platforms that define it) is no
longer exposed by any version of bits/signum.h.
A side effect only relevant to glibc hackers is that _NSIG is now defined
in terms of __SIGRTMAX, instead of the other way around. This is because
__SIGRTMAX varies from platform to platform, but _NSIG==__SIGRTMAX+1 is
true universally. If your platform doesn't support realtime signals,
leave __SIGRTMAX equal to __SIGRTMIN.
I also added a Linux-specific test to make sure that our signal constants
match the ones in <asm/signal.h>, since we can't use that header (it's
not even vaguely namespace-clean).
* bits/signum-generic.h: Renamed from bits/signum.h.
Add proper multiple include guard and misuse check.
Define __SIGRTMIN = __SIGRTMAX = 32, and define _NSIG = __SIGRTMAX+1.
Move definition of SIGIO to "archaic names for compatibility" section.
* bits/signum.h: New file which just includes bits/signum-generic.h.
* sysdeps/unix/bsd/bits/signum.h
* sysdeps/unix/sysv/linux/bits/signum.h
* sysdeps/unix/sysv/linux/alpha/bits/signum.h
* sysdeps/unix/sysv/linux/hppa/bits/signum.h
* sysdeps/unix/sysv/linux/mips/bits/signum.h
* sysdeps/unix/sysv/linux/sparc/bits/signum.h
Just include <bits/signum-generic.h> and then add or adjust
signal constants. Do not define SIGUNUSED, SIGRTMIN, or SIGRTMAX.
* signal/Makefile: Install bits/signum-generic.h.
* signal/signal.h: Define SIGRTMIN and SIGRTMAX here.
* sysdeps/generic/siglist.h: SIGSYS and SIGWINCH are
universal. Prefer SIGPOLL to SIGIO. Simplify #ifdeffage.
* sysdeps/unix/sysv/linux/tst-signal-numbers.sh: New test.
* sysdeps/unix/sysv/linux/Makefile: Run it.
<locale.h> is specified to define locale_t in POSIX.1-2008, and so are
all of the headers that define functions that take locale_t arguments.
Under _GNU_SOURCE, the additional headers that define such functions
have also always defined locale_t. Therefore, there is no need to use
__locale_t in public function prototypes, nor in any internal code.
* ctype/ctype-c99_l.c, ctype/ctype.h, ctype/ctype_l.c
* include/monetary.h, include/stdlib.h, include/time.h
* include/wchar.h, locale/duplocale.c, locale/freelocale.c
* locale/global-locale.c, locale/langinfo.h, locale/locale.h
* locale/localeinfo.h, locale/newlocale.c
* locale/nl_langinfo_l.c, locale/uselocale.c
* localedata/bug-usesetlocale.c, localedata/tst-xlocale2.c
* stdio-common/vfscanf.c, stdlib/monetary.h, stdlib/stdlib.h
* stdlib/strfmon_l.c, stdlib/strtod_l.c, stdlib/strtof_l.c
* stdlib/strtol.c, stdlib/strtol_l.c, stdlib/strtold_l.c
* stdlib/strtoll_l.c, stdlib/strtoul_l.c, stdlib/strtoull_l.c
* string/strcasecmp.c, string/strcoll_l.c, string/string.h
* string/strings.h, string/strncase.c, string/strxfrm_l.c
* sysdeps/ieee754/float128/strtof128_l.c
* sysdeps/ieee754/float128/wcstof128.c
* sysdeps/ieee754/float128/wcstof128_l.c
* sysdeps/ieee754/ldbl-128ibm/strtold_l.c
* sysdeps/ieee754/ldbl-64-128/strtold_l.c
* sysdeps/ieee754/ldbl-opt/nldbl-compat.c
* sysdeps/ieee754/ldbl-opt/nldbl-strfmon_l.c
* sysdeps/ieee754/ldbl-opt/nldbl-strtold_l.c
* sysdeps/ieee754/ldbl-opt/nldbl-wcstold_l.c
* sysdeps/powerpc/powerpc32/power7/strcasecmp.S
* sysdeps/powerpc/powerpc64/power7/strcasecmp.S
* sysdeps/x86_64/strcasecmp_l-nonascii.c
* sysdeps/x86_64/strncase_l-nonascii.c, time/strftime_l.c
* time/strptime_l.c, time/time.h, wcsmbs/mbsrtowcs_l.c
* wcsmbs/wchar.h, wcsmbs/wcscasecmp.c, wcsmbs/wcsncase.c
* wcsmbs/wcstod.c, wcsmbs/wcstod_l.c, wcsmbs/wcstof.c
* wcsmbs/wcstof_l.c, wcsmbs/wcstol_l.c, wcsmbs/wcstold.c
* wcsmbs/wcstold_l.c, wcsmbs/wcstoll_l.c, wcsmbs/wcstoul_l.c
* wcsmbs/wcstoull_l.c, wctype/iswctype_l.c
* wctype/towctrans_l.c, wctype/wcfuncs_l.c
* wctype/wctrans_l.c, wctype/wctype.h, wctype/wctype_l.c:
Change all uses of __locale_t to locale_t.
xlocale.h is already a single-type micro-header, defining struct
__locale_struct and the typedefs __locale_t and locale_t. This patch
brings it into the bits/types/ scheme: there are now
bits/types/__locale_t.h which defines only __locale_struct and
__locale_t, and bits/types/locale_t.h which defines locale_t as well
as the other two. None of *our* headers need __locale_t.h, but it
appears to me that libstdc++ could make use of it.
There are a lot of external uses of xlocale.h, but all the uses I
checked had an autoconf test or equivalent for its existence. It has
never been available from other C libraries, and it has always
contained a comment reading "This file is not standardized, don't rely
on it, it can go away without warning" so I think dropping it is
pretty safe.
I also took the opportunity to clean up comments in various public
header files that still talk about the *_l interfaces as though they
were completely nonstandard. There are a few of them, notably the
strtoX_l and wcstoX_l families, that haven't been standardized, but
the bulk are in POSIX.1-2008.
* locale/xlocale.h: Rename to...
* locale/bits/types/__locale_t.h: ...here. Adjust commentary.
Only define struct __locale_struct and __locale_t, not locale_t.
* locale/bits/types/locale_t.h: New file; define locale_t here.
* locale/Makefile (headers): Update to match.
* include/xlocale.h: Delete wrapper.
* include/bits/types/__locale_t.h: New wrapper.
* include/bits/types/locale_t.h: New wrapper.
* ctype/ctype.h, include/printf.h, include/time.h
* locale/langinfo.h, locale/locale.h, stdlib/monetary.h
* stdlib/stdlib.h, string/string.h, string/strings.h, time/time.h
* wcsmbs/wchar.h, wctype/wctype.h: Use bits/types/locale_t.h.
Correct outdated comments regarding the standardization status of
the functions that take locale_t arguments.
* stdlib/strtod_l.c, stdlib/strtof_l.c, stdlib/strtol_l.c
* stdlib/strtold_l.c, stdlib/strtoul_l.c, stdlib/strtoull_l.c
* sysdeps/ieee754/ldbl-128ibm/strtold_l.c
* sysdeps/ieee754/ldbl-64-128/strtold_l.c
* wcsmbs/wcstod.c, wcsmbs/wcstod_l.c, wcsmbs/wcstof.c
* wcsmbs/wcstof_l.c, wcsmbs/wcstold.c, wcsmbs/wcstold_l.c:
Don't include xlocale.h. If necessary, include locale.h instead.
* stdlib/strtold_l.c: Unconditionally include wchar.h.
This patch consolidates the open Linux syscall implementation on
sysdeps/unix/sysv/linux/open{64}.c. The changes are:
1. Remove wordsize-64 openat{64}.
2. For architetures that define __OFF_T_MATCHES_OFF64_T openat64
will be default one with alias to required symbols. Otherwise
openat64 will pass the required O_LARGEFILE flag on syscall.
Checked on i686-linux-gnu, x86_64-linux-gnu, x86_64-linux-gnux32,
arch64-linux-gnu, arm-linux-gnueabihf, and powerpc64le-linux-gnu.
* sysdeps/unix/sysv/linux/openat.c (__libc_openat): Build only
for !__OFF_T_MATCHES_OFF64_T.
* sysdeps/unix/sysv/linux/openat64.c (__libc_openat64): New
implementation based on open64.
* sysdeps/unix/sysv/linux/wordsize-64/openat.c: Remove file.
* sysdeps/unix/sysv/linux/wordsize-64/openat64.c: Likewise.
This patch XFAILs one test where the powerpc32 ucontext_t has the
wrong type of a field, to allow the conform/ tests as a whole to pass
once the namespace issues are fixed.
Tested with build-many-glibcs.py.
[BZ #21635]
* sysdeps/unix/sysv/linux/powerpc/powerpc32/Makefile
[$(subdir) = conform] (conformtest-xfail-conds): New variable.
* conform/data/signal.h-data (uc_mcontext): XFAIL for
powerpc32-linux.
* conform/data/ucontext.h-data (uc_mcontext): Likewise.
This patch XFAILs one test where the ia64 ucontext_t has the wrong
type of a field, to allow the conform/ tests as a whole to pass once
the namespace issues are fixed.
Tested with build-many-glibcs.py.
[BZ #21634]
* sysdeps/unix/sysv/linux/ia64/Makefile [$(subdir) = conform]
(conformtest-xfail-conds): New variable.
* conform/data/signal.h-data (uc_sigmask): XFAIL for ia64-linux.
* conform/data/ucontext.h-data (uc_sigmask): Likewise.
Add a workload for powf. This is a reduced trace based on 2.3 billion
samples extracted from wrf. The distribution of values, in particular
frequency of commonly used operands is the same as in the full trace.
* benchtests/powf-inputs: Add reduced trace from wrf.
The current IFUNC selection is based on microbenchmarks in glibc. It
should give the best performance for most workloads. But other choices
may have better performance for a particular workload or on the hardware
which wasn't available at the selection was made. The environment
variable, GLIBC_TUNABLES=glibc.tune.ifunc=-xxx,yyy,-zzz...., can be used
to enable CPU/ARCH feature yyy, disable CPU/ARCH feature yyy and zzz,
where the feature name is case-sensitive and has to match the ones in
cpu-features.h. It can be used by glibc developers to override the
IFUNC selection to tune for a new processor or improve performance for
a particular workload. It isn't intended for normal end users.
NOTE: the IFUNC selection may change over time. Please check all
multiarch implementations when experimenting.
Also, GLIBC_TUNABLES=glibc.tune.x86_non_temporal_threshold=NUMBER is
provided to set threshold to use non temporal store to NUMBER,
GLIBC_TUNABLES=glibc.tune.x86_data_cache_size=NUMBER to set data cache
size, GLIBC_TUNABLES=glibc.tune.x86_shared_cache_size=NUMBER to set
shared cache size.
* elf/dl-tunables.list (tune): Add ifunc,
x86_non_temporal_threshold,
x86_data_cache_size and x86_shared_cache_size.
* manual/tunables.texi: Document glibc.tune.ifunc,
glibc.tune.x86_data_cache_size, glibc.tune.x86_shared_cache_size
and glibc.tune.x86_non_temporal_threshold.
* sysdeps/unix/sysv/linux/x86/dl-sysdep.c: New file.
* sysdeps/x86/cpu-tunables.c: Likewise.
* sysdeps/x86/cacheinfo.c
(init_cacheinfo): Check and get data cache size, shared cache
size and non temporal threshold from cpu_features.
* sysdeps/x86/cpu-features.c [HAVE_TUNABLES] (TUNABLE_NAMESPACE):
New.
[HAVE_TUNABLES] Include <unistd.h>.
[HAVE_TUNABLES] Include <elf/dl-tunables.h>.
[HAVE_TUNABLES] (TUNABLE_CALLBACK (set_ifunc)): Likewise.
[HAVE_TUNABLES] (init_cpu_features): Use TUNABLE_GET to set
IFUNC selection, data cache size, shared cache size and non
temporal threshold.
* sysdeps/x86/cpu-features.h (cpu_features): Add data_cache_size,
shared_cache_size and non_temporal_threshold.
Improve support for math function benchmarking. This patch adds
a feature that allows accurate benchmarking of traces extracted
from real workloads. This is done by iterating over all samples
rather than repeating each sample many times (which completely
ignores branch prediction and cache effects). A trace can be
added to existing math function inputs via
"## name: workload-<name>", followed by the trace.
* benchtests/README: Describe workload feature.
* benchtests/bench-skeleton.c (main): Add support for
benchmarking traces from workloads.
Remove one more string inline that was defined directly in string.h;
in the absence of the rest of the inlines, it broke the build.
Like other ifunc shims for these functions,
x86_64/multiarch/{mem,st}pcpy.c need to define __NO_STRING_INLINES and
NO_MEMPCPY_STPCPY_REDIRECT.
* string/string.h (__mempcpy_inline): Delete.
* sysdeps/x86_64/multiarch/mempcpy.c
* sysdeps/x86_64/multiarch/stpcpy.c:
Define NO_MEMPCPY_STPCPY_REDIRECT and __NO_STRING_INLINES
before including string.h.
Add powf() bench test with input which covers these cases:
- positive base to positive exponent
- exponent 0
- negative base to even exponent
- exponent 1
- exponent -1
- squared
- squareroot
- 1 to negative exponent
- -1 to negative exponent
- base 0
- -1 to even exponent
- small base
- small exponent
* benchtests/Makefile (bench-math): Add powf.
* benchtests/powf-inputs: New file.
These machine-dependent inline string functions have never been on by
default, and even if they were a good idea at the time they were
introduced, they haven't really been touched in ten to fifteen years
and probably aren't a good idea on current-gen processors. Current
thinking is that this class of optimization is best left to the
compiler.
* bits/string.h, string/bits/string.h
* sysdeps/aarch64/bits/string.h
* sysdeps/m68k/m680x0/m68020/bits/string.h
* sysdeps/s390/bits/string.h, sysdeps/sparc/bits/string.h
* sysdeps/x86/bits/string.h: Delete file.
* string/string.h: Don't include bits/string.h.
* string/bits/string3.h: Rename to bits/string_fortified.h.
No need to undef various symbols that the removed headers
might have defined as macros.
* string/Makefile (headers): Remove bits/string.h, change
bits/string3.h to bits/string_fortified.h.
* string/string-inlines.c: Update commentary. Remove definitions
of various macros that nothing looks at anymore. Don't directly
include bits/string.h. Set _STRING_INLINE_unaligned here, based on
compiler-predefined macros.
* string/strncat.c: If STRNCAT is not defined, or STRNCAT_PRIMARY
_is_ defined, provide internal hidden alias __strncat.
* include/string.h: Declare internal hidden alias __strncat.
Only forward __stpcpy to __builtin_stpcpy if __NO_STRING_INLINES is
not defined.
* include/bits/string3.h: Rename to bits/string_fortified.h,
update to match above.
* sysdeps/i386/string-inlines.c: Define compat symbols for
everything formerly defined by sysdeps/x86/bits/string.h.
Make existing definitions into compat symbols as well.
Remove some no-longer-necessary messing around with macros.
* sysdeps/powerpc/powerpc32/power4/multiarch/mempcpy.c
* sysdeps/powerpc/powerpc64/multiarch/mempcpy.c
* sysdeps/powerpc/powerpc64/multiarch/stpcpy.c
* sysdeps/s390/multiarch/mempcpy.c
No need to define _HAVE_STRING_ARCH_mempcpy.
Do define __NO_STRING_INLINES and NO_MEMPCPY_STPCPY_REDIRECT.
* sysdeps/i386/i686/multiarch/strncat-c.c
* sysdeps/s390/multiarch/strncat-c.c
* sysdeps/x86_64/multiarch/strncat-c.c
Define STRNCAT_PRIMARY. Don't change definition of libc_hidden_def.
This patch removes some MIPS code in glibc that was conditional on old
GCC versions no longer supported for building glibc.
Tested with build-many-glibcs.py.
* sysdeps/mips/atomic-machine.h (R10K_BEQZ_INSN): Remove.
[__GNUC_PREREQ (4, 8) || __mips16]: Make code unconditional.
[!__GNUC_PREREQ (4, 8) && !__mips16]: Remove conditional code.
* sysdeps/mips/math-tests.h
[_MIPS_SIM != _ABIO32 && !__GNUC_PREREQ (4, 9)]: Remove
conditional code.