glibc/sysdeps
H.J. Lu d2538b9156 x86-64: Optimize strrchr/wcsrchr with AVX2
Optimize strrchr/wcsrchr with AVX2 to check 32 bytes with vector
instructions.  It is as fast as SSE2 version for small data sizes
and up to 1X faster for large data sizes on Haswell.  Select AVX2
version on AVX2 machines where vzeroupper is preferred and AVX
unaligned load is fast.

	* sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add
	strrchr-sse2, strrchr-avx2, wcsrchr-sse2 and wcsrchr-avx2.
	* sysdeps/x86_64/multiarch/ifunc-impl-list.c
	(__libc_ifunc_impl_list): Add tests for __strrchr_avx2,
	__strrchr_sse2, __wcsrchr_avx2 and __wcsrchr_sse2.
	* sysdeps/x86_64/multiarch/strrchr-avx2.S: New file.
	* sysdeps/x86_64/multiarch/strrchr-sse2.S: Likewise.
	* sysdeps/x86_64/multiarch/strrchr.c: Likewise.
	* sysdeps/x86_64/multiarch/wcsrchr-avx2.S: Likewise.
	* sysdeps/x86_64/multiarch/wcsrchr-sse2.S: Likewise.
	* sysdeps/x86_64/multiarch/wcsrchr.c: Likewise.
2017-06-09 05:45:52 -07:00
..
aarch64 Optimize generic spinlock code and use C11 like atomic macros. 2017-06-06 09:41:56 +02:00
alpha Optimize generic spinlock code and use C11 like atomic macros. 2017-06-06 09:41:56 +02:00
arm Optimize generic spinlock code and use C11 like atomic macros. 2017-06-06 09:41:56 +02:00
generic ld.so: Consolidate 2 strtouls into _dl_strtoul [BZ #21528] 2017-06-08 12:52:42 -07:00
gnu Regenerate sysdeps/gnu/errlist.c. 2017-06-04 15:27:14 -04:00
hppa Optimize generic spinlock code and use C11 like atomic macros. 2017-06-06 09:41:56 +02:00
i386 Optimize generic spinlock code and use C11 like atomic macros. 2017-06-06 09:41:56 +02:00
ia64 Optimize generic spinlock code and use C11 like atomic macros. 2017-06-06 09:41:56 +02:00
ieee754 float128: Add strfromf128 2017-06-07 17:08:21 -03:00
init_array
m68k Optimize generic spinlock code and use C11 like atomic macros. 2017-06-06 09:41:56 +02:00
mach Fix struct sigaltstack namespace (bug 21517). 2017-06-05 10:17:46 +00:00
microblaze Optimize generic spinlock code and use C11 like atomic macros. 2017-06-06 09:41:56 +02:00
mips Optimize generic spinlock code and use C11 like atomic macros. 2017-06-06 09:41:56 +02:00
nios2 Optimize generic spinlock code and use C11 like atomic macros. 2017-06-06 09:41:56 +02:00
nptl fork: Remove bogus parent PID assertions [BZ #21386] 2017-05-12 16:04:16 +02:00
posix getaddrinfo: Eliminate another strdup call 2017-06-03 08:37:31 +02:00
powerpc Optimize generic spinlock code and use C11 like atomic macros. 2017-06-06 09:41:56 +02:00
pthread Remove __need macros from signal.h. 2017-05-20 19:04:43 -04:00
s390 S390: Use generic spinlock code. 2017-06-06 09:41:56 +02:00
sh Move shared pthread definitions to common headers 2017-05-09 17:49:17 -03:00
sparc Make LD_HWCAP_MASK usable for static binaries 2017-06-07 11:11:40 +05:30
tile Optimize generic spinlock code and use C11 like atomic macros. 2017-06-06 09:41:56 +02:00
unix aarch64: Fix undefined behavior in _dl_procinfo 2017-06-09 14:18:12 +05:30
wordsize-32
wordsize-64
x86 Make LD_HWCAP_MASK usable for static binaries 2017-06-07 11:11:40 +05:30
x86_64 x86-64: Optimize strrchr/wcsrchr with AVX2 2017-06-09 05:45:52 -07:00