Go to file
Leonardo Sandoval 1457016337 x86-64: Optimize strcmp/wcscmp and strncmp/wcsncmp with AVX2
Optimize x86-64 strcmp/wcscmp and strncmp/wcsncmp with AVX2. It uses vector
comparison as much as possible. Peak performance observed on a SkyLake
machine: 9x, 3x, 2.5x and 5.5x for strcmp, strncmp, wcscmp and wcsncmp,
respectively. The larger the comparison length, the more benefit using
avx2 functions, except on the strcmp, where peak is observed at length
== 32 bytes. Select AVX2 strcmp/wcscmp on AVX2 machines where vzeroupper
is preferred and AVX unaligned load is fast.

NB: It uses TZCNT instead of BSF since TZCNT produces the same result
as BSF for non-zero input.  TZCNT is faster than BSF and is executed
as BSF if machine doesn't support TZCNT.

	* sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add
	strcmp-avx2, strncmp-avx2, wcscmp-avx2, wcscmp-sse2, wcsncmp-avx2 and
	wcsncmp-sse2.
	* sysdeps/x86_64/multiarch/ifunc-impl-list.c
	(__libc_ifunc_impl_list): Add tests for __strcmp_avx2,
	__strncmp_avx2,	__wcscmp_avx2, __wcsncmp_avx2, __wcscmp_sse2
	and __wcsncmp_sse2.
	* sysdeps/x86_64/multiarch/strcmp.c (OPTIMIZE (avx2)):
	(IFUNC_SELECTOR): Return OPTIMIZE (avx2) on AVX 2 machines if
	AVX unaligned load is fast and vzeroupper is preferred.
	* sysdeps/x86_64/multiarch/strncmp.c: Likewise.
	* sysdeps/x86_64/multiarch/strcmp-avx2.S: New file.
	* sysdeps/x86_64/multiarch/strncmp-avx2.S: Likewise.
	* sysdeps/x86_64/multiarch/wcscmp-avx2.S: Likewise.
	* sysdeps/x86_64/multiarch/wcscmp-sse2.S: Likewise.
	* sysdeps/x86_64/multiarch/wcscmp.c: Likewise.
	* sysdeps/x86_64/multiarch/wcsncmp-avx2.S: Likewise.
	* sysdeps/x86_64/multiarch/wcsncmp-sse2.c: Likewise.
	* sysdeps/x86_64/multiarch/wcsncmp.c: Likewise.
	* sysdeps/x86_64/wcscmp.S (__wcscmp): Add alias only if __wcscmp
	is undefined.
2018-06-01 16:32:43 -05:00
argp
assert
benchtests
bits powerpc: Fix the compiler type used with C++ when -mabi=ieeelongdouble 2018-05-11 18:05:03 -03:00
catgets
ChangeLog.old
conform
crypt
csu
ctype
debug libio: Avoid _allocate_buffer, _free_buffer function pointers [BZ #23236] 2018-06-01 10:41:03 +02:00
dirent
dlfcn
elf static-PIE: Update DT_DEBUG for debugger [BZ #23206] 2018-05-29 06:33:57 -07:00
gmon
gnulib
grp
gshadow
hesiod
htl
hurd
iconv
iconvdata
include Switch IDNA implementation to libidn2 [BZ #19728] [BZ #19729] [BZ #22247] 2018-05-23 15:27:24 +02:00
inet Switch IDNA implementation to libidn2 [BZ #19728] [BZ #19729] [BZ #22247] 2018-05-23 15:27:24 +02:00
intl
io
libio libio: Avoid _allocate_buffer, _free_buffer function pointers [BZ #23236] 2018-06-01 10:41:03 +02:00
locale
localedata gd_GB: Fix typo in abbreviated "May" (bug 23152). 2018-05-11 00:00:10 +02:00
login
mach
malloc
manual Add narrowing divide functions. 2018-05-17 00:40:52 +00:00
math Fix parameter type in C++ version of iseqsig (bug 23171) 2018-05-24 13:12:39 -03:00
mathvec
misc Implement allocate_once for atomic initialization with allocation 2018-05-23 15:27:01 +02:00
nis
nptl nptl: Remove __ASSUME_PRIVATE_FUTEX 2018-05-17 04:25:10 -07:00
nptl_db
nscd Switch IDNA implementation to libidn2 [BZ #19728] [BZ #19729] [BZ #22247] 2018-05-23 15:27:24 +02:00
nss
po
posix
pwd
resolv Switch IDNA implementation to libidn2 [BZ #19728] [BZ #19729] [BZ #22247] 2018-05-23 15:27:24 +02:00
resource
rt
scripts
setjmp
shadow
signal
socket
soft-fp Make powerpc-nofpu __sqrtsf2, __sqrtdf2 compat symbols (bug 18473). 2018-06-01 17:25:12 +00:00
stdio-common
stdlib stdlib: Additional tests need generated locale dependencies 2018-05-29 10:34:53 +02:00
streams
string Add a test case for [BZ #23196] 2018-05-23 04:00:11 -07:00
sunrpc sunrpc: Remove stray exports without --enable-obsolete-rpc [BZ #23166] 2018-05-11 15:36:50 +02:00
support support: Add wrappers for pthread_barrierattr_t 2018-05-29 15:37:00 +02:00
sysdeps x86-64: Optimize strcmp/wcscmp and strncmp/wcsncmp with AVX2 2018-06-01 16:32:43 -05:00
sysvipc
termios
time Fix year 2039 bug for localtime with 64-bit time_t (bug 22639). 2018-05-18 11:57:15 +00:00
timezone
wcsmbs math: Merge strtod_nan_*.h into math-type-macros-*.h 2018-05-16 06:03:08 +02:00
wctype
.gitattributes
.gitignore
abi-tags
aclocal.m4
ChangeLog x86-64: Optimize strcmp/wcscmp and strncmp/wcsncmp with AVX2 2018-06-01 16:32:43 -05:00
config.h.in Switch IDNA implementation to libidn2 [BZ #19728] [BZ #19729] [BZ #22247] 2018-05-23 15:27:24 +02:00
config.make.in
configure
configure.ac
COPYING
COPYING.LIB
extra-lib.mk
gen-locales.mk
INSTALL
libc-abis
libof-iterator.mk
LICENSES Switch IDNA implementation to libidn2 [BZ #19728] [BZ #19729] [BZ #22247] 2018-05-23 15:27:24 +02:00
MAINTAINERS
Makeconfig
Makefile
Makefile.in
Makerules
NEWS Add references to CVE-2017-18269, CVE-2018-11236, CVE-2018-11237 2018-05-24 12:19:11 +02:00
o-iterator.mk
README
Rules
shlib-versions
test-skeleton.c
version.h

This directory contains the sources of the GNU C Library.
See the file "version.h" for what release version you have.

The GNU C Library is the standard system C library for all GNU systems,
and is an important part of what makes up a GNU system.  It provides the
system API for all programs written in C and C-compatible languages such
as C++ and Objective C; the runtime facilities of other programming
languages use the C library to access the underlying operating system.

In GNU/Linux systems, the C library works with the Linux kernel to
implement the operating system behavior seen by user applications.
In GNU/Hurd systems, it works with a microkernel and Hurd servers.

The GNU C Library implements much of the POSIX.1 functionality in the
GNU/Hurd system, using configurations i[4567]86-*-gnu.

When working with Linux kernels, this version of the GNU C Library
requires Linux kernel version 3.2 or later.

Also note that the shared version of the libgcc_s library must be
installed for the pthread library to work correctly.

The GNU C Library supports these configurations for using Linux kernels:

	aarch64*-*-linux-gnu
	alpha*-*-linux-gnu
	arm-*-linux-gnueabi
	hppa-*-linux-gnu
	i[4567]86-*-linux-gnu
	x86_64-*-linux-gnu	Can build either x86_64 or x32
	ia64-*-linux-gnu
	m68k-*-linux-gnu
	microblaze*-*-linux-gnu
	mips-*-linux-gnu
	mips64-*-linux-gnu
	powerpc-*-linux-gnu	Hardware or software floating point, BE only.
	powerpc64*-*-linux-gnu	Big-endian and little-endian.
	s390-*-linux-gnu
	s390x-*-linux-gnu
	riscv64-*-linux-gnu
	sh[34]-*-linux-gnu
	sparc*-*-linux-gnu
	sparc64*-*-linux-gnu

If you are interested in doing a port, please contact the glibc
maintainers; see http://www.gnu.org/software/libc/ for more
information.

See the file INSTALL to find out how to configure, build, and install
the GNU C Library.  You might also consider reading the WWW pages for
the C library at http://www.gnu.org/software/libc/.

The GNU C Library is (almost) completely documented by the Texinfo manual
found in the `manual/' subdirectory.  The manual is still being updated
and contains some known errors and omissions; we regret that we do not
have the resources to work on the manual as much as we would like.  For
corrections to the manual, please file a bug in the `manual' component,
following the bug-reporting instructions below.  Please be sure to check
the manual in the current development sources to see if your problem has
already been corrected.

Please see http://www.gnu.org/software/libc/bugs.html for bug reporting
information.  We are now using the Bugzilla system to track all bug reports.
This web page gives detailed information on how to report bugs properly.

The GNU C Library is free software.  See the file COPYING.LIB for copying
conditions, and LICENSES for notices about a few contributions that require
these additional notices to be distributed.  License copyright years may be
listed using range notation, e.g., 1996-2015, indicating that every year in
the range, inclusive, is a copyrightable year that would otherwise be listed
individually.