Go to file
Raghuveer Devulapalli 82a707aeb7 x86_64: Add strstr function with 512-bit EVEX
Adding a 512-bit EVEX version of strstr. The algorithm works as follows:

(1) We spend a few cycles at the begining to peek into the needle. We
locate an edge in the needle (first occurance of 2 consequent distinct
characters) and also store the first 64-bytes into a zmm register.

(2) We search for the edge in the haystack by looking into one cache
line of the haystack at a time. This avoids having to read past a page
boundary which can cause a seg fault.

(3) If an edge is found in the haystack we first compare the first
64-bytes of the needle (already stored in a zmm register) before we
proceed with a full string compare performed byte by byte.

Benchmarking results: (old = strstr_sse2_unaligned, new = strstr_avx512)

Geometric mean of all benchmarks: new / old =  0.66

Difficult skiptable(0) : new / old =  0.02
Difficult skiptable(1) : new / old =  0.01
Difficult 2-way : new / old =  0.25
Difficult testing first 2 : new / old =  1.26
Difficult skiptable(0) : new / old =  0.05
Difficult skiptable(1) : new / old =  0.06
Difficult 2-way : new / old =  0.26
Difficult testing first 2 : new / old =  1.05
Difficult skiptable(0) : new / old =  0.42
Difficult skiptable(1) : new / old =  0.24
Difficult 2-way : new / old =  0.21
Difficult testing first 2 : new / old =  1.04
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>

(cherry picked from commit 5082a287d5)

x86: Remove __mmask intrinsics in strstr-avx512.c

The intrinsics are not available before GCC7 and using standard
operators generates code of equivalent or better quality.

Removed:
    _cvtmask64_u64
    _kshiftri_mask64
    _kand_mask64

Geometric Mean of 5 Runs of Full Benchmark Suite New / Old: 0.958

(cherry picked from commit f2698954ff)
2022-07-18 20:45:20 -07:00
argp
assert
benchtests
bits elf: Issue la_symbind for bind-now (BZ #23734) 2022-04-12 13:32:59 -04:00
catgets catgets: Use 64 bit stat for __open_catalog (BZ# 29211) 2022-06-01 13:36:38 -03:00
ChangeLog.old
conform
crypt
csu csu: Implement and use _dl_early_allocate during static startup 2022-05-19 12:13:53 +02:00
ctype
debug misc: Fix rare fortify crash on wchar funcs. [BZ 29030] 2022-04-25 18:44:27 +05:30
dirent
dlfcn dlfcn: Implement the RTLD_DI_PHDR request type for dlinfo 2022-05-11 20:56:53 +02:00
elf rtld: Use generic argv adjustment in ld.so [BZ #23293] 2022-05-19 16:48:47 +01:00
gmon
gnulib
grp
gshadow
hesiod
htl
hurd
iconv iconv: Use 64 bit stat for gconv_parseconfdir (BZ# 29213) 2022-06-01 13:36:56 -03:00
iconvdata
include Fix deadlock when pthread_atfork handler calls pthread_atfork or dlclose 2022-05-30 12:38:32 +02:00
inet inet: Use 64 bit stat for ruserpass (BZ# 29210) 2022-06-01 13:34:51 -03:00
intl
io linux: Fix fchmodat with AT_SYMLINK_NOFOLLOW for 64 bit time_t (BZ#29097) 2022-04-28 10:10:30 -03:00
libio Make sure that the fortified function conditionals are constant 2022-03-11 20:36:24 +05:30
locale localedef: Handle symbolic links when generating locale-archive 2022-03-03 11:58:03 +01:00
localedata
login
mach
malloc
manual dlfcn: Implement the RTLD_DI_PHDR request type for dlinfo 2022-05-11 20:56:53 +02:00
math
mathvec
misc misc: Use 64 bit stat for getusershell (BZ# 29204) 2022-06-01 13:34:49 -03:00
nis
nptl nptl: Fix ___pthread_unregister_cancel_restore asynchronous restore 2022-07-13 13:23:20 -03:00
nptl_db
nscd
nss nss: handle stat failure in check_reload_and_get (BZ #28752) 2022-06-13 18:15:36 -04:00
po
posix Fix deadlock when pthread_atfork handler calls pthread_atfork or dlclose 2022-05-30 12:38:32 +02:00
pwd
resolv
resource
rt
scripts csu: Implement and use _dl_early_allocate during static startup 2022-05-19 12:13:53 +02:00
setjmp
shadow
signal
socket socket: Fix mistyped define statement in socket/sys/socket.h (BZ #29225) 2022-06-06 12:53:58 -03:00
soft-fp
stdio-common
stdlib fortify: Fix spurious warning with realpath 2022-03-11 20:36:24 +05:30
string string.h: fix __fortified_attr_access macro call [BZ #29162] 2022-05-23 14:06:31 +05:30
sunrpc
support debug: Synchronize feature guards in fortified functions [BZ #28746] 2022-03-11 20:36:24 +05:30
sysdeps x86_64: Add strstr function with 512-bit EVEX 2022-07-18 20:45:20 -07:00
sysvipc
termios
time
timezone
wcsmbs debug: Synchronize feature guards in fortified functions [BZ #28746] 2022-03-11 20:36:24 +05:30
wctype
.gitattributes
.gitignore
abi-tags
aclocal.m4
config.h.in
config.make.in
configure Default to --with-default-link=no (bug 25812) 2022-04-22 11:31:14 +02:00
configure.ac Default to --with-default-link=no (bug 25812) 2022-04-22 11:31:14 +02:00
COPYING
COPYING.LIB
extra-lib.mk
gen-locales.mk
INSTALL INSTALL: Rephrase -with-default-link documentation 2022-04-26 15:27:43 +02:00
libc-abis
libof-iterator.mk
LICENSES
MAINTAINERS
Makeconfig
Makefile
Makefile.help
Makefile.in
Makerules debug: Autogenerate _FORTIFY_SOURCE tests 2022-03-11 20:36:24 +05:30
NEWS nios2: Remove _dl_skip_args usage (BZ# 29187) 2022-06-10 09:15:00 -03:00
o-iterator.mk
README
Rules
shlib-versions nss: Do not mention NSS test modules in <gnu/lib-names.h> 2022-03-11 11:13:34 +01:00
test-skeleton.c
version.h

This directory contains the sources of the GNU C Library.
See the file "version.h" for what release version you have.

The GNU C Library is the standard system C library for all GNU systems,
and is an important part of what makes up a GNU system.  It provides the
system API for all programs written in C and C-compatible languages such
as C++ and Objective C; the runtime facilities of other programming
languages use the C library to access the underlying operating system.

In GNU/Linux systems, the C library works with the Linux kernel to
implement the operating system behavior seen by user applications.
In GNU/Hurd systems, it works with a microkernel and Hurd servers.

The GNU C Library implements much of the POSIX.1 functionality in the
GNU/Hurd system, using configurations i[4567]86-*-gnu.

When working with Linux kernels, this version of the GNU C Library
requires Linux kernel version 3.2 or later.

Also note that the shared version of the libgcc_s library must be
installed for the pthread library to work correctly.

The GNU C Library supports these configurations for using Linux kernels:

	aarch64*-*-linux-gnu
	alpha*-*-linux-gnu
	arc*-*-linux-gnu
	arm-*-linux-gnueabi
	csky-*-linux-gnuabiv2
	hppa-*-linux-gnu
	i[4567]86-*-linux-gnu
	x86_64-*-linux-gnu	Can build either x86_64 or x32
	ia64-*-linux-gnu
	m68k-*-linux-gnu
	microblaze*-*-linux-gnu
	mips-*-linux-gnu
	mips64-*-linux-gnu
	powerpc-*-linux-gnu	Hardware or software floating point, BE only.
	powerpc64*-*-linux-gnu	Big-endian and little-endian.
	s390-*-linux-gnu
	s390x-*-linux-gnu
	riscv32-*-linux-gnu
	riscv64-*-linux-gnu
	sh[34]-*-linux-gnu
	sparc*-*-linux-gnu
	sparc64*-*-linux-gnu

If you are interested in doing a port, please contact the glibc
maintainers; see https://www.gnu.org/software/libc/ for more
information.

See the file INSTALL to find out how to configure, build, and install
the GNU C Library.  You might also consider reading the WWW pages for
the C library at https://www.gnu.org/software/libc/.

The GNU C Library is (almost) completely documented by the Texinfo manual
found in the `manual/' subdirectory.  The manual is still being updated
and contains some known errors and omissions; we regret that we do not
have the resources to work on the manual as much as we would like.  For
corrections to the manual, please file a bug in the `manual' component,
following the bug-reporting instructions below.  Please be sure to check
the manual in the current development sources to see if your problem has
already been corrected.

Please see https://www.gnu.org/software/libc/bugs.html for bug reporting
information.  We are now using the Bugzilla system to track all bug reports.
This web page gives detailed information on how to report bugs properly.

The GNU C Library is free software.  See the file COPYING.LIB for copying
conditions, and LICENSES for notices about a few contributions that require
these additional notices to be distributed.  License copyright years may be
listed using range notation, e.g., 1996-2015, indicating that every year in
the range, inclusive, is a copyrightable year that would otherwise be listed
individually.