Go to file
H.J. Lu ef9c4cb6c7 x86-64: Optimize wmemset with SSE2/AVX2/AVX512
The difference between memset and wmemset is byte vs int.  Add stubs
to SSE2/AVX2/AVX512 memset for wmemset with updated constant and size:

SSE2 wmemset:
	shl    $0x2,%rdx
	movd   %esi,%xmm0
	mov    %rdi,%rax
	pshufd $0x0,%xmm0,%xmm0
	jmp	entry_from_wmemset

SSE2 memset:
	movd   %esi,%xmm0
	mov    %rdi,%rax
	punpcklbw %xmm0,%xmm0
	punpcklwd %xmm0,%xmm0
	pshufd $0x0,%xmm0,%xmm0
entry_from_wmemset:

Since the ERMS versions of wmemset requires "rep stosl" instead of
"rep stosb", only the vector store stubs of SSE2/AVX2/AVX512 wmemset
are added.  The SSE2 wmemset is about 3X faster and the AVX2 wmemset
is about 6X faster on Haswell.

	* include/wchar.h (__wmemset_chk): New.
	* sysdeps/x86_64/memset.S (VDUP_TO_VEC0_AND_SET_RETURN): Renamed
	to MEMSET_VDUP_TO_VEC0_AND_SET_RETURN.
	(WMEMSET_VDUP_TO_VEC0_AND_SET_RETURN): New.
	(WMEMSET_CHK_SYMBOL): Likewise.
	(WMEMSET_SYMBOL): Likewise.
	(__wmemset): Add hidden definition.
	(wmemset): Add weak hidden definition.
	* sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add
	wmemset_chk-nonshared.
	* sysdeps/x86_64/multiarch/ifunc-impl-list.c
	(__libc_ifunc_impl_list): Add __wmemset_sse2_unaligned,
	__wmemset_avx2_unaligned, __wmemset_avx512_unaligned,
	__wmemset_chk_sse2_unaligned, __wmemset_chk_avx2_unaligned
	and __wmemset_chk_avx512_unaligned.
	* sysdeps/x86_64/multiarch/memset-avx2-unaligned-erms.S
	(VDUP_TO_VEC0_AND_SET_RETURN): Renamed to ...
	(MEMSET_VDUP_TO_VEC0_AND_SET_RETURN): This.
	(WMEMSET_VDUP_TO_VEC0_AND_SET_RETURN): New.
	(WMEMSET_SYMBOL): Likewise.
	* sysdeps/x86_64/multiarch/memset-avx512-unaligned-erms.S
	(VDUP_TO_VEC0_AND_SET_RETURN): Renamed to ...
	(MEMSET_VDUP_TO_VEC0_AND_SET_RETURN): This.
	(WMEMSET_VDUP_TO_VEC0_AND_SET_RETURN): New.
	(WMEMSET_SYMBOL): Likewise.
	* sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S: Updated.
	(WMEMSET_CHK_SYMBOL): New.
	(WMEMSET_CHK_SYMBOL (__wmemset_chk, unaligned)): Likewise.
	(WMEMSET_SYMBOL (__wmemset, unaligned)): Likewise.
	* sysdeps/x86_64/multiarch/memset.S (WMEMSET_SYMBOL): New.
	(libc_hidden_builtin_def): Also define __GI_wmemset and
	__GI___wmemset.
	(weak_alias): New.
	* sysdeps/x86_64/multiarch/wmemset.c: New file.
	* sysdeps/x86_64/multiarch/wmemset.h: Likewise.
	* sysdeps/x86_64/multiarch/wmemset_chk-nonshared.S: Likewise.
	* sysdeps/x86_64/multiarch/wmemset_chk.c: Likewise.
	* sysdeps/x86_64/wmemset.c: Likewise.
	* sysdeps/x86_64/wmemset_chk.c: Likewise.
2017-06-05 11:09:59 -07:00
argp Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
assert Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
benchtests benchtests: Add more tests for memrchr 2017-06-04 09:45:09 -07:00
bits Define SIG_HOLD for XPG4 (bug 21538). 2017-06-05 10:19:03 +00:00
catgets Update copyright dates not handled by scripts/update-copyrights. 2017-01-01 00:26:24 +00:00
conform conformtest: Correct signal.h expectations for XPG4 / XPG42. 2017-06-01 17:17:43 +00:00
crypt Add missing header files throughout the testsuite. 2017-02-16 17:33:18 -05:00
csu
ctype Remove C++ namespace handling from glibc headers. 2017-03-16 13:31:57 +00:00
debug Fix struct sigaltstack namespace (bug 21517). 2017-06-05 10:17:46 +00:00
dirent
dlfcn Miscellaneous low-risk changes preparing for _ISOMAC testsuite. 2017-03-01 20:32:50 -05:00
elf
gmon
gnulib Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
grp Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
gshadow Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
hesiod
hurd
iconv
iconvdata Rename cppflags-iterator.mk to libof-iterator.mk, remove extra-modules.mk. 2017-05-09 07:06:29 -04:00
include x86-64: Optimize wmemset with SSE2/AVX2/AVX512 2017-06-05 11:09:59 -07:00
inet
intl Suppress internal declarations for most of the testsuite. 2017-05-11 19:27:59 -04:00
io Remove __need macros from signal.h. 2017-05-20 19:04:43 -04:00
libidn
libio libio: Avoid dup already opened file descriptor [BZ#21393] 2017-05-22 18:13:35 -03:00
locale
localedata Bug 20686: Add el_GR@euro support. 2017-05-03 15:37:04 -04:00
login Remove __need macros from signal.h. 2017-05-20 19:04:43 -04:00
mach
malloc Add internal facility for dynamic array handling 2017-06-02 11:59:28 +02:00
manual manual: Provide consistent errno documentation. 2017-06-02 13:49:20 -07:00
math float128: Add wrappers to override ldbl-128 as float128. 2017-05-25 09:01:37 -03:00
mathvec
misc posix: Add missing build flags for p{write,read}v2 2017-06-02 11:12:29 -03:00
nis Include shlib-compat.h in many sunrpc/nis source files. 2017-06-04 11:31:28 -04:00
nptl
nptl_db Narrowing the visibility of libc-internal.h even further. 2017-03-01 20:33:46 -05:00
nscd Rename cppflags-iterator.mk to libof-iterator.mk, remove extra-modules.mk. 2017-05-09 07:06:29 -04:00
nss
po
posix
pwd
resolv
resource
rt Remove __need macros from signal.h. 2017-05-20 19:04:43 -04:00
scripts Also create and use ldbl-compat-choose.h. 2017-05-19 11:30:26 +00:00
setjmp Remove __need macros from signal.h. 2017-05-20 19:04:43 -04:00
shadow
signal
socket Remove __need macros from signal.h. 2017-05-20 19:04:43 -04:00
soft-fp
stdio-common
stdlib Include sys/param.h in stdlib/gmp-impl.h instead of redefining MAX/MIN 2017-06-01 20:44:22 -03:00
streams
string
sunrpc
support
sysdeps x86-64: Optimize wmemset with SSE2/AVX2/AVX512 2017-06-05 11:09:59 -07:00
sysvipc Fix test-sysvsem on some platforms 2017-01-02 18:53:50 -02:00
termios
time Remove C++ namespace handling from glibc headers. 2017-03-16 13:31:57 +00:00
timezone
wcsmbs Do not use wildcard symbol names for public versions in Versions files. 2017-04-20 20:35:21 +00:00
wctype
.gitattributes
.gitignore Add *.pyc to .gitignore 2015-05-18 15:26:26 +05:30
abi-tags
aclocal.m4 Work even with compilers which enable -fstack-protector by default [BZ #7065] 2016-12-26 10:10:58 +01:00
BUGS
ChangeLog x86-64: Optimize wmemset with SSE2/AVX2/AVX512 2017-06-05 11:09:59 -07:00
ChangeLog.1
ChangeLog.2 * Makefile (distribute): Add ChangeLog.[0-9]. 1995-04-14 03:52:54 +00:00
ChangeLog.3 * Makefile (distribute): Add ChangeLog.[0-9]. 1995-04-14 03:52:54 +00:00
ChangeLog.4
ChangeLog.5 * sysdeps/posix/getaddrinfo.c: Implement configuration file 2006-05-04 06:38:07 +00:00
ChangeLog.6
ChangeLog.7
ChangeLog.8 ChangeLog: change Winblowz to Windows 2016-08-10 00:49:28 +08:00
ChangeLog.9
ChangeLog.10
ChangeLog.11 ChangeLog: change Winblowz to Windows 2016-08-10 00:49:28 +08:00
ChangeLog.12 Revert "ChangeLogs: convert to utf-8" 2016-02-12 16:35:27 -05:00
ChangeLog.13
ChangeLog.14
ChangeLog.15 Split out ChangeLog.15 at 2.3 branch point 2005-02-16 07:34:17 +00:00
ChangeLog.16 Fix typo in name 2012-06-21 16:45:27 +02:00
ChangeLog.17
ChangeLog.old-ports Move ports/ChangeLog* files to ChangeLog.old-ports*, remove ports/ directory. 2014-04-30 10:40:29 -07:00
ChangeLog.old-ports-aarch64
ChangeLog.old-ports-aix
ChangeLog.old-ports-alpha ChangeLog: fix BZ style to be consistent and match majority of existing code 2017-04-03 15:18:07 -04:00
ChangeLog.old-ports-am33
ChangeLog.old-ports-arm Move ports/ChangeLog* files to ChangeLog.old-ports*, remove ports/ directory. 2014-04-30 10:40:29 -07:00
ChangeLog.old-ports-cris Move ports/ChangeLog* files to ChangeLog.old-ports*, remove ports/ directory. 2014-04-30 10:40:29 -07:00
ChangeLog.old-ports-hppa
ChangeLog.old-ports-ia64
ChangeLog.old-ports-linux-generic Move ports/ChangeLog* files to ChangeLog.old-ports*, remove ports/ directory. 2014-04-30 10:40:29 -07:00
ChangeLog.old-ports-m68k
ChangeLog.old-ports-microblaze Move ports/ChangeLog* files to ChangeLog.old-ports*, remove ports/ directory. 2014-04-30 10:40:29 -07:00
ChangeLog.old-ports-mips ChangeLog: fix BZ style to be consistent and match majority of existing code 2017-04-03 15:18:07 -04:00
ChangeLog.old-ports-powerpc
ChangeLog.old-ports-tile
config.h.in
config.make.in
configure
configure.ac Deprecate libnsl by default (only shared library will be 2017-03-21 15:14:27 +01:00
CONFORMANCE Move __STDC_* predefined macros from features.h to stdc-predef.h. 2012-02-22 12:53:04 +00:00
COPYING Update to latest versions of GPL-2.0 and LGPL-2.1 2013-09-09 12:52:48 +10:00
COPYING.LIB Update to latest versions of GPL-2.0 and LGPL-2.1 2013-09-09 12:52:48 +10:00
extra-lib.mk Rename cppflags-iterator.mk to libof-iterator.mk, remove extra-modules.mk. 2017-05-09 07:06:29 -04:00
gen-locales.mk
INSTALL
libc-abis
libof-iterator.mk
LICENSES
MAINTAINERS
Makeconfig Support dl-tunables.list in subdirectories 2017-05-25 05:41:18 -07:00
Makefile
Makefile.in New make target to only build benchmark binaries 2016-04-20 10:23:28 +05:30
Makerules Also create and use ldbl-compat-choose.h. 2017-05-19 11:30:26 +00:00
NAMESPACE
NEWS
o-iterator.mk Fri Mar 17 12:58:37 1995 Roland McGrath <roland@churchy.gnu.ai.mit.edu> 1995-03-17 18:42:51 +00:00
README
README.pretty-printers
README.tunables tunables: Add support for tunables of uint64_t type 2017-05-17 13:11:55 +05:30
Rules Suppress internal declarations for most of the testsuite. 2017-05-11 19:27:59 -04:00
shlib-versions This is update for configure, build and install of vector math library. 2015-05-14 18:07:06 +03:00
test-skeleton.c Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
version.h
WUR-REPORT * posix/unistd.h (setuid, setreuid, seteuid, setresuid): 2012-08-01 18:12:58 +02:00

This directory contains the sources of the GNU C Library.
See the file "version.h" for what release version you have.

The GNU C Library is the standard system C library for all GNU systems,
and is an important part of what makes up a GNU system.  It provides the
system API for all programs written in C and C-compatible languages such
as C++ and Objective C; the runtime facilities of other programming
languages use the C library to access the underlying operating system.

In GNU/Linux systems, the C library works with the Linux kernel to
implement the operating system behavior seen by user applications.
In GNU/Hurd systems, it works with a microkernel and Hurd servers.

The GNU C Library implements much of the POSIX.1 functionality in the
GNU/Hurd system, using configurations i[4567]86-*-gnu.  The current
GNU/Hurd support requires out-of-tree patches that will eventually be
incorporated into an official GNU C Library release.

When working with Linux kernels, this version of the GNU C Library
requires Linux kernel version 3.2 or later.

Also note that the shared version of the libgcc_s library must be
installed for the pthread library to work correctly.

The GNU C Library supports these configurations for using Linux kernels:

	aarch64*-*-linux-gnu
	alpha*-*-linux-gnu
	arm-*-linux-gnueabi
	hppa-*-linux-gnu	Not currently functional without patches.
	i[4567]86-*-linux-gnu
	x86_64-*-linux-gnu	Can build either x86_64 or x32
	ia64-*-linux-gnu
	m68k-*-linux-gnu
	microblaze*-*-linux-gnu
	mips-*-linux-gnu
	mips64-*-linux-gnu
	powerpc-*-linux-gnu	Hardware or software floating point, BE only.
	powerpc64*-*-linux-gnu	Big-endian and little-endian.
	s390-*-linux-gnu
	s390x-*-linux-gnu
	sh[34]-*-linux-gnu
	sparc*-*-linux-gnu
	sparc64*-*-linux-gnu
	tilegx-*-linux-gnu
	tilepro-*-linux-gnu

If you are interested in doing a port, please contact the glibc
maintainers; see http://www.gnu.org/software/libc/ for more
information.

See the file INSTALL to find out how to configure, build, and install
the GNU C Library.  You might also consider reading the WWW pages for
the C library at http://www.gnu.org/software/libc/.

The GNU C Library is (almost) completely documented by the Texinfo manual
found in the `manual/' subdirectory.  The manual is still being updated
and contains some known errors and omissions; we regret that we do not
have the resources to work on the manual as much as we would like.  For
corrections to the manual, please file a bug in the `manual' component,
following the bug-reporting instructions below.  Please be sure to check
the manual in the current development sources to see if your problem has
already been corrected.

Please see http://www.gnu.org/software/libc/bugs.html for bug reporting
information.  We are now using the Bugzilla system to track all bug reports.
This web page gives detailed information on how to report bugs properly.

The GNU C Library is free software.  See the file COPYING.LIB for copying
conditions, and LICENSES for notices about a few contributions that require
these additional notices to be distributed.  License copyright years may be
listed using range notation, e.g., 1996-2015, indicating that every year in
the range, inclusive, is a copyrightable year that would otherwise be listed
individually.