glibc/include
H.J. Lu ef9c4cb6c7 x86-64: Optimize wmemset with SSE2/AVX2/AVX512
The difference between memset and wmemset is byte vs int.  Add stubs
to SSE2/AVX2/AVX512 memset for wmemset with updated constant and size:

SSE2 wmemset:
	shl    $0x2,%rdx
	movd   %esi,%xmm0
	mov    %rdi,%rax
	pshufd $0x0,%xmm0,%xmm0
	jmp	entry_from_wmemset

SSE2 memset:
	movd   %esi,%xmm0
	mov    %rdi,%rax
	punpcklbw %xmm0,%xmm0
	punpcklwd %xmm0,%xmm0
	pshufd $0x0,%xmm0,%xmm0
entry_from_wmemset:

Since the ERMS versions of wmemset requires "rep stosl" instead of
"rep stosb", only the vector store stubs of SSE2/AVX2/AVX512 wmemset
are added.  The SSE2 wmemset is about 3X faster and the AVX2 wmemset
is about 6X faster on Haswell.

	* include/wchar.h (__wmemset_chk): New.
	* sysdeps/x86_64/memset.S (VDUP_TO_VEC0_AND_SET_RETURN): Renamed
	to MEMSET_VDUP_TO_VEC0_AND_SET_RETURN.
	(WMEMSET_VDUP_TO_VEC0_AND_SET_RETURN): New.
	(WMEMSET_CHK_SYMBOL): Likewise.
	(WMEMSET_SYMBOL): Likewise.
	(__wmemset): Add hidden definition.
	(wmemset): Add weak hidden definition.
	* sysdeps/x86_64/multiarch/Makefile (sysdep_routines): Add
	wmemset_chk-nonshared.
	* sysdeps/x86_64/multiarch/ifunc-impl-list.c
	(__libc_ifunc_impl_list): Add __wmemset_sse2_unaligned,
	__wmemset_avx2_unaligned, __wmemset_avx512_unaligned,
	__wmemset_chk_sse2_unaligned, __wmemset_chk_avx2_unaligned
	and __wmemset_chk_avx512_unaligned.
	* sysdeps/x86_64/multiarch/memset-avx2-unaligned-erms.S
	(VDUP_TO_VEC0_AND_SET_RETURN): Renamed to ...
	(MEMSET_VDUP_TO_VEC0_AND_SET_RETURN): This.
	(WMEMSET_VDUP_TO_VEC0_AND_SET_RETURN): New.
	(WMEMSET_SYMBOL): Likewise.
	* sysdeps/x86_64/multiarch/memset-avx512-unaligned-erms.S
	(VDUP_TO_VEC0_AND_SET_RETURN): Renamed to ...
	(MEMSET_VDUP_TO_VEC0_AND_SET_RETURN): This.
	(WMEMSET_VDUP_TO_VEC0_AND_SET_RETURN): New.
	(WMEMSET_SYMBOL): Likewise.
	* sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S: Updated.
	(WMEMSET_CHK_SYMBOL): New.
	(WMEMSET_CHK_SYMBOL (__wmemset_chk, unaligned)): Likewise.
	(WMEMSET_SYMBOL (__wmemset, unaligned)): Likewise.
	* sysdeps/x86_64/multiarch/memset.S (WMEMSET_SYMBOL): New.
	(libc_hidden_builtin_def): Also define __GI_wmemset and
	__GI___wmemset.
	(weak_alias): New.
	* sysdeps/x86_64/multiarch/wmemset.c: New file.
	* sysdeps/x86_64/multiarch/wmemset.h: Likewise.
	* sysdeps/x86_64/multiarch/wmemset_chk-nonshared.S: Likewise.
	* sysdeps/x86_64/multiarch/wmemset_chk.c: Likewise.
	* sysdeps/x86_64/wmemset.c: Likewise.
	* sysdeps/x86_64/wmemset_chk.c: Likewise.
2017-06-05 11:09:59 -07:00
..
arpa nss_dns: Replace local declarations with declarations from a header file 2017-04-04 20:56:23 +02:00
bits Remove __need macros from signal.h. 2017-05-20 19:04:43 -04:00
gnu Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
net
netinet Installed header hygiene (BZ#20366): Test of installed headers. 2016-09-23 08:43:56 -04:00
programs Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
protocols
rpc sunrpc: Always obtain AF_INET addresses from NSS [BZ #20964] 2016-12-27 16:44:15 +01:00
rpcsvc Deprecate libnsl by default (only shared library will be 2017-03-21 15:14:27 +01:00
sys posix: Implement preadv2 and pwritev2 2017-05-31 17:35:46 -03:00
aio.h
aliases.h Installed header hygiene (BZ#20366): Test of installed headers. 2016-09-23 08:43:56 -04:00
alloca.h Installed header hygiene (BZ#20366): Test of installed headers. 2016-09-23 08:43:56 -04:00
argp.h
argz.h Installed header hygiene (BZ#20366): Test of installed headers. 2016-09-23 08:43:56 -04:00
assert.h
atomic.h Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
byteswap.h
caller.h Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
complex.h float128: Add private _Float128 declarations for libm. 2017-05-15 10:23:28 -03:00
cpio.h
crypt.h Add include/crypt.h. 2016-10-28 22:40:16 -04:00
ctype.h Rename bits/libc-tsd.h to libc-tsd.h (bug 14912). 2015-09-03 20:33:46 +00:00
des.h
dirent.h Mark internal dirent functions hidden 2015-10-15 14:15:41 -07:00
dlfcn.h Mark _dl_catch_error hidden 2015-10-15 14:13:50 -07:00
elf.h Installed header hygiene (BZ#20366): Test of installed headers. 2016-09-23 08:43:56 -04:00
endian.h
envz.h Installed header hygiene (BZ#20366): Test of installed headers. 2016-09-23 08:43:56 -04:00
err.h Installed header hygiene (BZ#20366): Test of installed headers. 2016-09-23 08:43:56 -04:00
errno.h Suppress internal declarations for most of the testsuite. 2017-05-11 19:27:59 -04:00
error.h
execinfo.h Installed header hygiene (BZ#20366): Test of installed headers. 2016-09-23 08:43:56 -04:00
fcntl.h Assume that O_CLOEXEC is always defined and works 2017-04-18 14:56:51 +02:00
features.h Add support for testing __STDC_WANT_IEC_60559_TYPES_EXT__ 2017-05-09 11:40:28 -03:00
fenv.h Mark fegetround pure (bug 16296). 2015-09-15 20:36:50 +00:00
float.h float128: Add private _Float128 declarations for libm. 2017-05-15 10:23:28 -03:00
fmtmsg.h
fnmatch.h
fpu_control.h Installed header hygiene (BZ#20366): Test of installed headers. 2016-09-23 08:43:56 -04:00
ftw.h
gconv.h
getopt_int.h
getopt.h getopt: remove USE_NONOPTION_FLAGS 2017-04-07 07:45:53 -04:00
glob.h
gmp.h
gnu-versions.h Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
grp-merge.h NSS: Implement group merging support. 2016-04-29 22:18:21 -04:00
grp.h
gshadow.h Installed header hygiene (BZ#20366): Test of installed headers. 2016-09-23 08:43:56 -04:00
iconv.h
ifaddrs.h Installed header hygiene (BZ#20366): Test of installed headers. 2016-09-23 08:43:56 -04:00
ifunc-impl-list.h Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
inline-hashtab.h Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
langinfo.h
libc-diag.h Split DIAG_* macros to new header libc-diag.h. 2017-02-25 09:59:46 -05:00
libc-internal.h Narrowing the visibility of libc-internal.h even further. 2017-03-01 20:33:46 -05:00
libc-pointer-arith.h Narrowing the visibility of libc-internal.h even further. 2017-03-01 20:33:46 -05:00
libc-symbols.h Suppress internal declarations for most of the testsuite. 2017-05-11 19:27:59 -04:00
libgen.h
libintl.h Installed header hygiene (BZ#20366): Test of installed headers. 2016-09-23 08:43:56 -04:00
libio.h Remove _IO_MTSAFE_IO from public headers. 2017-05-11 19:14:11 -04:00
limits.h Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
link.h Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
list_t.h Remove __need_list_t and __need_res_state. 2017-05-20 19:01:46 -04:00
list.h Remove __need_list_t and __need_res_state. 2017-05-20 19:01:46 -04:00
locale.h
malloc.h Installed header hygiene (BZ#20366): Test of installed headers. 2016-09-23 08:43:56 -04:00
math.h float128: Add private _Float128 declarations for libm. 2017-05-15 10:23:28 -03:00
mcheck.h Installed header hygiene (BZ#20366): Test of installed headers. 2016-09-23 08:43:56 -04:00
memory.h
mntent.h Installed header hygiene (BZ#20366): Test of installed headers. 2016-09-23 08:43:56 -04:00
monetary.h
mqueue.h Fix mq_receive, mq_send mq_timed* namespace (bug 18545). 2015-06-17 20:19:04 +00:00
netdb.h Fix network headers stdint.h namespace (bug 21455). 2017-05-04 20:36:42 +00:00
netgroup.h
nl_types.h
nss.h Installed header hygiene (BZ#20366): Test of installed headers. 2016-09-23 08:43:56 -04:00
nsswitch.h
obstack.h Installed header hygiene (BZ#20366): Test of installed headers. 2016-09-23 08:43:56 -04:00
poll.h
printf.h Installed header hygiene (BZ#20366): Test of installed headers. 2016-09-23 08:43:56 -04:00
pthread.h Fix mq_notify pthread_barrier_* namespace (bug 18544). 2015-06-17 20:16:56 +00:00
pty.h Installed header hygiene (BZ#20366): Test of installed headers. 2016-09-23 08:43:56 -04:00
pwd.h Harden putpwent, putgrent, putspent, putspent against injection [BZ #18724] 2015-10-02 11:34:13 +02:00
regex.h
resolv.h resolv: Reduce EDNS payload size to 1200 bytes [BZ #21361] 2017-04-13 13:09:38 +02:00
rounding-mode.h Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
sched.h posix: New Linux posix_spawn{p} implementation 2016-03-07 11:53:47 +07:00
scratch_buffer.h Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
search.h Fix sem_* tdelete, tfind, tsearch, twalk namespace (bug 18536). 2015-06-17 20:11:58 +00:00
set-hooks.h Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
setjmp.h Mark internal setjmp functions hidden 2015-10-15 14:22:25 -07:00
sgtty.h
shadow.h Installed header hygiene (BZ#20366): Test of installed headers. 2016-09-23 08:43:56 -04:00
shlib-compat.h Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
signal.h Fix struct sigaltstack namespace (bug 21517). 2017-06-05 10:17:46 +00:00
spawn.h
stab.h
stackinfo.h Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
stap-probe.h Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
stdc-predef.h Bug 20313: Update to Unicode 9.0.0 2017-02-21 06:30:38 -05:00
stdio_ext.h Installed header hygiene (BZ#20366): Test of installed headers. 2016-09-23 08:43:56 -04:00
stdio.h Suppress internal declarations for most of the testsuite. 2017-05-11 19:27:59 -04:00
stdlib.h Add reallocarray function 2017-05-30 18:27:57 -03:00
string.h Suppress internal declarations for most of the testsuite. 2017-05-11 19:27:59 -04:00
strings.h
stropts.h
stubs-prologue.h
syscall.h
sysexits.h
syslog.h
tar.h
termios.h
tgmath.h
time.h Suppress internal declarations for most of the testsuite. 2017-05-11 19:27:59 -04:00
ttyent.h Installed header hygiene (BZ#20366): Test of installed headers. 2016-09-23 08:43:56 -04:00
uchar.h
ucontext.h
ulimit.h
unistd.h Suppress internal declarations for most of the testsuite. 2017-05-11 19:27:59 -04:00
utime.h
utmp.h Installed header hygiene (BZ#20366): Test of installed headers. 2016-09-23 08:43:56 -04:00
values.h Update copyright dates with scripts/update-copyrights. 2017-01-01 00:14:16 +00:00
wchar.h x86-64: Optimize wmemset with SSE2/AVX2/AVX512 2017-06-05 11:09:59 -07:00
wctype.h
wordexp.h
xlocale.h