glibc/sysdeps/x86_64/multiarch
Noah Goldstein 6abf27980a x86: Improve memset-vec-unaligned-erms.S
No bug. This commit makes a few small improvements to
memset-vec-unaligned-erms.S. The changes are 1) only aligning to 64
instead of 128. Either alignment will perform equally well in a loop
and 128 just increases the odds of having to do an extra iteration
which can be significant overhead for small values. 2) Align some
targets and the loop. 3) Remove an ALU from the alignment process. 4)
Reorder the last 4x VEC so that they are stored after the loop. 5)
Move the condition for leq 8x VEC to before the alignment
process. test-memset and test-wmemset are both passing.

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-05-20 17:28:33 -04:00
..
bcopy.S
ifunc-avx2.h x86-64: Require BMI2 for strchr-avx2.S 2021-04-19 11:01:45 -07:00
ifunc-evex.h x86: Add EVEX optimized memchr family not safe for RTM 2021-05-08 16:26:30 -04:00
ifunc-impl-list.c x86: Optimize memcmp-avx2-movbe.S 2021-05-18 22:57:44 -04:00
ifunc-memcmp.h x86: Optimize memcmp-avx2-movbe.S 2021-05-18 22:57:44 -04:00
ifunc-memmove.h x86-64: Use ZMM16-ZMM31 in AVX512 memmove family functions 2021-03-29 07:40:17 -07:00
ifunc-memset.h x86: Optimize less_vec evex and avx512 memset-vec-unaligned-erms.S 2021-04-19 15:08:04 -07:00
ifunc-sse4_2.h Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
ifunc-strcasecmp.h Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
ifunc-strcpy.h x86-64: Add AVX optimized string/memory functions for RTM 2021-03-29 07:40:17 -07:00
ifunc-wmemset.h x86-64: Use ZMM16-ZMM31 in AVX512 memset family functions 2021-03-29 07:40:17 -07:00
Makefile x86: Add EVEX optimized memchr family not safe for RTM 2021-05-08 16:26:30 -04:00
memchr-avx2-rtm.S x86-64: Add AVX optimized string/memory functions for RTM 2021-03-29 07:40:17 -07:00
memchr-avx2.S x86: Optimize memchr-avx2.S 2021-05-03 21:17:21 -04:00
memchr-evex-rtm.S x86: Add EVEX optimized memchr family not safe for RTM 2021-05-08 16:26:30 -04:00
memchr-evex.S x86: Add EVEX optimized memchr family not safe for RTM 2021-05-08 16:26:30 -04:00
memchr-sse2.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
memchr.c x86: Add EVEX optimized memchr family not safe for RTM 2021-05-08 16:26:30 -04:00
memcmp-avx2-movbe-rtm.S x86-64: Add AVX optimized string/memory functions for RTM 2021-03-29 07:40:17 -07:00
memcmp-avx2-movbe.S x86: Optimize memcmp-avx2-movbe.S 2021-05-18 22:57:44 -04:00
memcmp-evex-movbe.S x86: Optimize memcmp-evex-movbe.S 2021-05-18 22:57:51 -04:00
memcmp-sse2.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
memcmp-sse4.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
memcmp-ssse3.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
memcmp.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
memcpy_chk-nonshared.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
memcpy_chk.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
memcpy-ssse3-back.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
memcpy-ssse3.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
memcpy.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
memmove_chk-nonshared.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
memmove_chk.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
memmove-avx512-no-vzeroupper.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
memmove-avx512-unaligned-erms.S x86-64: Use ZMM16-ZMM31 in AVX512 memmove family functions 2021-03-29 07:40:17 -07:00
memmove-avx-unaligned-erms-rtm.S x86-64: Add AVX optimized string/memory functions for RTM 2021-03-29 07:40:17 -07:00
memmove-avx-unaligned-erms.S
memmove-evex-unaligned-erms.S x86-64: Add memmove family functions with 256-bit EVEX 2021-03-29 07:40:17 -07:00
memmove-sse2-unaligned-erms.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
memmove-ssse3-back.S
memmove-ssse3.S
memmove-vec-unaligned-erms.S x86: Update large memcpy case in memmove-vec-unaligned-erms.S 2021-04-16 10:06:56 -07:00
memmove.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
mempcpy_chk-nonshared.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
mempcpy_chk.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
mempcpy.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
memrchr-avx2-rtm.S x86-64: Add AVX optimized string/memory functions for RTM 2021-03-29 07:40:17 -07:00
memrchr-avx2.S x86-64: Add AVX optimized string/memory functions for RTM 2021-03-29 07:40:17 -07:00
memrchr-evex.S x86-64: Add ifunc-avx2.h functions with 256-bit EVEX 2021-03-29 07:40:17 -07:00
memrchr-sse2.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
memrchr.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
memset_chk-nonshared.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
memset_chk.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
memset-avx2-unaligned-erms-rtm.S x86-64: Add AVX optimized string/memory functions for RTM 2021-03-29 07:40:17 -07:00
memset-avx2-unaligned-erms.S x86-64: Add AVX optimized string/memory functions for RTM 2021-03-29 07:40:17 -07:00
memset-avx512-no-vzeroupper.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
memset-avx512-unaligned-erms.S x86: Optimize less_vec evex and avx512 memset-vec-unaligned-erms.S 2021-04-19 15:08:04 -07:00
memset-evex-unaligned-erms.S x86: Optimize less_vec evex and avx512 memset-vec-unaligned-erms.S 2021-04-19 15:08:04 -07:00
memset-sse2-unaligned-erms.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
memset-vec-unaligned-erms.S x86: Improve memset-vec-unaligned-erms.S 2021-05-20 17:28:33 -04:00
memset.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
rawmemchr-avx2-rtm.S x86-64: Add AVX optimized string/memory functions for RTM 2021-03-29 07:40:17 -07:00
rawmemchr-avx2.S
rawmemchr-evex-rtm.S x86: Add EVEX optimized memchr family not safe for RTM 2021-05-08 16:26:30 -04:00
rawmemchr-evex.S x86-64: Add ifunc-avx2.h functions with 256-bit EVEX 2021-03-29 07:40:17 -07:00
rawmemchr-sse2.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
rawmemchr.c x86: Add EVEX optimized memchr family not safe for RTM 2021-05-08 16:26:30 -04:00
stpcpy-avx2-rtm.S x86-64: Add AVX optimized string/memory functions for RTM 2021-03-29 07:40:17 -07:00
stpcpy-avx2.S
stpcpy-evex.S x86-64: Add strcpy family functions with 256-bit EVEX 2021-03-29 07:40:17 -07:00
stpcpy-sse2-unaligned.S
stpcpy-sse2.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
stpcpy-ssse3.S
stpcpy.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
stpncpy-avx2-rtm.S x86-64: Add AVX optimized string/memory functions for RTM 2021-03-29 07:40:17 -07:00
stpncpy-avx2.S
stpncpy-c.c
stpncpy-evex.S x86-64: Add strcpy family functions with 256-bit EVEX 2021-03-29 07:40:17 -07:00
stpncpy-sse2-unaligned.S
stpncpy-ssse3.S
stpncpy.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strcasecmp_l-avx.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strcasecmp_l-sse2.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strcasecmp_l-sse4_2.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strcasecmp_l-ssse3.S
strcasecmp_l.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strcasecmp.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strcat-avx2-rtm.S x86-64: Add AVX optimized string/memory functions for RTM 2021-03-29 07:40:17 -07:00
strcat-avx2.S x86-64: Add AVX optimized string/memory functions for RTM 2021-03-29 07:40:17 -07:00
strcat-evex.S x86-64: Add strcpy family functions with 256-bit EVEX 2021-03-29 07:40:17 -07:00
strcat-sse2-unaligned.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strcat-sse2.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strcat-ssse3.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strcat.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strchr-avx2-rtm.S x86-64: Add AVX optimized string/memory functions for RTM 2021-03-29 07:40:17 -07:00
strchr-avx2.S x86: Optimize strchr-avx2.S 2021-04-25 10:04:31 -07:00
strchr-evex.S x86: Optimize strchr-evex.S 2021-04-25 10:04:39 -07:00
strchr-sse2-no-bsf.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strchr-sse2.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strchr.c x86-64: Add AVX optimized string/memory functions for RTM 2021-03-29 07:40:17 -07:00
strchrnul-avx2-rtm.S x86-64: Add AVX optimized string/memory functions for RTM 2021-03-29 07:40:17 -07:00
strchrnul-avx2.S
strchrnul-evex.S x86-64: Add ifunc-avx2.h functions with 256-bit EVEX 2021-03-29 07:40:17 -07:00
strchrnul-sse2.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strchrnul.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strcmp-avx2-rtm.S x86-64: Add AVX optimized string/memory functions for RTM 2021-03-29 07:40:17 -07:00
strcmp-avx2.S x86-64: Add AVX optimized string/memory functions for RTM 2021-03-29 07:40:17 -07:00
strcmp-evex.S x86-64: Add ifunc-avx2.h functions with 256-bit EVEX 2021-03-29 07:40:17 -07:00
strcmp-sse2-unaligned.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strcmp-sse2.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strcmp-sse4_2.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strcmp-sse42.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strcmp-ssse3.S
strcmp.c x86-64: Add AVX optimized string/memory functions for RTM 2021-03-29 07:40:17 -07:00
strcpy-avx2-rtm.S x86-64: Add AVX optimized string/memory functions for RTM 2021-03-29 07:40:17 -07:00
strcpy-avx2.S x86-64: Add AVX optimized string/memory functions for RTM 2021-03-29 07:40:17 -07:00
strcpy-evex.S x86-64: Add strcpy family functions with 256-bit EVEX 2021-03-29 07:40:17 -07:00
strcpy-sse2-unaligned.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strcpy-sse2.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strcpy-ssse3.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strcpy.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strcspn-c.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strcspn-sse2.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strcspn.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strlen-avx2-rtm.S x86-64: Add AVX optimized string/memory functions for RTM 2021-03-29 07:40:17 -07:00
strlen-avx2.S x86: Optimize strlen-avx2.S 2021-04-19 18:03:49 -07:00
strlen-evex.S x86: Optimize strlen-evex.S 2021-04-19 18:03:49 -07:00
strlen-sse2.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strlen.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strncase_l-avx.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strncase_l-sse2.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strncase_l-sse4_2.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strncase_l-ssse3.S
strncase_l.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strncase.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strncat-avx2-rtm.S x86-64: Add AVX optimized string/memory functions for RTM 2021-03-29 07:40:17 -07:00
strncat-avx2.S
strncat-c.c
strncat-evex.S x86-64: Add strcpy family functions with 256-bit EVEX 2021-03-29 07:40:17 -07:00
strncat-sse2-unaligned.S
strncat-ssse3.S
strncat.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strncmp-avx2-rtm.S x86-64: Add AVX optimized string/memory functions for RTM 2021-03-29 07:40:17 -07:00
strncmp-avx2.S
strncmp-evex.S x86-64: Add ifunc-avx2.h functions with 256-bit EVEX 2021-03-29 07:40:17 -07:00
strncmp-sse2.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strncmp-sse4_2.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strncmp-ssse3.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strncmp.c x86-64: Add AVX optimized string/memory functions for RTM 2021-03-29 07:40:17 -07:00
strncpy-avx2-rtm.S x86-64: Add AVX optimized string/memory functions for RTM 2021-03-29 07:40:17 -07:00
strncpy-avx2.S
strncpy-c.c
strncpy-evex.S x86-64: Add strcpy family functions with 256-bit EVEX 2021-03-29 07:40:17 -07:00
strncpy-sse2-unaligned.S
strncpy-ssse3.S
strncpy.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strnlen-avx2-rtm.S x86-64: Add AVX optimized string/memory functions for RTM 2021-03-29 07:40:17 -07:00
strnlen-avx2.S
strnlen-evex.S x86-64: Add ifunc-avx2.h functions with 256-bit EVEX 2021-03-29 07:40:17 -07:00
strnlen-sse2.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strnlen.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strpbrk-c.c
strpbrk-sse2.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strpbrk.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strrchr-avx2-rtm.S x86-64: Add AVX optimized string/memory functions for RTM 2021-03-29 07:40:17 -07:00
strrchr-avx2.S x86-64: Add AVX optimized string/memory functions for RTM 2021-03-29 07:40:17 -07:00
strrchr-evex.S x86-64: Add ifunc-avx2.h functions with 256-bit EVEX 2021-03-29 07:40:17 -07:00
strrchr-sse2.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strrchr.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strspn-c.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strspn-sse2.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strspn.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strstr-sse2-unaligned.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
strstr.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
varshift.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
varshift.h Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
wcschr-avx2-rtm.S x86-64: Add AVX optimized string/memory functions for RTM 2021-03-29 07:40:17 -07:00
wcschr-avx2.S
wcschr-evex.S x86-64: Add ifunc-avx2.h functions with 256-bit EVEX 2021-03-29 07:40:17 -07:00
wcschr-sse2.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
wcschr.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
wcscmp-avx2-rtm.S x86-64: Add AVX optimized string/memory functions for RTM 2021-03-29 07:40:17 -07:00
wcscmp-avx2.S
wcscmp-evex.S x86-64: Add ifunc-avx2.h functions with 256-bit EVEX 2021-03-29 07:40:17 -07:00
wcscmp-sse2.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
wcscmp.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
wcscpy-c.c
wcscpy-ssse3.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
wcscpy.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
wcslen-avx2-rtm.S x86-64: Add AVX optimized string/memory functions for RTM 2021-03-29 07:40:17 -07:00
wcslen-avx2.S
wcslen-evex.S x86-64: Add ifunc-avx2.h functions with 256-bit EVEX 2021-03-29 07:40:17 -07:00
wcslen-sse2.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
wcslen.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
wcsncmp-avx2-rtm.S x86-64: Add AVX optimized string/memory functions for RTM 2021-03-29 07:40:17 -07:00
wcsncmp-avx2.S
wcsncmp-evex.S x86-64: Add ifunc-avx2.h functions with 256-bit EVEX 2021-03-29 07:40:17 -07:00
wcsncmp-sse2.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
wcsncmp.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
wcsnlen-avx2-rtm.S x86-64: Add AVX optimized string/memory functions for RTM 2021-03-29 07:40:17 -07:00
wcsnlen-avx2.S
wcsnlen-c.c
wcsnlen-evex.S x86-64: Add ifunc-avx2.h functions with 256-bit EVEX 2021-03-29 07:40:17 -07:00
wcsnlen-sse4_1.S
wcsnlen.c x86-64: Add AVX optimized string/memory functions for RTM 2021-03-29 07:40:17 -07:00
wcsrchr-avx2-rtm.S x86-64: Add AVX optimized string/memory functions for RTM 2021-03-29 07:40:17 -07:00
wcsrchr-avx2.S
wcsrchr-evex.S x86-64: Add ifunc-avx2.h functions with 256-bit EVEX 2021-03-29 07:40:17 -07:00
wcsrchr-sse2.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
wcsrchr.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
wmemchr-avx2-rtm.S x86-64: Add AVX optimized string/memory functions for RTM 2021-03-29 07:40:17 -07:00
wmemchr-avx2.S
wmemchr-evex-rtm.S x86: Add EVEX optimized memchr family not safe for RTM 2021-05-08 16:26:30 -04:00
wmemchr-evex.S x86-64: Add ifunc-avx2.h functions with 256-bit EVEX 2021-03-29 07:40:17 -07:00
wmemchr-sse2.S
wmemchr.c x86: Add EVEX optimized memchr family not safe for RTM 2021-05-08 16:26:30 -04:00
wmemcmp-avx2-movbe-rtm.S x86-64: Add AVX optimized string/memory functions for RTM 2021-03-29 07:40:17 -07:00
wmemcmp-avx2-movbe.S
wmemcmp-c.c
wmemcmp-evex-movbe.S x86-64: Add memcmp family functions with 256-bit EVEX 2021-03-29 07:40:17 -07:00
wmemcmp-sse4.S
wmemcmp-ssse3.S
wmemcmp.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
wmemset_chk-nonshared.S Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
wmemset_chk.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
wmemset.c Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00