glibc/sysdeps
Noah Goldstein 475b63702e x86: Double size of ERMS rep_movsb_threshold in dl-cacheinfo.h
No bug.

This patch doubles the rep_movsb_threshold when using ERMS. Based on
benchmarks the vector copy loop, especially now that it handles 4k
aliasing, is better for these medium ranged.

On Skylake with ERMS:

Size,   Align1, Align2, dst>src,(rep movsb) / (vec copy)
4096,   0,      0,      0,      0.975
4096,   0,      0,      1,      0.953
4096,   12,     0,      0,      0.969
4096,   12,     0,      1,      0.872
4096,   44,     0,      0,      0.979
4096,   44,     0,      1,      0.83
4096,   0,      12,     0,      1.006
4096,   0,      12,     1,      0.989
4096,   0,      44,     0,      0.739
4096,   0,      44,     1,      0.942
4096,   12,     12,     0,      1.009
4096,   12,     12,     1,      0.973
4096,   44,     44,     0,      0.791
4096,   44,     44,     1,      0.961
4096,   2048,   0,      0,      0.978
4096,   2048,   0,      1,      0.951
4096,   2060,   0,      0,      0.986
4096,   2060,   0,      1,      0.963
4096,   2048,   12,     0,      0.971
4096,   2048,   12,     1,      0.941
4096,   2060,   12,     0,      0.977
4096,   2060,   12,     1,      0.949
8192,   0,      0,      0,      0.85
8192,   0,      0,      1,      0.845
8192,   13,     0,      0,      0.937
8192,   13,     0,      1,      0.939
8192,   45,     0,      0,      0.932
8192,   45,     0,      1,      0.927
8192,   0,      13,     0,      0.621
8192,   0,      13,     1,      0.62
8192,   0,      45,     0,      0.53
8192,   0,      45,     1,      0.516
8192,   13,     13,     0,      0.664
8192,   13,     13,     1,      0.659
8192,   45,     45,     0,      0.593
8192,   45,     45,     1,      0.575
8192,   2048,   0,      0,      0.854
8192,   2048,   0,      1,      0.834
8192,   2061,   0,      0,      0.863
8192,   2061,   0,      1,      0.857
8192,   2048,   13,     0,      0.63
8192,   2048,   13,     1,      0.629
8192,   2061,   13,     0,      0.627
8192,   2061,   13,     1,      0.62

Signed-off-by: Noah Goldstein <goldstein.w.n@gmail.com>
Reviewed-by: H.J. Lu <hjl.tools@gmail.com>
2021-11-06 16:18:08 -05:00
..
aarch64 String: Add hidden defs for __memcmpeq() to enable internal usage 2021-10-26 16:51:29 -05:00
alpha elf: Fix dynamic-link.h usage on rtld.c 2021-10-14 14:52:07 -03:00
arc elf: Fix dynamic-link.h usage on rtld.c 2021-10-14 14:52:07 -03:00
arm arm: Use have-mtls-dialect-gnu2 to check for ARM TLS descriptors support 2021-11-01 16:23:15 -03:00
csky String: Add hidden defs for __memcmpeq() to enable internal usage 2021-10-26 16:51:29 -05:00
generic x86_64: Add support for __memcmpeq using sse2, avx2, and evex 2021-10-27 13:03:46 -05:00
gnu Remove "Contributed by" lines 2021-09-03 22:06:44 +05:30
hppa elf: Fix dynamic-link.h usage on rtld.c 2021-10-14 14:52:07 -03:00
htl htl: Reimplement GSCOPE 2021-09-16 01:04:17 +02:00
hurd Update copyright dates with scripts/update-copyrights 2021-01-02 12:17:34 -08:00
i386 String: Add hidden defs for __memcmpeq() to enable internal usage 2021-10-26 16:51:29 -05:00
ia64 String: Add hidden defs for __memcmpeq() to enable internal usage 2021-10-26 16:51:29 -05:00
ieee754 Fixed inaccuracy of j0f (BZ #28185) 2021-10-05 13:45:37 +02:00
m68k elf: Fix dynamic-link.h usage on rtld.c 2021-10-14 14:52:07 -03:00
mach String: Add support for __memcmpeq() ABI on all targets 2021-10-26 16:51:29 -05:00
microblaze elf: Fix dynamic-link.h usage on rtld.c 2021-10-14 14:52:07 -03:00
mips ld.so: Initialize bootstrap_map.l_ld_readonly [BZ #28340] 2021-10-19 06:40:38 -07:00
nios2 elf: Fix dynamic-link.h usage on rtld.c 2021-10-14 14:52:07 -03:00
nptl nptl: Use FUTEX_LOCK_PI2 when available 2021-10-01 08:09:13 -03:00
posix posix: Remove spawni.c 2021-09-27 12:44:25 -03:00
powerpc [powerpc] Tighten contraints for asm constant parameters 2021-11-03 09:17:28 -05:00
pthread elf: Avoid deadlock between pthread_create and ctors [BZ #28357] 2021-10-04 15:07:05 +01:00
riscv riscv: Build with -mno-relax if linker does not support R_RISCV_ALIGN 2021-11-03 09:25:06 -03:00
s390 String: Add hidden defs for __memcmpeq() to enable internal usage 2021-10-26 16:51:29 -05:00
sh elf: Fix dynamic-link.h usage on rtld.c 2021-10-14 14:52:07 -03:00
sparc String: Add hidden defs for __memcmpeq() to enable internal usage 2021-10-26 16:51:29 -05:00
unix Fix compiler issue with mmap_internal 2021-10-29 09:21:37 -03:00
wordsize-32 Disable symbol hack in libc_nonshared.a 2021-09-27 07:46:25 -07:00
wordsize-64 Remove "Contributed by" lines 2021-09-03 22:06:44 +05:30
x86 x86: Double size of ERMS rep_movsb_threshold in dl-cacheinfo.h 2021-11-06 16:18:08 -05:00
x86_64 x86: Optimize memmove-vec-unaligned-erms.S 2021-11-06 16:18:03 -05:00