mirror of
git://sourceware.org/git/glibc.git
synced 2024-12-09 04:11:27 +08:00
922369032c
This is an optimized memcmp for AArch64. This is a complete rewrite using a different algorithm. The previous version split into cases where both inputs were aligned, the inputs were mutually aligned and unaligned using a byte loop. The new version combines all these cases, while small inputs of less than 8 bytes are handled separately. This allows the main code to be sped up using unaligned loads since there are now at least 8 bytes to be compared. After the first 8 bytes, align the first input. This ensures each iteration does at most one unaligned access and mutually aligned inputs behave as aligned. After the main loop, process the last 8 bytes using unaligned accesses. This improves performance of (mutually) aligned cases by 25% and unaligned by >500% (yes >6 times faster) on large inputs. * sysdeps/aarch64/memcmp.S (memcmp): Rewrite of optimized memcmp. |
||
---|---|---|
.. | ||
bits | ||
fpu | ||
multiarch | ||
nptl | ||
soft-fp | ||
__longjmp.S | ||
abort-instr.h | ||
atomic-machine.h | ||
backtrace.c | ||
bsd-_setjmp.S | ||
bsd-setjmp.S | ||
configure | ||
configure.ac | ||
crti.S | ||
crtn.S | ||
dl-irel.h | ||
dl-link.sym | ||
dl-machine.h | ||
dl-sysdep.h | ||
dl-tls.h | ||
dl-tlsdesc.h | ||
dl-tlsdesc.S | ||
dl-trampoline.S | ||
dl-tunables.list | ||
Implies | ||
jmpbuf-offsets.h | ||
jmpbuf-unwind.h | ||
ldsodefs.h | ||
libc-tls.c | ||
libm-test-ulps | ||
libm-test-ulps-name | ||
linkmap.h | ||
machine-gmon.h | ||
Makefile | ||
math-tests.h | ||
mcount.c | ||
memchr.S | ||
memcmp.S | ||
memcpy.S | ||
memmove.S | ||
memset.S | ||
memusage.h | ||
preconfigure | ||
rawmemchr.S | ||
setjmp.S | ||
sotruss-lib.c | ||
stackinfo.h | ||
start.S | ||
stpcpy.S | ||
strchr.S | ||
strchrnul.S | ||
strcmp.S | ||
strcpy.S | ||
string_private.h | ||
strlen.S | ||
strncmp.S | ||
strnlen.S | ||
strrchr.S | ||
sysdep.h | ||
tls-macros.h | ||
tlsdesc.c | ||
tlsdesc.sym | ||
tst-audit.h | ||
Versions |