glibc/sysdeps
Siddhesh Poyarekar db725a458e aarch64,falkor: Ignore prefetcher tagging for smaller copies
For smaller and medium sized copies, the effect of hardware
prefetching are not as dominant as instruction level parallelism.
Hence it makes more sense to load data into multiple registers than to
try and route them to the same prefetch unit.  This is also the case
for the loop exit where we are unable to latch on to the same prefetch
unit anyway so it makes more sense to have data loaded in parallel.

The performance results are a bit mixed with memcpy-random, with
numbers jumping between -1% and +3%, i.e. the numbers don't seem
repeatable.  memcpy-walk sees a 70% improvement (i.e. > 2x) for 128
bytes and that improvement reduces down as the impact of the tail copy
decreases in comparison to the loop.

	* sysdeps/aarch64/multiarch/memcpy_falkor.S (__memcpy_falkor):
	Use multiple registers to copy data in loop tail.
2018-05-11 00:11:52 +05:30
..
aarch64 aarch64,falkor: Ignore prefetcher tagging for smaller copies 2018-05-11 00:11:52 +05:30
alpha Move math_opt_barrier, math_force_eval to separate math-barriers.h. 2018-05-09 19:45:47 +00:00
arm
generic Move math_check_force_underflow macros to separate math-underflow.h. 2018-05-10 00:53:04 +00:00
gnu Increase robustness of internal dlopen() by using RTLD_NOW [BZ #22766] 2018-04-26 10:41:43 -03:00
hppa Update hppa libm-test-ulps 2018-04-20 15:36:41 -03:00
htl
hurd
i386 Move math_opt_barrier, math_force_eval to separate math-barriers.h. 2018-05-09 19:45:47 +00:00
ia64
ieee754 Move math_check_force_underflow macros to separate math-underflow.h. 2018-05-10 00:53:04 +00:00
init_array
m68k Move math_opt_barrier, math_force_eval to separate math-barriers.h. 2018-05-09 19:45:47 +00:00
mach Ignore absolute symbols in ABI tests. 2018-05-04 15:46:32 +00:00
microblaze
mips
nios2
nptl Fix comment typo 2018-05-08 14:59:13 +02:00
posix
powerpc Move math_check_force_underflow macros to separate math-underflow.h. 2018-05-10 00:53:04 +00:00
pthread
riscv
s390
sh
sparc
unix Ignore absolute symbols in ABI tests. 2018-05-04 15:46:32 +00:00
wordsize-32
wordsize-64
x86 Move math_check_force_underflow macros to separate math-underflow.h. 2018-05-10 00:53:04 +00:00
x86_64 x86-64/memset: Mark the debugger symbol as hidden 2018-05-07 11:01:48 -07:00