mirror of
git://sourceware.org/git/glibc.git
synced 2025-01-30 12:31:53 +08:00
ae8372d7e4
This patch adds SSE4.1 versions of trunc and truncf, using the roundsd / roundss instructions, similar to the versions of ceil, floor, rint and nearbyint functions we already have. In my testing with the glibc benchtests these are about 30% faster than the C versions for double, 20% faster for float. Tested for x86_64. [BZ #20142] * sysdeps/x86_64/fpu/multiarch/Makefile (libm-sysdep_routines): Add s_trunc-c, s_truncf-c, s_trunc-sse4_1 and s_truncf-sse4_1. * sysdeps/x86_64/fpu/multiarch/s_trunc-c.c: New file. * sysdeps/x86_64/fpu/multiarch/s_trunc-sse4_1.S: Likewise. * sysdeps/x86_64/fpu/multiarch/s_trunc.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_truncf-c.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_truncf-sse4_1.S: Likewise. * sysdeps/x86_64/fpu/multiarch/s_truncf.c: Likewise. |
||
---|---|---|
.. | ||
aarch64 | ||
alpha | ||
arm | ||
generic | ||
gnu | ||
hppa | ||
i386 | ||
ia64 | ||
ieee754 | ||
init_array | ||
m68k | ||
mach | ||
microblaze | ||
mips | ||
nios2 | ||
nptl | ||
posix | ||
powerpc | ||
pthread | ||
s390 | ||
sh | ||
sparc | ||
tile | ||
unix | ||
wordsize-32 | ||
wordsize-64 | ||
x86 | ||
x86_64 |