eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2025-01-30 17:40:05 +08:00

Go to file

Rasmus Munk Larsen b47c777993 Block transposeInPlace() when the matrix is real and square. This yields a large speedup because we transpose in registers (or L1 if we spill), instead of one packet at a time, which in the worst case makes the code write to the same cache line PacketSize times instead of once. rmlarsen@rmlarsen4:.../eigen_bench/google3$ benchy --benchmarks=.TransposeInPlace.float.* --reference=srcfs experimental/users/rmlarsen/bench:matmul_bench 10 / 10 [====================================================================================================================================================================================================================] 100.00% 2m50s (Generated by http://go/benchy. Settings: --runs 5 --benchtime 1s --reference "srcfs" --benchmarks ".TransposeInPlace.float.*" experimental/users/rmlarsen/bench:matmul_bench) name old time/op new time/op delta BM_TransposeInPlace<float>/4 9.84ns ± 0% 6.51ns ± 0% -33.80% (p=0.008 n=5+5) BM_TransposeInPlace<float>/8 23.6ns ± 1% 17.6ns ± 0% -25.26% (p=0.016 n=5+4) BM_TransposeInPlace<float>/16 78.8ns ± 0% 60.3ns ± 0% -23.50% (p=0.029 n=4+4) BM_TransposeInPlace<float>/32 302ns ± 0% 229ns ± 0% -24.40% (p=0.008 n=5+5) BM_TransposeInPlace<float>/59 1.03µs ± 0% 0.84µs ± 1% -17.87% (p=0.016 n=5+4) BM_TransposeInPlace<float>/64 1.20µs ± 0% 0.89µs ± 1% -25.81% (p=0.008 n=5+5) BM_TransposeInPlace<float>/128 8.96µs ± 0% 3.82µs ± 2% -57.33% (p=0.008 n=5+5) BM_TransposeInPlace<float>/256 152µs ± 3% 17µs ± 2% -89.06% (p=0.008 n=5+5) BM_TransposeInPlace<float>/512 837µs ± 1% 208µs ± 0% -75.15% (p=0.008 n=5+5) BM_TransposeInPlace<float>/1k 4.28ms ± 2% 1.08ms ± 2% -74.72% (p=0.008 n=5+5)		2020-04-28 16:08:16 +00:00
bench	Make file formatting comply with POSIX and Unix standards	2020-03-23 18:09:02 +00:00
blas	STYLE: Remove CMake-language block-end command arguments	2019-10-31 11:36:27 -05:00
cmake	[SYCL] Rebasing the SYCL support branch on top of the Einge upstream master branch.	2019-11-28 10:08:54 +00:00
debug	MIsc. source and comment typos	2018-03-11 10:01:44 -04:00
demos	Make file formatting comply with POSIX and Unix standards	2020-03-23 18:09:02 +00:00
doc	Update PreprocessorDirectives.dox - Added line for the new VectorwiseOp plugin directive (and re-alphabatized the plugin section)	2020-04-17 21:43:37 +00:00
Eigen	Block transposeInPlace() when the matrix is real and square. This yields a large speedup because we transpose in registers (or L1 if we spill), instead of one packet at a time, which in the worst case makes the code write to the same cache line PacketSize times instead of once.	2020-04-28 16:08:16 +00:00
failtest	Make file formatting comply with POSIX and Unix standards	2020-03-23 18:09:02 +00:00
lapack	STYLE: Convert CMake-language commands to lower case	2019-10-31 11:36:37 -05:00
scripts	Replace calls to "hg" by calls to "git"	2019-12-04 11:24:06 +01:00
test	Block transposeInPlace() when the matrix is real and square. This yields a large speedup because we transpose in registers (or L1 if we spill), instead of one packet at a time, which in the worst case makes the code write to the same cache line PacketSize times instead of once.	2020-04-28 16:08:16 +00:00
unsupported	Add async evaluation support to TensorSlicingOp.	2020-04-22 19:55:01 +00:00
.gitignore	Renamed .hgignore to .gitignore (removing hg-specific "syntax" line)	2019-12-13 19:40:57 +01:00
.hgeol	Added a pattern which forces LF line endings for *.sh files.	2013-07-31 18:20:58 +02:00
CMakeLists.txt	Don't restrict CMAKE_BUILD_TYPE	2020-02-28 20:46:53 +00:00
COPYING.BSD	Make file formatting comply with POSIX and Unix standards	2020-03-23 18:09:02 +00:00
COPYING.GPL
COPYING.LGPL	Replace COPYING.LGPL by a copy of the LGPL 2.1 (instead of LGPL 3).	2012-09-10 13:27:44 -04:00
COPYING.MINPACK	Make file formatting comply with POSIX and Unix standards	2020-03-23 18:09:02 +00:00
COPYING.MPL2	add COPYING.MPL2	2012-07-15 10:20:59 -04:00
COPYING.README	Replace COPYING.LGPL by a copy of the LGPL 2.1 (instead of LGPL 3).	2012-09-10 13:27:44 -04:00
CTestConfig.cmake	STYLE: Convert CMake-language commands to lower case	2019-10-31 11:36:37 -05:00
CTestCustom.cmake.in	Allow to filter out build-error messages	2018-07-24 20:12:49 +02:00
eigen3.pc.in	Further fixes for CMAKE_INSTALL_PREFIX correctness	2015-11-07 21:29:24 -05:00
INSTALL
README.md	Update old links to bitbucket to point to gitlab.com	2019-12-04 10:57:07 +01:00
signature_of_eigen3_matrix_library

README.md

Eigen is a C++ template library for linear algebra: matrices, vectors, numerical solvers, and related algorithms.

For more information go to http://eigen.tuxfamily.org/.

For pull request, bug reports, and feature requests, go to https://gitlab.com/libeigen/eigen.