eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2025-03-01 18:26:24 +08:00

Go to file

Rasmus Munk Larsen 7b76c85daf Vectorize and parallelize TensorScanOp. TensorScanOp is used in TensorFlow for a number of operations, such as cumulative logexp reduction and cumulative sum and product reductions. The benchmarks numbers below are for cumulative row- and column reductions of NxN matrices. name old time/op new time/op delta BM_cumSumRowReduction_1T/4 [using 1 threads ] 25.1ns ± 1% 35.2ns ± 1% +40.45% BM_cumSumRowReduction_1T/8 [using 1 threads ] 73.4ns ± 0% 82.7ns ± 3% +12.74% BM_cumSumRowReduction_1T/32 [using 1 threads ] 988ns ± 0% 832ns ± 0% -15.77% BM_cumSumRowReduction_1T/64 [using 1 threads ] 4.07µs ± 2% 3.47µs ± 0% -14.70% BM_cumSumRowReduction_1T/128 [using 1 threads ] 18.0µs ± 0% 16.8µs ± 0% -6.58% BM_cumSumRowReduction_1T/512 [using 1 threads ] 287µs ± 0% 281µs ± 0% -2.22% BM_cumSumRowReduction_1T/2k [using 1 threads ] 4.78ms ± 1% 4.78ms ± 2% ~ BM_cumSumRowReduction_1T/10k [using 1 threads ] 117ms ± 1% 117ms ± 1% ~ BM_cumSumRowReduction_8T/4 [using 8 threads ] 25.0ns ± 0% 35.2ns ± 0% +40.82% BM_cumSumRowReduction_8T/8 [using 8 threads ] 77.2ns ±16% 81.3ns ± 0% ~ BM_cumSumRowReduction_8T/32 [using 8 threads ] 988ns ± 0% 833ns ± 0% -15.67% BM_cumSumRowReduction_8T/64 [using 8 threads ] 4.08µs ± 2% 3.47µs ± 0% -14.95% BM_cumSumRowReduction_8T/128 [using 8 threads ] 18.0µs ± 0% 17.3µs ±10% ~ BM_cumSumRowReduction_8T/512 [using 8 threads ] 287µs ± 0% 58µs ± 6% -79.92% BM_cumSumRowReduction_8T/2k [using 8 threads ] 4.79ms ± 1% 0.64ms ± 1% -86.58% BM_cumSumRowReduction_8T/10k [using 8 threads ] 117ms ± 1% 18ms ± 6% -84.50% BM_cumSumColReduction_1T/4 [using 1 threads ] 23.9ns ± 0% 33.4ns ± 1% +39.68% BM_cumSumColReduction_1T/8 [using 1 threads ] 71.6ns ± 1% 49.1ns ± 3% -31.40% BM_cumSumColReduction_1T/32 [using 1 threads ] 973ns ± 0% 165ns ± 2% -83.10% BM_cumSumColReduction_1T/64 [using 1 threads ] 4.06µs ± 1% 0.57µs ± 1% -85.94% BM_cumSumColReduction_1T/128 [using 1 threads ] 33.4µs ± 1% 4.1µs ± 1% -87.67% BM_cumSumColReduction_1T/512 [using 1 threads ] 1.72ms ± 4% 0.21ms ± 5% -87.91% BM_cumSumColReduction_1T/2k [using 1 threads ] 119ms ±53% 11ms ±35% -90.42% BM_cumSumColReduction_1T/10k [using 1 threads ] 1.59s ±67% 0.35s ±49% -77.96% BM_cumSumColReduction_8T/4 [using 8 threads ] 23.8ns ± 0% 33.3ns ± 0% +40.06% BM_cumSumColReduction_8T/8 [using 8 threads ] 71.6ns ± 1% 49.2ns ± 5% -31.33% BM_cumSumColReduction_8T/32 [using 8 threads ] 1.01µs ±12% 0.17µs ± 3% -82.93% BM_cumSumColReduction_8T/64 [using 8 threads ] 4.15µs ± 4% 0.58µs ± 1% -86.09% BM_cumSumColReduction_8T/128 [using 8 threads ] 33.5µs ± 0% 4.1µs ± 4% -87.65% BM_cumSumColReduction_8T/512 [using 8 threads ] 1.71ms ± 3% 0.06ms ±16% -96.21% BM_cumSumColReduction_8T/2k [using 8 threads ] 97.1ms ±14% 3.0ms ±23% -96.88% BM_cumSumColReduction_8T/10k [using 8 threads ] 1.97s ± 8% 0.06s ± 2% -96.74%		2020-05-05 00:19:43 +00:00
bench	Fix perf monitoring merge function	2020-04-28 17:02:59 +00:00
blas	STYLE: Remove CMake-language block-end command arguments	2019-10-31 11:36:27 -05:00
cmake	[SYCL] Rebasing the SYCL support branch on top of the Einge upstream master branch.	2019-11-28 10:08:54 +00:00
debug	MIsc. source and comment typos	2018-03-11 10:01:44 -04:00
demos	Make file formatting comply with POSIX and Unix standards	2020-03-23 18:09:02 +00:00
doc	Update PreprocessorDirectives.dox - Added line for the new VectorwiseOp plugin directive (and re-alphabatized the plugin section)	2020-04-17 21:43:37 +00:00
Eigen	Fix confusing template param name for Stride fwd decl.	2020-04-30 01:43:05 +00:00
failtest	Make file formatting comply with POSIX and Unix standards	2020-03-23 18:09:02 +00:00
lapack	STYLE: Convert CMake-language commands to lower case	2019-10-31 11:36:37 -05:00
scripts	Replace calls to "hg" by calls to "git"	2019-12-04 11:24:06 +01:00
test	Extend support for Packet16b:	2020-04-28 16:12:47 +00:00
unsupported	Vectorize and parallelize TensorScanOp.	2020-05-05 00:19:43 +00:00
.gitignore	Renamed .hgignore to .gitignore (removing hg-specific "syntax" line)	2019-12-13 19:40:57 +01:00
.hgeol
CMakeLists.txt	Don't restrict CMAKE_BUILD_TYPE	2020-02-28 20:46:53 +00:00
COPYING.BSD	Make file formatting comply with POSIX and Unix standards	2020-03-23 18:09:02 +00:00
COPYING.GPL
COPYING.LGPL
COPYING.MINPACK	Make file formatting comply with POSIX and Unix standards	2020-03-23 18:09:02 +00:00
COPYING.MPL2
COPYING.README
CTestConfig.cmake	STYLE: Convert CMake-language commands to lower case	2019-10-31 11:36:37 -05:00
CTestCustom.cmake.in	Allow to filter out build-error messages	2018-07-24 20:12:49 +02:00
eigen3.pc.in
INSTALL
README.md	Update old links to bitbucket to point to gitlab.com	2019-12-04 10:57:07 +01:00
signature_of_eigen3_matrix_library

README.md

Eigen is a C++ template library for linear algebra: matrices, vectors, numerical solvers, and related algorithms.

For more information go to http://eigen.tuxfamily.org/.

For pull request, bug reports, and feature requests, go to https://gitlab.com/libeigen/eigen.