eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2025-03-07 18:27:40 +08:00

Go to file

Rasmus Munk Larsen 4c0fa6ce0f Speed up Eigen matrixvector and vectormatrix multiplication. This change speeds up Eigen matrix * vector and vector * matrix multiplication for dynamic matrices when it is known at runtime that one of the factors is a vector. The benchmarks below test c.noalias()= n_by_n_matrix * n_by_1_matrix; c.noalias()= 1_by_n_matrix * n_by_n_matrix; respectively. Benchmark measurements: SSE: Run on * (72 X 2992 MHz CPUs); 2019-01-28T17:51:44.452697457-08:00 CPU: Intel Skylake Xeon with HyperThreading (36 cores) dL1:32KB dL2:1024KB dL3:24MB Benchmark Base (ns) New (ns) Improvement ------------------------------------------------------------------ BM_MatVec/64 1096 312 +71.5% BM_MatVec/128 4581 1464 +68.0% BM_MatVec/256 18534 5710 +69.2% BM_MatVec/512 118083 24162 +79.5% BM_MatVec/1k 704106 173346 +75.4% BM_MatVec/2k 3080828 742728 +75.9% BM_MatVec/4k 25421512 4530117 +82.2% BM_VecMat/32 352 130 +63.1% BM_VecMat/64 1213 425 +65.0% BM_VecMat/128 4640 1564 +66.3% BM_VecMat/256 17902 5884 +67.1% BM_VecMat/512 70466 24000 +65.9% BM_VecMat/1k 340150 161263 +52.6% BM_VecMat/2k 1420590 645576 +54.6% BM_VecMat/4k 8083859 4364327 +46.0% AVX2: Run on * (72 X 2993 MHz CPUs); 2019-01-28T17:45:11.508545307-08:00 CPU: Intel Skylake Xeon with HyperThreading (36 cores) dL1:32KB dL2:1024KB dL3:24MB Benchmark Base (ns) New (ns) Improvement ------------------------------------------------------------------ BM_MatVec/64 619 120 +80.6% BM_MatVec/128 9693 752 +92.2% BM_MatVec/256 38356 2773 +92.8% BM_MatVec/512 69006 12803 +81.4% BM_MatVec/1k 443810 160378 +63.9% BM_MatVec/2k 2633553 646594 +75.4% BM_MatVec/4k 16211095 4327148 +73.3% BM_VecMat/64 925 227 +75.5% BM_VecMat/128 3438 830 +75.9% BM_VecMat/256 13427 2936 +78.1% BM_VecMat/512 53944 12473 +76.9% BM_VecMat/1k 302264 157076 +48.0% BM_VecMat/2k 1396811 675778 +51.6% BM_VecMat/4k 8962246 4459010 +50.2% AVX512: Run on *** (72 X 2993 MHz CPUs); 2019-01-28T17:35:17.239329863-08:00 CPU: Intel Skylake Xeon with HyperThreading (36 cores) dL1:32KB dL2:1024KB dL3:24MB Benchmark Base (ns) New (ns) Improvement ------------------------------------------------------------------ BM_MatVec/64 401 111 +72.3% BM_MatVec/128 1846 513 +72.2% BM_MatVec/256 36739 1927 +94.8% BM_MatVec/512 54490 9227 +83.1% BM_MatVec/1k 487374 161457 +66.9% BM_MatVec/2k 2016270 643824 +68.1% BM_MatVec/4k 13204300 4077412 +69.1% BM_VecMat/32 324 106 +67.3% BM_VecMat/64 1034 246 +76.2% BM_VecMat/128 3576 802 +77.6% BM_VecMat/256 13411 2561 +80.9% BM_VecMat/512 58686 10037 +82.9% BM_VecMat/1k 320862 163750 +49.0% BM_VecMat/2k 1406719 651397 +53.7% BM_VecMat/4k 7785179 4124677 +47.0% Currently watchingStop watching		2019-01-31 14:24:08 -08:00
bench	Add recent gemm related changesets and various cleanups in perf-monitoring	2019-01-29 11:53:47 +01:00
blas	Fix numerous shadow-warnings for GCC<=4.8	2018-08-28 18:32:39 +02:00
cmake	Simplify handling of tests that must fail to compile.	2018-12-12 15:48:36 +01:00
debug	MIsc. source and comment typos	2018-03-11 10:01:44 -04:00
demos
doc	Slightly extend discussions on auto and move the content of the Pit falls wiki page here.	2019-01-30 13:09:21 +01:00
Eigen	Speed up Eigen matrixvector and vectormatrix multiplication.	2019-01-31 14:24:08 -08:00
failtest	PR 572: Add initializer list constructors to Matrix and Array (include unit tests and doc)	2019-01-21 16:25:57 +01:00
lapack	Enable "old" CMP0026 policy (not perfect, but better than dozens of warning)	2018-12-08 18:59:51 +01:00
scripts	Simplify handling and non-splitted tests and include split_test_helper.h instead of re-generating it. This also allows us to modify it without breaking existing build folder.	2018-07-16 18:55:40 +02:00
test	bug #1669 : fix PartialPivLU/inverse with zero-sized matrices.	2019-01-29 10:27:13 +01:00
unsupported	Workaround lack of support for arbitrary packet-type in Tensor by manually loading half/quarter packets in tensor contraction mapper.	2019-01-30 16:48:01 +01:00
.hgeol
.hgignore	ignore all build sub directories	2017-12-14 14:22:14 +01:00
CMakeLists.txt	Bypass inline asm for non compatible compilers.	2019-01-23 23:43:13 +01:00
COPYING.BSD
COPYING.GPL
COPYING.LGPL
COPYING.MINPACK
COPYING.MPL2
COPYING.README
CTestConfig.cmake	Optimize the product of a householder-sequence with the identity, and optimize the evaluation of a HouseholderSequence to a dense matrix using faster blocked product.	2018-07-11 17:16:50 +02:00
CTestCustom.cmake.in	Allow to filter out build-error messages	2018-07-24 20:12:49 +02:00
eigen3.pc.in
INSTALL
README.md	Add links where to make PRs and report bugs into README.md	2018-04-13 21:05:28 +00:00
signature_of_eigen3_matrix_library

README.md

Eigen is a C++ template library for linear algebra: matrices, vectors, numerical solvers, and related algorithms.

For more information go to http://eigen.tuxfamily.org/.

For pull request please only use the official repository at https://bitbucket.org/eigen/eigen.

For bug reports and feature requests go to http://eigen.tuxfamily.org/bz.