eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2024-12-15 07:10:37 +08:00

Go to file

Benoit Steiner 83ef39e055 Turn on the cost model by default. This results in some significant speedups for smaller tensors. For example, below are the results for the various tensor reductions. Before: BM_colReduction_12T/10 1000000 1949 51.29 MFlops/s BM_colReduction_12T/80 100000 15636 409.29 MFlops/s BM_colReduction_12T/640 20000 95100 4307.01 MFlops/s BM_colReduction_12T/4K 500 4573423 5466.36 MFlops/s BM_colReduction_4T/10 1000000 1867 53.56 MFlops/s BM_colReduction_4T/80 500000 5288 1210.11 MFlops/s BM_colReduction_4T/640 10000 106924 3830.75 MFlops/s BM_colReduction_4T/4K 500 9946374 2513.48 MFlops/s BM_colReduction_8T/10 1000000 1912 52.30 MFlops/s BM_colReduction_8T/80 200000 8354 766.09 MFlops/s BM_colReduction_8T/640 20000 85063 4815.22 MFlops/s BM_colReduction_8T/4K 500 5445216 4591.19 MFlops/s BM_rowReduction_12T/10 1000000 2041 48.99 MFlops/s BM_rowReduction_12T/80 100000 15426 414.87 MFlops/s BM_rowReduction_12T/640 50000 39117 10470.98 MFlops/s BM_rowReduction_12T/4K 500 3034298 8239.14 MFlops/s BM_rowReduction_4T/10 1000000 1834 54.51 MFlops/s BM_rowReduction_4T/80 500000 5406 1183.81 MFlops/s BM_rowReduction_4T/640 50000 35017 11697.16 MFlops/s BM_rowReduction_4T/4K 500 3428527 7291.76 MFlops/s BM_rowReduction_8T/10 1000000 1925 51.95 MFlops/s BM_rowReduction_8T/80 200000 8519 751.23 MFlops/s BM_rowReduction_8T/640 50000 33441 12248.42 MFlops/s BM_rowReduction_8T/4K 1000 2852841 8763.19 MFlops/s After: BM_colReduction_12T/10 50000000 59 1678.30 MFlops/s BM_colReduction_12T/80 5000000 725 8822.71 MFlops/s BM_colReduction_12T/640 20000 90882 4506.93 MFlops/s BM_colReduction_12T/4K 500 4668855 5354.63 MFlops/s BM_colReduction_4T/10 50000000 59 1687.37 MFlops/s BM_colReduction_4T/80 5000000 737 8681.24 MFlops/s BM_colReduction_4T/640 50000 108637 3770.34 MFlops/s BM_colReduction_4T/4K 500 7912954 3159.38 MFlops/s BM_colReduction_8T/10 50000000 60 1657.21 MFlops/s BM_colReduction_8T/80 5000000 726 8812.48 MFlops/s BM_colReduction_8T/640 20000 91451 4478.90 MFlops/s BM_colReduction_8T/4K 500 5441692 4594.16 MFlops/s BM_rowReduction_12T/10 20000000 93 1065.28 MFlops/s BM_rowReduction_12T/80 2000000 950 6730.96 MFlops/s BM_rowReduction_12T/640 50000 38196 10723.48 MFlops/s BM_rowReduction_12T/4K 500 3019217 8280.29 MFlops/s BM_rowReduction_4T/10 20000000 93 1064.30 MFlops/s BM_rowReduction_4T/80 2000000 959 6667.71 MFlops/s BM_rowReduction_4T/640 50000 37433 10941.96 MFlops/s BM_rowReduction_4T/4K 500 3036476 8233.23 MFlops/s BM_rowReduction_8T/10 20000000 93 1072.47 MFlops/s BM_rowReduction_8T/80 2000000 959 6670.04 MFlops/s BM_rowReduction_8T/640 50000 38069 10759.37 MFlops/s BM_rowReduction_8T/4K 1000 2758988 9061.29 MFlops/s		2016-05-16 08:55:21 -07:00
bench	Added benchmarks for contraction on CPU.	2016-05-13 14:32:17 -07:00
blas	Enable and fix -Wdouble-conversion warnings	2016-05-05 13:35:45 +02:00
cmake	Created the new EIGEN_TEST_CUDA_CLANG option to compile the CUDA tests using clang instead of nvcc	2016-04-08 13:16:08 -07:00
debug	Make gdb pretty printer Python3-compatible (bug #800 ).	2014-04-28 14:10:22 +01:00
demos	Fixed compilation error due to obsolete internal::abs and internal::sqrt function calls	2014-03-26 22:02:48 -04:00
doc	Update doc regarding the genericity of EIGEN_USE_BLAS	2016-04-11 17:16:07 +02:00
Eigen	Fixed a couple of bugs related to the Pascalfamily of GPUs	2016-05-11 23:02:26 -07:00
failtest	Add unit tests for bug #981 : valid and invalid usage of ternary operator	2015-09-09 11:38:25 +02:00
lapack	Workaround "misleading-indentation" warnings	2016-05-11 08:41:36 +02:00
scripts	Fix help output of buildtests and check scripts	2016-05-11 19:39:09 +02:00
test	Split unit test	2016-05-11 19:41:53 +02:00
unsupported	Turn on the cost model by default. This results in some significant speedups for smaller tensors. For example, below are the results for the various tensor reductions.	2016-05-16 08:55:21 -07:00
.hgeol	Added a pattern which forces LF line endings for *.sh files.	2013-07-31 18:20:58 +02:00
.hgignore	Ignore automalically imported lapack source files	2014-10-17 15:34:39 +02:00
CMakeLists.txt	bug #1207 : Add and fix logical-op warnings	2016-05-11 19:36:34 +02:00
COPYING.BSD
COPYING.GPL
COPYING.LGPL
COPYING.MINPACK
COPYING.MPL2
COPYING.README
CTestConfig.cmake	swap 3.2 <-> default CTestConfig.cmake file	2014-03-05 10:07:44 +01:00
CTestCustom.cmake.in	Reduce maximum number of warnings/errors. (they took GBs even for limited period of time)	2013-06-20 17:39:15 +02:00
eigen3.pc.in	Further fixes for CMAKE_INSTALL_PREFIX correctness	2015-11-07 21:29:24 -05:00
INSTALL
README.md	Reverted the README	2015-02-27 13:09:49 -08:00
signature_of_eigen3_matrix_library

README.md

Eigen is a C++ template library for linear algebra: matrices, vectors, numerical solvers, and related algorithms.

For more information go to http://eigen.tuxfamily.org/.