Go to file
Benoit Steiner 83ef39e055 Turn on the cost model by default. This results in some significant speedups for smaller tensors. For example, below are the results for the various tensor reductions.
Before:
BM_colReduction_12T/10       1000000       1949    51.29 MFlops/s
BM_colReduction_12T/80        100000      15636   409.29 MFlops/s
BM_colReduction_12T/640        20000      95100  4307.01 MFlops/s
BM_colReduction_12T/4K           500    4573423  5466.36 MFlops/s
BM_colReduction_4T/10        1000000       1867    53.56 MFlops/s
BM_colReduction_4T/80         500000       5288  1210.11 MFlops/s
BM_colReduction_4T/640         10000     106924  3830.75 MFlops/s
BM_colReduction_4T/4K            500    9946374  2513.48 MFlops/s
BM_colReduction_8T/10        1000000       1912    52.30 MFlops/s
BM_colReduction_8T/80         200000       8354   766.09 MFlops/s
BM_colReduction_8T/640         20000      85063  4815.22 MFlops/s
BM_colReduction_8T/4K            500    5445216  4591.19 MFlops/s
BM_rowReduction_12T/10       1000000       2041    48.99 MFlops/s
BM_rowReduction_12T/80        100000      15426   414.87 MFlops/s
BM_rowReduction_12T/640        50000      39117 10470.98 MFlops/s
BM_rowReduction_12T/4K           500    3034298  8239.14 MFlops/s
BM_rowReduction_4T/10        1000000       1834    54.51 MFlops/s
BM_rowReduction_4T/80         500000       5406  1183.81 MFlops/s
BM_rowReduction_4T/640         50000      35017 11697.16 MFlops/s
BM_rowReduction_4T/4K            500    3428527  7291.76 MFlops/s
BM_rowReduction_8T/10        1000000       1925    51.95 MFlops/s
BM_rowReduction_8T/80         200000       8519   751.23 MFlops/s
BM_rowReduction_8T/640         50000      33441 12248.42 MFlops/s
BM_rowReduction_8T/4K           1000    2852841  8763.19 MFlops/s


After:
BM_colReduction_12T/10      50000000         59  1678.30 MFlops/s
BM_colReduction_12T/80       5000000        725  8822.71 MFlops/s
BM_colReduction_12T/640        20000      90882  4506.93 MFlops/s
BM_colReduction_12T/4K           500    4668855  5354.63 MFlops/s
BM_colReduction_4T/10       50000000         59  1687.37 MFlops/s
BM_colReduction_4T/80        5000000        737  8681.24 MFlops/s
BM_colReduction_4T/640         50000     108637  3770.34 MFlops/s
BM_colReduction_4T/4K            500    7912954  3159.38 MFlops/s
BM_colReduction_8T/10       50000000         60  1657.21 MFlops/s
BM_colReduction_8T/80        5000000        726  8812.48 MFlops/s
BM_colReduction_8T/640         20000      91451  4478.90 MFlops/s
BM_colReduction_8T/4K            500    5441692  4594.16 MFlops/s
BM_rowReduction_12T/10      20000000         93  1065.28 MFlops/s
BM_rowReduction_12T/80       2000000        950  6730.96 MFlops/s
BM_rowReduction_12T/640        50000      38196 10723.48 MFlops/s
BM_rowReduction_12T/4K           500    3019217  8280.29 MFlops/s
BM_rowReduction_4T/10       20000000         93  1064.30 MFlops/s
BM_rowReduction_4T/80        2000000        959  6667.71 MFlops/s
BM_rowReduction_4T/640         50000      37433 10941.96 MFlops/s
BM_rowReduction_4T/4K            500    3036476  8233.23 MFlops/s
BM_rowReduction_8T/10       20000000         93  1072.47 MFlops/s
BM_rowReduction_8T/80        2000000        959  6670.04 MFlops/s
BM_rowReduction_8T/640         50000      38069 10759.37 MFlops/s
BM_rowReduction_8T/4K           1000    2758988  9061.29 MFlops/s
2016-05-16 08:55:21 -07:00
bench Added benchmarks for contraction on CPU. 2016-05-13 14:32:17 -07:00
blas Enable and fix -Wdouble-conversion warnings 2016-05-05 13:35:45 +02:00
cmake Created the new EIGEN_TEST_CUDA_CLANG option to compile the CUDA tests using clang instead of nvcc 2016-04-08 13:16:08 -07:00
debug Make gdb pretty printer Python3-compatible (bug #800). 2014-04-28 14:10:22 +01:00
demos Fixed compilation error due to obsolete internal::abs and internal::sqrt function calls 2014-03-26 22:02:48 -04:00
doc Update doc regarding the genericity of EIGEN_USE_BLAS 2016-04-11 17:16:07 +02:00
Eigen Fixed a couple of bugs related to the Pascalfamily of GPUs 2016-05-11 23:02:26 -07:00
failtest Add unit tests for bug #981: valid and invalid usage of ternary operator 2015-09-09 11:38:25 +02:00
lapack Workaround "misleading-indentation" warnings 2016-05-11 08:41:36 +02:00
scripts Fix help output of buildtests and check scripts 2016-05-11 19:39:09 +02:00
test Split unit test 2016-05-11 19:41:53 +02:00
unsupported Turn on the cost model by default. This results in some significant speedups for smaller tensors. For example, below are the results for the various tensor reductions. 2016-05-16 08:55:21 -07:00
.hgeol Added a pattern which forces LF line endings for *.sh files. 2013-07-31 18:20:58 +02:00
.hgignore Ignore automalically imported lapack source files 2014-10-17 15:34:39 +02:00
CMakeLists.txt bug #1207: Add and fix logical-op warnings 2016-05-11 19:36:34 +02:00
COPYING.BSD
COPYING.GPL
COPYING.LGPL
COPYING.MINPACK
COPYING.MPL2
COPYING.README
CTestConfig.cmake swap 3.2 <-> default CTestConfig.cmake file 2014-03-05 10:07:44 +01:00
CTestCustom.cmake.in Reduce maximum number of warnings/errors. (they took GBs even for limited period of time) 2013-06-20 17:39:15 +02:00
eigen3.pc.in Further fixes for CMAKE_INSTALL_PREFIX correctness 2015-11-07 21:29:24 -05:00
INSTALL
README.md Reverted the README 2015-02-27 13:09:49 -08:00
signature_of_eigen3_matrix_library

Eigen is a C++ template library for linear algebra: matrices, vectors, numerical solvers, and related algorithms.

For more information go to http://eigen.tuxfamily.org/.