Gael Guennebaud
|
998f2efc58
|
Add a EIGEN_MAX_CPP_VER option to limit the C++ version to be used.
|
2016-05-20 14:44:28 +02:00 |
|
Gael Guennebaud
|
c028d96089
|
Improve doc of special math functions
|
2016-05-20 14:18:48 +02:00 |
|
Gael Guennebaud
|
0ba32f99bd
|
Rename UniformRandom to UnitRandom.
|
2016-05-20 13:21:34 +02:00 |
|
Gael Guennebaud
|
7a9d9cde94
|
Fix coding practice in Quaternion::UniformRandom
|
2016-05-20 13:19:52 +02:00 |
|
Joseph Mirabel
|
eb0cc2573a
|
bug #823: add static method to Quaternion for uniform random rotations.
|
2016-05-20 13:15:40 +02:00 |
|
Gael Guennebaud
|
2f656ce447
|
Remove std:: to enable custom scalar types.
|
2016-05-19 23:13:47 +02:00 |
|
Rasmus Larsen
|
b1e080c752
|
Merged eigen/eigen into default
|
2016-05-18 15:21:50 -07:00 |
|
Rasmus Munk Larsen
|
5624219b6b
|
Merge.
|
2016-05-18 15:16:06 -07:00 |
|
Rasmus Munk Larsen
|
7df811cfe5
|
Minor cleanups: 1. Get rid of unused variables. 2. Get rid of last uses of EIGEN_USE_COST_MODEL.
|
2016-05-18 15:09:48 -07:00 |
|
Benoit Steiner
|
bb3ff8e9d9
|
Advertize the packet api of the tensor reducers iff the corresponding packet primitives are available.
|
2016-05-18 14:52:49 -07:00 |
|
Gael Guennebaud
|
84df9142e7
|
bug #1231: fix compilation regression regarding complex_array/=real_array and add respective unit tests
|
2016-05-18 23:00:13 +02:00 |
|
Gael Guennebaud
|
21d692d054
|
Use coeff(i,j) instead of operator().
|
2016-05-18 17:09:20 +02:00 |
|
Gael Guennebaud
|
8456bbbadb
|
bug #1224: fix regression in (dense*dense).sparseView() by specializing evaluator<SparseView<Product>> for sparse products only.
|
2016-05-18 16:53:28 +02:00 |
|
Gael Guennebaud
|
b507b82326
|
Use default sorting strategy for square products.
|
2016-05-18 16:51:54 +02:00 |
|
Gael Guennebaud
|
1fa15ceee6
|
Extend sparse*sparse product unit test to check that the expected implementation is used (conservative vs auto pruning).
|
2016-05-18 16:50:54 +02:00 |
|
Gael Guennebaud
|
548a487800
|
bug #1229: bypass usage of Derived::Options which is available for plain matrix types only. Better use column-major storage anyway.
|
2016-05-18 16:44:05 +02:00 |
|
Gael Guennebaud
|
43790e009b
|
Pass argument by const ref instead of by value in pow(AutoDiffScalar...)
|
2016-05-18 16:28:02 +02:00 |
|
Gael Guennebaud
|
1fbfab27a9
|
bug #1223: fix compilation of AutoDiffScalar's min/max operators, and add regression unit test.
|
2016-05-18 16:26:26 +02:00 |
|
Gael Guennebaud
|
448d9d943c
|
bug #1222: fix compilation in AutoDiffScalar and add respective unit test
|
2016-05-18 16:00:11 +02:00 |
|
Gael Guennebaud
|
5a71eb5985
|
Big 1213: add regression unit test.
|
2016-05-18 14:03:03 +02:00 |
|
Gael Guennebaud
|
747e3290c0
|
bug #1213: rename some enums type for consistency.
|
2016-05-18 13:26:56 +02:00 |
|
Rasmus Munk Larsen
|
f519fca72b
|
Reduce overhead for small tensors and cheap ops by short-circuiting the const computation and block size calculation in parallelFor.
|
2016-05-17 16:06:00 -07:00 |
|
Benoit Steiner
|
86ae94462e
|
#if defined(EIGEN_USE_NONBLOCKING_THREAD_POOL) is now #if !defined(EIGEN_USE_SIMPLE_THREAD_POOL): the non blocking thread pool is the default since it's more scalable, and one needs to request the old thread pool explicitly.
|
2016-05-17 14:06:15 -07:00 |
|
Benoit Steiner
|
997c335970
|
Fixed compilation error
|
2016-05-17 12:54:18 -07:00 |
|
Benoit Steiner
|
ebf6ada5ee
|
Fixed compilation error in the tensor thread pool
|
2016-05-17 12:33:46 -07:00 |
|
Rasmus Munk Larsen
|
0bb61b04ca
|
Merge upstream.
|
2016-05-17 10:26:10 -07:00 |
|
Rasmus Munk Larsen
|
0dbd68145f
|
Roll back changes to core. Move include of TensorFunctors.h up to satisfy dependence in TensorCostModel.h.
|
2016-05-17 10:25:19 -07:00 |
|
Rasmus Larsen
|
00228f2506
|
Merged eigen/eigen into default
|
2016-05-17 09:49:31 -07:00 |
|
Benoit Steiner
|
e7e64c3277
|
Enable the use of the packet api to evaluate tensor broadcasts. This speed things up quite a bit:
Before"
M_broadcasting/10 500000 3690 27.10 MFlops/s
BM_broadcasting/80 500000 4014 1594.24 MFlops/s
BM_broadcasting/640 100000 14770 27731.35 MFlops/s
BM_broadcasting/4K 5000 632711 39512.48 MFlops/s
After:
BM_broadcasting/10 500000 4287 23.33 MFlops/s
BM_broadcasting/80 500000 4455 1436.41 MFlops/s
BM_broadcasting/640 200000 10195 40173.01 MFlops/s
BM_broadcasting/4K 5000 423746 58997.57 MFlops/s
|
2016-05-17 09:24:35 -07:00 |
|
Benoit Steiner
|
5fa27574dd
|
Allow vectorized padding on GPU. This helps speed things up a little
Before:
BM_padding/10 5000000 460 217.03 MFlops/s
BM_padding/80 5000000 460 13899.40 MFlops/s
BM_padding/640 5000000 461 888421.17 MFlops/s
BM_padding/4K 5000000 460 54316322.55 MFlops/s
After:
BM_padding/10 5000000 454 220.20 MFlops/s
BM_padding/80 5000000 455 14039.86 MFlops/s
BM_padding/640 5000000 452 904968.83 MFlops/s
BM_padding/4K 5000000 411 60750049.21 MFlops/s
|
2016-05-17 09:17:26 -07:00 |
|
Benoit Steiner
|
a910bcee43
|
Merged latest updates from trunk
|
2016-05-17 09:14:22 -07:00 |
|
Benoit Steiner
|
8d06c02ffd
|
Allow vectorized padding on GPU. This helps speed things up a little.
Before:
BM_padding/10 5000000 460 217.03 MFlops/s
BM_padding/80 5000000 460 13899.40 MFlops/s
BM_padding/640 5000000 461 888421.17 MFlops/s
BM_padding/4K 5000000 460 54316322.55 MFlops/s
After:
BM_padding/10 5000000 454 220.20 MFlops/s
BM_padding/80 5000000 455 14039.86 MFlops/s
BM_padding/640 5000000 452 904968.83 MFlops/s
BM_padding/4K 5000000 411 60750049.21 MFlops/s
|
2016-05-17 09:13:27 -07:00 |
|
Benoit Steiner
|
86da77cb9b
|
Pulled latest updates from trunk.
|
2016-05-17 07:21:48 -07:00 |
|
Benoit Steiner
|
92fc6add43
|
Don't rely on c++11 extension when we don't have to.
|
2016-05-17 07:21:22 -07:00 |
|
Benoit Steiner
|
2d74ef9682
|
Avoid float to double conversion
|
2016-05-17 07:20:11 -07:00 |
|
David Dement
|
ccc7563ac5
|
made a fix to the GMRES solver so that it now correctly reports the error achieved in the solution process
|
2016-05-16 14:26:41 -04:00 |
|
Gael Guennebaud
|
575bc44c3f
|
Fix unit test.
|
2016-05-19 22:48:16 +02:00 |
|
Gael Guennebaud
|
ccb408ee6a
|
Improve unit tests of zeta, polygamma, and digamma
|
2016-05-19 18:34:41 +02:00 |
|
Gael Guennebaud
|
6761c64d60
|
zeta and polygamma are not unary functions, but binary ones.
|
2016-05-19 18:34:16 +02:00 |
|
Gael Guennebaud
|
7a54032408
|
zeta and digamma do not require C++11/C99
|
2016-05-19 17:36:47 +02:00 |
|
Gael Guennebaud
|
ce12562710
|
Add some c++11 flags in documentation
|
2016-05-19 17:35:30 +02:00 |
|
Gael Guennebaud
|
b6ed8244b4
|
bug #1201: optimize affine*vector products
|
2016-05-19 16:09:15 +02:00 |
|
Gael Guennebaud
|
73693b5de6
|
bug #1221: disable gcc 6 warning: ignoring attributes on template argument
|
2016-05-19 15:21:53 +02:00 |
|
Gael Guennebaud
|
df9a5e13c6
|
Fix SelfAdjointEigenSolver for some input expression types, and add new regression unit tests for sparse and selfadjointview inputs.
|
2016-05-19 13:07:33 +02:00 |
|
Gael Guennebaud
|
6a2916df80
|
DiagonalWrapper is a vector, so it must expose the LinearAccessBit flag.
|
2016-05-19 13:06:21 +02:00 |
|
Gael Guennebaud
|
a226f6af6b
|
Add support for SelfAdjointView::diagonal()
|
2016-05-19 13:05:33 +02:00 |
|
Gael Guennebaud
|
ee7da3c7c5
|
Fix SelfAdjointView::triangularView for complexes.
|
2016-05-19 13:01:51 +02:00 |
|
Gael Guennebaud
|
b6b8578a67
|
bug #1230: add support for SelfadjointView::triangularView.
|
2016-05-19 11:36:38 +02:00 |
|
Benoit Steiner
|
a80d875916
|
Added missing costPerCoeff method
|
2016-05-16 09:31:10 -07:00 |
|
Benoit Steiner
|
83ef39e055
|
Turn on the cost model by default. This results in some significant speedups for smaller tensors. For example, below are the results for the various tensor reductions.
Before:
BM_colReduction_12T/10 1000000 1949 51.29 MFlops/s
BM_colReduction_12T/80 100000 15636 409.29 MFlops/s
BM_colReduction_12T/640 20000 95100 4307.01 MFlops/s
BM_colReduction_12T/4K 500 4573423 5466.36 MFlops/s
BM_colReduction_4T/10 1000000 1867 53.56 MFlops/s
BM_colReduction_4T/80 500000 5288 1210.11 MFlops/s
BM_colReduction_4T/640 10000 106924 3830.75 MFlops/s
BM_colReduction_4T/4K 500 9946374 2513.48 MFlops/s
BM_colReduction_8T/10 1000000 1912 52.30 MFlops/s
BM_colReduction_8T/80 200000 8354 766.09 MFlops/s
BM_colReduction_8T/640 20000 85063 4815.22 MFlops/s
BM_colReduction_8T/4K 500 5445216 4591.19 MFlops/s
BM_rowReduction_12T/10 1000000 2041 48.99 MFlops/s
BM_rowReduction_12T/80 100000 15426 414.87 MFlops/s
BM_rowReduction_12T/640 50000 39117 10470.98 MFlops/s
BM_rowReduction_12T/4K 500 3034298 8239.14 MFlops/s
BM_rowReduction_4T/10 1000000 1834 54.51 MFlops/s
BM_rowReduction_4T/80 500000 5406 1183.81 MFlops/s
BM_rowReduction_4T/640 50000 35017 11697.16 MFlops/s
BM_rowReduction_4T/4K 500 3428527 7291.76 MFlops/s
BM_rowReduction_8T/10 1000000 1925 51.95 MFlops/s
BM_rowReduction_8T/80 200000 8519 751.23 MFlops/s
BM_rowReduction_8T/640 50000 33441 12248.42 MFlops/s
BM_rowReduction_8T/4K 1000 2852841 8763.19 MFlops/s
After:
BM_colReduction_12T/10 50000000 59 1678.30 MFlops/s
BM_colReduction_12T/80 5000000 725 8822.71 MFlops/s
BM_colReduction_12T/640 20000 90882 4506.93 MFlops/s
BM_colReduction_12T/4K 500 4668855 5354.63 MFlops/s
BM_colReduction_4T/10 50000000 59 1687.37 MFlops/s
BM_colReduction_4T/80 5000000 737 8681.24 MFlops/s
BM_colReduction_4T/640 50000 108637 3770.34 MFlops/s
BM_colReduction_4T/4K 500 7912954 3159.38 MFlops/s
BM_colReduction_8T/10 50000000 60 1657.21 MFlops/s
BM_colReduction_8T/80 5000000 726 8812.48 MFlops/s
BM_colReduction_8T/640 20000 91451 4478.90 MFlops/s
BM_colReduction_8T/4K 500 5441692 4594.16 MFlops/s
BM_rowReduction_12T/10 20000000 93 1065.28 MFlops/s
BM_rowReduction_12T/80 2000000 950 6730.96 MFlops/s
BM_rowReduction_12T/640 50000 38196 10723.48 MFlops/s
BM_rowReduction_12T/4K 500 3019217 8280.29 MFlops/s
BM_rowReduction_4T/10 20000000 93 1064.30 MFlops/s
BM_rowReduction_4T/80 2000000 959 6667.71 MFlops/s
BM_rowReduction_4T/640 50000 37433 10941.96 MFlops/s
BM_rowReduction_4T/4K 500 3036476 8233.23 MFlops/s
BM_rowReduction_8T/10 20000000 93 1072.47 MFlops/s
BM_rowReduction_8T/80 2000000 959 6670.04 MFlops/s
BM_rowReduction_8T/640 50000 38069 10759.37 MFlops/s
BM_rowReduction_8T/4K 1000 2758988 9061.29 MFlops/s
|
2016-05-16 08:55:21 -07:00 |
|