Benoit Steiner
|
7d980d74e5
|
Started to vectorize the processing of 16bit floats on CPU.
|
2016-05-23 15:21:40 -07:00 |
|
Benoit Steiner
|
5d51a7f12c
|
Don't optimize the processing of the last rows of a matrix matrix product in cases that violate the assumptions made by the optimized code path.
|
2016-05-23 15:13:16 -07:00 |
|
Benoit Steiner
|
7aa5bc9558
|
Fixed a typo in the array.cpp test
|
2016-05-23 14:39:51 -07:00 |
|
Benoit Steiner
|
a09cbf9905
|
Merged in rmlarsen/eigen (pull request PR-188)
Minor cleanups: 1. Get rid of a few unused variables. 2. Get rid of last uses of EIGEN_USE_COST_MODEL.
|
2016-05-23 12:55:12 -07:00 |
|
Christoph Hertzberg
|
88654762da
|
Replace multiple constructors of half-type by a generic/templated constructor. This fixes an incompatibility with long double, exposed by the previous commit.
|
2016-05-23 10:03:03 +02:00 |
|
Christoph Hertzberg
|
718521d5cf
|
Silenced several double-promotion warnings
|
2016-05-22 18:17:04 +02:00 |
|
Christoph Hertzberg
|
b5a7603822
|
fixed macro name
|
2016-05-22 16:49:29 +02:00 |
|
Christoph Hertzberg
|
25a03c02d6
|
Fix some sign-compare warnings
|
2016-05-22 16:42:27 +02:00 |
|
Christoph Hertzberg
|
0851d5d210
|
Identify clang++ even if it is not named llvm-clang++
|
2016-05-22 15:21:14 +02:00 |
|
Gael Guennebaud
|
6a15e14cda
|
Document EIGEN_MAX_CPP_VER and user controllable compiler features.
|
2016-05-20 15:26:09 +02:00 |
|
Gael Guennebaud
|
ccaace03c9
|
Make EIGEN_HAS_CONSTEXPR user configurable
|
2016-05-20 15:10:08 +02:00 |
|
Gael Guennebaud
|
c3410804cd
|
Make EIGEN_HAS_VARIADIC_TEMPLATES user configurable
|
2016-05-20 15:05:38 +02:00 |
|
Gael Guennebaud
|
abd1c1af7a
|
Make EIGEN_HAS_STD_RESULT_OF user configurable
|
2016-05-20 15:01:27 +02:00 |
|
Gael Guennebaud
|
1395056fc0
|
Make EIGEN_HAS_C99_MATH user configurable
|
2016-05-20 14:58:19 +02:00 |
|
Gael Guennebaud
|
48bf5ec216
|
Make EIGEN_HAS_RVALUE_REFERENCES user configurable
|
2016-05-20 14:54:20 +02:00 |
|
Gael Guennebaud
|
f43ae88892
|
Rename EIGEN_HAVE_RVALUE_REFERENCES to EIGEN_HAS_RVALUE_REFERENCES
|
2016-05-20 14:48:51 +02:00 |
|
Gael Guennebaud
|
8d6bd5691b
|
polygamma is C99/C++11 only
|
2016-05-20 14:45:33 +02:00 |
|
Gael Guennebaud
|
998f2efc58
|
Add a EIGEN_MAX_CPP_VER option to limit the C++ version to be used.
|
2016-05-20 14:44:28 +02:00 |
|
Gael Guennebaud
|
c028d96089
|
Improve doc of special math functions
|
2016-05-20 14:18:48 +02:00 |
|
Gael Guennebaud
|
0ba32f99bd
|
Rename UniformRandom to UnitRandom.
|
2016-05-20 13:21:34 +02:00 |
|
Gael Guennebaud
|
7a9d9cde94
|
Fix coding practice in Quaternion::UniformRandom
|
2016-05-20 13:19:52 +02:00 |
|
Joseph Mirabel
|
eb0cc2573a
|
bug #823: add static method to Quaternion for uniform random rotations.
|
2016-05-20 13:15:40 +02:00 |
|
Gael Guennebaud
|
2f656ce447
|
Remove std:: to enable custom scalar types.
|
2016-05-19 23:13:47 +02:00 |
|
Rasmus Larsen
|
b1e080c752
|
Merged eigen/eigen into default
|
2016-05-18 15:21:50 -07:00 |
|
Rasmus Munk Larsen
|
5624219b6b
|
Merge.
|
2016-05-18 15:16:06 -07:00 |
|
Rasmus Munk Larsen
|
7df811cfe5
|
Minor cleanups: 1. Get rid of unused variables. 2. Get rid of last uses of EIGEN_USE_COST_MODEL.
|
2016-05-18 15:09:48 -07:00 |
|
Benoit Steiner
|
bb3ff8e9d9
|
Advertize the packet api of the tensor reducers iff the corresponding packet primitives are available.
|
2016-05-18 14:52:49 -07:00 |
|
Gael Guennebaud
|
84df9142e7
|
bug #1231: fix compilation regression regarding complex_array/=real_array and add respective unit tests
|
2016-05-18 23:00:13 +02:00 |
|
Gael Guennebaud
|
21d692d054
|
Use coeff(i,j) instead of operator().
|
2016-05-18 17:09:20 +02:00 |
|
Gael Guennebaud
|
8456bbbadb
|
bug #1224: fix regression in (dense*dense).sparseView() by specializing evaluator<SparseView<Product>> for sparse products only.
|
2016-05-18 16:53:28 +02:00 |
|
Gael Guennebaud
|
b507b82326
|
Use default sorting strategy for square products.
|
2016-05-18 16:51:54 +02:00 |
|
Gael Guennebaud
|
1fa15ceee6
|
Extend sparse*sparse product unit test to check that the expected implementation is used (conservative vs auto pruning).
|
2016-05-18 16:50:54 +02:00 |
|
Gael Guennebaud
|
548a487800
|
bug #1229: bypass usage of Derived::Options which is available for plain matrix types only. Better use column-major storage anyway.
|
2016-05-18 16:44:05 +02:00 |
|
Gael Guennebaud
|
43790e009b
|
Pass argument by const ref instead of by value in pow(AutoDiffScalar...)
|
2016-05-18 16:28:02 +02:00 |
|
Gael Guennebaud
|
1fbfab27a9
|
bug #1223: fix compilation of AutoDiffScalar's min/max operators, and add regression unit test.
|
2016-05-18 16:26:26 +02:00 |
|
Gael Guennebaud
|
448d9d943c
|
bug #1222: fix compilation in AutoDiffScalar and add respective unit test
|
2016-05-18 16:00:11 +02:00 |
|
Gael Guennebaud
|
5a71eb5985
|
Big 1213: add regression unit test.
|
2016-05-18 14:03:03 +02:00 |
|
Gael Guennebaud
|
747e3290c0
|
bug #1213: rename some enums type for consistency.
|
2016-05-18 13:26:56 +02:00 |
|
Rasmus Munk Larsen
|
f519fca72b
|
Reduce overhead for small tensors and cheap ops by short-circuiting the const computation and block size calculation in parallelFor.
|
2016-05-17 16:06:00 -07:00 |
|
Benoit Steiner
|
86ae94462e
|
#if defined(EIGEN_USE_NONBLOCKING_THREAD_POOL) is now #if !defined(EIGEN_USE_SIMPLE_THREAD_POOL): the non blocking thread pool is the default since it's more scalable, and one needs to request the old thread pool explicitly.
|
2016-05-17 14:06:15 -07:00 |
|
Benoit Steiner
|
997c335970
|
Fixed compilation error
|
2016-05-17 12:54:18 -07:00 |
|
Benoit Steiner
|
ebf6ada5ee
|
Fixed compilation error in the tensor thread pool
|
2016-05-17 12:33:46 -07:00 |
|
Rasmus Munk Larsen
|
0bb61b04ca
|
Merge upstream.
|
2016-05-17 10:26:10 -07:00 |
|
Rasmus Munk Larsen
|
0dbd68145f
|
Roll back changes to core. Move include of TensorFunctors.h up to satisfy dependence in TensorCostModel.h.
|
2016-05-17 10:25:19 -07:00 |
|
Rasmus Larsen
|
00228f2506
|
Merged eigen/eigen into default
|
2016-05-17 09:49:31 -07:00 |
|
Benoit Steiner
|
e7e64c3277
|
Enable the use of the packet api to evaluate tensor broadcasts. This speed things up quite a bit:
Before"
M_broadcasting/10 500000 3690 27.10 MFlops/s
BM_broadcasting/80 500000 4014 1594.24 MFlops/s
BM_broadcasting/640 100000 14770 27731.35 MFlops/s
BM_broadcasting/4K 5000 632711 39512.48 MFlops/s
After:
BM_broadcasting/10 500000 4287 23.33 MFlops/s
BM_broadcasting/80 500000 4455 1436.41 MFlops/s
BM_broadcasting/640 200000 10195 40173.01 MFlops/s
BM_broadcasting/4K 5000 423746 58997.57 MFlops/s
|
2016-05-17 09:24:35 -07:00 |
|
Benoit Steiner
|
5fa27574dd
|
Allow vectorized padding on GPU. This helps speed things up a little
Before:
BM_padding/10 5000000 460 217.03 MFlops/s
BM_padding/80 5000000 460 13899.40 MFlops/s
BM_padding/640 5000000 461 888421.17 MFlops/s
BM_padding/4K 5000000 460 54316322.55 MFlops/s
After:
BM_padding/10 5000000 454 220.20 MFlops/s
BM_padding/80 5000000 455 14039.86 MFlops/s
BM_padding/640 5000000 452 904968.83 MFlops/s
BM_padding/4K 5000000 411 60750049.21 MFlops/s
|
2016-05-17 09:17:26 -07:00 |
|
Benoit Steiner
|
86da77cb9b
|
Pulled latest updates from trunk.
|
2016-05-17 07:21:48 -07:00 |
|
Benoit Steiner
|
92fc6add43
|
Don't rely on c++11 extension when we don't have to.
|
2016-05-17 07:21:22 -07:00 |
|
Benoit Steiner
|
2d74ef9682
|
Avoid float to double conversion
|
2016-05-17 07:20:11 -07:00 |
|