Rasmus Munk Larsen
|
235e83aba6
|
Eigen cost model part 1. This implements a basic recursive framework to estimate the cost of evaluating tensor expressions.
|
2016-04-14 13:57:35 -07:00 |
|
Gael Guennebaud
|
3551dea887
|
Cleaning pass on rcond estimator.
|
2016-04-14 16:45:41 +02:00 |
|
Gael Guennebaud
|
d8a3bdaa24
|
remove useless include
|
2016-04-14 15:18:56 +02:00 |
|
Gael Guennebaud
|
d402adc3d7
|
Better use .data() than &coeffRef(0)
|
2016-04-14 15:18:08 +02:00 |
|
Gael Guennebaud
|
ea7087ef31
|
Merged in rmlarsen/eigen (pull request PR-174)
Add matrix condition number estimation module.
|
2016-04-14 15:11:33 +02:00 |
|
Benoit Steiner
|
36f5a10198
|
Properly gate the definition of the error and gamma functions for fp16
|
2016-04-13 18:44:48 -07:00 |
|
Benoit Steiner
|
10b69810d1
|
Improved support for trigonometric functions on GPU
|
2016-04-13 16:00:51 -07:00 |
|
Benoit Steiner
|
d6105b53b8
|
Added basic implementation of the lgamma, digamma, igamma, igammac, polygamma, and zeta function for fp16
|
2016-04-13 15:26:02 -07:00 |
|
Gael Guennebaud
|
703251f10f
|
merge
|
2016-04-13 23:45:10 +02:00 |
|
Gael Guennebaud
|
39211ba46b
|
Fix JacobiSVD for complex when the complex-to-real update already gives a diagonal 2x2 block.
|
2016-04-13 23:43:26 +02:00 |
|
Benoit Steiner
|
2986253259
|
Cleaned up the implementation of digamma
|
2016-04-13 14:24:06 -07:00 |
|
Benoit Steiner
|
d5de1a8220
|
Pulled latest updates from trunk
|
2016-04-13 14:17:11 -07:00 |
|
Benoit Steiner
|
87ca15c4e8
|
Added support for sin, cos, tan, and tanh on fp16
|
2016-04-13 14:12:38 -07:00 |
|
Gael Guennebaud
|
2c9e4fa417
|
Add debug output for random unit test
|
2016-04-13 22:56:12 +02:00 |
|
Gael Guennebaud
|
7d1391d049
|
Turn a converge check to a warning
|
2016-04-13 22:50:54 +02:00 |
|
Gael Guennebaud
|
feef39e2d1
|
Fix underflow in JacoviSVD's complex to real preconditioner
|
2016-04-13 22:49:51 +02:00 |
|
Gael Guennebaud
|
f4e12272f1
|
Fix corner case in unit test.
|
2016-04-13 22:18:02 +02:00 |
|
Gael Guennebaud
|
a95e1a273e
|
Fix warning in unit tests
|
2016-04-13 22:00:38 +02:00 |
|
Benoit Steiner
|
bf3f6688f0
|
Added support for computing cos, sin, tan, and tanh on GPU.
|
2016-04-13 11:55:08 -07:00 |
|
Benoit Steiner
|
473c8380ea
|
Added constructors to convert unsigned integers into fp16
|
2016-04-13 11:03:37 -07:00 |
|
Gael Guennebaud
|
42a3352a3b
|
Workaround a division by zero when outerstride==0
|
2016-04-13 19:02:02 +02:00 |
|
Gael Guennebaud
|
6f960b83ff
|
Make use of is_same_dense helper instead of extract_data to detect input/outputs are the same.
|
2016-04-13 18:47:12 +02:00 |
|
Gael Guennebaud
|
b7716c0328
|
Fix incomplete previous patch on matrix comparision.
|
2016-04-13 18:32:56 +02:00 |
|
Gael Guennebaud
|
2630d97c62
|
Fix detection of same matrices when both matrices are not handled by extract_data.
|
2016-04-13 18:26:08 +02:00 |
|
Gael Guennebaud
|
512ba0ac76
|
Add regression unit tests for half-packet vectorization
|
2016-04-13 18:16:35 +02:00 |
|
Gael Guennebaud
|
06447e0a39
|
Improve half-packet vectorization logic to distinguish linear versus inner traversal modes.
|
2016-04-13 18:15:49 +02:00 |
|
Gael Guennebaud
|
bbb8854bf7
|
Enable half-packet in reduxions.
|
2016-04-13 13:02:34 +02:00 |
|
Benoit Steiner
|
e9b12cc1f7
|
Fixed compilation warnings generated by clang
|
2016-04-12 20:53:18 -07:00 |
|
Benoit Steiner
|
eaeb6ca93a
|
Enable the benchmarks for algebraic and transcendental fnctions on fp16.
|
2016-04-12 16:29:00 -07:00 |
|
Benoit Steiner
|
aa1ba8bbd2
|
Don't put a command at the end of an enumerator list
|
2016-04-12 16:28:11 -07:00 |
|
Benoit Steiner
|
e49945ced4
|
Pulled latest update from trunk
|
2016-04-12 14:13:41 -07:00 |
|
Benoit Steiner
|
25d05c4b8f
|
Fixed the vectorization logic test
|
2016-04-12 14:13:25 -07:00 |
|
Benoit Steiner
|
53121c0119
|
Turned on the contraction benchmarks for fp16
|
2016-04-12 14:11:52 -07:00 |
|
Gael Guennebaud
|
b67c983291
|
Enable the use of half-packet in coeff-based product.
For instance, Matrix4f*Vector4f is now vectorized again when using AVX.
|
2016-04-12 23:03:03 +02:00 |
|
Benoit Steiner
|
e3a184785c
|
Fixed the zeta test
|
2016-04-12 11:12:36 -07:00 |
|
Benoit Steiner
|
3b76df64fc
|
Defer the decision to vectorize tensor CUDA code to the meta kernel. This makes it possible to decide to vectorize or not depending on the capability of the target cuda architecture. In particular, this enables us to vectorize the processing of fp16 when running on device of capability >= 5.3
|
2016-04-12 10:58:51 -07:00 |
|
Rasmus Larsen
|
6498dadc2f
|
Merged eigen/eigen into default
|
2016-04-11 17:42:05 -07:00 |
|
Benoit Steiner
|
748c4c4599
|
More accurate cost estimates for exp, log, tanh, and sqrt.
|
2016-04-11 13:11:04 -07:00 |
|
Benoit Steiner
|
833efb39bf
|
Added epsilon, dummy_precision, infinity and quiet_NaN NumTraits for fp16
|
2016-04-11 11:03:56 -07:00 |
|
Benoit Steiner
|
e939b087fe
|
Pulled latest update from trunk
|
2016-04-11 11:03:02 -07:00 |
|
Gael Guennebaud
|
1744b5b5d2
|
Update doc regarding the genericity of EIGEN_USE_BLAS
|
2016-04-11 17:16:07 +02:00 |
|
Gael Guennebaud
|
91bf925fc1
|
Improve constness of level2 blas API.
|
2016-04-11 17:13:01 +02:00 |
|
Gael Guennebaud
|
0483430283
|
Move LAPACK declarations from blas.h to lapack.h and fix compatibility with EIGEN_USE_MKL
|
2016-04-11 17:12:31 +02:00 |
|
Gael Guennebaud
|
097d1e8823
|
Cleanup obsolete assign_scalar_eig2mkl helper.
|
2016-04-11 16:09:29 +02:00 |
|
Gael Guennebaud
|
fec4c334ba
|
Remove all references to MKL in BLAS wrappers.
|
2016-04-11 16:04:09 +02:00 |
|
Gael Guennebaud
|
ddabc992fa
|
Fix long to int conversion in BLAS API.
|
2016-04-11 15:52:01 +02:00 |
|
Gael Guennebaud
|
8191f373be
|
Silent unused warning.
|
2016-04-11 15:37:16 +02:00 |
|
Gael Guennebaud
|
6a9ca88e7e
|
Relax dependency on MKL for EIGEN_USE_BLAS
|
2016-04-11 15:17:14 +02:00 |
|
Gael Guennebaud
|
4e8e5888d7
|
Improve constness of blas level-3 interface.
|
2016-04-11 15:12:44 +02:00 |
|
Gael Guennebaud
|
675e0a2224
|
Fix static/inline keywords order.
|
2016-04-11 15:06:20 +02:00 |
|