Rasmus Munk Larsen
235e83aba6
Eigen cost model part 1. This implements a basic recursive framework to estimate the cost of evaluating tensor expressions.
2016-04-14 13:57:35 -07:00
Gael Guennebaud
68897c52f3
Add extreme values to the imaginary part for SVD unit tests.
2016-04-14 22:47:30 +02:00
Gael Guennebaud
20f387fafa
Improve numerical robustness of JacoviSVD:
...
- avoid noise amplification in complex to real conversion
- compare off-diagonal entries to the current biggest diagonal entry: no need to bother about a 2x2 block containing ridiculously small entries compared to the rest of the matrix.
2016-04-14 22:46:55 +02:00
Benoit Steiner
7718749fee
Force the inlining of the << operator on half floats
2016-04-14 11:51:54 -07:00
Benoit Steiner
5379d2b594
Inline the << operator on half floats
2016-04-14 11:40:48 -07:00
Benoit Steiner
5912ad877c
Silenced a compilation warning
2016-04-14 11:40:14 -07:00
Benoit Steiner
2b6e3de02f
Added tests to validate flooring and ceiling of fp16
2016-04-14 11:39:18 -07:00
Benoit Steiner
6f23e945f6
Added simple test for numext::sqrt and numext::pow on fp16
2016-04-14 10:32:52 -07:00
Benoit Steiner
72510c80e1
Added basic test for trigonometric functions on fp16
2016-04-14 10:27:24 -07:00
Benoit Steiner
7b3d7acebe
Added support for fp16 to test_isApprox, test_isMuchSmallerThan, and test_isApproxOrLessThan
2016-04-14 10:25:50 -07:00
Benoit Steiner
5c13765ee3
Added ability to printf fp16
2016-04-14 10:24:52 -07:00
Benoit Steiner
c7167fee0e
Added support for fp16 to the sigmoid function
2016-04-14 10:08:33 -07:00
Benoit Steiner
f6003f0873
Made the test msvc friendly
2016-04-14 09:47:26 -07:00
Gael Guennebaud
3551dea887
Cleaning pass on rcond estimator.
2016-04-14 16:45:41 +02:00
Gael Guennebaud
d8a3bdaa24
remove useless include
2016-04-14 15:18:56 +02:00
Gael Guennebaud
d402adc3d7
Better use .data() than &coeffRef(0)
2016-04-14 15:18:08 +02:00
Gael Guennebaud
ea7087ef31
Merged in rmlarsen/eigen (pull request PR-174)
...
Add matrix condition number estimation module.
2016-04-14 15:11:33 +02:00
Benoit Steiner
36f5a10198
Properly gate the definition of the error and gamma functions for fp16
2016-04-13 18:44:48 -07:00
Benoit Steiner
10b69810d1
Improved support for trigonometric functions on GPU
2016-04-13 16:00:51 -07:00
Benoit Steiner
d6105b53b8
Added basic implementation of the lgamma, digamma, igamma, igammac, polygamma, and zeta function for fp16
2016-04-13 15:26:02 -07:00
Gael Guennebaud
703251f10f
merge
2016-04-13 23:45:10 +02:00
Gael Guennebaud
39211ba46b
Fix JacobiSVD for complex when the complex-to-real update already gives a diagonal 2x2 block.
2016-04-13 23:43:26 +02:00
Benoit Steiner
2986253259
Cleaned up the implementation of digamma
2016-04-13 14:24:06 -07:00
Benoit Steiner
d5de1a8220
Pulled latest updates from trunk
2016-04-13 14:17:11 -07:00
Benoit Steiner
87ca15c4e8
Added support for sin, cos, tan, and tanh on fp16
2016-04-13 14:12:38 -07:00
Gael Guennebaud
2c9e4fa417
Add debug output for random unit test
2016-04-13 22:56:12 +02:00
Gael Guennebaud
7d1391d049
Turn a converge check to a warning
2016-04-13 22:50:54 +02:00
Gael Guennebaud
feef39e2d1
Fix underflow in JacoviSVD's complex to real preconditioner
2016-04-13 22:49:51 +02:00
Gael Guennebaud
f4e12272f1
Fix corner case in unit test.
2016-04-13 22:18:02 +02:00
Gael Guennebaud
a95e1a273e
Fix warning in unit tests
2016-04-13 22:00:38 +02:00
Benoit Steiner
bf3f6688f0
Added support for computing cos, sin, tan, and tanh on GPU.
2016-04-13 11:55:08 -07:00
Benoit Steiner
473c8380ea
Added constructors to convert unsigned integers into fp16
2016-04-13 11:03:37 -07:00
Gael Guennebaud
42a3352a3b
Workaround a division by zero when outerstride==0
2016-04-13 19:02:02 +02:00
Gael Guennebaud
6f960b83ff
Make use of is_same_dense helper instead of extract_data to detect input/outputs are the same.
2016-04-13 18:47:12 +02:00
Gael Guennebaud
b7716c0328
Fix incomplete previous patch on matrix comparision.
2016-04-13 18:32:56 +02:00
Gael Guennebaud
2630d97c62
Fix detection of same matrices when both matrices are not handled by extract_data.
2016-04-13 18:26:08 +02:00
Gael Guennebaud
512ba0ac76
Add regression unit tests for half-packet vectorization
2016-04-13 18:16:35 +02:00
Gael Guennebaud
06447e0a39
Improve half-packet vectorization logic to distinguish linear versus inner traversal modes.
2016-04-13 18:15:49 +02:00
Gael Guennebaud
bbb8854bf7
Enable half-packet in reduxions.
2016-04-13 13:02:34 +02:00
Benoit Steiner
e9b12cc1f7
Fixed compilation warnings generated by clang
2016-04-12 20:53:18 -07:00
Benoit Steiner
eaeb6ca93a
Enable the benchmarks for algebraic and transcendental fnctions on fp16.
2016-04-12 16:29:00 -07:00
Benoit Steiner
aa1ba8bbd2
Don't put a command at the end of an enumerator list
2016-04-12 16:28:11 -07:00
Benoit Steiner
e49945ced4
Pulled latest update from trunk
2016-04-12 14:13:41 -07:00
Benoit Steiner
25d05c4b8f
Fixed the vectorization logic test
2016-04-12 14:13:25 -07:00
Benoit Steiner
53121c0119
Turned on the contraction benchmarks for fp16
2016-04-12 14:11:52 -07:00
Gael Guennebaud
b67c983291
Enable the use of half-packet in coeff-based product.
...
For instance, Matrix4f*Vector4f is now vectorized again when using AVX.
2016-04-12 23:03:03 +02:00
Benoit Steiner
e3a184785c
Fixed the zeta test
2016-04-12 11:12:36 -07:00
Benoit Steiner
3b76df64fc
Defer the decision to vectorize tensor CUDA code to the meta kernel. This makes it possible to decide to vectorize or not depending on the capability of the target cuda architecture. In particular, this enables us to vectorize the processing of fp16 when running on device of capability >= 5.3
2016-04-12 10:58:51 -07:00
Rasmus Larsen
6498dadc2f
Merged eigen/eigen into default
2016-04-11 17:42:05 -07:00
Benoit Steiner
748c4c4599
More accurate cost estimates for exp, log, tanh, and sqrt.
2016-04-11 13:11:04 -07:00