Benoit Steiner
|
a8e8837ba7
|
Added tests for the non blocking thread pool
|
2016-04-14 15:23:49 -07:00 |
|
Benoit Steiner
|
78a51abc12
|
Added a more scalable non blocking thread pool
|
2016-04-14 15:23:10 -07:00 |
|
Benoit Steiner
|
7718749fee
|
Force the inlining of the << operator on half floats
|
2016-04-14 11:51:54 -07:00 |
|
Benoit Steiner
|
5379d2b594
|
Inline the << operator on half floats
|
2016-04-14 11:40:48 -07:00 |
|
Benoit Steiner
|
5912ad877c
|
Silenced a compilation warning
|
2016-04-14 11:40:14 -07:00 |
|
Benoit Steiner
|
2b6e3de02f
|
Added tests to validate flooring and ceiling of fp16
|
2016-04-14 11:39:18 -07:00 |
|
Benoit Steiner
|
6f23e945f6
|
Added simple test for numext::sqrt and numext::pow on fp16
|
2016-04-14 10:32:52 -07:00 |
|
Benoit Steiner
|
72510c80e1
|
Added basic test for trigonometric functions on fp16
|
2016-04-14 10:27:24 -07:00 |
|
Benoit Steiner
|
7b3d7acebe
|
Added support for fp16 to test_isApprox, test_isMuchSmallerThan, and test_isApproxOrLessThan
|
2016-04-14 10:25:50 -07:00 |
|
Benoit Steiner
|
5c13765ee3
|
Added ability to printf fp16
|
2016-04-14 10:24:52 -07:00 |
|
Benoit Steiner
|
c7167fee0e
|
Added support for fp16 to the sigmoid function
|
2016-04-14 10:08:33 -07:00 |
|
Benoit Steiner
|
f6003f0873
|
Made the test msvc friendly
|
2016-04-14 09:47:26 -07:00 |
|
Gael Guennebaud
|
3551dea887
|
Cleaning pass on rcond estimator.
|
2016-04-14 16:45:41 +02:00 |
|
Gael Guennebaud
|
d8a3bdaa24
|
remove useless include
|
2016-04-14 15:18:56 +02:00 |
|
Gael Guennebaud
|
d402adc3d7
|
Better use .data() than &coeffRef(0)
|
2016-04-14 15:18:08 +02:00 |
|
Gael Guennebaud
|
ea7087ef31
|
Merged in rmlarsen/eigen (pull request PR-174)
Add matrix condition number estimation module.
|
2016-04-14 15:11:33 +02:00 |
|
Benoit Steiner
|
36f5a10198
|
Properly gate the definition of the error and gamma functions for fp16
|
2016-04-13 18:44:48 -07:00 |
|
Benoit Steiner
|
10b69810d1
|
Improved support for trigonometric functions on GPU
|
2016-04-13 16:00:51 -07:00 |
|
Benoit Steiner
|
d6105b53b8
|
Added basic implementation of the lgamma, digamma, igamma, igammac, polygamma, and zeta function for fp16
|
2016-04-13 15:26:02 -07:00 |
|
Gael Guennebaud
|
703251f10f
|
merge
|
2016-04-13 23:45:10 +02:00 |
|
Gael Guennebaud
|
39211ba46b
|
Fix JacobiSVD for complex when the complex-to-real update already gives a diagonal 2x2 block.
|
2016-04-13 23:43:26 +02:00 |
|
Benoit Steiner
|
2986253259
|
Cleaned up the implementation of digamma
|
2016-04-13 14:24:06 -07:00 |
|
Benoit Steiner
|
d5de1a8220
|
Pulled latest updates from trunk
|
2016-04-13 14:17:11 -07:00 |
|
Benoit Steiner
|
87ca15c4e8
|
Added support for sin, cos, tan, and tanh on fp16
|
2016-04-13 14:12:38 -07:00 |
|
Gael Guennebaud
|
2c9e4fa417
|
Add debug output for random unit test
|
2016-04-13 22:56:12 +02:00 |
|
Gael Guennebaud
|
7d1391d049
|
Turn a converge check to a warning
|
2016-04-13 22:50:54 +02:00 |
|
Gael Guennebaud
|
feef39e2d1
|
Fix underflow in JacoviSVD's complex to real preconditioner
|
2016-04-13 22:49:51 +02:00 |
|
Gael Guennebaud
|
f4e12272f1
|
Fix corner case in unit test.
|
2016-04-13 22:18:02 +02:00 |
|
Gael Guennebaud
|
a95e1a273e
|
Fix warning in unit tests
|
2016-04-13 22:00:38 +02:00 |
|
Benoit Steiner
|
bf3f6688f0
|
Added support for computing cos, sin, tan, and tanh on GPU.
|
2016-04-13 11:55:08 -07:00 |
|
Benoit Steiner
|
473c8380ea
|
Added constructors to convert unsigned integers into fp16
|
2016-04-13 11:03:37 -07:00 |
|
Gael Guennebaud
|
42a3352a3b
|
Workaround a division by zero when outerstride==0
|
2016-04-13 19:02:02 +02:00 |
|
Gael Guennebaud
|
6f960b83ff
|
Make use of is_same_dense helper instead of extract_data to detect input/outputs are the same.
|
2016-04-13 18:47:12 +02:00 |
|
Gael Guennebaud
|
b7716c0328
|
Fix incomplete previous patch on matrix comparision.
|
2016-04-13 18:32:56 +02:00 |
|
Gael Guennebaud
|
2630d97c62
|
Fix detection of same matrices when both matrices are not handled by extract_data.
|
2016-04-13 18:26:08 +02:00 |
|
Gael Guennebaud
|
512ba0ac76
|
Add regression unit tests for half-packet vectorization
|
2016-04-13 18:16:35 +02:00 |
|
Gael Guennebaud
|
06447e0a39
|
Improve half-packet vectorization logic to distinguish linear versus inner traversal modes.
|
2016-04-13 18:15:49 +02:00 |
|
Gael Guennebaud
|
bbb8854bf7
|
Enable half-packet in reduxions.
|
2016-04-13 13:02:34 +02:00 |
|
Benoit Steiner
|
e9b12cc1f7
|
Fixed compilation warnings generated by clang
|
2016-04-12 20:53:18 -07:00 |
|
Benoit Steiner
|
eaeb6ca93a
|
Enable the benchmarks for algebraic and transcendental fnctions on fp16.
|
2016-04-12 16:29:00 -07:00 |
|
Benoit Steiner
|
aa1ba8bbd2
|
Don't put a command at the end of an enumerator list
|
2016-04-12 16:28:11 -07:00 |
|
Benoit Steiner
|
e49945ced4
|
Pulled latest update from trunk
|
2016-04-12 14:13:41 -07:00 |
|
Benoit Steiner
|
25d05c4b8f
|
Fixed the vectorization logic test
|
2016-04-12 14:13:25 -07:00 |
|
Benoit Steiner
|
53121c0119
|
Turned on the contraction benchmarks for fp16
|
2016-04-12 14:11:52 -07:00 |
|
Gael Guennebaud
|
b67c983291
|
Enable the use of half-packet in coeff-based product.
For instance, Matrix4f*Vector4f is now vectorized again when using AVX.
|
2016-04-12 23:03:03 +02:00 |
|
Benoit Steiner
|
e3a184785c
|
Fixed the zeta test
|
2016-04-12 11:12:36 -07:00 |
|
Benoit Steiner
|
3b76df64fc
|
Defer the decision to vectorize tensor CUDA code to the meta kernel. This makes it possible to decide to vectorize or not depending on the capability of the target cuda architecture. In particular, this enables us to vectorize the processing of fp16 when running on device of capability >= 5.3
|
2016-04-12 10:58:51 -07:00 |
|
Rasmus Larsen
|
6498dadc2f
|
Merged eigen/eigen into default
|
2016-04-11 17:42:05 -07:00 |
|
Benoit Steiner
|
748c4c4599
|
More accurate cost estimates for exp, log, tanh, and sqrt.
|
2016-04-11 13:11:04 -07:00 |
|
Benoit Steiner
|
833efb39bf
|
Added epsilon, dummy_precision, infinity and quiet_NaN NumTraits for fp16
|
2016-04-11 11:03:56 -07:00 |
|