Commit Graph

7728 Commits

Author SHA1 Message Date
Gael Guennebaud
b67c983291 Enable the use of half-packet in coeff-based product.
For instance, Matrix4f*Vector4f is now vectorized again when using AVX.
2016-04-12 23:03:03 +02:00
Benoit Steiner
e3a184785c Fixed the zeta test 2016-04-12 11:12:36 -07:00
Benoit Steiner
3b76df64fc Defer the decision to vectorize tensor CUDA code to the meta kernel. This makes it possible to decide to vectorize or not depending on the capability of the target cuda architecture. In particular, this enables us to vectorize the processing of fp16 when running on device of capability >= 5.3 2016-04-12 10:58:51 -07:00
Benoit Steiner
748c4c4599 More accurate cost estimates for exp, log, tanh, and sqrt. 2016-04-11 13:11:04 -07:00
Benoit Steiner
833efb39bf Added epsilon, dummy_precision, infinity and quiet_NaN NumTraits for fp16 2016-04-11 11:03:56 -07:00
Benoit Steiner
e939b087fe Pulled latest update from trunk 2016-04-11 11:03:02 -07:00
Gael Guennebaud
1744b5b5d2 Update doc regarding the genericity of EIGEN_USE_BLAS 2016-04-11 17:16:07 +02:00
Gael Guennebaud
91bf925fc1 Improve constness of level2 blas API. 2016-04-11 17:13:01 +02:00
Gael Guennebaud
0483430283 Move LAPACK declarations from blas.h to lapack.h and fix compatibility with EIGEN_USE_MKL 2016-04-11 17:12:31 +02:00
Gael Guennebaud
097d1e8823 Cleanup obsolete assign_scalar_eig2mkl helper. 2016-04-11 16:09:29 +02:00
Gael Guennebaud
fec4c334ba Remove all references to MKL in BLAS wrappers. 2016-04-11 16:04:09 +02:00
Gael Guennebaud
ddabc992fa Fix long to int conversion in BLAS API. 2016-04-11 15:52:01 +02:00
Gael Guennebaud
8191f373be Silent unused warning. 2016-04-11 15:37:16 +02:00
Gael Guennebaud
6a9ca88e7e Relax dependency on MKL for EIGEN_USE_BLAS 2016-04-11 15:17:14 +02:00
Gael Guennebaud
4e8e5888d7 Improve constness of blas level-3 interface. 2016-04-11 15:12:44 +02:00
Gael Guennebaud
675e0a2224 Fix static/inline keywords order. 2016-04-11 15:06:20 +02:00
Gael Guennebaud
fc6a0ebb1c Typos in doc. 2016-04-11 10:54:58 +02:00
Till Hoffmann
643b697649 Proper handling of domain errors. 2016-04-10 00:37:53 +01:00
Till Hoffmann
7f4826890c Merge upstream 2016-04-09 20:08:07 +01:00
Till Hoffmann
de057ebe54 Added nans to zeta function. 2016-04-09 20:07:36 +01:00
Gael Guennebaud
af2161cdb4 bug #1197: fix/relax some LM unit tests 2016-04-09 11:14:02 +02:00
Gael Guennebaud
a05a683d83 bug #1160: fix and relax some lm unit tests by turning faillures to warnings 2016-04-09 10:49:19 +02:00
Benoit Steiner
5da90fc8dd Use numext::abs instead of std::abs in scalar_fuzzy_default_impl to make it usable inside GPU kernels. 2016-04-08 19:40:48 -07:00
Benoit Steiner
01bd577288 Fixed the implementation of Eigen::numext::isfinite, Eigen::numext::isnan, andEigen::numext::isinf on CUDA devices 2016-04-08 16:40:10 -07:00
Benoit Steiner
89a3dc35a3 Fixed isfinite_impl: NumTraits<T>::highest() and NumTraits<T>::lowest() are finite numbers. 2016-04-08 15:56:16 -07:00
Benoit Steiner
995f202cea Disabled the use of half2 on cuda devices of compute capability < 5.3 2016-04-08 14:43:36 -07:00
Benoit Steiner
8d22967bd9 Initial support for taking the power of fp16 2016-04-08 14:22:39 -07:00
Benoit Steiner
3394379319 Fixed the packet_traits for half floats. 2016-04-08 13:33:59 -07:00
Benoit Steiner
0d2a532fc3 Created the new EIGEN_TEST_CUDA_CLANG option to compile the CUDA tests using clang instead of nvcc 2016-04-08 13:16:08 -07:00
Benoit Steiner
2d072b38c1 Don't test the division by 0 on float16 when compiling with msvc since msvc detects and errors out on divisions by 0. 2016-04-08 12:50:25 -07:00
Benoit Jacob
cd2b667ac8 Add references to filed LLVM bugs 2016-04-08 08:12:47 -04:00
Benoit Steiner
3bd16457e1 Properly handle complex numbers. 2016-04-07 23:28:04 -07:00
Benoit Steiner
63102ee43d Turn on the coeffWise benchmarks on fp16 2016-04-07 23:05:20 -07:00
Benoit Steiner
7c47d3e663 Fixed the type casting benchmarks for fp16 2016-04-07 22:50:25 -07:00
Benoit Steiner
2f2801f096 Merged in parthaEth/eigen (pull request PR-175)
Static casting scalar types so as to let chlesky module of eigen work with ceres
2016-04-07 22:10:14 -07:00
Benoit Steiner
d962fe6a99 Renamed float16 into cxx11_float16 since the test relies on c++11 features 2016-04-07 20:28:32 -07:00
Benoit Steiner
7d5b17087f Added missing EIGEN_DEVICE_FUNC to the tensor conversion code. 2016-04-07 20:01:19 -07:00
Benoit Steiner
a6d08be9b2 Fixed the benchmarking of fp16 coefficient wise operations 2016-04-07 17:13:44 -07:00
parthaEth
2d5bb375b7 Static casting scalar types so as to let chlesky module of eigen work with ceres 2016-04-08 00:14:44 +02:00
Benoit Steiner
a02ec09511 Worked around numerical noise in the test for the zeta function. 2016-04-07 12:11:02 -07:00
Benoit Steiner
c912b1d28c Fixed a typo in the polygamma test. 2016-04-07 11:51:07 -07:00
Benoit Steiner
74f64838c5 Updated the unary functors to use the numext implementation of typicall functions instead of the one provided in the standard library. The standard library functions aren't supported officially by cuda, so we're better off using the numext implementations. 2016-04-07 11:42:14 -07:00
Benoit Steiner
737644366f Move the functions operating on fp16 out of the std namespace and into the Eigen::numext namespace 2016-04-07 11:40:15 -07:00
Benoit Steiner
dc45aaeb93 Added tests for float16 2016-04-07 11:18:05 -07:00
Benoit Steiner
8db269e055 Fixed a typo in a test 2016-04-07 10:41:51 -07:00
Benoit Steiner
b89d3f78b2 Updated the isnan, isinf and isfinite functions to make compatible with cuda devices. 2016-04-07 10:08:49 -07:00
Benoit Steiner
48308ed801 Added support for isinf, isnan, and isfinite checks to the tensor api 2016-04-07 09:48:36 -07:00
Benoit Steiner
cfb34d808b Fixed a possible integer overflow. 2016-04-07 08:46:52 -07:00
Benoit Steiner
df838736e2 Fixed compilation warning triggered by msvc 2016-04-06 20:48:55 -07:00
Benoit Steiner
14ea7c7ec7 Fixed packet_traits<half> 2016-04-06 19:30:21 -07:00