Benoit Steiner
aa33446dac
Improved support for vectorization of 16-bit floats
2016-06-09 08:22:27 -07:00
Benoit Steiner
d6d39c7ddb
Added missing EIGEN_DEVICE_FUNC
2016-06-07 14:35:08 -07:00
Christoph Hertzberg
db0118342c
Fixed compilation of BVH_Example (required for make doc)
2016-06-07 19:17:18 +02:00
Benoit Steiner
84b2060a9e
Fixed compilation error with gcc 4.4
2016-06-06 17:16:19 -07:00
Benoit Steiner
7ef9f47b58
Misc small improvements to the reduction code.
2016-06-06 14:09:46 -07:00
Benoit Steiner
9137f560f0
Moved assertions to the constructor to make the code more portable
2016-06-06 07:26:48 -07:00
Eugene Brevdo
39baff850c
Add TernaryFunctors and the betainc SpecialFunction.
...
TernaryFunctors and their executors allow operations on 3-tuples of inputs.
API fully implemented for Arrays and Tensors based on binary functors.
Ported the cephes betainc function (regularized incomplete beta
integral) to Eigen, with support for CPU and GPU, floats, doubles, and
half types.
Added unit tests in array.cpp and cxx11_tensor_cuda.cu
Collapsed revision
* Merged helper methods for betainc across floats and doubles.
* Added TensorGlobalFunctions with betainc(). Removed betainc() from TensorBase.
* Clean up CwiseTernaryOp checks, change igamma_helper to cephes_helper.
* betainc: merge incbcf and incbd into incbeta_cfe. and more cleanup.
* Update TernaryOp and SpecialFunctions (betainc) based on review comments.
2016-06-02 17:04:19 -07:00
Benoit Steiner
02db4e1a82
Disable the tensor tests when using msvc since older versions of the compiler fail to handle this code
2016-06-04 08:21:17 -07:00
Benoit Steiner
c21eaedce6
Use array_prod to compute the number of elements contained in the input tensor expression
2016-06-04 07:47:04 -07:00
Benoit Steiner
36a4500822
Merged in ibab/eigen (pull request PR-192)
...
Add generic scan method
2016-06-03 17:28:33 -07:00
Benoit Steiner
c2a102345f
Improved the performance of full reductions.
...
AFTER:
BM_fullReduction/10 4541 4543 154017 21.0M items/s
BM_fullReduction/64 5191 5193 100000 752.5M items/s
BM_fullReduction/512 9588 9588 71361 25.5G items/s
BM_fullReduction/4k 244314 244281 2863 64.0G items/s
BM_fullReduction/5k 359382 359363 1946 64.8G items/s
BEFORE:
BM_fullReduction/10 9085 9087 74395 10.5M items/s
BM_fullReduction/64 9478 9478 72014 412.1M items/s
BM_fullReduction/512 14643 14646 46902 16.7G items/s
BM_fullReduction/4k 260338 260384 2678 60.0G items/s
BM_fullReduction/5k 385076 385178 1818 60.5G items/s
2016-06-03 17:27:08 -07:00
Igor Babuschkin
dc03b8f3a1
Add generic scan method
2016-06-03 17:37:04 +01:00
Gael Guennebaud
e8b922ca63
Fix MatrixFunctions module.
2016-06-03 09:21:35 +02:00
Benoit Steiner
c3c8ad8046
Align the first element of the Waiter struct instead of padding it. This reduces its memory footprint a bit while achieving the goal of preventing false sharing
2016-06-02 21:17:41 -07:00
Rasmus Munk Larsen
811aadbe00
Add syntactic sugar to Eigen tensors to allow more natural syntax.
...
Specifically, this enables expressions involving:
scalar + tensor
scalar * tensor
scalar / tensor
scalar - tensor
2016-06-02 12:41:28 -07:00
Igor Babuschkin
fbd7ed6ff7
Add tensor scan op
...
This is the initial implementation a generic scan operation.
Based on this, cumsum and cumprod method have been added to TensorBase.
2016-06-02 13:35:47 +01:00
Benoit Steiner
0ed08fd281
Use a single PacketSize variable
2016-06-01 21:19:05 -07:00
Benoit Steiner
8f6fedc55f
Fixed compilation warning
2016-06-01 21:14:46 -07:00
Benoit Steiner
c3cada38e2
Speedup a test
2016-06-01 21:13:00 -07:00
Benoit Steiner
873e6ac54b
Silenced compilation warning generated by nvcc.
2016-06-01 14:20:50 -07:00
Benoit Steiner
d27b0ad4c8
Added support for mean reductions on fp16
2016-06-01 11:12:07 -07:00
Benoit Steiner
5aeb3687c4
Only enable optimized reductions of fp16 if the reduction functor supports them
2016-05-31 10:33:40 -07:00
Benoit Steiner
e2946d962d
Reimplement clamp as a static function.
2016-05-27 12:58:43 -07:00
Benoit Steiner
e96d36d4cd
Use NULL instead of nullptr to preserve the compatibility with cxx03
2016-05-27 12:54:06 -07:00
Benoit Steiner
abc815798b
Added a new operation to enable more powerful tensorindexing.
2016-05-27 12:22:25 -07:00
Benoit Steiner
5707537592
Fixed option '--relaxed-constexpr' has been deprecated and replaced by option '--expt-relaxed-constexpr' warning generated by nvcc 7.5
2016-05-27 10:47:53 -07:00
Gael Guennebaud
22a035db95
Fix compilation when defaulting to row-major
2016-05-27 10:31:11 +02:00
Benoit Steiner
1ae2567861
Fixed some compilation warnings
2016-05-26 15:57:19 -07:00
Benoit Steiner
1a47844529
Preserve the ability to vectorize the evaluation of an expression even when it involves a cast that isn't vectorized (e.g fp16 to float)
2016-05-26 14:37:09 -07:00
Benoit Steiner
36369ab63c
Resolved merge conflicts
2016-05-26 13:39:39 -07:00
Benoit Steiner
28fcb5ca2a
Merged latest reduction improvements
2016-05-26 12:19:33 -07:00
Benoit Steiner
c1c7f06c35
Improved the performance of inner reductions.
2016-05-26 11:53:59 -07:00
Benoit Steiner
22d02c9855
Improved the coverage of the fp16 reduction tests
2016-05-26 11:12:16 -07:00
Benoit Steiner
8288b0aec2
Code cleanup.
2016-05-26 09:00:04 -07:00
Benoit Steiner
2d7ed54ba2
Made the static storage class qualifier come first.
2016-05-25 22:16:15 -07:00
Benoit Steiner
e1fca8866e
Deleted unnecessary explicit qualifiers.
2016-05-25 22:15:26 -07:00
Benoit Steiner
9b0aaf5113
Don't mark inline functions as static since it confuses the ICC compiler
2016-05-25 22:10:11 -07:00
Benoit Steiner
037a463fd5
Marked unused variables as such
2016-05-25 22:07:48 -07:00
Benoit Steiner
3ac4045272
Made the IndexPair code compile in non cxx11 mode
2016-05-25 15:15:12 -07:00
Benoit Steiner
66556d0e05
Made the index pair list code more portable accross various compilers
2016-05-25 14:34:27 -07:00
Benoit Steiner
034aa3b2c0
Improved the performance of tensor padding
2016-05-25 11:43:08 -07:00
Benoit Steiner
58026905ae
Added support for statically known lists of pairs of indices
2016-05-25 11:04:14 -07:00
Benoit Steiner
0835667329
There is no need to make the fp16 full reduction kernel a static function.
2016-05-24 23:11:56 -07:00
Benoit Steiner
b5d6b52a4d
Fixed compilation warning
2016-05-24 23:10:57 -07:00
Benoit Steiner
a09cbf9905
Merged in rmlarsen/eigen (pull request PR-188)
...
Minor cleanups: 1. Get rid of a few unused variables. 2. Get rid of last uses of EIGEN_USE_COST_MODEL.
2016-05-23 12:55:12 -07:00
Christoph Hertzberg
718521d5cf
Silenced several double-promotion warnings
2016-05-22 18:17:04 +02:00
Christoph Hertzberg
b5a7603822
fixed macro name
2016-05-22 16:49:29 +02:00
Christoph Hertzberg
25a03c02d6
Fix some sign-compare warnings
2016-05-22 16:42:27 +02:00
Gael Guennebaud
ccaace03c9
Make EIGEN_HAS_CONSTEXPR user configurable
2016-05-20 15:10:08 +02:00
Gael Guennebaud
c3410804cd
Make EIGEN_HAS_VARIADIC_TEMPLATES user configurable
2016-05-20 15:05:38 +02:00