Gael Guennebaud
360a743a10
bug #1241 : does not emmit anything for empty tensors
2016-06-23 18:47:31 +02:00
Gael Guennebaud
7c6561485a
merge PR 194
2016-06-23 15:29:57 +02:00
Benoit Steiner
a29a2cb4ff
Silenced a couple of compilation warnings generated by xcode
2016-06-22 16:43:02 -07:00
Benoit Steiner
f8fcd6b32d
Turned the constructor of the PerThread struct into what is effectively a constant expression to make the code compatible with a wider range of compilers
2016-06-22 16:03:11 -07:00
Benoit Steiner
c58df31747
Handle empty tensors in the print functions
2016-06-21 09:22:43 -07:00
Benoit Steiner
de32f8d656
Fixed the printing of rank-0 tensors
2016-06-20 10:46:45 -07:00
Benoit Steiner
7d495d890a
Merged in ibab/eigen (pull request PR-197)
...
Implement exclusive scan option for Tensor library
2016-06-14 17:54:59 -07:00
Benoit Steiner
aedc5be1d6
Avoid generating pseudo random numbers that are multiple of 5: this helps
...
spread the load over multiple cpus without havind to rely on work stealing.
2016-06-14 17:51:47 -07:00
Igor Babuschkin
c4d10e921f
Implement exclusive scan option
2016-06-14 19:44:07 +01:00
Gael Guennebaud
76236cdea4
merge
2016-06-14 15:33:47 +02:00
Gael Guennebaud
62134082aa
Update AutoDiffScalar wrt to scalar-multiple.
2016-06-14 15:06:35 +02:00
Gael Guennebaud
5d38203735
Update Tensor module to use bind1st_op and bind2nd_op
2016-06-14 15:06:03 +02:00
Gael Guennebaud
f925dba3d9
Fix compilation of BVH example
2016-06-14 11:32:09 +02:00
Gael Guennebaud
3c12e24164
Add bind1st_op and bind2nd_op helpers to turn binary functors into unary ones, and implement scalar_multiple2 and scalar_quotient2 on top of them.
2016-06-13 16:18:59 +02:00
Benoit Steiner
65d33e5898
Merged in ibab/eigen (pull request PR-195)
...
Add small fixes to TensorScanOp
2016-06-10 19:31:17 -07:00
Benoit Steiner
a05607875a
Don't refer to the half2 type unless it's been defined
2016-06-10 11:53:56 -07:00
Igor Babuschkin
86aedc9282
Add small fixes to TensorScanOp
2016-06-07 20:06:38 +01:00
Christoph Hertzberg
db0118342c
Fixed compilation of BVH_Example (required for make doc)
2016-06-07 19:17:18 +02:00
Benoit Steiner
84b2060a9e
Fixed compilation error with gcc 4.4
2016-06-06 17:16:19 -07:00
Benoit Steiner
7ef9f47b58
Misc small improvements to the reduction code.
2016-06-06 14:09:46 -07:00
Benoit Steiner
9137f560f0
Moved assertions to the constructor to make the code more portable
2016-06-06 07:26:48 -07:00
Gael Guennebaud
66e99ab6a1
Relax mixing-type constraints for binary coefficient-wise operators:
...
- Replace internal::scalar_product_traits<A,B> by Eigen::ScalarBinaryOpTraits<A,B,OP>
- Remove the "functor_is_product_like" helper (was pretty ugly)
- Currently, OP is not used, but it is available to the user for fine grained tuning
- Currently, only the following operators have been generalized: *,/,+,-,=,*=,/=,+=,-=
- TODO: generalize all other binray operators (comparisons,pow,etc.)
- TODO: handle "scalar op array" operators (currently only * is handled)
- TODO: move the handling of the "void" scalar type to ScalarBinaryOpTraits
2016-06-06 15:11:41 +02:00
Benoit Steiner
37638dafd7
Simplified the code that dispatches vectorized reductions on GPU
2016-06-09 10:29:52 -07:00
Benoit Steiner
66796e843d
Fixed definition of some of the reducer_traits
2016-06-09 08:50:01 -07:00
Benoit Steiner
14a112ee15
Use signed integers more consistently to encode the number of threads to use to evaluate a tensor expression.
2016-06-09 08:25:22 -07:00
Benoit Steiner
8f92c26319
Improved code formatting
2016-06-09 08:23:42 -07:00
Benoit Steiner
aa33446dac
Improved support for vectorization of 16-bit floats
2016-06-09 08:22:27 -07:00
Benoit Steiner
d6d39c7ddb
Added missing EIGEN_DEVICE_FUNC
2016-06-07 14:35:08 -07:00
Eugene Brevdo
39baff850c
Add TernaryFunctors and the betainc SpecialFunction.
...
TernaryFunctors and their executors allow operations on 3-tuples of inputs.
API fully implemented for Arrays and Tensors based on binary functors.
Ported the cephes betainc function (regularized incomplete beta
integral) to Eigen, with support for CPU and GPU, floats, doubles, and
half types.
Added unit tests in array.cpp and cxx11_tensor_cuda.cu
Collapsed revision
* Merged helper methods for betainc across floats and doubles.
* Added TensorGlobalFunctions with betainc(). Removed betainc() from TensorBase.
* Clean up CwiseTernaryOp checks, change igamma_helper to cephes_helper.
* betainc: merge incbcf and incbd into incbeta_cfe. and more cleanup.
* Update TernaryOp and SpecialFunctions (betainc) based on review comments.
2016-06-02 17:04:19 -07:00
Benoit Steiner
02db4e1a82
Disable the tensor tests when using msvc since older versions of the compiler fail to handle this code
2016-06-04 08:21:17 -07:00
Benoit Steiner
c21eaedce6
Use array_prod to compute the number of elements contained in the input tensor expression
2016-06-04 07:47:04 -07:00
Benoit Steiner
36a4500822
Merged in ibab/eigen (pull request PR-192)
...
Add generic scan method
2016-06-03 17:28:33 -07:00
Benoit Steiner
c2a102345f
Improved the performance of full reductions.
...
AFTER:
BM_fullReduction/10 4541 4543 154017 21.0M items/s
BM_fullReduction/64 5191 5193 100000 752.5M items/s
BM_fullReduction/512 9588 9588 71361 25.5G items/s
BM_fullReduction/4k 244314 244281 2863 64.0G items/s
BM_fullReduction/5k 359382 359363 1946 64.8G items/s
BEFORE:
BM_fullReduction/10 9085 9087 74395 10.5M items/s
BM_fullReduction/64 9478 9478 72014 412.1M items/s
BM_fullReduction/512 14643 14646 46902 16.7G items/s
BM_fullReduction/4k 260338 260384 2678 60.0G items/s
BM_fullReduction/5k 385076 385178 1818 60.5G items/s
2016-06-03 17:27:08 -07:00
Igor Babuschkin
dc03b8f3a1
Add generic scan method
2016-06-03 17:37:04 +01:00
Gael Guennebaud
e8b922ca63
Fix MatrixFunctions module.
2016-06-03 09:21:35 +02:00
Benoit Steiner
c3c8ad8046
Align the first element of the Waiter struct instead of padding it. This reduces its memory footprint a bit while achieving the goal of preventing false sharing
2016-06-02 21:17:41 -07:00
Rasmus Munk Larsen
811aadbe00
Add syntactic sugar to Eigen tensors to allow more natural syntax.
...
Specifically, this enables expressions involving:
scalar + tensor
scalar * tensor
scalar / tensor
scalar - tensor
2016-06-02 12:41:28 -07:00
Igor Babuschkin
fbd7ed6ff7
Add tensor scan op
...
This is the initial implementation a generic scan operation.
Based on this, cumsum and cumprod method have been added to TensorBase.
2016-06-02 13:35:47 +01:00
Benoit Steiner
0ed08fd281
Use a single PacketSize variable
2016-06-01 21:19:05 -07:00
Benoit Steiner
8f6fedc55f
Fixed compilation warning
2016-06-01 21:14:46 -07:00
Benoit Steiner
c3cada38e2
Speedup a test
2016-06-01 21:13:00 -07:00
Benoit Steiner
873e6ac54b
Silenced compilation warning generated by nvcc.
2016-06-01 14:20:50 -07:00
Benoit Steiner
d27b0ad4c8
Added support for mean reductions on fp16
2016-06-01 11:12:07 -07:00
Benoit Steiner
5aeb3687c4
Only enable optimized reductions of fp16 if the reduction functor supports them
2016-05-31 10:33:40 -07:00
Benoit Steiner
e2946d962d
Reimplement clamp as a static function.
2016-05-27 12:58:43 -07:00
Benoit Steiner
e96d36d4cd
Use NULL instead of nullptr to preserve the compatibility with cxx03
2016-05-27 12:54:06 -07:00
Benoit Steiner
abc815798b
Added a new operation to enable more powerful tensorindexing.
2016-05-27 12:22:25 -07:00
Benoit Steiner
5707537592
Fixed option '--relaxed-constexpr' has been deprecated and replaced by option '--expt-relaxed-constexpr' warning generated by nvcc 7.5
2016-05-27 10:47:53 -07:00
Gael Guennebaud
22a035db95
Fix compilation when defaulting to row-major
2016-05-27 10:31:11 +02:00
Benoit Steiner
1ae2567861
Fixed some compilation warnings
2016-05-26 15:57:19 -07:00