Eugene Brevdo
39baff850c
Add TernaryFunctors and the betainc SpecialFunction.
...
TernaryFunctors and their executors allow operations on 3-tuples of inputs.
API fully implemented for Arrays and Tensors based on binary functors.
Ported the cephes betainc function (regularized incomplete beta
integral) to Eigen, with support for CPU and GPU, floats, doubles, and
half types.
Added unit tests in array.cpp and cxx11_tensor_cuda.cu
Collapsed revision
* Merged helper methods for betainc across floats and doubles.
* Added TensorGlobalFunctions with betainc(). Removed betainc() from TensorBase.
* Clean up CwiseTernaryOp checks, change igamma_helper to cephes_helper.
* betainc: merge incbcf and incbd into incbeta_cfe. and more cleanup.
* Update TernaryOp and SpecialFunctions (betainc) based on review comments.
2016-06-02 17:04:19 -07:00
Benoit Steiner
02db4e1a82
Disable the tensor tests when using msvc since older versions of the compiler fail to handle this code
2016-06-04 08:21:17 -07:00
Benoit Steiner
c21eaedce6
Use array_prod to compute the number of elements contained in the input tensor expression
2016-06-04 07:47:04 -07:00
Benoit Steiner
36a4500822
Merged in ibab/eigen (pull request PR-192)
...
Add generic scan method
2016-06-03 17:28:33 -07:00
Benoit Steiner
c2a102345f
Improved the performance of full reductions.
...
AFTER:
BM_fullReduction/10 4541 4543 154017 21.0M items/s
BM_fullReduction/64 5191 5193 100000 752.5M items/s
BM_fullReduction/512 9588 9588 71361 25.5G items/s
BM_fullReduction/4k 244314 244281 2863 64.0G items/s
BM_fullReduction/5k 359382 359363 1946 64.8G items/s
BEFORE:
BM_fullReduction/10 9085 9087 74395 10.5M items/s
BM_fullReduction/64 9478 9478 72014 412.1M items/s
BM_fullReduction/512 14643 14646 46902 16.7G items/s
BM_fullReduction/4k 260338 260384 2678 60.0G items/s
BM_fullReduction/5k 385076 385178 1818 60.5G items/s
2016-06-03 17:27:08 -07:00
Igor Babuschkin
dc03b8f3a1
Add generic scan method
2016-06-03 17:37:04 +01:00
Gael Guennebaud
8d97ba6b22
bug #725 : make move ctor/assignment noexcept.
2016-06-03 14:28:25 +02:00
Gael Guennebaud
e8b922ca63
Fix MatrixFunctions module.
2016-06-03 09:21:35 +02:00
Gael Guennebaud
82293f38d6
Fix unit test.
2016-06-03 08:12:14 +02:00
Gael Guennebaud
fe62c06d9b
Fix compilation.
2016-06-03 07:47:38 +02:00
Gael Guennebaud
969b8959a0
Fix compilation: Matrix does not indirectly live in the internal namespace anymore!
2016-06-03 07:44:58 +02:00
Gael Guennebaud
f2c2465acc
Fix function dependencies
2016-06-03 07:44:18 +02:00
Benoit Steiner
c3c8ad8046
Align the first element of the Waiter struct instead of padding it. This reduces its memory footprint a bit while achieving the goal of preventing false sharing
2016-06-02 21:17:41 -07:00
Gael Guennebaud
5b77481d58
merge
2016-06-02 22:21:45 +02:00
Gael Guennebaud
53feb73b45
Remove dead code.
2016-06-02 22:19:55 +02:00
Gael Guennebaud
2c00ac0b53
Implement generic scalar*expr and expr*scalar operator based on scalar_product_traits.
...
This is especially useful for custom scalar types, e.g., to enable float*expr<multi_prec> without conversion.
2016-06-02 22:16:37 +02:00
Rasmus Munk Larsen
811aadbe00
Add syntactic sugar to Eigen tensors to allow more natural syntax.
...
Specifically, this enables expressions involving:
scalar + tensor
scalar * tensor
scalar / tensor
scalar - tensor
2016-06-02 12:41:28 -07:00
Benoit Steiner
6021c90fdf
Merged in ibab/eigen (pull request PR-189)
...
Add scan op to Tensor module
2016-06-02 08:08:11 -07:00
Gael Guennebaud
8b6f53222b
bug #1193 : fix lpNorm<Infinity> for empty input.
2016-06-02 15:29:59 +02:00
Gael Guennebaud
d616a81294
Disable MSVC's "decorated name length exceeded, name was truncated" warning in unit tests.
2016-06-02 14:48:38 +02:00
Gael Guennebaud
61a32f2a4c
Fix pointer to long conversion warning.
2016-06-02 14:45:45 +02:00
Igor Babuschkin
fbd7ed6ff7
Add tensor scan op
...
This is the initial implementation a generic scan operation.
Based on this, cumsum and cumprod method have been added to TensorBase.
2016-06-02 13:35:47 +01:00
Benoit Steiner
0ed08fd281
Use a single PacketSize variable
2016-06-01 21:19:05 -07:00
Benoit Steiner
8f6fedc55f
Fixed compilation warning
2016-06-01 21:14:46 -07:00
Benoit Steiner
c3cada38e2
Speedup a test
2016-06-01 21:13:00 -07:00
Gael Guennebaud
360e311b66
Doc: add some cross references (also fix empty macro argument warning)
2016-06-01 23:34:09 +02:00
Benoit Steiner
873e6ac54b
Silenced compilation warning generated by nvcc.
2016-06-01 14:20:50 -07:00
Benoit Steiner
d27b0ad4c8
Added support for mean reductions on fp16
2016-06-01 11:12:07 -07:00
Gael Guennebaud
cd221a62ee
Doc: start of a table summarizing coefficient-wise math functions.
2016-06-01 17:09:48 +02:00
Gael Guennebaud
3c69afca4c
Add missing ArrayBase::log1p
2016-06-01 17:08:47 +02:00
Gael Guennebaud
89099b0cf7
Expose log1p to Array.
2016-06-01 17:00:08 +02:00
Gael Guennebaud
afd33539dd
Doc: makes the global unary math functions visible to doxygen (and docuement them)
2016-06-01 15:27:13 +02:00
Gael Guennebaud
77e652d8ad
Doc: improve documentation of Map<SparseMatrix>
2016-06-01 10:03:32 +02:00
Gael Guennebaud
da4970ead2
Doc: disable inlining of inherited members, workaround Doxygen's limited C++ parsing abilities, and improve doc of MapBase.
2016-06-01 09:38:49 +02:00
Benoit Steiner
099b354ca7
Pulled latest updates from trunk
2016-05-31 10:34:16 -07:00
Benoit Steiner
5aeb3687c4
Only enable optimized reductions of fp16 if the reduction functor supports them
2016-05-31 10:33:40 -07:00
Benoit Steiner
b6e306f189
Improved support for CUDA 8.0
2016-05-31 09:47:59 -07:00
Gael Guennebaud
1d3b253329
bug #1181 : help MSVC inlining.
2016-05-31 17:23:42 +02:00
Gael Guennebaud
d79eee05ef
Fix compilation with old icc
2016-05-31 17:13:51 +02:00
Gael Guennebaud
2c1b56f4c1
bug #1238 : fix SparseMatrix::sum() overload for un-compressed mode.
2016-05-31 10:56:53 +02:00
Benoit Steiner
c4bd3b1f21
Silenced some compilation warnings triggered by nvcc 8.0
2016-05-27 14:40:49 -07:00
Benoit Steiner
e2946d962d
Reimplement clamp as a static function.
2016-05-27 12:58:43 -07:00
Benoit Steiner
e96d36d4cd
Use NULL instead of nullptr to preserve the compatibility with cxx03
2016-05-27 12:54:06 -07:00
Benoit Steiner
abc815798b
Added a new operation to enable more powerful tensorindexing.
2016-05-27 12:22:25 -07:00
Benoit Steiner
5707537592
Fixed option '--relaxed-constexpr' has been deprecated and replaced by option '--expt-relaxed-constexpr' warning generated by nvcc 7.5
2016-05-27 10:47:53 -07:00
Benoit Steiner
3a5d6a3c38
Disable the use of MMX instructions since the code is broken on many platforms
2016-05-27 09:13:26 -07:00
Christoph Hertzberg
f2c86384f4
Cleaner implementation of dont_over_optimize.
2016-05-27 11:13:38 +02:00
Gael Guennebaud
22a035db95
Fix compilation when defaulting to row-major
2016-05-27 10:31:11 +02:00
Gael Guennebaud
e0cb73b46b
Fix compilation with old ICC version (use C99 types instead of C++11 ones)
2016-05-27 10:28:09 +02:00
Benoit Steiner
1ae2567861
Fixed some compilation warnings
2016-05-26 15:57:19 -07:00