Benoit Steiner
e892524efe
Added support for multi gpu configuration to the GpuDevice class
2015-07-15 12:38:34 -07:00
Benoit Steiner
b80036abec
Enabled the construction of a fixed sized tensor directly from an expression.
2015-07-13 11:16:37 -07:00
Benoit Steiner
3912ca0d53
Fixed a bug in the integer division code that caused some large numerators to be incorrectly handled
2015-07-13 11:14:59 -07:00
Benoit Steiner
e6297741c9
Added support for generation of random complex numbers on CUDA devices
2015-07-07 17:40:49 -07:00
Benoit Steiner
6de6fa9483
Use NumTraits<T>::RequireInitialization instead of internal::is_arithmetic<T>::value to check whether it's possible to bypass the type constructor in the tensor code.
2015-07-07 15:23:56 -07:00
Benoit Steiner
a93af65938
Improved and cleaned up the 2d patch extraction code
2015-07-07 08:52:14 -07:00
Benoit Steiner
3f2101b03b
Use numext::swap instead of std::swap
2015-07-06 17:02:29 -07:00
Benoit Steiner
0485a2468d
use Eigen smart_copy instead of std::copy
2015-07-06 17:01:51 -07:00
Benoit Steiner
ebdacfc5ea
Fixed a compilation warning generated by clang
2015-07-06 15:03:11 -07:00
Benoit Steiner
81f9e968fd
Only attempt to use the texture path on GPUs when it's supported by CUDA
2015-07-06 13:32:38 -07:00
Benoit Steiner
864318e508
Misc small fixes to the tensor slicing code.
2015-07-06 11:45:56 -07:00
Benoit Steiner
8f1d547c92
Added a default value for the cuda stream in the GpuDevice constructor
2015-07-01 18:32:18 -07:00
Benoit Steiner
1e911b276c
Misc improvements and optimizations
2015-07-01 13:59:11 -07:00
Benoit Steiner
4ed213f97b
Improved a previous fix
2015-07-01 13:06:30 -07:00
Benoit Steiner
56e155dd60
Fixed a couple of mistakes in the previous commit.
2015-07-01 12:40:27 -07:00
Benoit Steiner
925d0d375a
Enabled the vectorized evaluation of several tensor expressions that was previously disabled by mistake
2015-07-01 11:32:04 -07:00
Benoit Steiner
6021b68d8b
Silenced a compilation warning
2015-06-30 15:42:25 -07:00
Benoit Steiner
f1f480b116
Added support for user defined custom tensor op.
2015-06-30 15:36:29 -07:00
Benoit Steiner
dc31fcb9ba
Added support for 3D patch extraction
2015-06-30 14:48:26 -07:00
Benoit Steiner
f587075987
Made ThreadPoolDevice inherit from a new pure abstract ThreadPoolInterface class: this enables users to leverage their existing threadpool when using eigen tensors.
2015-06-30 14:21:24 -07:00
Benoit Steiner
28b36632ec
Turned Eigen::array::size into a function to make the code compatible with std::array
2015-06-30 13:23:05 -07:00
Benoit Steiner
109005c6c9
Added a test for multithreaded full reductions
2015-06-30 13:08:12 -07:00
Benoit Steiner
a4aa7c6217
Fixed a few compilation warnings
2015-06-30 10:36:17 -07:00
Benoit Steiner
7d41e97fa9
Silenced a number of compilation warnings
2015-06-29 14:47:40 -07:00
Benoit Steiner
db9dbbda32
Improved performance of full reduction by 2 order of magnitude on CPU and 3 orders of magnitude on GPU
2015-06-29 14:06:32 -07:00
Benoit Steiner
f0ce85b757
Improved support for fixed size tensors
2015-06-29 14:04:15 -07:00
Benoit Steiner
670c71d906
Express the full reduction operations (such as sum, max, min) using TensorDimensionList
2015-06-29 11:30:36 -07:00
Benoit Steiner
d8098ee7d5
Added support for tanh function to the tensor code
2015-06-29 11:14:42 -07:00
Benoit Steiner
3625734bc8
Moved some utilities to TensorMeta.h to make it easier to reuse them accross several tensor operations.
...
Created the TensorDimensionList class to encode the list of all the dimensions of a tensor of rank n. This could be done using TensorIndexList, however TensorIndexList require cxx11 which isn't yet supported as widely as we'd like.
2015-06-29 10:49:55 -07:00
Gael Guennebaud
84aaef93ba
Merged in vanhoucke/eigen_vanhoucke (pull request PR-118)
...
Fix two small undefined behaviors caught by static analysis.
2015-06-20 13:56:48 +02:00
Gael Guennebaud
846b227bb7
Get rid of class internal::nested<> (still have to updated Tensor module)
2015-06-19 17:56:39 +02:00
vanhoucke
4cc0c961f3
Fix undefined behavior.
2015-06-19 15:46:46 +00:00
Benoit Steiner
ab5db86fe9
Fixed merge conflict
2015-06-16 19:52:20 -07:00
Benoit Steiner
ea160a898c
Pulled latest updates from trunk
2015-06-16 19:46:23 -07:00
Benoit Steiner
367794e668
Fixed compilation warnings triggered by clang
2015-06-16 19:43:49 -07:00
Gael Guennebaud
9ab8ac5c8b
Fix compilation in TensorImagePatch
2015-06-16 14:50:08 +02:00
Gael Guennebaud
38874b1651
Fix shadow warnings in Tensor module
2015-06-16 14:43:46 +02:00
Benoit Steiner
ea1190486f
Fixed a compilation error triggered by nvcc 7
2015-05-28 11:57:51 -07:00
Benoit Steiner
0e5fed74e7
Worked around some constexpr related bugs in nvcc 7
2015-05-28 10:14:38 -07:00
Benoit Steiner
f13b3d4433
Added missing include files
2015-05-28 07:57:28 -07:00
Benoit Steiner
abec18bae0
Fixed potential compilation error
2015-05-26 10:11:15 -07:00
Benoit Steiner
9df186c140
Added a few more missing EIGEN_DEVICE_FUNC statements
2015-05-26 09:47:48 -07:00
Benoit Steiner
466bcc589e
Added a few missing EIGEN_DEVICE_FUNC statements
2015-05-26 09:37:23 -07:00
Benoit Steiner
6b800744ce
Moved away from std::async and std::future as the underlying mechnism for the thread pool device. On several platforms, the functions passed to std::async are not scheduled in the order in which they are given to std::async, which leads to massive performance issues in the contraction code.
...
Instead we now have a custom thread pool that ensures that the functions are picked up by the threads in the pool in the order in which they are enqueued in the pool.
2015-05-20 13:52:07 -07:00
Benoit Steiner
2451679951
Avoid using the cuda memcpy for small tensor slices since the memcpy kernel is very expensive to launch
2015-05-19 15:19:01 -07:00
Benoit Steiner
a81d17b73a
Added new version of the TensorIntDiv class optimized for 32 bit signed integers. It saves 1 register on CPU and 2 on GPU.
2015-05-19 13:59:52 -07:00
Benoit Steiner
fd1d4bd86c
Silenced a few compilation warnings
2015-04-22 16:16:15 -07:00
Benoit Steiner
91359e1d0a
Added the ability to generate a tensor from a custom user defined 'generator'. This simplifies the creation of constant tensors initialized using specific regular patterns.
...
Created a gaussian window generator as a first use case.
2015-04-22 11:14:58 -07:00
Benoit Steiner
8838ed39f4
Added support for non-deterministic random number generation on GPU
2015-04-22 09:14:38 -07:00
Benoit Steiner
dfa991cbae
Make sure that the copy constructor of the evaluator is always called before launching the evaluation of a tensor expression on a cuda device.
2015-04-21 16:15:45 -07:00