Benoit Steiner
|
e256acec7c
|
Avoid unecessary object copies
|
2016-08-01 17:03:39 -07:00 |
|
Benoit Steiner
|
2693fd54bf
|
bug #1266: half implementation has been moved to half_impl namespace
|
2016-07-29 13:45:56 -07:00 |
|
Benoit Steiner
|
3d3d34e442
|
Deleted dead code.
|
2016-07-25 08:53:37 -07:00 |
|
Gael Guennebaud
|
6d5daf32f5
|
bug #1255: comment out broken and unsused line.
|
2016-07-25 14:48:30 +02:00 |
|
Gael Guennebaud
|
9908020d36
|
Add minimal support for Array<string>, and fix Tensor<string>
|
2016-07-25 14:25:56 +02:00 |
|
Benoit Steiner
|
c6b0de2c21
|
Improved partial reductions in more cases
|
2016-07-22 17:18:20 -07:00 |
|
Gael Guennebaud
|
0f350a8b7e
|
Fix CUDA compilation
|
2016-07-21 18:47:07 +02:00 |
|
Benoit Steiner
|
20f7ef2f89
|
An evalTo expression is only aligned iff both the lhs and the rhs are aligned.
|
2016-07-12 10:56:42 -07:00 |
|
Benoit Steiner
|
3a2dd352ae
|
Improved the contraction mapper to properly support tensor products
|
2016-07-11 13:43:41 -07:00 |
|
Benoit Steiner
|
0bc020be9d
|
Improved the detection of packet size in the tensor scan evaluator.
|
2016-07-11 12:14:56 -07:00 |
|
Gael Guennebaud
|
fd60966310
|
merge
|
2016-07-11 18:11:47 +02:00 |
|
Gael Guennebaud
|
194daa3048
|
Fix assertion (it did not make sense for static_val types)
|
2016-07-11 11:39:27 +02:00 |
|
Gael Guennebaud
|
18c35747ce
|
Emulate _BitScanReverse64 for 32 bits builds
|
2016-07-11 11:38:04 +02:00 |
|
Gael Guennebaud
|
599f8ba617
|
Change runtime to compile-time conditional.
|
2016-07-08 11:39:43 +02:00 |
|
Gael Guennebaud
|
544935101a
|
Fix warnings
|
2016-07-08 11:38:52 +02:00 |
|
Gael Guennebaud
|
2f7e2614e7
|
bug #1232: refactor special functions as a new SpecialFunctions module, currently in unsupported/.
|
2016-07-08 11:13:55 +02:00 |
|
Gael Guennebaud
|
179ebb88f9
|
Fix warning
|
2016-07-07 09:16:40 +02:00 |
|
Gael Guennebaud
|
ce9fc0ce14
|
fix clang compilation
|
2016-07-04 12:59:02 +02:00 |
|
Gael Guennebaud
|
440020474c
|
Workaround compilation issue with msvc
|
2016-07-04 12:49:19 +02:00 |
|
Benoit Steiner
|
cb2d8b8fa6
|
Made it possible to compile reductions for an old cuda architecture and run them on a recent gpu.
|
2016-06-29 15:42:01 -07:00 |
|
Benoit Steiner
|
b2a47641ce
|
Made the code compile when using CUDA architecture < 300
|
2016-06-29 15:32:47 -07:00 |
|
Igor Babuschkin
|
85699850d9
|
Add missing CUDA kernel to tensor scan op
The TensorScanOp implementation was missing a CUDA kernel launch.
This adds a simple placeholder implementation.
|
2016-06-29 11:54:35 +01:00 |
|
Benoit Steiner
|
75c333f94c
|
Don't store the scan axis in the evaluator of the tensor scan operation since it's only used in the constructor.
Also avoid taking references to values that may becomes stale after a copy construction.
|
2016-06-27 10:32:38 -07:00 |
|
Rasmus Munk Larsen
|
a9c1e4d7b7
|
Return -1 from CurrentThreadId when called by thread outside the pool.
|
2016-06-23 16:40:07 -07:00 |
|
Rasmus Munk Larsen
|
d39df320d2
|
Resolve merge.
|
2016-06-23 15:08:03 -07:00 |
|
Gael Guennebaud
|
360a743a10
|
bug #1241: does not emmit anything for empty tensors
|
2016-06-23 18:47:31 +02:00 |
|
Gael Guennebaud
|
7c6561485a
|
merge PR 194
|
2016-06-23 15:29:57 +02:00 |
|
Benoit Steiner
|
a29a2cb4ff
|
Silenced a couple of compilation warnings generated by xcode
|
2016-06-22 16:43:02 -07:00 |
|
Benoit Steiner
|
f8fcd6b32d
|
Turned the constructor of the PerThread struct into what is effectively a constant expression to make the code compatible with a wider range of compilers
|
2016-06-22 16:03:11 -07:00 |
|
Benoit Steiner
|
c58df31747
|
Handle empty tensors in the print functions
|
2016-06-21 09:22:43 -07:00 |
|
Benoit Steiner
|
de32f8d656
|
Fixed the printing of rank-0 tensors
|
2016-06-20 10:46:45 -07:00 |
|
Benoit Steiner
|
7d495d890a
|
Merged in ibab/eigen (pull request PR-197)
Implement exclusive scan option for Tensor library
|
2016-06-14 17:54:59 -07:00 |
|
Benoit Steiner
|
aedc5be1d6
|
Avoid generating pseudo random numbers that are multiple of 5: this helps
spread the load over multiple cpus without havind to rely on work stealing.
|
2016-06-14 17:51:47 -07:00 |
|
Igor Babuschkin
|
c4d10e921f
|
Implement exclusive scan option
|
2016-06-14 19:44:07 +01:00 |
|
Gael Guennebaud
|
76236cdea4
|
merge
|
2016-06-14 15:33:47 +02:00 |
|
Gael Guennebaud
|
5d38203735
|
Update Tensor module to use bind1st_op and bind2nd_op
|
2016-06-14 15:06:03 +02:00 |
|
Benoit Steiner
|
65d33e5898
|
Merged in ibab/eigen (pull request PR-195)
Add small fixes to TensorScanOp
|
2016-06-10 19:31:17 -07:00 |
|
Benoit Steiner
|
a05607875a
|
Don't refer to the half2 type unless it's been defined
|
2016-06-10 11:53:56 -07:00 |
|
Igor Babuschkin
|
86aedc9282
|
Add small fixes to TensorScanOp
|
2016-06-07 20:06:38 +01:00 |
|
Benoit Steiner
|
84b2060a9e
|
Fixed compilation error with gcc 4.4
|
2016-06-06 17:16:19 -07:00 |
|
Benoit Steiner
|
7ef9f47b58
|
Misc small improvements to the reduction code.
|
2016-06-06 14:09:46 -07:00 |
|
Benoit Steiner
|
9137f560f0
|
Moved assertions to the constructor to make the code more portable
|
2016-06-06 07:26:48 -07:00 |
|
Rasmus Munk Larsen
|
f1f2ff8208
|
size_t -> int
|
2016-06-03 18:06:37 -07:00 |
|
Rasmus Munk Larsen
|
76308e7fd2
|
Add CurrentThreadId and NumThreads methods to Eigen threadpools and TensorDeviceThreadPool.
|
2016-06-03 16:28:58 -07:00 |
|
Benoit Steiner
|
37638dafd7
|
Simplified the code that dispatches vectorized reductions on GPU
|
2016-06-09 10:29:52 -07:00 |
|
Benoit Steiner
|
66796e843d
|
Fixed definition of some of the reducer_traits
|
2016-06-09 08:50:01 -07:00 |
|
Benoit Steiner
|
14a112ee15
|
Use signed integers more consistently to encode the number of threads to use to evaluate a tensor expression.
|
2016-06-09 08:25:22 -07:00 |
|
Benoit Steiner
|
8f92c26319
|
Improved code formatting
|
2016-06-09 08:23:42 -07:00 |
|
Benoit Steiner
|
aa33446dac
|
Improved support for vectorization of 16-bit floats
|
2016-06-09 08:22:27 -07:00 |
|
Benoit Steiner
|
d6d39c7ddb
|
Added missing EIGEN_DEVICE_FUNC
|
2016-06-07 14:35:08 -07:00 |
|