Mehdi Goli
|
7318daf887
|
Fixing LLVM error on TensorMorphingSycl.h on GPU; fixing int64_t crash for tensor_broadcast_sycl on GPU; adding get_sycl_supported_devices() on syclDevice.h.
|
2016-11-25 16:19:07 +00:00 |
|
Benoit Steiner
|
7ad37606dd
|
Fixed the documentation of Scalar Tensors
|
2016-11-24 12:31:43 -08:00 |
|
Mehdi Goli
|
b8cc5635d5
|
Removing unsupported device from test case; cleaning the tensor device sycl.
|
2016-11-23 16:30:41 +00:00 |
|
Benoit Steiner
|
f11da1d83b
|
Made the QueueInterface thread safe
|
2016-11-20 13:17:08 -08:00 |
|
Benoit Steiner
|
6d781e3e52
|
Merged eigen/eigen into default
|
2016-11-20 10:12:54 -08:00 |
|
Benoit Steiner
|
79a07b891b
|
Fixed a typo
|
2016-11-20 07:07:41 -08:00 |
|
Benoit Steiner
|
81151bd474
|
Fixed merge conflicts
|
2016-11-19 19:12:59 -08:00 |
|
Benoit Steiner
|
9265ca707e
|
Made it possible to check the state of a sycl device without synchronization
|
2016-11-19 10:56:24 -08:00 |
|
Benoit Steiner
|
2d1aec15a7
|
Added missing include
|
2016-11-19 08:09:54 -08:00 |
|
Benoit Steiner
|
1bdf1b9ce0
|
Merged in benoitsteiner/opencl (pull request PR-253)
OpenCL improvements
|
2016-11-19 04:44:43 +00:00 |
|
Benoit Steiner
|
dc601d79d1
|
Added the ability to run test exclusively OpenCL devices that are listed by sycl::device::get_devices().
|
2016-11-18 16:26:50 -08:00 |
|
Benoit Steiner
|
110b7f8d9f
|
Deleted unnecessary semicolons
|
2016-11-18 14:06:17 -08:00 |
|
Benoit Steiner
|
37c2c516a6
|
Cleaned up the sycl device code
|
2016-11-18 12:38:06 -08:00 |
|
Mehdi Goli
|
15e226d7d3
|
adding Benoit changes on the TensorDeviceSycl.h
|
2016-11-18 16:34:54 +00:00 |
|
Mehdi Goli
|
622805a0c5
|
Modifying TensorDeviceSycl.h to always create buffer of type uint8_t and convert them to the actual type at the execution on the device; adding the queue interface class to separate the lifespan of sycl queue and buffers,created for that queue, from Eigen::SyclDevice; modifying sycl tests to support the evaluation of the results for both row major and column major data layout on all different devices that are supported by Sycl{CPU; GPU; and Host}.
|
2016-11-18 16:20:42 +00:00 |
|
Benoit Steiner
|
7c30078b9f
|
Merged eigen/eigen into default
|
2016-11-17 22:53:37 -08:00 |
|
Benoit Steiner
|
553f50b246
|
Added a way to detect errors generated by the opencl device from the host
|
2016-11-17 21:51:48 -08:00 |
|
Benoit Steiner
|
72a45d32e9
|
Cleanup
|
2016-11-17 21:29:15 -08:00 |
|
Benoit Steiner
|
4349fc640e
|
Created a test to check that the sycl runtime can successfully report errors (like ivision by 0).
Small cleanup
|
2016-11-17 20:27:54 -08:00 |
|
Benoit Steiner
|
a6a3fd0703
|
Made TensorDeviceCuda.h compile on windows
|
2016-11-17 16:15:27 -08:00 |
|
Luke Iwanski
|
c5130dedbe
|
Specialised basic math functions for SYCL device.
|
2016-11-17 11:47:13 +00:00 |
|
Benoit Steiner
|
b5c75351e3
|
Merged eigen/eigen into default
|
2016-11-14 15:54:44 -08:00 |
|
Rasmus Munk Larsen
|
32df1b1046
|
Reduce dispatch overhead in parallelFor by only calling thread_pool.Schedule() for one of the two recursive calls in handleRange. This avoids going through the scedule path to push both recursive calls onto another thread-queue in the binary tree, but instead executes one of them on the main thread. At the leaf level this will still activate a full complement of threads, but will save up to 50% of the overhead in Schedule (random number generation, insertion in queue which includes signaling via atomics).
|
2016-11-14 14:18:16 -08:00 |
|
Mehdi Goli
|
05e8c2a1d9
|
Adding extra test for non-fixed size to broadcast; Replacing stcl with sycl.
|
2016-11-14 18:13:53 +00:00 |
|
Mehdi Goli
|
f8ca893976
|
Adding TensorFixsize; adding sycl device memcpy; adding insial stage of slicing.
|
2016-11-14 17:51:57 +00:00 |
|
Mehdi Goli
|
a5c3f15682
|
Adding comment to TensorDeviceSycl.h and cleaning the code.
|
2016-11-11 19:06:34 +00:00 |
|
Mehdi Goli
|
3be3963021
|
Adding EIGEN_STRONG_INLINE back; using size() instead of dimensions.TotalSize() on Tensor.
|
2016-11-10 19:16:31 +00:00 |
|
Mehdi Goli
|
12387abad5
|
adding the missing in eigen_assert!
|
2016-11-10 18:58:08 +00:00 |
|
Mehdi Goli
|
2e704d4257
|
Adding Memset; optimising MecopyDeviceToHost by removing double copying;
|
2016-11-10 18:45:12 +00:00 |
|
Benoit Steiner
|
dcc14bee64
|
Fixed the formatting of the code
|
2016-11-08 14:24:46 -08:00 |
|
Luke Iwanski
|
912cb3d660
|
#if EIGEN_EXCEPTION -> #ifdef EIGEN_EXCEPTIONS.
|
2016-11-08 22:01:14 +00:00 |
|
Luke Iwanski
|
1b345b0895
|
Fix for SYCL queue initialisation.
|
2016-11-08 21:56:31 +00:00 |
|
Luke Iwanski
|
1b95717358
|
Use try/catch only when exceptions are enabled.
|
2016-11-08 21:08:53 +00:00 |
|
Mehdi Goli
|
d57430dd73
|
Converting all sycl buffers to uninitialised device only buffers; adding memcpyHostToDevice and memcpyDeviceToHost on syclDevice; modifying all examples to obey the new rules; moving sycl queue creating to the device based on Benoit suggestion; removing the sycl specefic condition for returning m_result in TensorReduction.h according to Benoit suggestion.
|
2016-11-08 17:08:02 +00:00 |
|
Benoit Steiner
|
dad177be01
|
Added missing includes
|
2016-11-05 10:04:42 -07:00 |
|
Mehdi Goli
|
0ebe3808ca
|
Removed the sycl include from Eigen/Core and moved it to Unsupported/Eigen/CXX11/Tensor; added TensorReduction for sycl (full reduction and partial reduction); added TensorReduction test case for sycl (full reduction and partial reduction); fixed the tile size on TensorSyclRun.h based on the device max work group size;
|
2016-11-04 18:18:19 +00:00 |
|
Benoit Steiner
|
0585b2965d
|
Disable vectorization on device only when compiling for sycl
|
2016-11-02 11:44:27 -07:00 |
|
Mehdi Goli
|
51af6ae971
|
Fixed the ambiguity in callig make_tuple for sycl backend.
|
2016-10-31 16:35:51 +00:00 |
|
Benoit Steiner
|
0a9ad6fc72
|
Worked around Visual Studio compilation errors
|
2016-10-28 07:54:27 -07:00 |
|
Benoit Steiner
|
b0c5bfdf78
|
Added missing template parameters
|
2016-10-28 03:43:41 +00:00 |
|
Gael Guennebaud
|
530f20c21a
|
Workaround MSVC issue.
|
2016-10-27 21:51:37 +02:00 |
|
Benoit Steiner
|
0a4c4d40b4
|
Removed a template parameter for fixed sized tensors
|
2016-10-26 18:47:37 -07:00 |
|
Benoit Steiner
|
5f2dd503ff
|
Replaced tabs with spaces
|
2016-10-25 20:40:58 -07:00 |
|
Benoit Steiner
|
1644bafe29
|
Code cleanup
|
2016-10-25 20:36:14 -07:00 |
|
Benoit Steiner
|
cf20b30d65
|
Merge latest updates from trunk
|
2016-10-20 09:42:05 -07:00 |
|
Luke Iwanski
|
03b63e182c
|
Added SYCL include in Tensor.
|
2016-10-20 15:32:44 +01:00 |
|
Benoit Steiner
|
d3943cd50c
|
Fixed a few typos in the ternary tensor expressions types
|
2016-10-19 12:56:12 -07:00 |
|
Mehdi Goli
|
e36cb91c99
|
Fixing the code indentation in the TensorReduction.h file.
|
2016-10-14 18:03:00 +01:00 |
|
Luke Iwanski
|
e742da8b28
|
Merged ComputeCpp into default.
|
2016-10-14 13:36:51 +01:00 |
|
Mehdi Goli
|
524fa4c46f
|
Reducing the code by generalising sycl backend functions/structs.
|
2016-10-14 12:09:55 +01:00 |
|
Benoit Steiner
|
7e4a6754b2
|
Merged eigen/eigen into default
|
2016-10-12 22:42:33 -07:00 |
|
Benoit Steiner
|
7f0599b6eb
|
Manually define int16_t and uint16_t when compiling with Visual Studio
|
2016-10-08 22:56:32 -07:00 |
|
Benoit Steiner
|
5c68051cd7
|
Merge the content of the ComputeCpp branch into the default branch
|
2016-10-07 11:04:16 -07:00 |
|
RJ Ryan
|
e2e9cdd169
|
Fully support complex types in SumReducer and MeanReducer when building for CUDA by using scalar_sum_op and scalar_product_op instead of operator+ and operator*.
|
2016-10-06 10:49:48 -07:00 |
|
Benoit Steiner
|
ae1385c7e4
|
Pull the latest updates from trunk
|
2016-10-05 14:54:36 -07:00 |
|
Benoit Steiner
|
c84084c0c0
|
Fixed compilation warning
|
2016-10-05 14:15:41 -07:00 |
|
Benoit Steiner
|
8b69d5d730
|
::rand() returns a signed integer on win32
|
2016-10-05 08:55:02 -07:00 |
|
Benoit Steiner
|
ed7a220b04
|
Fixed a typo that impacts windows builds
|
2016-10-05 08:51:31 -07:00 |
|
Benoit Steiner
|
ceee1c008b
|
Silenced compilation warning
|
2016-10-04 18:47:53 -07:00 |
|
Benoit Steiner
|
6af5ac7e27
|
Cleanup the cuda executor code.
|
2016-10-04 08:52:13 -07:00 |
|
Benoit Steiner
|
2f6d1607c8
|
Cleaned up the random number generation code.
|
2016-10-04 08:38:23 -07:00 |
|
Benoit Steiner
|
2bda1b0d93
|
Updated the tensor sum and mean reducer to enable them to process complex numbers on cuda gpus.
|
2016-09-28 17:08:41 -07:00 |
|
Mehdi Goli
|
dd602e62c8
|
Converting alias template to nested struct in order to be compatible with CXX-03
|
2016-09-27 16:21:19 +01:00 |
|
Benoit Steiner
|
6565f8d60f
|
Made the initialization of a CUDA device thread safe.
|
2016-09-26 11:00:32 -07:00 |
|
Benoit Steiner
|
f6ac51a054
|
Made TensorEvalTo compatible with c++0x again.
|
2016-09-23 16:45:17 -07:00 |
|
Benoit Steiner
|
00d4e65f00
|
Deleted unused TensorMap data member
|
2016-09-23 16:44:45 -07:00 |
|
Benoit Steiner
|
1301d744f8
|
Made the gaussian generator usable on GPU
|
2016-09-22 19:04:44 -07:00 |
|
Benoit Steiner
|
c3ca9b1e76
|
Deleted some unecessary and confusing EIGEN_DEVICE_FUNC
|
2016-09-19 11:33:39 -07:00 |
|
Luke Iwanski
|
b91e021172
|
Merged with default.
|
2016-09-19 14:03:54 +01:00 |
|
Luke Iwanski
|
cb81975714
|
Partial OpenCL support via SYCL compatible with ComputeCpp CE.
|
2016-09-19 12:44:13 +01:00 |
|
Gael Guennebaud
|
18f6e47815
|
Fix order of "static inline".
|
2016-09-16 11:32:54 +02:00 |
|
Benoit Steiner
|
488ad7dd1b
|
Added missing EIGEN_DEVICE_FUNC qualifiers
|
2016-09-14 13:35:00 -07:00 |
|
Benoit Steiner
|
028e299577
|
Fixed a bug impacting some outer reductions on GPU
|
2016-09-12 18:36:52 -07:00 |
|
Benoit Steiner
|
8321dcce76
|
Merged latest updates from trunk
|
2016-09-12 10:33:05 -07:00 |
|
Benoit Steiner
|
eb6ba00cc8
|
Properly size the list of waiters
|
2016-09-12 10:31:55 -07:00 |
|
Benoit Steiner
|
a618094b62
|
Added a resize method to MaxSizeVector
|
2016-09-12 10:30:53 -07:00 |
|
Gael Guennebaud
|
471eac5399
|
bug #1195: move NumTraits::Div<>::Cost to internal::scalar_div_cost (with some specializations in arch/SSE and arch/AVX)
|
2016-09-08 08:36:27 +02:00 |
|
Benoit Steiner
|
13df3441ae
|
Use MaxSizeVector instead of std::vector: xcode sometimes assumes that std::vector allocates aligned memory and therefore issues aligned instruction to initialize it. This can result in random crashes when compiling with AVX instructions enabled.
|
2016-09-02 19:25:47 -07:00 |
|
Benoit Steiner
|
cadd124d73
|
Pulled latest update from trunk
|
2016-09-02 15:30:02 -07:00 |
|
Benoit Steiner
|
05b0518077
|
Made the index type an explicit template parameter to help some compilers compile the code.
|
2016-09-02 15:29:34 -07:00 |
|
Benoit Steiner
|
adf864fec0
|
Merged in rmlarsen/eigen (pull request PR-222)
Fix CUDA build broken by changes to min and max reduction.
|
2016-09-02 14:11:20 -07:00 |
|
Rasmus Munk Larsen
|
13e93ca8b7
|
Fix CUDA build broken by changes to min and max reduction.
|
2016-09-02 13:41:36 -07:00 |
|
Benoit Steiner
|
c53f783705
|
Updated the contraction code to support constant inputs.
|
2016-09-01 11:41:27 -07:00 |
|
Gael Guennebaud
|
46475eff9a
|
Adjust Tensor module wrt recent change in nullary functor
|
2016-09-01 13:40:45 +02:00 |
|
Rasmus Munk Larsen
|
a1e092d1e8
|
Fix bugs to make min- and max reducers with correctly with IEEE infinities.
|
2016-08-31 15:04:16 -07:00 |
|
Gael Guennebaud
|
35a8e94577
|
bug #1167: simplify installation of header files using cmake's install(DIRECTORY ...) command.
|
2016-08-29 10:59:37 +02:00 |
|
Gael Guennebaud
|
965e595f02
|
Add missing log1p method
|
2016-08-26 14:55:00 +02:00 |
|
Benoit Steiner
|
34ae80179a
|
Use array_prod instead of calling TotalSize since TotalSize is only available on DSize.
|
2016-08-15 10:29:14 -07:00 |
|
Benoit Steiner
|
fe73648c98
|
Fixed a bug in the documentation.
|
2016-08-12 10:00:43 -07:00 |
|
Benoit Steiner
|
64e68cbe87
|
Don't attempt to optimize partial reductions when the optimized implementation doesn't buy anything.
|
2016-08-08 19:29:59 -07:00 |
|
Benoit Steiner
|
ca2cee2739
|
Merged in ibab/eigen (pull request PR-206)
Expose real and imag methods on Tensors
|
2016-08-03 11:53:04 -07:00 |
|
Benoit Steiner
|
a20b58845f
|
CUDA_ARCH isn't always defined, so avoid relying on it too much when figuring out which implementation to use for reductions. Instead rely on the device to tell us on which hardware version we're running.
|
2016-08-03 10:00:43 -07:00 |
|
Benoit Steiner
|
fd220dd8b0
|
Use numext::conj instead of std::conj
|
2016-08-01 18:16:16 -07:00 |
|
Benoit Steiner
|
e256acec7c
|
Avoid unecessary object copies
|
2016-08-01 17:03:39 -07:00 |
|
Benoit Steiner
|
2693fd54bf
|
bug #1266: half implementation has been moved to half_impl namespace
|
2016-07-29 13:45:56 -07:00 |
|
Benoit Steiner
|
3d3d34e442
|
Deleted dead code.
|
2016-07-25 08:53:37 -07:00 |
|
Gael Guennebaud
|
6d5daf32f5
|
bug #1255: comment out broken and unsused line.
|
2016-07-25 14:48:30 +02:00 |
|
Gael Guennebaud
|
9908020d36
|
Add minimal support for Array<string>, and fix Tensor<string>
|
2016-07-25 14:25:56 +02:00 |
|
Benoit Steiner
|
c6b0de2c21
|
Improved partial reductions in more cases
|
2016-07-22 17:18:20 -07:00 |
|
Gael Guennebaud
|
0f350a8b7e
|
Fix CUDA compilation
|
2016-07-21 18:47:07 +02:00 |
|
Benoit Steiner
|
20f7ef2f89
|
An evalTo expression is only aligned iff both the lhs and the rhs are aligned.
|
2016-07-12 10:56:42 -07:00 |
|
Benoit Steiner
|
3a2dd352ae
|
Improved the contraction mapper to properly support tensor products
|
2016-07-11 13:43:41 -07:00 |
|
Benoit Steiner
|
0bc020be9d
|
Improved the detection of packet size in the tensor scan evaluator.
|
2016-07-11 12:14:56 -07:00 |
|
Gael Guennebaud
|
fd60966310
|
merge
|
2016-07-11 18:11:47 +02:00 |
|
Gael Guennebaud
|
194daa3048
|
Fix assertion (it did not make sense for static_val types)
|
2016-07-11 11:39:27 +02:00 |
|
Gael Guennebaud
|
18c35747ce
|
Emulate _BitScanReverse64 for 32 bits builds
|
2016-07-11 11:38:04 +02:00 |
|
Gael Guennebaud
|
599f8ba617
|
Change runtime to compile-time conditional.
|
2016-07-08 11:39:43 +02:00 |
|
Gael Guennebaud
|
544935101a
|
Fix warnings
|
2016-07-08 11:38:52 +02:00 |
|
Gael Guennebaud
|
2f7e2614e7
|
bug #1232: refactor special functions as a new SpecialFunctions module, currently in unsupported/.
|
2016-07-08 11:13:55 +02:00 |
|
Gael Guennebaud
|
179ebb88f9
|
Fix warning
|
2016-07-07 09:16:40 +02:00 |
|
Gael Guennebaud
|
ce9fc0ce14
|
fix clang compilation
|
2016-07-04 12:59:02 +02:00 |
|
Gael Guennebaud
|
440020474c
|
Workaround compilation issue with msvc
|
2016-07-04 12:49:19 +02:00 |
|
Igor Babuschkin
|
78f37ca03c
|
Expose real and imag methods on Tensors
|
2016-07-01 17:34:31 +01:00 |
|
Benoit Steiner
|
cb2d8b8fa6
|
Made it possible to compile reductions for an old cuda architecture and run them on a recent gpu.
|
2016-06-29 15:42:01 -07:00 |
|
Benoit Steiner
|
b2a47641ce
|
Made the code compile when using CUDA architecture < 300
|
2016-06-29 15:32:47 -07:00 |
|
Igor Babuschkin
|
85699850d9
|
Add missing CUDA kernel to tensor scan op
The TensorScanOp implementation was missing a CUDA kernel launch.
This adds a simple placeholder implementation.
|
2016-06-29 11:54:35 +01:00 |
|
Benoit Steiner
|
75c333f94c
|
Don't store the scan axis in the evaluator of the tensor scan operation since it's only used in the constructor.
Also avoid taking references to values that may becomes stale after a copy construction.
|
2016-06-27 10:32:38 -07:00 |
|
Benoit Steiner
|
7944d4431f
|
Made the cost model cwiseMax and cwiseMin methods consts to help the PowerPC cuda compiler compile this code.
|
2016-08-18 13:46:36 -07:00 |
|
Benoit Steiner
|
647a51b426
|
Force the inlining of a simple accessor.
|
2016-08-18 12:31:02 -07:00 |
|
Benoit Steiner
|
a452dedb4f
|
Merged in ibab/eigen/double-tensor-reduction (pull request PR-216)
Enable efficient Tensor reduction for doubles on the GPU (continued)
|
2016-08-18 12:29:54 -07:00 |
|
Igor Babuschkin
|
18c67df31c
|
Fix remaining CUDA >= 300 checks
|
2016-08-18 17:18:30 +01:00 |
|
Igor Babuschkin
|
1569a7d7ab
|
Add the necessary CUDA >= 300 checks back
|
2016-08-18 17:15:12 +01:00 |
|
Benoit Steiner
|
2b17f34574
|
Properly detect the type of the result of a contraction.
|
2016-08-16 16:00:30 -07:00 |
|
Igor Babuschkin
|
841e075154
|
Remove CUDA >= 300 checks and enable outer reductin for doubles
|
2016-08-06 18:07:50 +01:00 |
|
Igor Babuschkin
|
0425118e2a
|
Merge upstream changes
|
2016-08-05 14:34:57 +01:00 |
|
Igor Babuschkin
|
9537e8b118
|
Make use of atomicExch for atomicExchCustom
|
2016-08-05 14:29:58 +01:00 |
|
Igor Babuschkin
|
eeb0d880ee
|
Enable efficient Tensor reduction for doubles
|
2016-07-01 19:08:26 +01:00 |
|
Rasmus Munk Larsen
|
a9c1e4d7b7
|
Return -1 from CurrentThreadId when called by thread outside the pool.
|
2016-06-23 16:40:07 -07:00 |
|
Rasmus Munk Larsen
|
d39df320d2
|
Resolve merge.
|
2016-06-23 15:08:03 -07:00 |
|
Gael Guennebaud
|
360a743a10
|
bug #1241: does not emmit anything for empty tensors
|
2016-06-23 18:47:31 +02:00 |
|
Gael Guennebaud
|
7c6561485a
|
merge PR 194
|
2016-06-23 15:29:57 +02:00 |
|
Benoit Steiner
|
a29a2cb4ff
|
Silenced a couple of compilation warnings generated by xcode
|
2016-06-22 16:43:02 -07:00 |
|
Benoit Steiner
|
f8fcd6b32d
|
Turned the constructor of the PerThread struct into what is effectively a constant expression to make the code compatible with a wider range of compilers
|
2016-06-22 16:03:11 -07:00 |
|
Benoit Steiner
|
c58df31747
|
Handle empty tensors in the print functions
|
2016-06-21 09:22:43 -07:00 |
|
Benoit Steiner
|
de32f8d656
|
Fixed the printing of rank-0 tensors
|
2016-06-20 10:46:45 -07:00 |
|
Benoit Steiner
|
7d495d890a
|
Merged in ibab/eigen (pull request PR-197)
Implement exclusive scan option for Tensor library
|
2016-06-14 17:54:59 -07:00 |
|
Benoit Steiner
|
aedc5be1d6
|
Avoid generating pseudo random numbers that are multiple of 5: this helps
spread the load over multiple cpus without havind to rely on work stealing.
|
2016-06-14 17:51:47 -07:00 |
|
Igor Babuschkin
|
c4d10e921f
|
Implement exclusive scan option
|
2016-06-14 19:44:07 +01:00 |
|
Gael Guennebaud
|
76236cdea4
|
merge
|
2016-06-14 15:33:47 +02:00 |
|
Gael Guennebaud
|
5d38203735
|
Update Tensor module to use bind1st_op and bind2nd_op
|
2016-06-14 15:06:03 +02:00 |
|
Benoit Steiner
|
65d33e5898
|
Merged in ibab/eigen (pull request PR-195)
Add small fixes to TensorScanOp
|
2016-06-10 19:31:17 -07:00 |
|
Benoit Steiner
|
a05607875a
|
Don't refer to the half2 type unless it's been defined
|
2016-06-10 11:53:56 -07:00 |
|
Igor Babuschkin
|
86aedc9282
|
Add small fixes to TensorScanOp
|
2016-06-07 20:06:38 +01:00 |
|
Benoit Steiner
|
84b2060a9e
|
Fixed compilation error with gcc 4.4
|
2016-06-06 17:16:19 -07:00 |
|
Benoit Steiner
|
7ef9f47b58
|
Misc small improvements to the reduction code.
|
2016-06-06 14:09:46 -07:00 |
|
Benoit Steiner
|
9137f560f0
|
Moved assertions to the constructor to make the code more portable
|
2016-06-06 07:26:48 -07:00 |
|
Rasmus Munk Larsen
|
f1f2ff8208
|
size_t -> int
|
2016-06-03 18:06:37 -07:00 |
|
Rasmus Munk Larsen
|
76308e7fd2
|
Add CurrentThreadId and NumThreads methods to Eigen threadpools and TensorDeviceThreadPool.
|
2016-06-03 16:28:58 -07:00 |
|
Benoit Steiner
|
37638dafd7
|
Simplified the code that dispatches vectorized reductions on GPU
|
2016-06-09 10:29:52 -07:00 |
|
Benoit Steiner
|
66796e843d
|
Fixed definition of some of the reducer_traits
|
2016-06-09 08:50:01 -07:00 |
|