Benoit Steiner
5ac27d5b51
Avoid relying on cxx11 features when possible.
2017-07-08 21:58:44 -07:00
Benoit Steiner
c5a241ab9b
Merged in benoitsteiner/opencl (pull request PR-323)
...
Improved support for OpenCL
2017-07-07 16:27:33 +00:00
Benoit Steiner
b7ae4dd9ef
Merged in hughperkins/eigen/add-endif-labels-TensorReductionCuda.h (pull request PR-315)
...
Add labels to #ifdef, in TensorReductionCuda.h
2017-07-07 04:23:52 +00:00
Benoit Steiner
9daed67952
Merged in tntnatbry/eigen (pull request PR-319)
...
Tensor Trace op
2017-07-07 04:18:03 +00:00
Benoit Steiner
6795512e59
Improved the randomness of the tensor random generator
2017-07-06 21:12:45 -07:00
Benoit Steiner
dc524ac716
Fixed compilation warning
2017-07-06 21:11:15 -07:00
Benoit Steiner
62b4634ebe
Merged in mehdi_goli/upstr_benoit/TensorSYCLImageVolumePatchFixed (pull request PR-14)
...
Applying Benoit's comment for Fixing ImageVolumePatch.
* Applying Benoit's comment for Fixing ImageVolumePatch. Fixing conflict on cmake file.
* Fixing dealocation of the memory in ImagePatch test for SYCL.
* Fixing the automerge issue.
2017-07-06 05:08:13 +00:00
Benoit Steiner
53725c10b8
Merged in mehdi_goli/opencl/DataDependancy (pull request PR-10)
...
DataDependancy
* Wrapping data type to the pointer class for sycl in non-terminal nodes; not having that breaks Tensorflow Conv2d code.
* Applying Ronnan's Comments.
* Applying benoit's comments
2017-06-28 17:55:23 +00:00
Benoit Steiner
b8e805497e
Merged in benoitsteiner/opencl (pull request PR-318)
...
Improved support for OpenCL
2017-06-13 05:01:10 +00:00
Hugh Perkins
9341f258d4
Add labels to #ifdef, in TensorReductionCuda.h
2017-06-06 15:51:06 +01:00
Benoit Steiner
1e736b9ead
Merged in mehdi_goli/opencl/SYCLAlignAllocator (pull request PR-7)
...
Fixing SYCL alignment issue required by TensorFlow.
2017-05-26 17:23:00 +00:00
Benoit Steiner
9dee55ec33
Merged eigen/eigen into default
2017-05-26 09:01:04 -07:00
Mehdi Goli
0370d3576e
Applying Ronnan's comments.
2017-05-26 16:01:48 +01:00
Mehdi Goli
e3f964ed55
Applying Benoit's comment;removing dead code.
2017-05-25 11:17:26 +01:00
a-doumoulakis
fb853a857a
Restore misplaced comment
2017-05-24 17:50:15 +01:00
a-doumoulakis
7a8ba565f8
Merge changed from upstream
2017-05-24 17:45:29 +01:00
Mmanu Chaturvedi
2971503fed
Specializing numeric_limits For AutoDiffScalar
2017-05-23 17:12:36 -04:00
Gael Guennebaud
26e8f9171e
Fix compilation of matrix log with Map as input
2017-06-07 10:51:23 +02:00
Mehdi Goli
76c0fc1f95
Fixing SYCL alignment issue required by TensorFlow.
2017-05-22 16:49:32 +01:00
Mehdi Goli
2d17128d6f
Fixing suported device list.
2017-05-22 16:40:33 +01:00
a-doumoulakis
052426b824
Add support for triSYCL
...
Eigen is now able to use triSYCL with EIGEN_SYCL_TRISYCL and TRISYCL_INCLUDE_DIR options
Fix contraction kernel with correct nd_item dimension
2017-05-05 19:26:27 +01:00
RJ Ryan
949a2da38c
Use scalar_sum_op and scalar_quotient_op instead of operator+ and operator/ in MeanReducer.
...
Improves support for std::complex types when compiling for CUDA.
Expands on e2e9cdd169
and 2bda1b0d93
.
2017-04-14 13:23:35 -07:00
Benoit Steiner
0d08165a7f
Merged in benoitsteiner/opencl (pull request PR-309)
...
OpenCL improvements
2017-04-05 14:28:08 +00:00
Benoit Steiner
c302ea7bc4
Deleted empty line of code
2017-04-04 10:05:16 -07:00
Benoit Steiner
a5a0c8fac1
Guard sycl specific code under a EIGEN_USE_SYCL ifdef
2017-04-04 10:03:21 -07:00
Benoit Steiner
a1304b95b7
Code cleanup
2017-04-04 10:00:46 -07:00
Benoit Steiner
66c63826bd
Guard the sycl specific code with EIGEN_USE_SYCL
2017-04-04 09:59:09 -07:00
Benoit Steiner
e3e343390a
Guard the sycl specific code with a #ifdef EIGEN_USE_SYCL
2017-04-04 09:56:33 -07:00
Benoit Steiner
63840d4666
iGate the sycl specific code under a EIGEN_USE_SYCL define
2017-04-04 09:54:31 -07:00
Benoit Steiner
bc050ea9f0
Fixed compilation error when sycl is enabled.
2017-04-04 09:47:04 -07:00
Gagan Goel
4910630c96
fix typos in the Tensor readme
2017-03-31 20:32:16 -04:00
Benoit Steiner
c1b3d5ecb6
Restored code compatibility with compilers that dont support c++11
...
Gated more sycl code under #ifdef sycl
2017-03-31 08:31:28 -07:00
Benoit Steiner
e2d5d4e7b3
Restore the old constructors to retain compatibility with non c++11 compilers.
2017-03-31 08:26:13 -07:00
Benoit Steiner
73fcaa319f
Gate the sycl specific code under #ifdef sycl
2017-03-31 08:22:25 -07:00
Mehdi Goli
bd64ee8555
Fixing TensorArgMaxSycl.h; Removing warning related to the hardcoded type of dims to be int in Argmax.
2017-03-28 16:50:34 +01:00
Luke Iwanski
a91417a7a5
Introduces align allocator for SYCL buffer
2017-03-20 14:48:54 +00:00
Benoit Steiner
f8a622ef3c
Merged eigen/eigen into default
2017-03-15 20:06:19 -07:00
Benoit Steiner
fd7db52f9b
Silenced compilation warning
2017-03-15 20:02:39 -07:00
Luke Iwanski
c06861d15e
Fixes bug in get_sycl_supported_devices() that was reporting unsupported Intel CPU on AMD platform - causing timeouts in that configuration
2017-03-15 19:26:08 +00:00
Benoit Steiner
f0f3591118
Made the reduction code compile with cuda-clang
2017-03-14 14:16:53 -07:00
Mehdi Goli
f499fe9496
Adding synchronisation to convolution kernel for sycl backend.
2017-03-13 09:18:37 +00:00
Rasmus Munk Larsen
bfd7bf9c5b
Get rid of Init().
2017-03-10 08:48:20 -08:00
Rasmus Munk Larsen
d56ab01094
Use C++11 ctor forwarding to simplify code a bit.
2017-03-10 08:30:22 -08:00
Rasmus Munk Larsen
344c2694a6
Make the non-blocking threadpool more flexible and less wasteful of CPU cycles for high-latency use-cases.
...
* Adds a hint to ThreadPool allowing us to turn off spin waiting. Currently each reader and record yielder op in a graph creates a threadpool with a thread that spins for 1000 iterations through the work stealing loop before yielding. This is wasteful for such ops that process I/O.
* This also changes the number of iterations through the steal loop to be inversely proportional to the number of threads. Since the time of each iteration is proportional to the number of threads, this yields roughly a constant spin time.
* Implement a separate worker loop for the num_threads == 1 case since there is no point in going through the expensive steal loop. Moreover, since Steal() calls PopBack() on the victim queues it might reverse the order in which ops are executed, compared to the order in which they are scheduled, which is usually counter-productive for the types of I/O workloads the single thread pools tend to be used for.
* Store num_threads in a member variable for simplicity and to avoid a data race between the thread creation loop and worker threads calling threads_.size().
2017-03-09 15:41:03 -08:00
Luke Iwanski
1b32a10053
Use name to distinguish name instead of the vendor
2017-03-08 18:26:34 +00:00
Gael Guennebaud
970ff78294
bug #1401 : fix compilation of "cond ? x : -x" with x an AutoDiffScalar
2017-03-08 16:16:53 +01:00
Mehdi Goli
5e9a1e7a7a
Adding sycl Benchmarks.
2017-03-08 14:17:48 +00:00
Mehdi Goli
e2e3f78533
Fixing potential race condition on sycl device.
2017-03-07 17:48:15 +00:00
Mehdi Goli
f84963ed95
Adding TensorIndexTuple and TensorTupleReduceOP backend (ArgMax/Min) for sycl; fixing the address space issue for const TensorMap; converting all discard_write to write due to data missmatch.
2017-03-07 14:27:10 +00:00
Benoit Steiner
a71943b9a4
Made the Tensor code compile with clang 3.9
2017-03-02 10:47:29 -08:00