Yangzihao Wang
3122477c86
Update the padding computation for PADDING_SAME to be consistent with TensorFlow.
2017-12-12 11:15:24 -08:00
Benoit Steiner
a6d875bac8
Removed unecesasry #include
2017-10-22 08:12:45 -07:00
Gael Guennebaud
a91918a105
Merged in infinitei/eigen (pull request PR-328)
...
bug #1464 : Fixes construction of EulerAngles from 3D vector expression.
Approved-by: Tal Hadad <tal_hd@hotmail.com>
Approved-by: Abhijit Kundu <abhijit.kundu@gatech.edu>
2017-09-06 08:42:14 +00:00
Benoit Steiner
a4089991eb
Added support for CUDA 9.0.
2017-08-31 02:49:39 +00:00
Abhijit Kundu
6d991a9595
bug #1464 : Fixes construction of EulerAngles from 3D vector expression.
2017-08-30 13:26:30 -04:00
Gael Guennebaud
304ef29571
Handle min/max/inf/etc issue in cuda_fp16.h directly in test/main.h
2017-08-24 11:26:41 +02:00
Gael Guennebaud
21633e585b
bug #1462 : remove all occurences of the deprecated __CUDACC_VER__ macro by introducing EIGEN_CUDACC_VER
2017-08-24 11:06:47 +02:00
Benoit Steiner
c5a241ab9b
Merged in benoitsteiner/opencl (pull request PR-323)
...
Improved support for OpenCL
2017-07-07 16:27:33 +00:00
Benoit Steiner
9daed67952
Merged in tntnatbry/eigen (pull request PR-319)
...
Tensor Trace op
2017-07-07 04:18:03 +00:00
Benoit Steiner
62b4634ebe
Merged in mehdi_goli/upstr_benoit/TensorSYCLImageVolumePatchFixed (pull request PR-14)
...
Applying Benoit's comment for Fixing ImageVolumePatch.
* Applying Benoit's comment for Fixing ImageVolumePatch. Fixing conflict on cmake file.
* Fixing dealocation of the memory in ImagePatch test for SYCL.
* Fixing the automerge issue.
2017-07-06 05:08:13 +00:00
Benoit Steiner
b8e805497e
Merged in benoitsteiner/opencl (pull request PR-318)
...
Improved support for OpenCL
2017-06-13 05:01:10 +00:00
Gael Guennebaud
8640093af1
fix compilation in C++98
2017-06-09 12:45:01 +02:00
Benoit Steiner
9dee55ec33
Merged eigen/eigen into default
2017-05-26 09:01:04 -07:00
a-doumoulakis
7a8ba565f8
Merge changed from upstream
2017-05-24 17:45:29 +01:00
Mmanu Chaturvedi
2971503fed
Specializing numeric_limits For AutoDiffScalar
2017-05-23 17:12:36 -04:00
Mehdi Goli
61d7f3664a
Fixing Cmake Dependency for SYCL
2017-05-22 14:58:28 +01:00
a-doumoulakis
052426b824
Add support for triSYCL
...
Eigen is now able to use triSYCL with EIGEN_SYCL_TRISYCL and TRISYCL_INCLUDE_DIR options
Fix contraction kernel with correct nd_item dimension
2017-05-05 19:26:27 +01:00
RJ Ryan
949a2da38c
Use scalar_sum_op and scalar_quotient_op instead of operator+ and operator/ in MeanReducer.
...
Improves support for std::complex types when compiling for CUDA.
Expands on e2e9cdd169
and 2bda1b0d93
.
2017-04-14 13:23:35 -07:00
Benoit Steiner
068cc09708
Preserve file naming conventions
2017-04-04 10:09:10 -07:00
Mehdi Goli
bd64ee8555
Fixing TensorArgMaxSycl.h; Removing warning related to the hardcoded type of dims to be int in Argmax.
2017-03-28 16:50:34 +01:00
Benoit Steiner
f8a622ef3c
Merged eigen/eigen into default
2017-03-15 20:06:19 -07:00
Luke Iwanski
9597d6f6ab
Temporary: Disables cxx11_tensor_argmax_sycl test since it is causing zombie thread
2017-03-15 19:28:09 +00:00
Rasmus Munk Larsen
344c2694a6
Make the non-blocking threadpool more flexible and less wasteful of CPU cycles for high-latency use-cases.
...
* Adds a hint to ThreadPool allowing us to turn off spin waiting. Currently each reader and record yielder op in a graph creates a threadpool with a thread that spins for 1000 iterations through the work stealing loop before yielding. This is wasteful for such ops that process I/O.
* This also changes the number of iterations through the steal loop to be inversely proportional to the number of threads. Since the time of each iteration is proportional to the number of threads, this yields roughly a constant spin time.
* Implement a separate worker loop for the num_threads == 1 case since there is no point in going through the expensive steal loop. Moreover, since Steal() calls PopBack() on the victim queues it might reverse the order in which ops are executed, compared to the order in which they are scheduled, which is usually counter-productive for the types of I/O workloads the single thread pools tend to be used for.
* Store num_threads in a member variable for simplicity and to avoid a data race between the thread creation loop and worker threads calling threads_.size().
2017-03-09 15:41:03 -08:00
Mehdi Goli
f84963ed95
Adding TensorIndexTuple and TensorTupleReduceOP backend (ArgMax/Min) for sycl; fixing the address space issue for const TensorMap; converting all discard_write to write due to data missmatch.
2017-03-07 14:27:10 +00:00
Mehdi Goli
8296b87d7b
Adding sycl backend for TensorCustomOp; fixing the partial lhs modification issue on sycl when the rhs is TensorContraction, reduction or convolution; Fixing the partial modification for memset when sycl backend is used.
2017-02-28 17:16:14 +00:00
Mehdi Goli
2fa2b617a9
Adding TensorVolumePatchOP.h for sycl
2017-02-24 19:16:24 +00:00
Mehdi Goli
89dfd51fae
Adding Sycl Backend for TensorGenerator.h.
2017-02-22 16:36:24 +00:00
Mehdi Goli
4f07ac16b0
Reducing the number of warnings.
2017-02-21 10:09:47 +00:00
Mehdi Goli
79ebc8f761
Adding Sycl backend for TensorImagePatchOP.h; adding Sycl backend for TensorInflation.h.
2017-02-20 12:11:05 +00:00
Mehdi Goli
91982b91c0
Adding TensorLayoutSwapOp for sycl.
2017-02-15 16:28:12 +00:00
Mehdi Goli
b1e312edd6
Adding TensorPatch.h for sycl backend.
2017-02-15 10:13:01 +00:00
Mehdi Goli
0d153ded29
Adding TensorChippingOP for sycl backend; fixing the index value in the verification operation for cxx11_tensorChipping.cpp test
2017-02-13 17:25:12 +00:00
Mehdi Goli
0ee97b60c2
Adding mean to TensorReductionSycl.h
2017-02-07 15:43:17 +00:00
Mehdi Goli
42bd5c4e7b
Fixing TensorReductionSycl for min and max.
2017-02-06 18:05:23 +00:00
Mehdi Goli
ff53050034
Converting ptrdiff_t type to int64_t type in cxx11_tensor_contract_sycl.cpp in order to be the same as other tests.
2017-02-01 15:36:03 +00:00
Mehdi Goli
bab29936a1
Reducing warnings in Sycl backend.
2017-02-01 15:29:53 +00:00
Benoit Steiner
fbc39fd02c
Merge latest changes from upstream
2017-01-30 15:25:57 -08:00
Rasmus Munk Larsen
5e144bbaa4
Make NaN propagatation consistent between the pmax/pmin and std::max/std::min. This makes the NaN propagation consistent between the scalar and vectorized code paths of Eigen's scalar_max_op and scalar_min_op.
...
See #1373 for details.
2017-01-24 13:32:50 -08:00
Mehdi Goli
6bdd15f572
Adding non-deferrenciable pointer track for ComputeCpp backend; Adding TensorConvolutionOp for ComputeCpp; fixing typos. modifying TensorDeviceSycl to use the LegacyPointer class.
2017-01-19 11:30:59 +00:00
Mehdi Goli
e46e722381
Adding Tensor ReverseOp; TensorStriding; TensorConversionOp; Modifying Tensor Contractsycl to be located in any place in the expression tree.
2017-01-16 13:58:49 +00:00
Benoit Steiner
0657228569
Simplified the way we link libxsmm
2016-12-21 14:40:08 -08:00
Benoit Steiner
c19fe5e9ed
Added support for libxsmm in the eigen makefiles
2016-12-21 10:43:40 -08:00
Benoit Steiner
0f577d4744
Merged eigen/eigen into default
2016-12-20 17:02:06 -08:00
Gael Guennebaud
e8d6862f14
Properly adjust precision when saving to Market format.
2016-12-20 22:10:33 +01:00
Gael Guennebaud
e2f4ee1c2b
Speed up parsing of sparse Market file.
2016-12-20 21:56:21 +01:00
Benoit Steiner
548ed30a1c
Added an OpenCL regression test
2016-12-19 18:56:26 -08:00
Benoit Steiner
27ceb43bf6
Fixed race condition in the tensor_shuffling_sycl test
2016-12-19 15:34:42 -08:00
Mehdi Goli
35bae513a0
Converting all parallel for lambda to functor in order to prevent kernel duplication name error; adding tensorConcatinationOp backend for sycl.
2016-12-16 19:46:45 +00:00
Benoit Steiner
9ff5d0f821
Merged eigen/eigen into default
2016-12-14 17:32:16 -08:00
Mehdi Goli
730eb9fe1c
Adding asynchronous execution as it improves the performance.
2016-12-14 17:38:53 +00:00