Commit Graph

1056 Commits

Author SHA1 Message Date
Benoit Steiner
e23c8c294e Use actual types instead of the auto keyword to make the code more portable 2018-08-16 10:41:01 -07:00
Sameer Agarwal
f197c3f55b Removed an used variable (PacketSize) from TensorExecutor 2018-08-15 11:24:57 -07:00
Benoit Steiner
4181556907 Fixed the tensor contraction code. 2018-08-15 09:34:47 -07:00
Benoit Steiner
fbb834144d Fixed more compilation errors 2018-08-15 08:52:58 -07:00
Benoit Steiner
ab3f481141 Cleaned up the code and make it compile with more compilers 2018-08-14 14:05:46 -07:00
Benoit Steiner
59bba77ead Fixed compilation errors with gcc 4.7 and 4.8 2018-08-14 10:54:48 -07:00
Benoit Steiner
501be70b27 Code cleanup 2018-08-13 15:16:40 -07:00
Gael Guennebaud
3ec60215df Merged in rmlarsen/eigen2 (pull request PR-466)
Move sigmoid functor to core and rename it to 'logistic'.
2018-08-13 21:28:20 +00:00
Rasmus Munk Larsen
0f1b2e08a5 Call logistic functor from Tensor::sigmoid. 2018-08-13 11:52:58 -07:00
Benoit Steiner
26239ee580 Use NULL instead of nullptr to avoid adding a cxx11 requirement. 2018-08-13 11:05:51 -07:00
Benoit Steiner
3810ec228f Don't use the auto keyword since it's not always supported properly. 2018-08-13 10:46:09 -07:00
Benoit Steiner
e6d5be811d Fixed syntax of nested templates chevrons to make it compatible with c++97 mode. 2018-08-13 10:29:21 -07:00
Benoit Steiner
c8ea398675 Avoided language features that are only available in cxx11 mode. 2018-08-10 13:02:41 -07:00
Benoit Steiner
4be4286224 Made the code compile with gcc 5.4. 2018-08-10 11:32:58 -07:00
Benoit Steiner
131ed1191f Merged in codeplaysoftware/eigen-upstream-pure/Fixing_compiler_warning (pull request PR-462)
Fixing compiler warning in TensorBlock.h as it was creating a lot of noise at compilation.
2018-08-08 18:14:15 +00:00
Mehdi Goli
532a0be05c Fixing compiler warning in TensorBlock.h as it was creating a lot of noise at compilation. 2018-08-08 12:12:26 +01:00
Rasmus Munk Larsen
693fb1d41e Fix init order. 2018-08-07 17:18:51 -07:00
Benoit Steiner
10d286f55b Silenced a couple of compilation warnings. 2018-08-06 16:00:29 -07:00
Benoit Steiner
d011d05fd6 Fixed compilation errors. 2018-08-06 13:40:51 -07:00
Rasmus Munk Larsen
36e7e7dd8f Forward declare NoOpOutputKernel as struct rather than class to be consistent with implementation. 2018-08-06 13:16:32 -07:00
Rasmus Munk Larsen
fa68342ef8 Move sigmoid functor to core. 2018-08-03 17:31:23 -07:00
Rasmus Munk Larsen
bcb29f890c Fix initialization order. 2018-08-03 10:18:53 -07:00
Mehdi Goli
3074b1ff9e Fixing the compilation error. 2018-08-03 17:13:44 +01:00
Mehdi Goli
01358300d5 Creating separate SYCL required PR for uncontroversial files. 2018-08-03 16:59:15 +01:00
Benoit Steiner
93b9e36e10 Merged in paultucker/eigen (pull request PR-431)
Optional ThreadPoolDevice allocator

Approved-by: Benoit Steiner <benoit.steiner.goog@gmail.com>
2018-08-01 19:14:34 +00:00
Benoit Steiner
17221115c9 Merged in codeplaysoftware/eigen-upstream-pure/eigen_variadic_assert (pull request PR-447)
Adding variadic version of assert which can take a parameter pack as its input.
2018-08-01 16:41:54 +00:00
Benoit Steiner
0360c36170 Merged in codeplaysoftware/eigen-upstream-pure/separating_internal_memory_allocation (pull request PR-446)
Distinguishing between internal memory allocation/deallocation from explicit user memory allocation/deallocation.
2018-08-01 16:13:15 +00:00
Mehdi Goli
c6a5c70712 Correcting the position of allocate_temp/deallocate_temp in TensorDeviceGpu.h 2018-08-01 16:56:26 +01:00
Benoit Steiner
45f75f1ace Merged in codeplaysoftware/eigen-upstream-pure/using_PacketType_class (pull request PR-449)
Enabling per device specialisation of packetSize.
2018-08-01 15:43:03 +00:00
Mehdi Goli
af96018b49 Using the suggested modification. 2018-08-01 16:04:44 +01:00
Mehdi Goli
b512a9536f Enabling per device specialisation of packetsize. 2018-08-01 13:39:13 +01:00
Mehdi Goli
3a197a60e6 variadic version of assert which can take a parameter pack as its input. 2018-08-01 12:19:14 +01:00
Mehdi Goli
d7a8414848 Distinguishing between internal memory allocation/deallocation from explicit user memory allocation/deallocation. 2018-08-01 11:56:30 +01:00
Mehdi Goli
9e219bb3d3 Converting ad-hoc inline keyword to EIGEN_STRONG_INLINE MACRO. 2018-08-01 10:47:49 +01:00
Benoit Steiner
edf46bd7a2 Merged in yuefengz/eigen (pull request PR-370)
Use device's allocate function instead of internal::aligned_malloc.
2018-07-31 22:38:28 +00:00
Paul Tucker
385f7b8d0c Change getAllocator() to allocator() in ThreadPoolDevice. 2018-07-31 13:52:18 -07:00
Gael Guennebaud
678a0dcb12 Merged in ezhulenev/eigen/tiling_3 (pull request PR-438)
Tiled tensor executor
2018-07-31 08:13:00 +00:00
Gael Guennebaud
679eece876 Speedup trivial tensor broadcasting on GPU by enforcing unaligned loads. See PR 437. 2018-07-31 10:10:14 +02:00
Eugene Zhulenev
966c2a7bb6 Rename Index to StorageIndex + use Eigen::Array and Eigen::Map when possible 2018-07-27 12:45:17 -07:00
Eugene Zhulenev
6913221c43 Add tiled evaluation support to TensorExecutor 2018-07-25 13:51:10 -07:00
Rasmus Munk Larsen
e478532625 Reduce the number of template specializations of classes related to tensor contraction to reduce binary size. 2018-07-27 12:36:34 -07:00
Eugene Zhulenev
d55efa6f0f TensorBlockIO 2018-07-23 15:50:55 -07:00
Eugene Zhulenev
34a75c3c5c Initial support of TensorBlock 2018-07-20 17:37:20 -07:00
Paul Tucker
d4afccde5a Add test coverage for ThreadPoolDevice optional allocator. 2018-07-19 17:43:44 -07:00
Eugene Zhulenev
c58b874727 PR430: Convert count to the reducer type in MeanReducer
Without explicit conversion Tensorflow fails to compile, pset1 template deduction fails.

cannot convert '((const Eigen::internal::MeanReducer<Eigen::half>*)this)
  ->Eigen::internal::MeanReducer<Eigen::half>::packetCount_'
(type 'const DenseIndex {aka const long int}')
to type 'const type& {aka const Eigen::half&}'
     return pdiv(vaccum, pset1<Packet>(packetCount_));
Honestly I’m not sure why it works in Eigen tests, because Eigen::half constructor is explicit, and why it stopped working in TF, I didn’t find any relevant changes since previous Eigen upgrade.

static_cast<T>(packetCount_) - breaks cxx11_tensor_reductions test for Eigen::half, also quite surprising.
2018-07-19 17:37:03 -07:00
Paul Tucker
4e9848fa86 Actually add optional Allocator* arg to ThreadPoolDevice(). 2018-07-16 17:53:36 -07:00
Paul Tucker
b3e7c9132d Add optional Allocator argument to ThreadPoolDevice constructor.
When supplied, this allocator will be used in place of
internal::aligned_malloc.  This permits e.g. use of a NUMA-node specific
allocator where the thread-pool is also restricted a single NUMA-node.
2018-07-16 17:26:05 -07:00
Rasmus Munk Larsen
3a9cf4e290 Get rid of alias for m_broadcast. 2018-07-13 16:24:48 -07:00
Rasmus Munk Larsen
4222550e17 Optimize the case where broadcasting is a no-op. 2018-07-13 16:12:38 -07:00
Gael Guennebaud
06eb24cf4d Introduce gpu_assert for assertion in device-code, and disable them with clang-cuda. 2018-07-13 16:04:27 +02:00