Commit Graph

2815 Commits

Author SHA1 Message Date
Christoph Hertzberg
d86544d654 Reduce code duplication and avoid confusing Doxygen 2019-12-19 19:48:39 +01:00
Christoph Hertzberg
dde279f57d Hide recursive meta templates from Doxygen 2019-12-19 19:47:23 +01:00
Christoph Hertzberg
c21771ac04 Use double-braces initialization (as everywhere else in the test-suite). 2019-12-19 19:20:48 +01:00
Christoph Hertzberg
a3273aeff8 Fix trivial shadow warning 2019-12-19 19:13:11 +01:00
Eugene Zhulenev
7a65219a2e Fix TensorPadding bug in squeezed reads from inner dimension 2019-12-19 05:43:57 +00:00
Eugene Zhulenev
73e55525e5 Return const data pointer from TensorRef evaluator.data() 2019-12-18 23:19:36 +00:00
Eugene Zhulenev
ae07801dd8 Tensor block evaluation cost model 2019-12-18 20:07:00 +00:00
Jeff Daily
de07c4d1c2 fix compilation due to new HIP scalar accessor 2019-12-17 20:27:30 +00:00
Eugene Zhulenev
788bef6ab5 Reduce block evaluation overhead for small tensor expressions 2019-12-17 19:06:14 +00:00
Eugene Zhulenev
381f8f3139 Initialize non-trivially constructible types when allocating a temp buffer. 2019-12-12 01:31:30 +00:00
Eugene Zhulenev
64272c7f40 Squeeze reads from two inner dimensions in TensorPadding 2019-12-11 16:54:51 -08:00
Eugene Zhulenev
963ba1015b Add back accidentally deleted default constructor to TensorExecutorTilingContext. 2019-12-11 18:47:55 +00:00
Eugene Zhulenev
c9220c035f Remove block memory allocation required by removed block evaluation API 2019-12-10 17:15:55 -08:00
Eugene Zhulenev
1c879eb010 Remove V2 suffix from TensorBlock 2019-12-10 15:40:23 -08:00
Eugene Zhulenev
dbca11e880 Remove TensorBlock.h and old TensorBlock/BlockMapper 2019-12-10 14:31:44 -08:00
Deven Desai
c49f0d851a Fix for HIP breakage detected on 191210
The following commit introduces compile errors when running eigen with hipcc

2918f85ba9

hipcc errors out because it requies the device attribute on the methods within the TensorBlockV2ResourceRequirements struct instroduced by the commit above. The fix is to add the device attribute to those methods
2019-12-10 22:14:05 +00:00
Eugene Zhulenev
2918f85ba9 Do not use std::vector in getResourceRequirements 2019-12-09 16:19:55 -08:00
Artem Belevich
8056a05b54 Undo the block size change.
.z *is* used by the EigenContractionKernelInternal().
2019-12-09 11:10:29 -08:00
Eugene Zhulenev
dbb703d44e Add async evaluation support to TensorSelectOp 2019-12-09 18:36:13 +00:00
Janek Kozicki
11d6465326 fix AlignedVector3 inconsisent interface with other Vector classes, default constructor and operator- were missing. 2019-12-06 21:07:39 +01:00
Eugene Zhulenev
bb7ccac3af Add recursive work splitting to EvalShardedByInnerDimContext 2019-12-05 14:51:49 -08:00
Artem Belevich
25230d1862 Improve performance of contraction kernels
* Force-inline implementations. They pass around pointers to shared memory
  blocks. Without inlining compiler must operate via generic pointers.
  Inlining allows compiler to detect that we're operating on shared memory
  which allows generation of substantially faster code.

* Fixed a long-standing typo which resulted in launching 8x more kernels
  than we needed (.z dimension of the block is unused by the kernel).
2019-12-05 12:48:34 -08:00
Rasmus Munk Larsen
366cf005b0 Add missing initialization in cxx11_tensor_trace.cpp. 2019-12-04 23:56:37 +00:00
Eugene Zhulenev
8f4536e852 Capture TensorMap by value inside tensor expression AST 2019-12-03 16:39:05 -08:00
Rasmus Munk Larsen
4e696901f8 Remove __host__ annotation for device-only function. 2019-12-03 14:33:19 -08:00
Rasmus Munk Larsen
ead81559c8 Use EIGEN_DEVICE_FUNC macro instead of __device__. 2019-12-03 12:08:22 -08:00
Mehdi Goli
00f32752f7 [SYCL] Rebasing the SYCL support branch on top of the Einge upstream master branch.
* Unifying all loadLocalTile from lhs and rhs to an extract_block function.
* Adding get_tensor operation which was missing in TensorContractionMapper.
* Adding the -D method missing from cmake for Disable_Skinny Contraction operation.
* Wrapping all the indices in TensorScanSycl into Scan parameter struct.
* Fixing typo in Device SYCL
* Unifying load to private register for tall/skinny no shared
* Unifying load to vector tile for tensor-vector/vector-tensor operation
* Removing all the LHS/RHS class for extracting data from global
* Removing Outputfunction from TensorContractionSkinnyNoshared.
* Combining the local memory version of tall/skinny and normal tensor contraction into one kernel.
* Combining the no-local memory version of tall/skinny and normal tensor contraction into one kernel.
* Combining General Tensor-Vector and VectorTensor contraction into one kernel.
* Making double buffering optional for Tensor contraction when local memory is version is used.
* Modifying benchmark to accept custom Reduction Sizes
* Disabling AVX optimization for SYCL backend on the host to allow SSE optimization to the host
* Adding Test for SYCL
* Modifying SYCL CMake
2019-11-28 10:08:54 +00:00
Eugene Zhulenev
5496d0da0b Add async evaluation support to TensorReverse 2019-11-26 15:02:24 -08:00
Eugene Zhulenev
bc66c88255 Add async evaluation support to TensorPadding/TensorImagePatch/TensorShuffling 2019-11-26 11:41:57 -08:00
Hans Johnson
8c8cab1afd STYLE: Convert CMake-language commands to lower case
Ancient CMake versions required upper-case commands.  Later command names
became case-insensitive.  Now the preferred style is lower-case.
2019-10-31 11:36:37 -05:00
Hans Johnson
6fb3e5f176 STYLE: Remove CMake-language block-end command arguments
Ancient versions of CMake required else(), endif(), and similar block
termination commands to have arguments matching the command starting the block.
This is no longer the preferred style.
2019-10-31 11:36:27 -05:00
Gael Guennebaud
c3f6fcf2c0 bug #1747: one more fix for MSVC regarding the Bessel implementation. 2019-11-15 11:12:35 +01:00
Gael Guennebaud
b9837ca9ae bug #1281: fix AutoDiffScalar's make_coherent for nested expression of constant ADs. 2019-11-14 14:58:08 +01:00
Eugene Zhulenev
13c3327f5c Remove legacy block evaluation support 2019-11-12 10:12:28 -08:00
Rasmus Munk Larsen
0ed0338593 Fix a race in async tensor evaluation: Don't run on_done() until after device.deallocate() / evaluator.cleanup() complete, since the device might be destroyed after on_done() runs. 2019-11-11 12:26:41 -08:00
Eugene Zhulenev
c952b8dfda Break loop dependence in TensorGenerator block access 2019-11-11 10:32:57 -08:00
Rasmus Munk Larsen
ebf04fb3e8 Fix data race in css11_tensor_notification test. 2019-11-08 17:44:50 -08:00
Rasmus Munk Larsen
cc3d0e6a40 Add EIGEN_HAS_INTRINSIC_INT128 macro
Add a new EIGEN_HAS_INTRINSIC_INT128 macro, and use this instead of __SIZEOF_INT128__. This fixes related issues with TensorIntDiv.h when building with Clang for Windows, where support for 128-bit integer arithmetic is advertised but broken in practice.
2019-11-06 14:24:33 -08:00
Rasmus Munk Larsen
ee404667e2 Rollback or PR-746 and partial rollback of 668ab3fc47
.

std::array is still not supported in CUDA device code on Windows.
2019-11-05 17:17:58 -08:00
Rasmus Larsen
0c9745903a Merged in ezhulenev/eigen-01 (pull request PR-746)
Remove internal::smart_copy and replace with std::copy
2019-11-04 20:18:38 +00:00
Eugene Zhulenev
73ecb2c57d Cleanup includes in Tensor module after switch to C++11 and above 2019-10-29 15:49:54 -07:00
Eugene Zhulenev
e7ed4bd388 Remove internal::smart_copy and replace with std::copy 2019-10-29 11:25:24 -07:00
Eugene Zhulenev
fbc0a9a3ec Fix CXX11Meta compilation with MSVC 2019-10-28 18:30:10 -07:00
Eugene Zhulenev
bd864ab42b Prevent potential ODR in TensorExecutor 2019-10-28 15:45:09 -07:00
Mehdi Goli
6332aff0b2 This PR fixes:
* The specialization of array class in the different namespace for GCC<=6.4
* The implicit call to `std::array` constructor using the initializer list for GCC <=6.1
2019-10-23 15:56:56 +01:00
Rasmus Larsen
8e4e29ae99 Merged in deven-amd/eigen-hip-fix-191018 (pull request PR-738)
Fix for the HIP build+test errors.
2019-10-22 22:18:38 +00:00
Rasmus Munk Larsen
97c0c5d485 Add block evaluation V2 to TensorAsyncExecutor.
Add async evaluation to a number of ops.
2019-10-22 12:42:44 -07:00
Deven Desai
102cf2a72d Fix for the HIP build+test errors.
The errors were introduced by this commit :

After the above mentioned commit, some of the tests started failing with the following error


```
Built target cxx11_tensor_reduction
Building HIPCC object unsupported/test/CMakeFiles/cxx11_tensor_reduction_gpu_5.dir/cxx11_tensor_reduction_gpu_5_generated_cxx11_tensor_reduction_gpu.cu.o
In file included from /home/rocm-user/eigen/unsupported/test/cxx11_tensor_reduction_gpu.cu:16:
In file included from /home/rocm-user/eigen/unsupported/Eigen/CXX11/Tensor:117:
/home/rocm-user/eigen/unsupported/Eigen/CXX11/src/Tensor/TensorBlockV2.h:155:5: error: the field type is not amp-compatible
    DestinationBufferKind m_kind;
    ^
/home/rocm-user/eigen/unsupported/Eigen/CXX11/src/Tensor/TensorBlockV2.h:211:3: error: the field type is not amp-compatible
  DestinationBuffer m_destination;
  ^
```


For some reason HIPCC does not like device code to contain enum types which do not have the base-type explicitly declared. The fix is trivial, explicitly state "int" as the basetype
2019-10-22 19:21:27 +00:00
Rasmus Munk Larsen
668ab3fc47 Drop support for c++03 in Eigen tensor. Get rid of some code used to emulate c++11 functionality with older compilers. 2019-10-18 16:42:00 -07:00
Eugene Zhulenev
df0e8b8137 Propagate block evaluation preference through rvalue tensor expressions 2019-10-17 11:17:33 -07:00