Commit Graph

2768 Commits

Author SHA1 Message Date
Rasmus Munk Larsen
8e2be7777e Merged eigen/eigen into default 2018-09-20 11:41:15 -07:00
Rasmus Munk Larsen
5d2e759329 Initialize BlockIteratorState in a C++03 compatible way. 2018-09-20 11:40:43 -07:00
Gael Guennebaud
e04faca930 merge 2018-09-20 18:33:54 +02:00
Gael Guennebaud
d37188b9c1 Fix MPrealSupport 2018-09-20 18:30:10 +02:00
Gael Guennebaud
3c6dc93f99 Fix GPU support. 2018-09-20 18:29:21 +02:00
Gael Guennebaud
9419f506d0 Fix regression introduced by the previous fix for AVX512.
It brokes the complex-complex case on SSE.
2018-09-20 17:32:34 +02:00
Christoph Hertzberg
a0166ab651 Workaround for spurious "array subscript is above array bounds" warnings with g++4.x 2018-09-20 17:08:43 +02:00
Christoph Hertzberg
c50250cb24 Avoid warning "suggest braces around initialization of subobject".
This test is not run in C++03 mode, so no compatibility is lost.
2018-09-20 17:03:42 +02:00
Gael Guennebaud
71496b0e25 Fix gebp kernel for real+complex in case only reals are vectorized (e.g., AVX512).
This commit also removes "half-packet" from data-mappers: it was not used and conceptually broken anyways.
2018-09-20 17:01:24 +02:00
Rasmus Munk Larsen
44d8274383 Cast to longer type. 2018-09-19 13:31:42 -07:00
Rasmus Munk Larsen
d638b62dda Silence compiler warning. 2018-09-19 13:27:55 -07:00
Rasmus Munk Larsen
db9c9df59a Silence more compiler warnings. 2018-09-19 11:50:27 -07:00
Rasmus Munk Larsen
febd09dcc0 Silence compiler warnings in ThreadPoolInterface.h. 2018-09-19 11:11:04 -07:00
luz.paz"
f67b19a884 [PATCH 1/2] Misc. typos
From 68d431b4c14ad60a778ee93c1f59ecc4b931950e Mon Sep 17 00:00:00 2001
Found via `codespell -q 3 -I ../eigen-word-whitelist.txt` where the whitelists consists of:
```
als
ans
cas
dum
lastr
lowd
nd
overfl
pres
preverse
substraction
te
uint
whch
```
---
 CMakeLists.txt                                | 26 +++++++++----------
 Eigen/src/Core/GenericPacketMath.h            |  2 +-
 Eigen/src/SparseLU/SparseLU.h                 |  2 +-
 bench/bench_norm.cpp                          |  2 +-
 doc/HiPerformance.dox                         |  2 +-
 doc/QuickStartGuide.dox                       |  2 +-
 .../Eigen/CXX11/src/Tensor/TensorChipping.h   |  6 ++---
 .../Eigen/CXX11/src/Tensor/TensorDeviceGpu.h  |  2 +-
 .../src/Tensor/TensorForwardDeclarations.h    |  4 +--
 .../src/Tensor/TensorGpuHipCudaDefines.h      |  2 +-
 .../Eigen/CXX11/src/Tensor/TensorReduction.h  |  2 +-
 .../CXX11/src/Tensor/TensorReductionGpu.h     |  2 +-
 .../test/cxx11_tensor_concatenation.cpp       |  2 +-
 unsupported/test/cxx11_tensor_executor.cpp    |  2 +-
 14 files changed, 29 insertions(+), 29 deletions(-)
2018-09-18 04:15:01 -04:00
Eugene Zhulenev
c4627039ac Support static dimensions (aka IndexList) in Tensor::resize(...) 2018-09-18 14:25:21 -07:00
Eugene Zhulenev
218a7b9840 Enable DSizes type promotion with c++03 compilers 2018-09-18 10:57:00 -07:00
Ravi Kiran
1f0c941c3d Collapsed revision
* Merged eigen/eigen into default
2018-09-17 18:29:12 -07:00
Rasmus Munk Larsen
03a88c57e1 Merged in ezhulenev/eigen-02 (pull request PR-498)
Add DSizes index type promotion
2018-09-17 21:58:38 +00:00
Rasmus Munk Larsen
5ca0e4a245 Merged in ezhulenev/eigen-01 (pull request PR-497)
Fix warnings in IndexList array_prod
2018-09-17 20:15:06 +00:00
Eugene Zhulenev
a5cd4e9ad1 Replace deprecated Eigen::DenseIndex with Eigen::Index in TensorIndexList 2018-09-17 10:58:07 -07:00
Gael Guennebaud
b311bfb752 bug #1596: fix inclusion of Eigen's header within unsupported modules. 2018-09-17 09:54:29 +02:00
Gael Guennebaud
72f19c827a typo 2018-09-16 22:10:34 +02:00
Eugene Zhulenev
66f056776f Add DSizes index type promotion 2018-09-15 15:17:38 -07:00
Eugene Zhulenev
f313126dab Fix warnings in IndexList array_prod 2018-09-15 13:47:54 -07:00
Christoph Hertzberg
42705ba574 Fix weird error for building with g++-4.7 in C++03 mode. 2018-09-15 12:43:41 +02:00
Rasmus Munk Larsen
c2383f95af Merged in ezhulenev/eigen/fix_dsizes (pull request PR-494)
Fix DSizes IndexList constructor
2018-09-15 02:36:19 +00:00
Rasmus Munk Larsen
30290cdd56 Merged in ezhulenev/eigen/moar_eigen_fixes_3 (pull request PR-493)
Const cast scalar pointer in TensorSlicingOp evaluator

Approved-by: Sameer Agarwal <sameeragarwal@google.com>
2018-09-15 02:35:07 +00:00
Eugene Zhulenev
f7d0053cf0 Fix DSizes IndexList constructor 2018-09-14 19:19:13 -07:00
Rasmus Munk Larsen
601e289d27 Merged in ezhulenev/eigen/moar_eigen_fixes_1 (pull request PR-492)
Explicitly construct tensor block dimensions from evaluator dimensions
2018-09-15 01:36:21 +00:00
Eugene Zhulenev
71070a1e84 Const cast scalar pointer in TensorSlicingOp evaluator 2018-09-14 17:17:50 -07:00
Eugene Zhulenev
4863375723 Explicitly construct tensor block dimensions from evaluator dimensions 2018-09-14 16:55:05 -07:00
Rasmus Munk Larsen
14e35855e1 Merged in chtz/eigen-maxsizevector (pull request PR-490)
Let MaxSizeVector respect alignment of objects

Approved-by: Rasmus Munk Larsen <rmlarsen@google.com>
2018-09-14 23:29:24 +00:00
Eugene Zhulenev
1b8d70a22b Support reshaping with static shapes and dimensions conversion in tensor broadcasting 2018-09-14 15:25:27 -07:00
Christoph Hertzberg
007f165c69 bug #1598: Let MaxSizeVector respect alignment of objects and add a unit test
Also revert 8b3d9ed081
2018-09-14 20:21:56 +02:00
Rasmus Munk Larsen
9b864cdb37 Merged in rmlarsen/eigen3 (pull request PR-480)
Avoid compilation error in C++11 test when EIGEN_AVOID_STL_ARRAY is set.
2018-09-14 00:05:09 +00:00
Rasmus Munk Larsen
d0eef5fe6c Don't use bracket syntax in ctor. 2018-09-13 17:04:05 -07:00
Rasmus Munk Larsen
6313dde390 Fix merge error. 2018-09-13 16:42:05 -07:00
Rasmus Munk Larsen
0db590d22d Backed out changeset 01197e4452 2018-09-13 16:20:57 -07:00
Rasmus Munk Larsen
b3f4c067d9 Merge 2018-09-13 16:18:52 -07:00
Rasmus Munk Larsen
2b07018140 Enable vectorized version on GPUs. The underlying bug has been fixed. 2018-09-13 16:12:22 -07:00
Rasmus Munk Larsen
53568e3549 Merged in ezhulenev/eigen/tiled_evalution_support (pull request PR-444)
Tiled evaluation for Tensor ops

Approved-by: Rasmus Munk Larsen <rmlarsen@google.com>
Approved-by: Gael Guennebaud <g.gael@free.fr>
2018-09-13 22:05:47 +00:00
Eugene Zhulenev
01197e4452 Fix warnings 2018-09-13 15:03:36 -07:00
Gael Guennebaud
7f3b17e403 MSVC 2015 supports c++11 thread-local-storage 2018-09-13 18:15:07 +02:00
Eugene Zhulenev
d138fe341d Fis static_assert in test to conform c++11 standard 2018-09-11 17:23:18 -07:00
Rasmus Munk Larsen
e289f44c56 Don't vectorize the MeanReducer unless pdiv is available. 2018-09-11 14:09:00 -07:00
Eugene Zhulenev
55bb7e7935 Merge with upstream eigen/default 2018-09-11 13:33:06 -07:00
Eugene Zhulenev
81b38a155a Fix compilation of tiled evaluation code with c++03 2018-09-11 13:32:32 -07:00
Rasmus Munk Larsen
46f88fc454 Use numerically stable tree reduction in TensorReduction. 2018-09-11 10:08:10 -07:00
Rasmus Munk Larsen
3d057e0453 Avoid compilation error in C++11 test when EIGEN_AVOID_STL_ARRAY is set. 2018-09-06 12:59:36 -07:00
Alexey Frunze
edeee16a16 Fix build failures in matrix_power and matrix_exponential tests.
This fixes the static assertion complaining about double being
used in place of long double. This happened on MIPS32, where
double and long double have the same type representation.
This can be simulated on x86 as well if we pass -mlong-double-64
to g++.
2018-08-31 14:11:10 -07:00
Deven Desai
c64fe9ea1f Updates to fix HIP-clang specific compile errors.
Compiling the eigen unittests with hip-clang (HIP with clang as the underlying compiler instead of hcc or nvcc), results in compile errors. The changes in this commit fix those compile errors. The main change is to convert a few instances of "__device__" to "EIGEN_DEVICE_FUNC"
2018-08-30 20:22:16 +00:00
Rasmus Munk Larsen
8b3d9ed081 Use padding instead of alignment attribute, which MaxSizeVector does not respect. This leads to undefined behavior and hard-to-trace bugs. 2018-09-05 11:20:06 -07:00
Christoph Hertzberg
ba2c8efdcf EIGEN_UNUSED is not supported by g++4.7 (and not portable) 2018-09-12 11:49:10 +02:00
Christoph Hertzberg
ff4e835d6b "sparse_product.cpp" must be included before "sparse_basic.cpp", otherwise EIGEN_SPARSE_CREATE_TEMPORARY_PLUGIN has no effect 2018-08-30 20:10:11 +02:00
Christoph Hertzberg
023ed6b9a8 Product of empty array must be 1 and not 0. 2018-08-30 17:14:52 +02:00
Christoph Hertzberg
c2f4e8c08e Fix integer conversion warning 2018-08-30 17:12:53 +02:00
Deven Desai
946c3e2544 adding EIGEN_DEVICE_FUNC attribute to fix some GPU unit tests that are broken in HIP mode 2018-08-27 23:04:08 +00:00
Christoph Hertzberg
20ba2eee6d gcc thinks this may not be initialized 2018-08-28 18:33:24 +02:00
Christoph Hertzberg
73ca600bca Fix numerous shadow-warnings for GCC<=4.8 2018-08-28 18:32:39 +02:00
Christoph Hertzberg
42f3ee4fb8 Old gcc versions have problems with recursive #pragma GCC diagnostic push/pop
Workaround: Don't include "DisableStupidWarnings.h" before including other main-headers
2018-08-28 11:44:15 +02:00
Eugene Zhulenev
c144bb355b Merge with upstream eigen/default 2018-08-27 14:34:07 -07:00
Christoph Hertzberg
b1653d1599 Fix some trivial C++11 vs C++03 compatibility warnings 2018-08-25 12:21:00 +02:00
Christoph Hertzberg
42123ff38b Make unit test C++03 compatible 2018-08-25 11:53:28 +02:00
Christoph Hertzberg
117bc5d505 Fix some shadow warnings 2018-08-25 09:06:08 +02:00
Christoph Hertzberg
f155e97adb Previous fix broke compilation for clang 2018-08-25 00:10:46 +02:00
Christoph Hertzberg
209b4972ec Fix conversion warning 2018-08-25 00:02:46 +02:00
Christoph Hertzberg
495f6c3c3a Fix missing-braces warnings 2018-08-24 23:56:13 +02:00
Christoph Hertzberg
5aaedbeced Fixed more sign-compare and type-limits warnings 2018-08-24 23:54:12 +02:00
Christoph Hertzberg
8295f02b36 Hide "maybe uninitialized" warning on gcc 2018-08-24 23:22:20 +02:00
Christoph Hertzberg
f7675b826b Fix several integer conversion and sign-compare warnings 2018-08-24 22:58:55 +02:00
Rasmus Munk Larsen
744e2fe0de Address comments about EIGEN_THREAD_LOCAL. 2018-08-24 10:24:54 -07:00
Rasmus Munk Larsen
8d9bc5cc02 Fix g++ compilation. 2018-08-23 13:06:39 -07:00
Rasmus Munk Larsen
e9f9d70611 Don't rely on __had_feature for g++.
Don't use __thread.
Only use thread_local for gcc 4.8 or newer.
2018-08-23 12:59:46 -07:00
Rasmus Munk Larsen
668690978f Pad PerThread when we emulate thread_local to prevent false sharing. 2018-08-23 12:54:33 -07:00
Rasmus Munk Larsen
6cedc5a9b3 rename mu. 2018-08-23 12:11:58 -07:00
Rasmus Munk Larsen
6e0464004a Store std::unique_ptr instead of raw pointers in per_thread_map_. 2018-08-23 12:10:08 -07:00
Rasmus Munk Larsen
e51d9e473a Protect #undef max with #ifdef max. 2018-08-23 11:42:05 -07:00
Rasmus Munk Larsen
d35880ed91 merge 2018-08-23 11:36:49 -07:00
Christoph Hertzberg
a709c8efb4 Replace pointers by values or unique_ptr for better leak-safety 2018-08-23 19:41:59 +02:00
Christoph Hertzberg
39335cf51e Make MaxSizeVector leak-safe 2018-08-23 19:37:56 +02:00
Benoit Steiner
ff8e0ecc2f Updated one more line of code to avoid making the test dependent on cxx11 features. 2018-08-17 15:15:52 -07:00
Benoit Steiner
43d9dd9b28 Removed more dependencies on cxx11. 2018-08-17 08:49:32 -07:00
Christoph Hertzberg
4713465eef Silence double-promotion warning 2018-08-17 16:39:43 +02:00
Christoph Hertzberg
c9b25fbefa Silence unused parameter warning 2018-08-17 16:28:28 +02:00
Christoph Hertzberg
dbdeceabdd Silence double-promotion warning (when converting double to complex<long double>) 2018-08-17 16:26:11 +02:00
Benoit Steiner
19df4d5752 Merged in codeplaysoftware/eigen-upstream-pure/Pointer_type_creation (pull request PR-461)
Creating a pointer type in TensorCustomOp.h
2018-08-16 18:28:33 +00:00
Benoit Steiner
f641cf1253 Adding missing at method in Eigen::array 2018-08-16 11:24:37 -07:00
Benoit Steiner
ede580ccda Avoid using the auto keyword to make the tensor block access test more portable 2018-08-16 10:49:47 -07:00
Benoit Steiner
e23c8c294e Use actual types instead of the auto keyword to make the code more portable 2018-08-16 10:41:01 -07:00
Mehdi Goli
80f1a76dec removing the noises. 2018-08-16 13:33:24 +01:00
Mehdi Goli
d0b01ebbf6 Reverting the unitended delete from the code. 2018-08-16 13:21:36 +01:00
Mehdi Goli
161dcbae9b Using PointerType struct and specializing it per device for TensorCustomOp.h 2018-08-16 00:07:02 +01:00
Sameer Agarwal
f197c3f55b Removed an used variable (PacketSize) from TensorExecutor 2018-08-15 11:24:57 -07:00
Benoit Steiner
4181556907 Fixed the tensor contraction code. 2018-08-15 09:34:47 -07:00
Benoit Steiner
b6f96cf7dd Removed dependencies on cxx11 language features from the tensor_block_access test 2018-08-15 08:54:31 -07:00
Benoit Steiner
fbb834144d Fixed more compilation errors 2018-08-15 08:52:58 -07:00
Benoit Steiner
6bb3f1b43e Made the tensor_block_access test compile again 2018-08-14 14:26:59 -07:00
Benoit Steiner
43ec0082a6 Made the kronecker_product test compile again 2018-08-14 14:08:36 -07:00
Benoit Steiner
ab3f481141 Cleaned up the code and make it compile with more compilers 2018-08-14 14:05:46 -07:00
Rasmus Munk Larsen
fa0bcbf230 merge 2018-08-14 12:18:31 -07:00
Rasmus Munk Larsen
15d4f515e2 Use plain_assert in destructors to avoid throwing in CXX11 tests where main.h owerwrites eigen_assert with a throwing version. 2018-08-14 12:17:46 -07:00
Rasmus Munk Larsen
aebdb06424 Fix a few compiler warnings in CXX11 tests. 2018-08-14 12:06:39 -07:00
Rasmus Munk Larsen
2a98bd9c8e Merged eigen/eigen into default 2018-08-14 12:02:09 -07:00
Benoit Steiner
59bba77ead Fixed compilation errors with gcc 4.7 and 4.8 2018-08-14 10:54:48 -07:00
Mehdi Goli
8ba799805b Merge with upstream 2018-08-14 09:43:45 +01:00
Rasmus Munk Larsen
6d6e7b7027 merge 2018-08-13 15:34:50 -07:00
Rasmus Munk Larsen
9bb75d8d31 Add Barrier.h. 2018-08-13 15:34:03 -07:00
Rasmus Munk Larsen
2e1adc0324 Merged eigen/eigen into default 2018-08-13 15:32:00 -07:00
Rasmus Munk Larsen
8278ae6313 Add support for thread local support on platforms that do not support it through emulation using a hash map. 2018-08-13 15:31:23 -07:00
Benoit Steiner
501be70b27 Code cleanup 2018-08-13 15:16:40 -07:00
Benoit Steiner
3d3711f22f Fixed compilation errors. 2018-08-13 15:16:06 -07:00
Gael Guennebaud
3ec60215df Merged in rmlarsen/eigen2 (pull request PR-466)
Move sigmoid functor to core and rename it to 'logistic'.
2018-08-13 21:28:20 +00:00
Rasmus Munk Larsen
0f1b2e08a5 Call logistic functor from Tensor::sigmoid. 2018-08-13 11:52:58 -07:00
Benoit Steiner
26239ee580 Use NULL instead of nullptr to avoid adding a cxx11 requirement. 2018-08-13 11:05:51 -07:00
Benoit Steiner
3810ec228f Don't use the auto keyword since it's not always supported properly. 2018-08-13 10:46:09 -07:00
Benoit Steiner
e6d5be811d Fixed syntax of nested templates chevrons to make it compatible with c++97 mode. 2018-08-13 10:29:21 -07:00
Mehdi Goli
1aa86aad14 Merge with upstream. 2018-08-13 15:40:31 +01:00
Eugene Zhulenev
35d90e8960 Fix BlockAccess enum in CwiseUnaryOp evaluator 2018-08-10 17:37:58 -07:00
Eugene Zhulenev
855b68896b Merge with eigen/default 2018-08-10 17:18:42 -07:00
Eugene Zhulenev
f2209d06e4 Add block evaluationto CwiseUnaryOp and add PreferBlockAccess enum to all evaluators 2018-08-10 16:53:36 -07:00
Benoit Steiner
c8ea398675 Avoided language features that are only available in cxx11 mode. 2018-08-10 13:02:41 -07:00
Benoit Steiner
4be4286224 Made the code compile with gcc 5.4. 2018-08-10 11:32:58 -07:00
Eugene Zhulenev
cfaedb38cd Fix bug in a test + compilation errors 2018-08-09 09:44:07 -07:00
Mehdi Goli
ea8fa5e86f Merge with upstream 2018-08-09 14:07:56 +01:00
Mehdi Goli
8c083bfd0e Properly fixing the PointerType for TensorCustomOp.h. As the output type here should be based on CoeffreturnType not the Scalar type. Therefore, Similar to reduction and evalTo function, it should have its own MakePointer class. In this case, for other device the type is defaulted to CoeffReturnType and no changes is required on users' code. However, in SYCL, on the device, we can recunstruct the device Type. 2018-08-09 13:57:43 +01:00
Eugene Zhulenev
1c8b9e10a7 Merged with upstream eigen 2018-08-08 16:57:58 -07:00
Benoit Steiner
131ed1191f Merged in codeplaysoftware/eigen-upstream-pure/Fixing_compiler_warning (pull request PR-462)
Fixing compiler warning in TensorBlock.h as it was creating a lot of noise at compilation.
2018-08-08 18:14:15 +00:00
Mehdi Goli
532a0be05c Fixing compiler warning in TensorBlock.h as it was creating a lot of noise at compilation. 2018-08-08 12:12:26 +01:00
Mehdi Goli
3055e3a7c2 Creating a pointer type in TensorCustomOp.h 2018-08-08 11:19:02 +01:00
Rasmus Munk Larsen
693fb1d41e Fix init order. 2018-08-07 17:18:51 -07:00
Benoit Steiner
10d286f55b Silenced a couple of compilation warnings. 2018-08-06 16:00:29 -07:00
Benoit Steiner
d011d05fd6 Fixed compilation errors. 2018-08-06 13:40:51 -07:00
Rasmus Munk Larsen
36e7e7dd8f Forward declare NoOpOutputKernel as struct rather than class to be consistent with implementation. 2018-08-06 13:16:32 -07:00
Rasmus Munk Larsen
fa68342ef8 Move sigmoid functor to core. 2018-08-03 17:31:23 -07:00
Gael Guennebaud
09c81ac033 bug #1451: fix numeric_limits<AutoDiffScalar<Der>> with a reference as derivative type 2018-08-04 00:17:37 +02:00
Christoph Hertzberg
edfb7962fd Use static const int instead of enum to avoid numerous local-type-template-args warnings in C++03 mode 2018-09-07 14:08:39 +02:00
Eugene Zhulenev
1b0373ae10 Replace all using declarations with typedefs in Tensor ops 2018-08-01 15:55:46 -07:00
Rasmus Munk Larsen
bcb29f890c Fix initialization order. 2018-08-03 10:18:53 -07:00
Mehdi Goli
3074b1ff9e Fixing the compilation error. 2018-08-03 17:13:44 +01:00
Mehdi Goli
225fa112aa Merge with upstream. 2018-08-03 17:04:08 +01:00
Mehdi Goli
01358300d5 Creating separate SYCL required PR for uncontroversial files. 2018-08-03 16:59:15 +01:00
Benoit Steiner
dd5875e30d Merged in codeplaysoftware/eigen-upstream-pure/constructor_error_clang (pull request PR-451)
Fixing ambigous constructor error for Clang compiler.
2018-08-02 20:46:03 +00:00
Mehdi Goli
516d2621b9 fixing compilation error for cxx11_tensor_trace.cpp error on Microsoft Visual Studio. 2018-08-02 14:30:48 +01:00
Mehdi Goli
40d6d020a0 Fixing ambigous constructor error for Clang compiler. 2018-08-02 13:34:53 +01:00
Eugene Zhulenev
64abdf1d7e Fix typo + get rid of redundant member variables for block sizes 2018-08-01 12:35:19 -07:00
Benoit Steiner
93b9e36e10 Merged in paultucker/eigen (pull request PR-431)
Optional ThreadPoolDevice allocator

Approved-by: Benoit Steiner <benoit.steiner.goog@gmail.com>
2018-08-01 19:14:34 +00:00
Eugene Zhulenev
385b3ff12f Merged latest changes from upstream/eigen 2018-08-01 11:59:04 -07:00
Benoit Steiner
17221115c9 Merged in codeplaysoftware/eigen-upstream-pure/eigen_variadic_assert (pull request PR-447)
Adding variadic version of assert which can take a parameter pack as its input.
2018-08-01 16:41:54 +00:00
Benoit Steiner
0360c36170 Merged in codeplaysoftware/eigen-upstream-pure/separating_internal_memory_allocation (pull request PR-446)
Distinguishing between internal memory allocation/deallocation from explicit user memory allocation/deallocation.
2018-08-01 16:13:15 +00:00
Mehdi Goli
c6a5c70712 Correcting the position of allocate_temp/deallocate_temp in TensorDeviceGpu.h 2018-08-01 16:56:26 +01:00
Benoit Steiner
45f75f1ace Merged in codeplaysoftware/eigen-upstream-pure/using_PacketType_class (pull request PR-449)
Enabling per device specialisation of packetSize.
2018-08-01 15:43:03 +00:00
Mehdi Goli
af96018b49 Using the suggested modification. 2018-08-01 16:04:44 +01:00
Mehdi Goli
b512a9536f Enabling per device specialisation of packetsize. 2018-08-01 13:39:13 +01:00
Mehdi Goli
3a197a60e6 variadic version of assert which can take a parameter pack as its input. 2018-08-01 12:19:14 +01:00
Mehdi Goli
d7a8414848 Distinguishing between internal memory allocation/deallocation from explicit user memory allocation/deallocation. 2018-08-01 11:56:30 +01:00
Mehdi Goli
9e219bb3d3 Converting ad-hoc inline keyword to EIGEN_STRONG_INLINE MACRO. 2018-08-01 10:47:49 +01:00
Eugene Zhulenev
83c0a16baf Add block evaluation support to TensorOps 2018-07-31 15:56:31 -07:00
Benoit Steiner
edf46bd7a2 Merged in yuefengz/eigen (pull request PR-370)
Use device's allocate function instead of internal::aligned_malloc.
2018-07-31 22:38:28 +00:00
Paul Tucker
385f7b8d0c Change getAllocator() to allocator() in ThreadPoolDevice. 2018-07-31 13:52:18 -07:00
Mark D Ryan
6f5b126e6d Fix tensor contraction for AVX512 machines
This patch modifies the TensorContraction class to ensure that the kc_ field is
always a multiple of the packet_size, if the packet_size is > 8.  Without this
change spatial convolutions in Tensorflow do not work properly as the code that
re-arranges the input matrices can assert if kc_ is not a multiple of the
packet_size.  This leads to a unit test failure,
//tensorflow/python/kernel_tests:conv_ops_test, on AVX512 builds of tensorflow.
2018-07-31 09:33:37 +01:00
Gael Guennebaud
678a0dcb12 Merged in ezhulenev/eigen/tiling_3 (pull request PR-438)
Tiled tensor executor
2018-07-31 08:13:00 +00:00
Gael Guennebaud
679eece876 Speedup trivial tensor broadcasting on GPU by enforcing unaligned loads. See PR 437. 2018-07-31 10:10:14 +02:00
Eugene Zhulenev
966c2a7bb6 Rename Index to StorageIndex + use Eigen::Array and Eigen::Map when possible 2018-07-27 12:45:17 -07:00
Eugene Zhulenev
6913221c43 Add tiled evaluation support to TensorExecutor 2018-07-25 13:51:10 -07:00
Rasmus Munk Larsen
e478532625 Reduce the number of template specializations of classes related to tensor contraction to reduce binary size. 2018-07-27 12:36:34 -07:00
Christoph Hertzberg
5e79402b4a fix warnings for doc-eigen-prerequisites 2018-07-24 21:59:15 +02:00
Christoph Hertzberg
5f79b7f9a9 Removed several shadowing types and use global Index typedef everywhere 2018-07-25 21:47:45 +02:00
Christoph Hertzberg
44ee201337 Rename variable which shadows class name 2018-07-25 20:26:15 +02:00
Gustavo Lima Chaves
705f66a9ca Account for missing change on commit "Remove SimpleThreadPool and..."
"... always use {NonBlocking}ThreadPool". It seems the non-blocking
implementation was me the default/only one, but a reference to the old
name was left unmodified. Fix that.
2018-07-23 16:29:09 -07:00
Eugene Zhulenev
d55efa6f0f TensorBlockIO 2018-07-23 15:50:55 -07:00
Eugene Zhulenev
34a75c3c5c Initial support of TensorBlock 2018-07-20 17:37:20 -07:00
Gustavo Lima Chaves
02eaaacbc5 Move cxx11_tensor_uint128 test under an EIGEN_TEST_CXX11 guarded
block

Builds configured without the -DEIGEN_TEST_CXX11=ON flag would fail
right away without this, as this test seems to rely on those language
features. The skip under compilation with MSVC was kept.
2018-07-20 16:08:40 -07:00
Paul Tucker
d4afccde5a Add test coverage for ThreadPoolDevice optional allocator. 2018-07-19 17:43:44 -07:00
Eugene Zhulenev
c58b874727 PR430: Convert count to the reducer type in MeanReducer
Without explicit conversion Tensorflow fails to compile, pset1 template deduction fails.

cannot convert '((const Eigen::internal::MeanReducer<Eigen::half>*)this)
  ->Eigen::internal::MeanReducer<Eigen::half>::packetCount_'
(type 'const DenseIndex {aka const long int}')
to type 'const type& {aka const Eigen::half&}'
     return pdiv(vaccum, pset1<Packet>(packetCount_));
Honestly I’m not sure why it works in Eigen tests, because Eigen::half constructor is explicit, and why it stopped working in TF, I didn’t find any relevant changes since previous Eigen upgrade.

static_cast<T>(packetCount_) - breaks cxx11_tensor_reductions test for Eigen::half, also quite surprising.
2018-07-19 17:37:03 -07:00
Paul Tucker
4e9848fa86 Actually add optional Allocator* arg to ThreadPoolDevice(). 2018-07-16 17:53:36 -07:00
Paul Tucker
b3e7c9132d Add optional Allocator argument to ThreadPoolDevice constructor.
When supplied, this allocator will be used in place of
internal::aligned_malloc.  This permits e.g. use of a NUMA-node specific
allocator where the thread-pool is also restricted a single NUMA-node.
2018-07-16 17:26:05 -07:00
Gael Guennebaud
add5757488 Simplify handling and non-splitted tests and include split_test_helper.h instead of re-generating it. This also allows us to modify it without breaking existing build folder. 2018-07-16 18:55:40 +02:00
Gael Guennebaud
901c7d31f0 Fix usage of EIGEN_SPLIT_LARGE_TESTS=ON: some unit tests, such as indexed_view have to be split unconditionally. 2018-07-16 18:35:05 +02:00
Rasmus Munk Larsen
3a9cf4e290 Get rid of alias for m_broadcast. 2018-07-13 16:24:48 -07:00
Rasmus Munk Larsen
4222550e17 Optimize the case where broadcasting is a no-op. 2018-07-13 16:12:38 -07:00
Gael Guennebaud
1920129d71 Remove clang warning 2018-07-13 16:05:35 +02:00
Gael Guennebaud
06eb24cf4d Introduce gpu_assert for assertion in device-code, and disable them with clang-cuda. 2018-07-13 16:04:27 +02:00
David Hyde
d908afe35f bug #1558: fix a corner case in MINRES when both v_new and w_new vanish. 2018-07-08 22:06:38 -07:00
Eugene Zhulenev
6e654f3379 Reduce number of allocations in TensorContractionThreadPool. 2018-07-16 14:26:39 -07:00
Gael Guennebaud
7ccb623746 bug #1569: fix Tensor<half>::mean() on AVX with respective unit test. 2018-07-19 13:15:40 +02:00
Eugene Zhulenev
e3c2d61739 Assert that no output kernel is defined for GPU contraction 2018-07-18 14:34:22 -07:00
Eugene Zhulenev
79d4129cce Specify default output kernel for TensorContractionOp 2018-07-18 14:21:01 -07:00
Gael Guennebaud
44ea5f7623 Add unit test for -Tensor<complex> on GPU 2018-07-12 17:19:38 +02:00
Thales Sabino
9a6a43319f Fix cxx11_tensor_fft not building on Windows.
The type used in Eigen::DSizes needs to be at least 8 bytes long. Internally Tensor tries to convert this to an __int64 on Windows and this fails to build. On Linux, long and long long are both 8 byte integer types.
* * *
Changing from "long long" to "std::int64_t".
2018-07-12 11:20:59 +01:00
Gael Guennebaud
b347eb0b1c Fix doc 2018-07-12 11:56:18 +02:00
Yuefeng Zhou
1eff6cf8a7 Use device's allocate function instead of internal::aligned_malloc. This would make it easier to track memory usage in device instances. 2018-02-20 16:50:05 -08:00
Gael Guennebaud
6cd6551b26 Add deprecated header files for TensorFlow 2018-07-12 10:50:53 +02:00
Deven Desai
876f392c39 Updates corresponding to the latest round of PR feedback
The major changes are

1. Moving CUDA/PacketMath.h to GPU/PacketMath.h
2. Moving CUDA/MathFunctions.h to GPU/MathFunction.h
3. Moving CUDA/CudaSpecialFunctions.h to GPU/GpuSpecialFunctions.h
    The above three changes effectively enable the Eigen "Packet" layer for the HIP platform

4. Merging the "hip_basic" and "cuda_basic" unit tests into one ("gpu_basic")
5. Updating the "EIGEN_DEVICE_FUNC" marking in some places

The change has been tested on the HIP and CUDA platforms.
2018-07-11 10:39:54 -04:00
Deven Desai
471cfe5ff7 renaming CUDA* to GPU* for some header files 2018-07-11 09:22:04 -04:00
Deven Desai
38807a2575 merging updates from upstream 2018-07-11 09:17:33 -04:00
Gael Guennebaud
6190aa5632 bug #1567: add optimized path for tensor broadcasting and 'Channel First' shape 2018-07-09 11:23:16 +02:00
Deven Desai
1bb6fa99a3 merging the CUDA and HIP implementation for the Tensor directory and the unit tests 2018-06-20 16:44:58 -04:00
Deven Desai
cfdabbcc8f removing the *Hip files from the unsupported/Eigen/CXX11/src/Tensor and unsupported/test directories 2018-06-20 12:57:02 -04:00
Deven Desai
7e41c8f1a9 renaming *Cuda files to *Gpu in the unsupported/Eigen/CXX11/src/Tensor and unsupported/test directories 2018-06-20 12:52:30 -04:00
Deven Desai
b6cc0961b1 updates based on PR feedback
There are two major changes (and a few minor ones which are not listed here...see PR discussion for details)

1. Eigen::half implementations for HIP and CUDA have been merged.
This means that
- `CUDA/Half.h` and `HIP/hcc/Half.h` got merged to a new file `GPU/Half.h`
- `CUDA/PacketMathHalf.h` and `HIP/hcc/PacketMathHalf.h` got merged to a new file `GPU/PacketMathHalf.h`
- `CUDA/TypeCasting.h` and `HIP/hcc/TypeCasting.h` got merged to a new file `GPU/TypeCasting.h`

After this change the `HIP/hcc` directory only contains one file `math_constants.h`. That will go away too once that file becomes a part of the HIP install.

2. new macros EIGEN_GPUCC, EIGEN_GPU_COMPILE_PHASE and EIGEN_HAS_GPU_FP16 have been added and the code has been updated to use them where appropriate.
- `EIGEN_GPUCC` is the same as `(EIGEN_CUDACC || EIGEN_HIPCC)`
- `EIGEN_GPU_DEVICE_COMPILE` is the same as `(EIGEN_CUDA_ARCH || EIGEN_HIP_DEVICE_COMPILE)`
- `EIGEN_HAS_GPU_FP16` is the same as `(EIGEN_HAS_CUDA_FP16 or EIGEN_HAS_HIP_FP16)`
2018-06-14 10:21:54 -04:00
Deven Desai
d1d22ef0f4 syncing this fork with upstream 2018-06-13 12:09:52 -04:00
Benoit Steiner
d3a380af4d Merged in mfigurnov/eigen/gamma-der-a (pull request PR-403)
Derivative of the incomplete Gamma function and the sample of a Gamma random variable

Approved-by: Benoit Steiner <benoit.steiner.goog@gmail.com>
2018-06-11 17:57:47 +00:00
Gael Guennebaud
67ec37f7b0 Activate dgmres unit test 2018-07-02 12:54:14 +02:00
Jonathan Liu
b7689bded9 Use std::complex constructor instead of assignment from scalar
Fixes GCC conversion to non-scalar type requested compile error when
using boost::multiprecision::cpp_dec_float_50 as scalar type.
2018-06-28 00:32:37 +10:00
Rasmus Munk Larsen
5418154a45 Fix oversharding bug in parallelFor. 2018-06-20 17:51:48 -07:00
Gael Guennebaud
7933267c67 fix prototype 2018-06-08 09:56:01 +02:00
Michael Figurnov
30fa3d0454 Merge from eigen/eigen 2018-06-07 17:57:56 +01:00
Michael Figurnov
6c71c7d360 Merge from eigen/eigen. 2018-06-07 15:54:18 +01:00
Gael Guennebaud
37348d03ae Fix int versus Index 2018-06-07 15:56:43 +02:00
Michael Figurnov
aa813d417b Fix compilation of special functions without C99 math.
The commit with Bessel functions i0e and i1e placed the ifdef/endif incorrectly,
causing i0e/i1e to be undefined when EIGEN_HAS_C99_MATH=0. These functions do not
actually require C99 math, so now they are always available.
2018-06-07 14:35:07 +01:00
Gael Guennebaud
b3fd93207b Fix typos found using codespell 2018-06-07 14:43:02 +02:00
Michael Figurnov
5172a32849 Updated the stopping criteria in igammac_cf_impl.
Previously, when computing the derivative, it used a relative error threshold. Now it uses an absolute error threshold. The behavior for computing the value is unchanged. This makes more sense since we do not expect the derivative to often be close to zero. This change makes the derivatives about 30% faster across the board. The error for the igamma_der_a is almost unchanged, while for gamma_sample_der_alpha it is a bit worse for float32 and unchanged for float64.
2018-06-07 12:03:58 +01:00
Michael Figurnov
4bd158fa37 Derivative of the incomplete Gamma function and the sample of a Gamma random variable.
In addition to igamma(a, x), this code implements:
* igamma_der_a(a, x) = d igamma(a, x) / da -- derivative of igamma with respect to the parameter
* gamma_sample_der_alpha(alpha, sample) -- reparameterization derivative of a Gamma(alpha, 1) random variable sample with respect to the alpha parameter

The derivatives are computed by forward mode differentiation of the igamma(a, x) code. Although gamma_sample_der_alpha can be implemented via igamma_der_a, a separate function is more accurate and efficient due to analytical cancellation of some terms. All three functions are implemented by a method parameterized with "mode" that always computes the derivatives, but does not return them unless required by the mode. The compiler is expected to (and, based on benchmarks, does) skip the unnecessary computations depending on the mode.
2018-06-06 18:49:26 +01:00
Deven Desai
8fbd47052b Adding support for using Eigen in HIP kernels.
This commit enables the use of Eigen on HIP kernels / AMD GPUs. Support has been added along the same lines as what already exists for using Eigen in CUDA kernels / NVidia GPUs.

Application code needs to explicitly define EIGEN_USE_HIP when using Eigen in HIP kernels. This is because some of the CUDA headers get picked up by default during Eigen compile (irrespective of whether or not the underlying compiler is CUDACC/NVCC, for e.g. Eigen/src/Core/arch/CUDA/Half.h). In order to maintain this behavior, the EIGEN_USE_HIP macro is used to switch to using the HIP version of those header files (see Eigen/Core and unsupported/Eigen/CXX11/Tensor)


Use the "-DEIGEN_TEST_HIP" cmake option to enable the HIP specific unit tests.
2018-06-06 10:12:58 -04:00
Benoit Steiner
e206f8d4a4 Merged in mfigurnov/eigen (pull request PR-400)
Exponentially scaled modified Bessel functions of order zero and one.

Approved-by: Benoit Steiner <benoit.steiner.goog@gmail.com>
2018-06-05 17:05:21 +00:00
Penporn Koanantakool
e2ed0cf8ab Add a ThreadPoolInterface* getter for ThreadPoolDevice. 2018-06-02 12:07:49 -07:00
Michael Figurnov
f216854453 Exponentially scaled modified Bessel functions of order zero and one.
The functions are conventionally called i0e and i1e. The exponentially scaled version is more numerically stable. The standard Bessel functions can be obtained as i0(x) = exp(|x|) i0e(x)

The code is ported from Cephes and tested against SciPy.
2018-05-31 15:34:53 +01:00
Katrin Leinweber
ea94543190 Hyperlink DOIs against preferred resolver 2018-05-24 18:55:40 +02:00
Vamsi Sripathi
6293ad3f39 Performance improvements to tensor broadcast operation
1. Added new packet functions using SIMD for NByOne, OneByN cases
  2. Modified existing packet functions to reduce index calculations when input stride is non-SIMD
  3. Added 4 test cases to cover the new packet functions
2018-05-23 14:02:05 -07:00
Benoit Steiner
0371380d5b Merged in rmlarsen/eigen2 (pull request PR-393)
Rename scalar_clip_op to scalar_clamp_op to prevent collision with existing functor in TensorFlow.
2018-05-16 21:45:42 +00:00
Rasmus Munk Larsen
b8d36774fa Rename clip2 to clamp. 2018-05-16 14:04:48 -07:00
Rasmus Munk Larsen
812480baa3 Rename scalar_clip_op to scalar_clip2_op to prevent collision with existing functor in TensorFlow. 2018-05-16 09:49:24 -07:00
Benoit Steiner
1403c2c15b Merged in didierjansen/eigen (pull request PR-360)
Fix bugs and typos in the contraction example of the tensor README
2018-05-16 01:16:36 +00:00
Rasmus Munk Larsen
afec3021f7 Use numext::maxi & numext::mini. 2018-05-14 16:35:39 -07:00
Rasmus Munk Larsen
b8c8e5f436 Add vectorized clip functor for Eigen Tensors. 2018-05-14 16:07:13 -07:00
Benoit Steiner
6118c6ff4f Enable RawAccess to tensor slices whenever possinle.
Avoid 32-bit integer overflow in TensorSlicingOp
2018-04-30 11:28:12 -07:00
Gael Guennebaud
2f3287da7d Fix "used uninitialized" warnings 2018-04-24 17:17:25 +02:00
Gael Guennebaud
3ffd449ef5 Workaround warning 2018-04-24 17:11:51 +02:00
Christoph Hertzberg
84dcd998a9 Recent Adolc versions require C++11 2018-04-13 19:10:23 +02:00
Weiming Zhao
b0eda3cb9f Avoid using memcpy for non-POD elements 2018-04-11 11:37:06 +02:00
Gael Guennebaud
67bac6368c protect calls to isnan 2018-04-03 14:19:04 +02:00
Gael Guennebaud
524119d32a Fix uninitialized output argument. 2018-04-03 10:56:10 +02:00
Viktor Csomor
000840cae0 Added a move constructor and move assignment operator to Tensor and wrote some tests. 2018-02-07 19:10:54 +01:00
Eugene Zhulenev
c95aacab90 Fix TensorContractionOp evaluators for GPU and SYCL 2018-07-17 14:09:37 -07:00
Deven Desai
f124f07965 applying EIGEN_DECLARE_TEST to *gpu* tests
Also, a few minor fixes for GPU tests running in HIP mode.

1. Adding an include for hip/hip_runtime.h in the Macros.h file
   For HIP __host__ and __device__ are macros which are defined in hip headers.
   Their definitions need to be included before their use in the file.

2. Fixing the compile failure in TensorContractionGpu introduced by the commit to
   "Fuse computations into the Tensor contractions using output kernel"

3. Fixing a HIP/clang specific compile error by making the struct-member assignment explicit
2018-07-17 14:16:48 -04:00
Gael Guennebaud
82f0ce2726 Get rid of EIGEN_TEST_FUNC, unit tests must now be declared with EIGEN_DECLARE_TEST(mytest) { /* code */ }.
This provide several advantages:
- more flexibility in designing unit tests
- unit tests can be glued to speed up compilation
- unit tests are compiled with same predefined macros, which is a requirement for zapcc
2018-07-17 14:46:15 +02:00
Eugene Zhulenev
43206ac4de Call OutputKernel in evalGemv 2018-07-12 14:52:23 -07:00
Eugene Zhulenev
e204ecdaaf Remove SimpleThreadPool and always use {NonBlocking}ThreadPool 2018-07-16 15:06:57 -07:00
Eugene Zhulenev
01fd4096d3 Fuse computations into the Tensor contractions using output kernel 2018-07-10 13:16:38 -07:00
Gael Guennebaud
5539587b1f Some warning fixes 2018-07-17 10:29:12 +02:00
Benoit Steiner
8f55956a57 Update the padding computation for PADDING_SAME to be consistent with TensorFlow. 2018-01-30 20:22:12 +00:00
Lee.Deokjae
5b3c367926 Fix typos in the contraction example of tensor README 2018-01-06 14:36:19 +09:00
RJ Ryan
59985cfd26 Disable use of recurrence for computing twiddle factors. Fixes FFT precision issues for large FFTs. https://github.com/tensorflow/tensorflow/issues/10749#issuecomment-354557689 2017-12-31 10:44:56 -05:00
Gael Guennebaud
73214c4bd0 Workaround nvcc 9.0 issue. See PR 351.
https://bitbucket.org/eigen/eigen/pull-requests/351
2017-12-15 14:10:59 +01:00
Yangzihao Wang
3122477c86 Update the padding computation for PADDING_SAME to be consistent with TensorFlow. 2017-12-12 11:15:24 -08:00
Rasmus Munk Larsen
e900b010c8 Improve robustness of igamma and igammac to bad inputs.
Check for nan inputs and propagate them immediately. Limit the number of internal iterations to 2000 (same number as used by scipy.special.gammainc). This prevents an infinite loop when the function is called with nan or very large arguments.

Original change by mfirgunov@google.com
2018-03-19 09:04:54 -07:00
Gael Guennebaud
00bc67c374 Move KLU support to official 2017-11-10 14:11:22 +01:00
Gael Guennebaud
b82cd93c01 KLU: truely disable unimplemented code, add proper static assertions in solve 2017-11-10 14:09:01 +01:00
Gael Guennebaud
8cf63ccb99 Merged in kylemacfarlan/eigen (pull request PR-337)
Add support for SuiteSparse's KLU routines
2017-11-10 10:43:17 +00:00
Gael Guennebaud
1495b98a8e Merged in spraetor/eigen (pull request PR-305)
Issue with mpreal and std::numeric_limits::digits
2017-11-10 10:28:54 +00:00