Deven Desai
c64fe9ea1f
Updates to fix HIP-clang specific compile errors.
...
Compiling the eigen unittests with hip-clang (HIP with clang as the underlying compiler instead of hcc or nvcc), results in compile errors. The changes in this commit fix those compile errors. The main change is to convert a few instances of "__device__" to "EIGEN_DEVICE_FUNC"
2018-08-30 20:22:16 +00:00
Rasmus Munk Larsen
8b3d9ed081
Use padding instead of alignment attribute, which MaxSizeVector does not respect. This leads to undefined behavior and hard-to-trace bugs.
2018-09-05 11:20:06 -07:00
Christoph Hertzberg
ba2c8efdcf
EIGEN_UNUSED is not supported by g++4.7 (and not portable)
2018-09-12 11:49:10 +02:00
Christoph Hertzberg
ff4e835d6b
"sparse_product.cpp" must be included before "sparse_basic.cpp", otherwise EIGEN_SPARSE_CREATE_TEMPORARY_PLUGIN has no effect
2018-08-30 20:10:11 +02:00
Christoph Hertzberg
023ed6b9a8
Product of empty array must be 1 and not 0.
2018-08-30 17:14:52 +02:00
Christoph Hertzberg
c2f4e8c08e
Fix integer conversion warning
2018-08-30 17:12:53 +02:00
Deven Desai
946c3e2544
adding EIGEN_DEVICE_FUNC attribute to fix some GPU unit tests that are broken in HIP mode
2018-08-27 23:04:08 +00:00
Christoph Hertzberg
20ba2eee6d
gcc thinks this may not be initialized
2018-08-28 18:33:24 +02:00
Christoph Hertzberg
73ca600bca
Fix numerous shadow-warnings for GCC<=4.8
2018-08-28 18:32:39 +02:00
Christoph Hertzberg
42f3ee4fb8
Old gcc versions have problems with recursive #pragma GCC diagnostic push/pop
...
Workaround: Don't include "DisableStupidWarnings.h" before including other main-headers
2018-08-28 11:44:15 +02:00
Eugene Zhulenev
c144bb355b
Merge with upstream eigen/default
2018-08-27 14:34:07 -07:00
Christoph Hertzberg
b1653d1599
Fix some trivial C++11 vs C++03 compatibility warnings
2018-08-25 12:21:00 +02:00
Christoph Hertzberg
42123ff38b
Make unit test C++03 compatible
2018-08-25 11:53:28 +02:00
Christoph Hertzberg
117bc5d505
Fix some shadow warnings
2018-08-25 09:06:08 +02:00
Christoph Hertzberg
f155e97adb
Previous fix broke compilation for clang
2018-08-25 00:10:46 +02:00
Christoph Hertzberg
209b4972ec
Fix conversion warning
2018-08-25 00:02:46 +02:00
Christoph Hertzberg
495f6c3c3a
Fix missing-braces warnings
2018-08-24 23:56:13 +02:00
Christoph Hertzberg
5aaedbeced
Fixed more sign-compare and type-limits warnings
2018-08-24 23:54:12 +02:00
Christoph Hertzberg
8295f02b36
Hide "maybe uninitialized" warning on gcc
2018-08-24 23:22:20 +02:00
Christoph Hertzberg
f7675b826b
Fix several integer conversion and sign-compare warnings
2018-08-24 22:58:55 +02:00
Rasmus Munk Larsen
744e2fe0de
Address comments about EIGEN_THREAD_LOCAL.
2018-08-24 10:24:54 -07:00
Rasmus Munk Larsen
8d9bc5cc02
Fix g++ compilation.
2018-08-23 13:06:39 -07:00
Rasmus Munk Larsen
e9f9d70611
Don't rely on __had_feature for g++.
...
Don't use __thread.
Only use thread_local for gcc 4.8 or newer.
2018-08-23 12:59:46 -07:00
Rasmus Munk Larsen
668690978f
Pad PerThread when we emulate thread_local to prevent false sharing.
2018-08-23 12:54:33 -07:00
Rasmus Munk Larsen
6cedc5a9b3
rename mu.
2018-08-23 12:11:58 -07:00
Rasmus Munk Larsen
6e0464004a
Store std::unique_ptr instead of raw pointers in per_thread_map_.
2018-08-23 12:10:08 -07:00
Rasmus Munk Larsen
e51d9e473a
Protect #undef max with #ifdef max.
2018-08-23 11:42:05 -07:00
Rasmus Munk Larsen
d35880ed91
merge
2018-08-23 11:36:49 -07:00
Christoph Hertzberg
a709c8efb4
Replace pointers by values or unique_ptr for better leak-safety
2018-08-23 19:41:59 +02:00
Christoph Hertzberg
39335cf51e
Make MaxSizeVector leak-safe
2018-08-23 19:37:56 +02:00
Benoit Steiner
ff8e0ecc2f
Updated one more line of code to avoid making the test dependent on cxx11 features.
2018-08-17 15:15:52 -07:00
Benoit Steiner
43d9dd9b28
Removed more dependencies on cxx11.
2018-08-17 08:49:32 -07:00
Christoph Hertzberg
4713465eef
Silence double-promotion warning
2018-08-17 16:39:43 +02:00
Christoph Hertzberg
c9b25fbefa
Silence unused parameter warning
2018-08-17 16:28:28 +02:00
Christoph Hertzberg
dbdeceabdd
Silence double-promotion warning (when converting double to complex<long double>)
2018-08-17 16:26:11 +02:00
Benoit Steiner
19df4d5752
Merged in codeplaysoftware/eigen-upstream-pure/Pointer_type_creation (pull request PR-461)
...
Creating a pointer type in TensorCustomOp.h
2018-08-16 18:28:33 +00:00
Benoit Steiner
f641cf1253
Adding missing at method in Eigen::array
2018-08-16 11:24:37 -07:00
Benoit Steiner
ede580ccda
Avoid using the auto keyword to make the tensor block access test more portable
2018-08-16 10:49:47 -07:00
Benoit Steiner
e23c8c294e
Use actual types instead of the auto keyword to make the code more portable
2018-08-16 10:41:01 -07:00
Mehdi Goli
80f1a76dec
removing the noises.
2018-08-16 13:33:24 +01:00
Mehdi Goli
d0b01ebbf6
Reverting the unitended delete from the code.
2018-08-16 13:21:36 +01:00
Mehdi Goli
161dcbae9b
Using PointerType struct and specializing it per device for TensorCustomOp.h
2018-08-16 00:07:02 +01:00
Sameer Agarwal
f197c3f55b
Removed an used variable (PacketSize) from TensorExecutor
2018-08-15 11:24:57 -07:00
Benoit Steiner
4181556907
Fixed the tensor contraction code.
2018-08-15 09:34:47 -07:00
Benoit Steiner
b6f96cf7dd
Removed dependencies on cxx11 language features from the tensor_block_access test
2018-08-15 08:54:31 -07:00
Benoit Steiner
fbb834144d
Fixed more compilation errors
2018-08-15 08:52:58 -07:00
Benoit Steiner
6bb3f1b43e
Made the tensor_block_access test compile again
2018-08-14 14:26:59 -07:00
Benoit Steiner
43ec0082a6
Made the kronecker_product test compile again
2018-08-14 14:08:36 -07:00
Benoit Steiner
ab3f481141
Cleaned up the code and make it compile with more compilers
2018-08-14 14:05:46 -07:00
Rasmus Munk Larsen
fa0bcbf230
merge
2018-08-14 12:18:31 -07:00
Rasmus Munk Larsen
15d4f515e2
Use plain_assert in destructors to avoid throwing in CXX11 tests where main.h owerwrites eigen_assert with a throwing version.
2018-08-14 12:17:46 -07:00
Rasmus Munk Larsen
aebdb06424
Fix a few compiler warnings in CXX11 tests.
2018-08-14 12:06:39 -07:00
Rasmus Munk Larsen
2a98bd9c8e
Merged eigen/eigen into default
2018-08-14 12:02:09 -07:00
Benoit Steiner
59bba77ead
Fixed compilation errors with gcc 4.7 and 4.8
2018-08-14 10:54:48 -07:00
Mehdi Goli
8ba799805b
Merge with upstream
2018-08-14 09:43:45 +01:00
Rasmus Munk Larsen
6d6e7b7027
merge
2018-08-13 15:34:50 -07:00
Rasmus Munk Larsen
9bb75d8d31
Add Barrier.h.
2018-08-13 15:34:03 -07:00
Rasmus Munk Larsen
2e1adc0324
Merged eigen/eigen into default
2018-08-13 15:32:00 -07:00
Rasmus Munk Larsen
8278ae6313
Add support for thread local support on platforms that do not support it through emulation using a hash map.
2018-08-13 15:31:23 -07:00
Benoit Steiner
501be70b27
Code cleanup
2018-08-13 15:16:40 -07:00
Benoit Steiner
3d3711f22f
Fixed compilation errors.
2018-08-13 15:16:06 -07:00
Gael Guennebaud
3ec60215df
Merged in rmlarsen/eigen2 (pull request PR-466)
...
Move sigmoid functor to core and rename it to 'logistic'.
2018-08-13 21:28:20 +00:00
Rasmus Munk Larsen
0f1b2e08a5
Call logistic functor from Tensor::sigmoid.
2018-08-13 11:52:58 -07:00
Benoit Steiner
26239ee580
Use NULL instead of nullptr to avoid adding a cxx11 requirement.
2018-08-13 11:05:51 -07:00
Benoit Steiner
3810ec228f
Don't use the auto keyword since it's not always supported properly.
2018-08-13 10:46:09 -07:00
Benoit Steiner
e6d5be811d
Fixed syntax of nested templates chevrons to make it compatible with c++97 mode.
2018-08-13 10:29:21 -07:00
Mehdi Goli
1aa86aad14
Merge with upstream.
2018-08-13 15:40:31 +01:00
Eugene Zhulenev
35d90e8960
Fix BlockAccess enum in CwiseUnaryOp evaluator
2018-08-10 17:37:58 -07:00
Eugene Zhulenev
855b68896b
Merge with eigen/default
2018-08-10 17:18:42 -07:00
Eugene Zhulenev
f2209d06e4
Add block evaluationto CwiseUnaryOp and add PreferBlockAccess enum to all evaluators
2018-08-10 16:53:36 -07:00
Benoit Steiner
c8ea398675
Avoided language features that are only available in cxx11 mode.
2018-08-10 13:02:41 -07:00
Benoit Steiner
4be4286224
Made the code compile with gcc 5.4.
2018-08-10 11:32:58 -07:00
Eugene Zhulenev
cfaedb38cd
Fix bug in a test + compilation errors
2018-08-09 09:44:07 -07:00
Mehdi Goli
ea8fa5e86f
Merge with upstream
2018-08-09 14:07:56 +01:00
Mehdi Goli
8c083bfd0e
Properly fixing the PointerType for TensorCustomOp.h. As the output type here should be based on CoeffreturnType not the Scalar type. Therefore, Similar to reduction and evalTo function, it should have its own MakePointer class. In this case, for other device the type is defaulted to CoeffReturnType and no changes is required on users' code. However, in SYCL, on the device, we can recunstruct the device Type.
2018-08-09 13:57:43 +01:00
Eugene Zhulenev
1c8b9e10a7
Merged with upstream eigen
2018-08-08 16:57:58 -07:00
Benoit Steiner
131ed1191f
Merged in codeplaysoftware/eigen-upstream-pure/Fixing_compiler_warning (pull request PR-462)
...
Fixing compiler warning in TensorBlock.h as it was creating a lot of noise at compilation.
2018-08-08 18:14:15 +00:00
Mehdi Goli
532a0be05c
Fixing compiler warning in TensorBlock.h as it was creating a lot of noise at compilation.
2018-08-08 12:12:26 +01:00
Mehdi Goli
3055e3a7c2
Creating a pointer type in TensorCustomOp.h
2018-08-08 11:19:02 +01:00
Rasmus Munk Larsen
693fb1d41e
Fix init order.
2018-08-07 17:18:51 -07:00
Benoit Steiner
10d286f55b
Silenced a couple of compilation warnings.
2018-08-06 16:00:29 -07:00
Benoit Steiner
d011d05fd6
Fixed compilation errors.
2018-08-06 13:40:51 -07:00
Rasmus Munk Larsen
36e7e7dd8f
Forward declare NoOpOutputKernel as struct rather than class to be consistent with implementation.
2018-08-06 13:16:32 -07:00
Rasmus Munk Larsen
fa68342ef8
Move sigmoid functor to core.
2018-08-03 17:31:23 -07:00
Gael Guennebaud
09c81ac033
bug #1451 : fix numeric_limits<AutoDiffScalar<Der>> with a reference as derivative type
2018-08-04 00:17:37 +02:00
Christoph Hertzberg
edfb7962fd
Use static const int
instead of enum
to avoid numerous local-type-template-args
warnings in C++03 mode
2018-09-07 14:08:39 +02:00
Eugene Zhulenev
1b0373ae10
Replace all using declarations with typedefs in Tensor ops
2018-08-01 15:55:46 -07:00
Rasmus Munk Larsen
bcb29f890c
Fix initialization order.
2018-08-03 10:18:53 -07:00
Mehdi Goli
3074b1ff9e
Fixing the compilation error.
2018-08-03 17:13:44 +01:00
Mehdi Goli
225fa112aa
Merge with upstream.
2018-08-03 17:04:08 +01:00
Mehdi Goli
01358300d5
Creating separate SYCL required PR for uncontroversial files.
2018-08-03 16:59:15 +01:00
Benoit Steiner
dd5875e30d
Merged in codeplaysoftware/eigen-upstream-pure/constructor_error_clang (pull request PR-451)
...
Fixing ambigous constructor error for Clang compiler.
2018-08-02 20:46:03 +00:00
Mehdi Goli
516d2621b9
fixing compilation error for cxx11_tensor_trace.cpp error on Microsoft Visual Studio.
2018-08-02 14:30:48 +01:00
Mehdi Goli
40d6d020a0
Fixing ambigous constructor error for Clang compiler.
2018-08-02 13:34:53 +01:00
Eugene Zhulenev
64abdf1d7e
Fix typo + get rid of redundant member variables for block sizes
2018-08-01 12:35:19 -07:00
Benoit Steiner
93b9e36e10
Merged in paultucker/eigen (pull request PR-431)
...
Optional ThreadPoolDevice allocator
Approved-by: Benoit Steiner <benoit.steiner.goog@gmail.com>
2018-08-01 19:14:34 +00:00
Eugene Zhulenev
385b3ff12f
Merged latest changes from upstream/eigen
2018-08-01 11:59:04 -07:00
Benoit Steiner
17221115c9
Merged in codeplaysoftware/eigen-upstream-pure/eigen_variadic_assert (pull request PR-447)
...
Adding variadic version of assert which can take a parameter pack as its input.
2018-08-01 16:41:54 +00:00
Benoit Steiner
0360c36170
Merged in codeplaysoftware/eigen-upstream-pure/separating_internal_memory_allocation (pull request PR-446)
...
Distinguishing between internal memory allocation/deallocation from explicit user memory allocation/deallocation.
2018-08-01 16:13:15 +00:00
Mehdi Goli
c6a5c70712
Correcting the position of allocate_temp/deallocate_temp in TensorDeviceGpu.h
2018-08-01 16:56:26 +01:00
Benoit Steiner
45f75f1ace
Merged in codeplaysoftware/eigen-upstream-pure/using_PacketType_class (pull request PR-449)
...
Enabling per device specialisation of packetSize.
2018-08-01 15:43:03 +00:00
Mehdi Goli
af96018b49
Using the suggested modification.
2018-08-01 16:04:44 +01:00
Mehdi Goli
b512a9536f
Enabling per device specialisation of packetsize.
2018-08-01 13:39:13 +01:00
Mehdi Goli
3a197a60e6
variadic version of assert which can take a parameter pack as its input.
2018-08-01 12:19:14 +01:00
Mehdi Goli
d7a8414848
Distinguishing between internal memory allocation/deallocation from explicit user memory allocation/deallocation.
2018-08-01 11:56:30 +01:00
Mehdi Goli
9e219bb3d3
Converting ad-hoc inline keyword to EIGEN_STRONG_INLINE MACRO.
2018-08-01 10:47:49 +01:00
Eugene Zhulenev
83c0a16baf
Add block evaluation support to TensorOps
2018-07-31 15:56:31 -07:00
Benoit Steiner
edf46bd7a2
Merged in yuefengz/eigen (pull request PR-370)
...
Use device's allocate function instead of internal::aligned_malloc.
2018-07-31 22:38:28 +00:00
Paul Tucker
385f7b8d0c
Change getAllocator() to allocator() in ThreadPoolDevice.
2018-07-31 13:52:18 -07:00
Mark D Ryan
6f5b126e6d
Fix tensor contraction for AVX512 machines
...
This patch modifies the TensorContraction class to ensure that the kc_ field is
always a multiple of the packet_size, if the packet_size is > 8. Without this
change spatial convolutions in Tensorflow do not work properly as the code that
re-arranges the input matrices can assert if kc_ is not a multiple of the
packet_size. This leads to a unit test failure,
//tensorflow/python/kernel_tests:conv_ops_test, on AVX512 builds of tensorflow.
2018-07-31 09:33:37 +01:00
Gael Guennebaud
678a0dcb12
Merged in ezhulenev/eigen/tiling_3 (pull request PR-438)
...
Tiled tensor executor
2018-07-31 08:13:00 +00:00
Gael Guennebaud
679eece876
Speedup trivial tensor broadcasting on GPU by enforcing unaligned loads. See PR 437.
2018-07-31 10:10:14 +02:00
Eugene Zhulenev
966c2a7bb6
Rename Index to StorageIndex + use Eigen::Array and Eigen::Map when possible
2018-07-27 12:45:17 -07:00
Eugene Zhulenev
6913221c43
Add tiled evaluation support to TensorExecutor
2018-07-25 13:51:10 -07:00
Rasmus Munk Larsen
e478532625
Reduce the number of template specializations of classes related to tensor contraction to reduce binary size.
2018-07-27 12:36:34 -07:00
Christoph Hertzberg
5e79402b4a
fix warnings for doc-eigen-prerequisites
2018-07-24 21:59:15 +02:00
Christoph Hertzberg
5f79b7f9a9
Removed several shadowing types and use global Index typedef everywhere
2018-07-25 21:47:45 +02:00
Christoph Hertzberg
44ee201337
Rename variable which shadows class name
2018-07-25 20:26:15 +02:00
Gustavo Lima Chaves
705f66a9ca
Account for missing change on commit "Remove SimpleThreadPool and..."
...
"... always use {NonBlocking}ThreadPool". It seems the non-blocking
implementation was me the default/only one, but a reference to the old
name was left unmodified. Fix that.
2018-07-23 16:29:09 -07:00
Eugene Zhulenev
d55efa6f0f
TensorBlockIO
2018-07-23 15:50:55 -07:00
Eugene Zhulenev
34a75c3c5c
Initial support of TensorBlock
2018-07-20 17:37:20 -07:00
Gustavo Lima Chaves
02eaaacbc5
Move cxx11_tensor_uint128 test under an EIGEN_TEST_CXX11 guarded
...
block
Builds configured without the -DEIGEN_TEST_CXX11=ON flag would fail
right away without this, as this test seems to rely on those language
features. The skip under compilation with MSVC was kept.
2018-07-20 16:08:40 -07:00
Paul Tucker
d4afccde5a
Add test coverage for ThreadPoolDevice optional allocator.
2018-07-19 17:43:44 -07:00
Eugene Zhulenev
c58b874727
PR430: Convert count to the reducer type in MeanReducer
...
Without explicit conversion Tensorflow fails to compile, pset1 template deduction fails.
cannot convert '((const Eigen::internal::MeanReducer<Eigen::half>*)this)
->Eigen::internal::MeanReducer<Eigen::half>::packetCount_'
(type 'const DenseIndex {aka const long int}')
to type 'const type& {aka const Eigen::half&}'
return pdiv(vaccum, pset1<Packet>(packetCount_));
Honestly I’m not sure why it works in Eigen tests, because Eigen::half constructor is explicit, and why it stopped working in TF, I didn’t find any relevant changes since previous Eigen upgrade.
static_cast<T>(packetCount_) - breaks cxx11_tensor_reductions test for Eigen::half, also quite surprising.
2018-07-19 17:37:03 -07:00
Paul Tucker
4e9848fa86
Actually add optional Allocator* arg to ThreadPoolDevice().
2018-07-16 17:53:36 -07:00
Paul Tucker
b3e7c9132d
Add optional Allocator argument to ThreadPoolDevice constructor.
...
When supplied, this allocator will be used in place of
internal::aligned_malloc. This permits e.g. use of a NUMA-node specific
allocator where the thread-pool is also restricted a single NUMA-node.
2018-07-16 17:26:05 -07:00
Gael Guennebaud
add5757488
Simplify handling and non-splitted tests and include split_test_helper.h instead of re-generating it. This also allows us to modify it without breaking existing build folder.
2018-07-16 18:55:40 +02:00
Gael Guennebaud
901c7d31f0
Fix usage of EIGEN_SPLIT_LARGE_TESTS=ON: some unit tests, such as indexed_view have to be split unconditionally.
2018-07-16 18:35:05 +02:00
Rasmus Munk Larsen
3a9cf4e290
Get rid of alias for m_broadcast.
2018-07-13 16:24:48 -07:00
Rasmus Munk Larsen
4222550e17
Optimize the case where broadcasting is a no-op.
2018-07-13 16:12:38 -07:00
Gael Guennebaud
1920129d71
Remove clang warning
2018-07-13 16:05:35 +02:00
Gael Guennebaud
06eb24cf4d
Introduce gpu_assert for assertion in device-code, and disable them with clang-cuda.
2018-07-13 16:04:27 +02:00
David Hyde
d908afe35f
bug #1558 : fix a corner case in MINRES when both v_new and w_new vanish.
2018-07-08 22:06:38 -07:00
Eugene Zhulenev
6e654f3379
Reduce number of allocations in TensorContractionThreadPool.
2018-07-16 14:26:39 -07:00
Gael Guennebaud
7ccb623746
bug #1569 : fix Tensor<half>::mean() on AVX with respective unit test.
2018-07-19 13:15:40 +02:00
Eugene Zhulenev
e3c2d61739
Assert that no output kernel is defined for GPU contraction
2018-07-18 14:34:22 -07:00
Eugene Zhulenev
79d4129cce
Specify default output kernel for TensorContractionOp
2018-07-18 14:21:01 -07:00
Gael Guennebaud
44ea5f7623
Add unit test for -Tensor<complex> on GPU
2018-07-12 17:19:38 +02:00
Thales Sabino
9a6a43319f
Fix cxx11_tensor_fft not building on Windows.
...
The type used in Eigen::DSizes needs to be at least 8 bytes long. Internally Tensor tries to convert this to an __int64 on Windows and this fails to build. On Linux, long and long long are both 8 byte integer types.
* * *
Changing from "long long" to "std::int64_t".
2018-07-12 11:20:59 +01:00
Gael Guennebaud
b347eb0b1c
Fix doc
2018-07-12 11:56:18 +02:00
Yuefeng Zhou
1eff6cf8a7
Use device's allocate function instead of internal::aligned_malloc. This would make it easier to track memory usage in device instances.
2018-02-20 16:50:05 -08:00
Gael Guennebaud
6cd6551b26
Add deprecated header files for TensorFlow
2018-07-12 10:50:53 +02:00
Deven Desai
876f392c39
Updates corresponding to the latest round of PR feedback
...
The major changes are
1. Moving CUDA/PacketMath.h to GPU/PacketMath.h
2. Moving CUDA/MathFunctions.h to GPU/MathFunction.h
3. Moving CUDA/CudaSpecialFunctions.h to GPU/GpuSpecialFunctions.h
The above three changes effectively enable the Eigen "Packet" layer for the HIP platform
4. Merging the "hip_basic" and "cuda_basic" unit tests into one ("gpu_basic")
5. Updating the "EIGEN_DEVICE_FUNC" marking in some places
The change has been tested on the HIP and CUDA platforms.
2018-07-11 10:39:54 -04:00
Deven Desai
471cfe5ff7
renaming CUDA* to GPU* for some header files
2018-07-11 09:22:04 -04:00
Deven Desai
38807a2575
merging updates from upstream
2018-07-11 09:17:33 -04:00
Gael Guennebaud
6190aa5632
bug #1567 : add optimized path for tensor broadcasting and 'Channel First' shape
2018-07-09 11:23:16 +02:00
Deven Desai
1bb6fa99a3
merging the CUDA and HIP implementation for the Tensor directory and the unit tests
2018-06-20 16:44:58 -04:00
Deven Desai
cfdabbcc8f
removing the *Hip files from the unsupported/Eigen/CXX11/src/Tensor and unsupported/test directories
2018-06-20 12:57:02 -04:00
Deven Desai
7e41c8f1a9
renaming *Cuda files to *Gpu in the unsupported/Eigen/CXX11/src/Tensor and unsupported/test directories
2018-06-20 12:52:30 -04:00
Deven Desai
b6cc0961b1
updates based on PR feedback
...
There are two major changes (and a few minor ones which are not listed here...see PR discussion for details)
1. Eigen::half implementations for HIP and CUDA have been merged.
This means that
- `CUDA/Half.h` and `HIP/hcc/Half.h` got merged to a new file `GPU/Half.h`
- `CUDA/PacketMathHalf.h` and `HIP/hcc/PacketMathHalf.h` got merged to a new file `GPU/PacketMathHalf.h`
- `CUDA/TypeCasting.h` and `HIP/hcc/TypeCasting.h` got merged to a new file `GPU/TypeCasting.h`
After this change the `HIP/hcc` directory only contains one file `math_constants.h`. That will go away too once that file becomes a part of the HIP install.
2. new macros EIGEN_GPUCC, EIGEN_GPU_COMPILE_PHASE and EIGEN_HAS_GPU_FP16 have been added and the code has been updated to use them where appropriate.
- `EIGEN_GPUCC` is the same as `(EIGEN_CUDACC || EIGEN_HIPCC)`
- `EIGEN_GPU_DEVICE_COMPILE` is the same as `(EIGEN_CUDA_ARCH || EIGEN_HIP_DEVICE_COMPILE)`
- `EIGEN_HAS_GPU_FP16` is the same as `(EIGEN_HAS_CUDA_FP16 or EIGEN_HAS_HIP_FP16)`
2018-06-14 10:21:54 -04:00
Deven Desai
d1d22ef0f4
syncing this fork with upstream
2018-06-13 12:09:52 -04:00
Benoit Steiner
d3a380af4d
Merged in mfigurnov/eigen/gamma-der-a (pull request PR-403)
...
Derivative of the incomplete Gamma function and the sample of a Gamma random variable
Approved-by: Benoit Steiner <benoit.steiner.goog@gmail.com>
2018-06-11 17:57:47 +00:00
Gael Guennebaud
67ec37f7b0
Activate dgmres unit test
2018-07-02 12:54:14 +02:00
Jonathan Liu
b7689bded9
Use std::complex constructor instead of assignment from scalar
...
Fixes GCC conversion to non-scalar type requested compile error when
using boost::multiprecision::cpp_dec_float_50 as scalar type.
2018-06-28 00:32:37 +10:00
Rasmus Munk Larsen
5418154a45
Fix oversharding bug in parallelFor.
2018-06-20 17:51:48 -07:00
Gael Guennebaud
7933267c67
fix prototype
2018-06-08 09:56:01 +02:00
Michael Figurnov
30fa3d0454
Merge from eigen/eigen
2018-06-07 17:57:56 +01:00
Michael Figurnov
6c71c7d360
Merge from eigen/eigen.
2018-06-07 15:54:18 +01:00
Gael Guennebaud
37348d03ae
Fix int versus Index
2018-06-07 15:56:43 +02:00
Michael Figurnov
aa813d417b
Fix compilation of special functions without C99 math.
...
The commit with Bessel functions i0e and i1e placed the ifdef/endif incorrectly,
causing i0e/i1e to be undefined when EIGEN_HAS_C99_MATH=0. These functions do not
actually require C99 math, so now they are always available.
2018-06-07 14:35:07 +01:00
Gael Guennebaud
b3fd93207b
Fix typos found using codespell
2018-06-07 14:43:02 +02:00
Michael Figurnov
5172a32849
Updated the stopping criteria in igammac_cf_impl.
...
Previously, when computing the derivative, it used a relative error threshold. Now it uses an absolute error threshold. The behavior for computing the value is unchanged. This makes more sense since we do not expect the derivative to often be close to zero. This change makes the derivatives about 30% faster across the board. The error for the igamma_der_a is almost unchanged, while for gamma_sample_der_alpha it is a bit worse for float32 and unchanged for float64.
2018-06-07 12:03:58 +01:00
Michael Figurnov
4bd158fa37
Derivative of the incomplete Gamma function and the sample of a Gamma random variable.
...
In addition to igamma(a, x), this code implements:
* igamma_der_a(a, x) = d igamma(a, x) / da -- derivative of igamma with respect to the parameter
* gamma_sample_der_alpha(alpha, sample) -- reparameterization derivative of a Gamma(alpha, 1) random variable sample with respect to the alpha parameter
The derivatives are computed by forward mode differentiation of the igamma(a, x) code. Although gamma_sample_der_alpha can be implemented via igamma_der_a, a separate function is more accurate and efficient due to analytical cancellation of some terms. All three functions are implemented by a method parameterized with "mode" that always computes the derivatives, but does not return them unless required by the mode. The compiler is expected to (and, based on benchmarks, does) skip the unnecessary computations depending on the mode.
2018-06-06 18:49:26 +01:00
Deven Desai
8fbd47052b
Adding support for using Eigen in HIP kernels.
...
This commit enables the use of Eigen on HIP kernels / AMD GPUs. Support has been added along the same lines as what already exists for using Eigen in CUDA kernels / NVidia GPUs.
Application code needs to explicitly define EIGEN_USE_HIP when using Eigen in HIP kernels. This is because some of the CUDA headers get picked up by default during Eigen compile (irrespective of whether or not the underlying compiler is CUDACC/NVCC, for e.g. Eigen/src/Core/arch/CUDA/Half.h). In order to maintain this behavior, the EIGEN_USE_HIP macro is used to switch to using the HIP version of those header files (see Eigen/Core and unsupported/Eigen/CXX11/Tensor)
Use the "-DEIGEN_TEST_HIP" cmake option to enable the HIP specific unit tests.
2018-06-06 10:12:58 -04:00
Benoit Steiner
e206f8d4a4
Merged in mfigurnov/eigen (pull request PR-400)
...
Exponentially scaled modified Bessel functions of order zero and one.
Approved-by: Benoit Steiner <benoit.steiner.goog@gmail.com>
2018-06-05 17:05:21 +00:00
Penporn Koanantakool
e2ed0cf8ab
Add a ThreadPoolInterface* getter for ThreadPoolDevice.
2018-06-02 12:07:49 -07:00
Michael Figurnov
f216854453
Exponentially scaled modified Bessel functions of order zero and one.
...
The functions are conventionally called i0e and i1e. The exponentially scaled version is more numerically stable. The standard Bessel functions can be obtained as i0(x) = exp(|x|) i0e(x)
The code is ported from Cephes and tested against SciPy.
2018-05-31 15:34:53 +01:00
Katrin Leinweber
ea94543190
Hyperlink DOIs against preferred resolver
2018-05-24 18:55:40 +02:00
Vamsi Sripathi
6293ad3f39
Performance improvements to tensor broadcast operation
...
1. Added new packet functions using SIMD for NByOne, OneByN cases
2. Modified existing packet functions to reduce index calculations when input stride is non-SIMD
3. Added 4 test cases to cover the new packet functions
2018-05-23 14:02:05 -07:00
Benoit Steiner
0371380d5b
Merged in rmlarsen/eigen2 (pull request PR-393)
...
Rename scalar_clip_op to scalar_clamp_op to prevent collision with existing functor in TensorFlow.
2018-05-16 21:45:42 +00:00
Rasmus Munk Larsen
b8d36774fa
Rename clip2 to clamp.
2018-05-16 14:04:48 -07:00
Rasmus Munk Larsen
812480baa3
Rename scalar_clip_op to scalar_clip2_op to prevent collision with existing functor in TensorFlow.
2018-05-16 09:49:24 -07:00
Benoit Steiner
1403c2c15b
Merged in didierjansen/eigen (pull request PR-360)
...
Fix bugs and typos in the contraction example of the tensor README
2018-05-16 01:16:36 +00:00
Rasmus Munk Larsen
afec3021f7
Use numext::maxi & numext::mini.
2018-05-14 16:35:39 -07:00
Rasmus Munk Larsen
b8c8e5f436
Add vectorized clip functor for Eigen Tensors.
2018-05-14 16:07:13 -07:00
Benoit Steiner
6118c6ff4f
Enable RawAccess to tensor slices whenever possinle.
...
Avoid 32-bit integer overflow in TensorSlicingOp
2018-04-30 11:28:12 -07:00
Gael Guennebaud
2f3287da7d
Fix "used uninitialized" warnings
2018-04-24 17:17:25 +02:00
Gael Guennebaud
3ffd449ef5
Workaround warning
2018-04-24 17:11:51 +02:00
Christoph Hertzberg
84dcd998a9
Recent Adolc versions require C++11
2018-04-13 19:10:23 +02:00
Weiming Zhao
b0eda3cb9f
Avoid using memcpy for non-POD elements
2018-04-11 11:37:06 +02:00
Gael Guennebaud
67bac6368c
protect calls to isnan
2018-04-03 14:19:04 +02:00
Gael Guennebaud
524119d32a
Fix uninitialized output argument.
2018-04-03 10:56:10 +02:00
Viktor Csomor
000840cae0
Added a move constructor and move assignment operator to Tensor and wrote some tests.
2018-02-07 19:10:54 +01:00
Eugene Zhulenev
c95aacab90
Fix TensorContractionOp evaluators for GPU and SYCL
2018-07-17 14:09:37 -07:00
Deven Desai
f124f07965
applying EIGEN_DECLARE_TEST to *gpu* tests
...
Also, a few minor fixes for GPU tests running in HIP mode.
1. Adding an include for hip/hip_runtime.h in the Macros.h file
For HIP __host__ and __device__ are macros which are defined in hip headers.
Their definitions need to be included before their use in the file.
2. Fixing the compile failure in TensorContractionGpu introduced by the commit to
"Fuse computations into the Tensor contractions using output kernel"
3. Fixing a HIP/clang specific compile error by making the struct-member assignment explicit
2018-07-17 14:16:48 -04:00
Gael Guennebaud
82f0ce2726
Get rid of EIGEN_TEST_FUNC, unit tests must now be declared with EIGEN_DECLARE_TEST(mytest) { /* code */ }.
...
This provide several advantages:
- more flexibility in designing unit tests
- unit tests can be glued to speed up compilation
- unit tests are compiled with same predefined macros, which is a requirement for zapcc
2018-07-17 14:46:15 +02:00
Eugene Zhulenev
43206ac4de
Call OutputKernel in evalGemv
2018-07-12 14:52:23 -07:00
Eugene Zhulenev
e204ecdaaf
Remove SimpleThreadPool and always use {NonBlocking}ThreadPool
2018-07-16 15:06:57 -07:00
Eugene Zhulenev
01fd4096d3
Fuse computations into the Tensor contractions using output kernel
2018-07-10 13:16:38 -07:00
Gael Guennebaud
5539587b1f
Some warning fixes
2018-07-17 10:29:12 +02:00
Benoit Steiner
8f55956a57
Update the padding computation for PADDING_SAME to be consistent with TensorFlow.
2018-01-30 20:22:12 +00:00
Lee.Deokjae
5b3c367926
Fix typos in the contraction example of tensor README
2018-01-06 14:36:19 +09:00
RJ Ryan
59985cfd26
Disable use of recurrence for computing twiddle factors. Fixes FFT precision issues for large FFTs. https://github.com/tensorflow/tensorflow/issues/10749#issuecomment-354557689
2017-12-31 10:44:56 -05:00
Gael Guennebaud
73214c4bd0
Workaround nvcc 9.0 issue. See PR 351.
...
https://bitbucket.org/eigen/eigen/pull-requests/351
2017-12-15 14:10:59 +01:00
Yangzihao Wang
3122477c86
Update the padding computation for PADDING_SAME to be consistent with TensorFlow.
2017-12-12 11:15:24 -08:00
Rasmus Munk Larsen
e900b010c8
Improve robustness of igamma and igammac to bad inputs.
...
Check for nan inputs and propagate them immediately. Limit the number of internal iterations to 2000 (same number as used by scipy.special.gammainc). This prevents an infinite loop when the function is called with nan or very large arguments.
Original change by mfirgunov@google.com
2018-03-19 09:04:54 -07:00
Gael Guennebaud
00bc67c374
Move KLU support to official
2017-11-10 14:11:22 +01:00
Gael Guennebaud
b82cd93c01
KLU: truely disable unimplemented code, add proper static assertions in solve
2017-11-10 14:09:01 +01:00
Gael Guennebaud
8cf63ccb99
Merged in kylemacfarlan/eigen (pull request PR-337)
...
Add support for SuiteSparse's KLU routines
2017-11-10 10:43:17 +00:00
Gael Guennebaud
1495b98a8e
Merged in spraetor/eigen (pull request PR-305)
...
Issue with mpreal and std::numeric_limits::digits
2017-11-10 10:28:54 +00:00
Gael Guennebaud
fc45324380
Merged in jkflying/eigen-fix-scaling (pull request PR-302)
...
Make scaling work with non-square matrices
2017-11-10 10:11:36 +00:00
Gael Guennebaud
1b2dcf9a47
Check that Schur decomposition succeed.
2017-11-10 10:26:09 +01:00
Gael Guennebaud
0a1cc73942
bug #1484 : restore deleted line for 128 bits long doubles, and improve dispatching logic.
2017-11-10 10:25:41 +01:00
Benoit Steiner
3949615176
Merged in JonasMu/eigen (pull request PR-329)
...
Added an example for a contraction to a scalar value to README.md
Approved-by: Jonas Harsch <jonas.harsch@gmail.com>
2017-10-27 07:27:46 +00:00
Benoit Steiner
a6d875bac8
Removed unecesasry #include
2017-10-22 08:12:45 -07:00
Benoit Steiner
8eb4b9d254
Merged in benoitsteiner/opencl (pull request PR-341)
2017-10-17 16:39:28 +00:00
Rasmus Munk Larsen
f349507e02
Specialize ThreadPoolDevice::enqueueNotification for the case with no args. As an example this reduces binary size of an TensorFlow demo app for Android by about 2.5%.
2017-10-13 15:58:12 -07:00
Kyle Vedder
c0e1d510fd
Add support for SuiteSparse's KLU routines
2017-10-04 21:01:23 -05:00
Mehdi Goli
2062ac9958
Changes required for new ComputeCpp CE version.
2017-09-18 18:17:39 +01:00
Rasmus Munk Larsen
1b7294f6fc
Fix cut-and-paste error.
2017-09-08 16:35:58 -07:00
Rasmus Munk Larsen
94e2213b38
Avoid undefined behavior in Eigen::TensorCostModel::numThreads.
...
If the cost is large enough then the thread count can be larger than the maximum
representable int, so just casting it to an int is undefined behavior.
Contributed by phurst@google.com .
2017-09-08 15:49:55 -07:00
Gael Guennebaud
a91918a105
Merged in infinitei/eigen (pull request PR-328)
...
bug #1464 : Fixes construction of EulerAngles from 3D vector expression.
Approved-by: Tal Hadad <tal_hd@hotmail.com>
Approved-by: Abhijit Kundu <abhijit.kundu@gatech.edu>
2017-09-06 08:42:14 +00:00
Jonas Harsch
a991c80365
Added an example for a contraction to a scalar value, e.g. a double contraction of two second order tensors and how you can get the value of the result. I lost one day to get this doen so I think it will help some guys. I also added Eigen:: to the IndexPair and and array in the same example.
2017-09-01 11:30:26 +00:00
Benoit Steiner
a4089991eb
Added support for CUDA 9.0.
2017-08-31 02:49:39 +00:00
Abhijit Kundu
6d991a9595
bug #1464 : Fixes construction of EulerAngles from 3D vector expression.
2017-08-30 13:26:30 -04:00
Gael Guennebaud
304ef29571
Handle min/max/inf/etc issue in cuda_fp16.h directly in test/main.h
2017-08-24 11:26:41 +02:00
Gael Guennebaud
21633e585b
bug #1462 : remove all occurences of the deprecated __CUDACC_VER__ macro by introducing EIGEN_CUDACC_VER
2017-08-24 11:06:47 +02:00
Benoit Steiner
84d7be103a
Fixing Argmax that was breaking upstream TensorFlow.
2017-07-22 03:19:34 +00:00
Benoit Steiner
f0b154a4b0
Code cleanup
2017-07-10 09:54:09 -07:00
Benoit Steiner
575cda76b3
Fixed syntax errors generated by xcode
2017-07-09 11:39:01 -07:00
Benoit Steiner
5ac27d5b51
Avoid relying on cxx11 features when possible.
2017-07-08 21:58:44 -07:00
Benoit Steiner
c5a241ab9b
Merged in benoitsteiner/opencl (pull request PR-323)
...
Improved support for OpenCL
2017-07-07 16:27:33 +00:00
Benoit Steiner
b7ae4dd9ef
Merged in hughperkins/eigen/add-endif-labels-TensorReductionCuda.h (pull request PR-315)
...
Add labels to #ifdef, in TensorReductionCuda.h
2017-07-07 04:23:52 +00:00
Benoit Steiner
9daed67952
Merged in tntnatbry/eigen (pull request PR-319)
...
Tensor Trace op
2017-07-07 04:18:03 +00:00
Benoit Steiner
6795512e59
Improved the randomness of the tensor random generator
2017-07-06 21:12:45 -07:00
Benoit Steiner
dc524ac716
Fixed compilation warning
2017-07-06 21:11:15 -07:00
Benoit Steiner
62b4634ebe
Merged in mehdi_goli/upstr_benoit/TensorSYCLImageVolumePatchFixed (pull request PR-14)
...
Applying Benoit's comment for Fixing ImageVolumePatch.
* Applying Benoit's comment for Fixing ImageVolumePatch. Fixing conflict on cmake file.
* Fixing dealocation of the memory in ImagePatch test for SYCL.
* Fixing the automerge issue.
2017-07-06 05:08:13 +00:00
Benoit Steiner
53725c10b8
Merged in mehdi_goli/opencl/DataDependancy (pull request PR-10)
...
DataDependancy
* Wrapping data type to the pointer class for sycl in non-terminal nodes; not having that breaks Tensorflow Conv2d code.
* Applying Ronnan's Comments.
* Applying benoit's comments
2017-06-28 17:55:23 +00:00
Benoit Steiner
b8e805497e
Merged in benoitsteiner/opencl (pull request PR-318)
...
Improved support for OpenCL
2017-06-13 05:01:10 +00:00
Gael Guennebaud
8640093af1
fix compilation in C++98
2017-06-09 12:45:01 +02:00
Hugh Perkins
9341f258d4
Add labels to #ifdef, in TensorReductionCuda.h
2017-06-06 15:51:06 +01:00
Benoit Steiner
1e736b9ead
Merged in mehdi_goli/opencl/SYCLAlignAllocator (pull request PR-7)
...
Fixing SYCL alignment issue required by TensorFlow.
2017-05-26 17:23:00 +00:00
Benoit Steiner
9dee55ec33
Merged eigen/eigen into default
2017-05-26 09:01:04 -07:00
Mehdi Goli
0370d3576e
Applying Ronnan's comments.
2017-05-26 16:01:48 +01:00
Mehdi Goli
e3f964ed55
Applying Benoit's comment;removing dead code.
2017-05-25 11:17:26 +01:00
a-doumoulakis
fb853a857a
Restore misplaced comment
2017-05-24 17:50:15 +01:00
a-doumoulakis
7a8ba565f8
Merge changed from upstream
2017-05-24 17:45:29 +01:00
Mmanu Chaturvedi
2971503fed
Specializing numeric_limits For AutoDiffScalar
2017-05-23 17:12:36 -04:00
Gael Guennebaud
26e8f9171e
Fix compilation of matrix log with Map as input
2017-06-07 10:51:23 +02:00
Benoit Steiner
615733381e
Merged in mehdi_goli/opencl/FixingCmakeDependency (pull request PR-2)
...
Fixing Cmake Dependency for SYCL
2017-05-22 17:43:06 +00:00
Mehdi Goli
76c0fc1f95
Fixing SYCL alignment issue required by TensorFlow.
2017-05-22 16:49:32 +01:00
Mehdi Goli
2d17128d6f
Fixing suported device list.
2017-05-22 16:40:33 +01:00
Mehdi Goli
61d7f3664a
Fixing Cmake Dependency for SYCL
2017-05-22 14:58:28 +01:00
a-doumoulakis
052426b824
Add support for triSYCL
...
Eigen is now able to use triSYCL with EIGEN_SYCL_TRISYCL and TRISYCL_INCLUDE_DIR options
Fix contraction kernel with correct nd_item dimension
2017-05-05 19:26:27 +01:00
RJ Ryan
949a2da38c
Use scalar_sum_op and scalar_quotient_op instead of operator+ and operator/ in MeanReducer.
...
Improves support for std::complex types when compiling for CUDA.
Expands on e2e9cdd169
and 2bda1b0d93
.
2017-04-14 13:23:35 -07:00
Benoit Steiner
0d08165a7f
Merged in benoitsteiner/opencl (pull request PR-309)
...
OpenCL improvements
2017-04-05 14:28:08 +00:00
Benoit Steiner
068cc09708
Preserve file naming conventions
2017-04-04 10:09:10 -07:00
Benoit Steiner
c302ea7bc4
Deleted empty line of code
2017-04-04 10:05:16 -07:00
Benoit Steiner
a5a0c8fac1
Guard sycl specific code under a EIGEN_USE_SYCL ifdef
2017-04-04 10:03:21 -07:00
Benoit Steiner
a1304b95b7
Code cleanup
2017-04-04 10:00:46 -07:00