eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2024-12-21 07:19:46 +08:00

Author	SHA1	Message	Date
Deven Desai	c64fe9ea1f	Updates to fix HIP-clang specific compile errors. Compiling the eigen unittests with hip-clang (HIP with clang as the underlying compiler instead of hcc or nvcc), results in compile errors. The changes in this commit fix those compile errors. The main change is to convert a few instances of "__device__" to "EIGEN_DEVICE_FUNC"	2018-08-30 20:22:16 +00:00
Rasmus Munk Larsen	8b3d9ed081	Use padding instead of alignment attribute, which MaxSizeVector does not respect. This leads to undefined behavior and hard-to-trace bugs.	2018-09-05 11:20:06 -07:00
Christoph Hertzberg	ba2c8efdcf	EIGEN_UNUSED is not supported by g++4.7 (and not portable)	2018-09-12 11:49:10 +02:00
Christoph Hertzberg	ff4e835d6b	"sparse_product.cpp" must be included before "sparse_basic.cpp", otherwise EIGEN_SPARSE_CREATE_TEMPORARY_PLUGIN has no effect	2018-08-30 20:10:11 +02:00
Christoph Hertzberg	023ed6b9a8	Product of empty array must be 1 and not 0.	2018-08-30 17:14:52 +02:00
Christoph Hertzberg	c2f4e8c08e	Fix integer conversion warning	2018-08-30 17:12:53 +02:00
Deven Desai	946c3e2544	adding EIGEN_DEVICE_FUNC attribute to fix some GPU unit tests that are broken in HIP mode	2018-08-27 23:04:08 +00:00
Christoph Hertzberg	20ba2eee6d	gcc thinks this may not be initialized	2018-08-28 18:33:24 +02:00
Christoph Hertzberg	73ca600bca	Fix numerous shadow-warnings for GCC<=4.8	2018-08-28 18:32:39 +02:00
Christoph Hertzberg	42f3ee4fb8	Old gcc versions have problems with recursive #pragma GCC diagnostic push/pop Workaround: Don't include "DisableStupidWarnings.h" before including other main-headers	2018-08-28 11:44:15 +02:00
Eugene Zhulenev	c144bb355b	Merge with upstream eigen/default	2018-08-27 14:34:07 -07:00
Christoph Hertzberg	b1653d1599	Fix some trivial C++11 vs C++03 compatibility warnings	2018-08-25 12:21:00 +02:00
Christoph Hertzberg	42123ff38b	Make unit test C++03 compatible	2018-08-25 11:53:28 +02:00
Christoph Hertzberg	117bc5d505	Fix some shadow warnings	2018-08-25 09:06:08 +02:00
Christoph Hertzberg	f155e97adb	Previous fix broke compilation for clang	2018-08-25 00:10:46 +02:00
Christoph Hertzberg	209b4972ec	Fix conversion warning	2018-08-25 00:02:46 +02:00
Christoph Hertzberg	495f6c3c3a	Fix missing-braces warnings	2018-08-24 23:56:13 +02:00
Christoph Hertzberg	5aaedbeced	Fixed more sign-compare and type-limits warnings	2018-08-24 23:54:12 +02:00
Christoph Hertzberg	8295f02b36	Hide "maybe uninitialized" warning on gcc	2018-08-24 23:22:20 +02:00
Christoph Hertzberg	f7675b826b	Fix several integer conversion and sign-compare warnings	2018-08-24 22:58:55 +02:00
Rasmus Munk Larsen	744e2fe0de	Address comments about EIGEN_THREAD_LOCAL.	2018-08-24 10:24:54 -07:00
Rasmus Munk Larsen	8d9bc5cc02	Fix g++ compilation.	2018-08-23 13:06:39 -07:00
Rasmus Munk Larsen	e9f9d70611	Don't rely on __had_feature for g++. Don't use __thread. Only use thread_local for gcc 4.8 or newer.	2018-08-23 12:59:46 -07:00
Rasmus Munk Larsen	668690978f	Pad PerThread when we emulate thread_local to prevent false sharing.	2018-08-23 12:54:33 -07:00
Rasmus Munk Larsen	6cedc5a9b3	rename mu.	2018-08-23 12:11:58 -07:00
Rasmus Munk Larsen	6e0464004a	Store std::unique_ptr instead of raw pointers in per_thread_map_.	2018-08-23 12:10:08 -07:00
Rasmus Munk Larsen	e51d9e473a	Protect #undef max with #ifdef max.	2018-08-23 11:42:05 -07:00
Rasmus Munk Larsen	d35880ed91	merge	2018-08-23 11:36:49 -07:00
Christoph Hertzberg	a709c8efb4	Replace pointers by values or unique_ptr for better leak-safety	2018-08-23 19:41:59 +02:00
Christoph Hertzberg	39335cf51e	Make MaxSizeVector leak-safe	2018-08-23 19:37:56 +02:00
Benoit Steiner	ff8e0ecc2f	Updated one more line of code to avoid making the test dependent on cxx11 features.	2018-08-17 15:15:52 -07:00
Benoit Steiner	43d9dd9b28	Removed more dependencies on cxx11.	2018-08-17 08:49:32 -07:00
Christoph Hertzberg	4713465eef	Silence double-promotion warning	2018-08-17 16:39:43 +02:00
Christoph Hertzberg	c9b25fbefa	Silence unused parameter warning	2018-08-17 16:28:28 +02:00
Christoph Hertzberg	dbdeceabdd	Silence double-promotion warning (when converting double to complex<long double>)	2018-08-17 16:26:11 +02:00
Benoit Steiner	19df4d5752	Merged in codeplaysoftware/eigen-upstream-pure/Pointer_type_creation (pull request PR-461) Creating a pointer type in TensorCustomOp.h	2018-08-16 18:28:33 +00:00
Benoit Steiner	f641cf1253	Adding missing at method in Eigen::array	2018-08-16 11:24:37 -07:00
Benoit Steiner	ede580ccda	Avoid using the auto keyword to make the tensor block access test more portable	2018-08-16 10:49:47 -07:00
Benoit Steiner	e23c8c294e	Use actual types instead of the auto keyword to make the code more portable	2018-08-16 10:41:01 -07:00
Mehdi Goli	80f1a76dec	removing the noises.	2018-08-16 13:33:24 +01:00
Mehdi Goli	d0b01ebbf6	Reverting the unitended delete from the code.	2018-08-16 13:21:36 +01:00
Mehdi Goli	161dcbae9b	Using PointerType struct and specializing it per device for TensorCustomOp.h	2018-08-16 00:07:02 +01:00
Sameer Agarwal	f197c3f55b	Removed an used variable (PacketSize) from TensorExecutor	2018-08-15 11:24:57 -07:00
Benoit Steiner	4181556907	Fixed the tensor contraction code.	2018-08-15 09:34:47 -07:00
Benoit Steiner	b6f96cf7dd	Removed dependencies on cxx11 language features from the tensor_block_access test	2018-08-15 08:54:31 -07:00
Benoit Steiner	fbb834144d	Fixed more compilation errors	2018-08-15 08:52:58 -07:00
Benoit Steiner	6bb3f1b43e	Made the tensor_block_access test compile again	2018-08-14 14:26:59 -07:00
Benoit Steiner	43ec0082a6	Made the kronecker_product test compile again	2018-08-14 14:08:36 -07:00
Benoit Steiner	ab3f481141	Cleaned up the code and make it compile with more compilers	2018-08-14 14:05:46 -07:00
Rasmus Munk Larsen	fa0bcbf230	merge	2018-08-14 12:18:31 -07:00
Rasmus Munk Larsen	15d4f515e2	Use plain_assert in destructors to avoid throwing in CXX11 tests where main.h owerwrites eigen_assert with a throwing version.	2018-08-14 12:17:46 -07:00
Rasmus Munk Larsen	aebdb06424	Fix a few compiler warnings in CXX11 tests.	2018-08-14 12:06:39 -07:00
Rasmus Munk Larsen	2a98bd9c8e	Merged eigen/eigen into default	2018-08-14 12:02:09 -07:00
Benoit Steiner	59bba77ead	Fixed compilation errors with gcc 4.7 and 4.8	2018-08-14 10:54:48 -07:00
Mehdi Goli	8ba799805b	Merge with upstream	2018-08-14 09:43:45 +01:00
Rasmus Munk Larsen	6d6e7b7027	merge	2018-08-13 15:34:50 -07:00
Rasmus Munk Larsen	9bb75d8d31	Add Barrier.h.	2018-08-13 15:34:03 -07:00
Rasmus Munk Larsen	2e1adc0324	Merged eigen/eigen into default	2018-08-13 15:32:00 -07:00
Rasmus Munk Larsen	8278ae6313	Add support for thread local support on platforms that do not support it through emulation using a hash map.	2018-08-13 15:31:23 -07:00
Benoit Steiner	501be70b27	Code cleanup	2018-08-13 15:16:40 -07:00
Benoit Steiner	3d3711f22f	Fixed compilation errors.	2018-08-13 15:16:06 -07:00
Gael Guennebaud	3ec60215df	Merged in rmlarsen/eigen2 (pull request PR-466) Move sigmoid functor to core and rename it to 'logistic'.	2018-08-13 21:28:20 +00:00
Rasmus Munk Larsen	0f1b2e08a5	Call logistic functor from Tensor::sigmoid.	2018-08-13 11:52:58 -07:00
Benoit Steiner	26239ee580	Use NULL instead of nullptr to avoid adding a cxx11 requirement.	2018-08-13 11:05:51 -07:00
Benoit Steiner	3810ec228f	Don't use the auto keyword since it's not always supported properly.	2018-08-13 10:46:09 -07:00
Benoit Steiner	e6d5be811d	Fixed syntax of nested templates chevrons to make it compatible with c++97 mode.	2018-08-13 10:29:21 -07:00
Mehdi Goli	1aa86aad14	Merge with upstream.	2018-08-13 15:40:31 +01:00
Eugene Zhulenev	35d90e8960	Fix BlockAccess enum in CwiseUnaryOp evaluator	2018-08-10 17:37:58 -07:00
Eugene Zhulenev	855b68896b	Merge with eigen/default	2018-08-10 17:18:42 -07:00
Eugene Zhulenev	f2209d06e4	Add block evaluationto CwiseUnaryOp and add PreferBlockAccess enum to all evaluators	2018-08-10 16:53:36 -07:00
Benoit Steiner	c8ea398675	Avoided language features that are only available in cxx11 mode.	2018-08-10 13:02:41 -07:00
Benoit Steiner	4be4286224	Made the code compile with gcc 5.4.	2018-08-10 11:32:58 -07:00
Eugene Zhulenev	cfaedb38cd	Fix bug in a test + compilation errors	2018-08-09 09:44:07 -07:00
Mehdi Goli	ea8fa5e86f	Merge with upstream	2018-08-09 14:07:56 +01:00
Mehdi Goli	8c083bfd0e	Properly fixing the PointerType for TensorCustomOp.h. As the output type here should be based on CoeffreturnType not the Scalar type. Therefore, Similar to reduction and evalTo function, it should have its own MakePointer class. In this case, for other device the type is defaulted to CoeffReturnType and no changes is required on users' code. However, in SYCL, on the device, we can recunstruct the device Type.	2018-08-09 13:57:43 +01:00
Eugene Zhulenev	1c8b9e10a7	Merged with upstream eigen	2018-08-08 16:57:58 -07:00
Benoit Steiner	131ed1191f	Merged in codeplaysoftware/eigen-upstream-pure/Fixing_compiler_warning (pull request PR-462) Fixing compiler warning in TensorBlock.h as it was creating a lot of noise at compilation.	2018-08-08 18:14:15 +00:00
Mehdi Goli	532a0be05c	Fixing compiler warning in TensorBlock.h as it was creating a lot of noise at compilation.	2018-08-08 12:12:26 +01:00
Mehdi Goli	3055e3a7c2	Creating a pointer type in TensorCustomOp.h	2018-08-08 11:19:02 +01:00
Rasmus Munk Larsen	693fb1d41e	Fix init order.	2018-08-07 17:18:51 -07:00
Benoit Steiner	10d286f55b	Silenced a couple of compilation warnings.	2018-08-06 16:00:29 -07:00
Benoit Steiner	d011d05fd6	Fixed compilation errors.	2018-08-06 13:40:51 -07:00
Rasmus Munk Larsen	36e7e7dd8f	Forward declare NoOpOutputKernel as struct rather than class to be consistent with implementation.	2018-08-06 13:16:32 -07:00
Rasmus Munk Larsen	fa68342ef8	Move sigmoid functor to core.	2018-08-03 17:31:23 -07:00
Gael Guennebaud	09c81ac033	bug #1451 : fix numeric_limits<AutoDiffScalar<Der>> with a reference as derivative type	2018-08-04 00:17:37 +02:00
Christoph Hertzberg	edfb7962fd	Use `static const int` instead of `enum` to avoid numerous `local-type-template-args` warnings in C++03 mode	2018-09-07 14:08:39 +02:00
Eugene Zhulenev	1b0373ae10	Replace all using declarations with typedefs in Tensor ops	2018-08-01 15:55:46 -07:00
Rasmus Munk Larsen	bcb29f890c	Fix initialization order.	2018-08-03 10:18:53 -07:00
Mehdi Goli	3074b1ff9e	Fixing the compilation error.	2018-08-03 17:13:44 +01:00
Mehdi Goli	225fa112aa	Merge with upstream.	2018-08-03 17:04:08 +01:00
Mehdi Goli	01358300d5	Creating separate SYCL required PR for uncontroversial files.	2018-08-03 16:59:15 +01:00
Benoit Steiner	dd5875e30d	Merged in codeplaysoftware/eigen-upstream-pure/constructor_error_clang (pull request PR-451) Fixing ambigous constructor error for Clang compiler.	2018-08-02 20:46:03 +00:00
Mehdi Goli	516d2621b9	fixing compilation error for cxx11_tensor_trace.cpp error on Microsoft Visual Studio.	2018-08-02 14:30:48 +01:00
Mehdi Goli	40d6d020a0	Fixing ambigous constructor error for Clang compiler.	2018-08-02 13:34:53 +01:00
Eugene Zhulenev	64abdf1d7e	Fix typo + get rid of redundant member variables for block sizes	2018-08-01 12:35:19 -07:00
Benoit Steiner	93b9e36e10	Merged in paultucker/eigen (pull request PR-431) Optional ThreadPoolDevice allocator Approved-by: Benoit Steiner <benoit.steiner.goog@gmail.com>	2018-08-01 19:14:34 +00:00
Eugene Zhulenev	385b3ff12f	Merged latest changes from upstream/eigen	2018-08-01 11:59:04 -07:00
Benoit Steiner	17221115c9	Merged in codeplaysoftware/eigen-upstream-pure/eigen_variadic_assert (pull request PR-447) Adding variadic version of assert which can take a parameter pack as its input.	2018-08-01 16:41:54 +00:00
Benoit Steiner	0360c36170	Merged in codeplaysoftware/eigen-upstream-pure/separating_internal_memory_allocation (pull request PR-446) Distinguishing between internal memory allocation/deallocation from explicit user memory allocation/deallocation.	2018-08-01 16:13:15 +00:00
Mehdi Goli	c6a5c70712	Correcting the position of allocate_temp/deallocate_temp in TensorDeviceGpu.h	2018-08-01 16:56:26 +01:00
Benoit Steiner	45f75f1ace	Merged in codeplaysoftware/eigen-upstream-pure/using_PacketType_class (pull request PR-449) Enabling per device specialisation of packetSize.	2018-08-01 15:43:03 +00:00
Mehdi Goli	af96018b49	Using the suggested modification.	2018-08-01 16:04:44 +01:00
Mehdi Goli	b512a9536f	Enabling per device specialisation of packetsize.	2018-08-01 13:39:13 +01:00
Mehdi Goli	3a197a60e6	variadic version of assert which can take a parameter pack as its input.	2018-08-01 12:19:14 +01:00
Mehdi Goli	d7a8414848	Distinguishing between internal memory allocation/deallocation from explicit user memory allocation/deallocation.	2018-08-01 11:56:30 +01:00
Mehdi Goli	9e219bb3d3	Converting ad-hoc inline keyword to EIGEN_STRONG_INLINE MACRO.	2018-08-01 10:47:49 +01:00
Eugene Zhulenev	83c0a16baf	Add block evaluation support to TensorOps	2018-07-31 15:56:31 -07:00
Benoit Steiner	edf46bd7a2	Merged in yuefengz/eigen (pull request PR-370) Use device's allocate function instead of internal::aligned_malloc.	2018-07-31 22:38:28 +00:00
Paul Tucker	385f7b8d0c	Change getAllocator() to allocator() in ThreadPoolDevice.	2018-07-31 13:52:18 -07:00
Mark D Ryan	6f5b126e6d	Fix tensor contraction for AVX512 machines This patch modifies the TensorContraction class to ensure that the kc_ field is always a multiple of the packet_size, if the packet_size is > 8. Without this change spatial convolutions in Tensorflow do not work properly as the code that re-arranges the input matrices can assert if kc_ is not a multiple of the packet_size. This leads to a unit test failure, //tensorflow/python/kernel_tests:conv_ops_test, on AVX512 builds of tensorflow.	2018-07-31 09:33:37 +01:00
Gael Guennebaud	678a0dcb12	Merged in ezhulenev/eigen/tiling_3 (pull request PR-438) Tiled tensor executor	2018-07-31 08:13:00 +00:00
Gael Guennebaud	679eece876	Speedup trivial tensor broadcasting on GPU by enforcing unaligned loads. See PR 437.	2018-07-31 10:10:14 +02:00
Eugene Zhulenev	966c2a7bb6	Rename Index to StorageIndex + use Eigen::Array and Eigen::Map when possible	2018-07-27 12:45:17 -07:00
Eugene Zhulenev	6913221c43	Add tiled evaluation support to TensorExecutor	2018-07-25 13:51:10 -07:00
Rasmus Munk Larsen	e478532625	Reduce the number of template specializations of classes related to tensor contraction to reduce binary size.	2018-07-27 12:36:34 -07:00
Christoph Hertzberg	5e79402b4a	fix warnings for doc-eigen-prerequisites	2018-07-24 21:59:15 +02:00
Christoph Hertzberg	5f79b7f9a9	Removed several shadowing types and use global Index typedef everywhere	2018-07-25 21:47:45 +02:00
Christoph Hertzberg	44ee201337	Rename variable which shadows class name	2018-07-25 20:26:15 +02:00
Gustavo Lima Chaves	705f66a9ca	Account for missing change on commit "Remove SimpleThreadPool and..." "... always use {NonBlocking}ThreadPool". It seems the non-blocking implementation was me the default/only one, but a reference to the old name was left unmodified. Fix that.	2018-07-23 16:29:09 -07:00
Eugene Zhulenev	d55efa6f0f	TensorBlockIO	2018-07-23 15:50:55 -07:00
Eugene Zhulenev	34a75c3c5c	Initial support of TensorBlock	2018-07-20 17:37:20 -07:00
Gustavo Lima Chaves	02eaaacbc5	Move cxx11_tensor_uint128 test under an EIGEN_TEST_CXX11 guarded block Builds configured without the -DEIGEN_TEST_CXX11=ON flag would fail right away without this, as this test seems to rely on those language features. The skip under compilation with MSVC was kept.	2018-07-20 16:08:40 -07:00
Paul Tucker	d4afccde5a	Add test coverage for ThreadPoolDevice optional allocator.	2018-07-19 17:43:44 -07:00
Eugene Zhulenev	c58b874727	PR430: Convert count to the reducer type in MeanReducer Without explicit conversion Tensorflow fails to compile, pset1 template deduction fails. cannot convert '((const Eigen::internal::MeanReducer<Eigen::half>*)this) ->Eigen::internal::MeanReducer<Eigen::half>::packetCount_' (type 'const DenseIndex {aka const long int}') to type 'const type& {aka const Eigen::half&}' return pdiv(vaccum, pset1<Packet>(packetCount_)); Honestly I’m not sure why it works in Eigen tests, because Eigen::half constructor is explicit, and why it stopped working in TF, I didn’t find any relevant changes since previous Eigen upgrade. static_cast<T>(packetCount_) - breaks cxx11_tensor_reductions test for Eigen::half, also quite surprising.	2018-07-19 17:37:03 -07:00
Paul Tucker	4e9848fa86	Actually add optional Allocator* arg to ThreadPoolDevice().	2018-07-16 17:53:36 -07:00
Paul Tucker	b3e7c9132d	Add optional Allocator argument to ThreadPoolDevice constructor. When supplied, this allocator will be used in place of internal::aligned_malloc. This permits e.g. use of a NUMA-node specific allocator where the thread-pool is also restricted a single NUMA-node.	2018-07-16 17:26:05 -07:00
Gael Guennebaud	add5757488	Simplify handling and non-splitted tests and include split_test_helper.h instead of re-generating it. This also allows us to modify it without breaking existing build folder.	2018-07-16 18:55:40 +02:00
Gael Guennebaud	901c7d31f0	Fix usage of EIGEN_SPLIT_LARGE_TESTS=ON: some unit tests, such as indexed_view have to be split unconditionally.	2018-07-16 18:35:05 +02:00
Rasmus Munk Larsen	3a9cf4e290	Get rid of alias for m_broadcast.	2018-07-13 16:24:48 -07:00
Rasmus Munk Larsen	4222550e17	Optimize the case where broadcasting is a no-op.	2018-07-13 16:12:38 -07:00
Gael Guennebaud	1920129d71	Remove clang warning	2018-07-13 16:05:35 +02:00
Gael Guennebaud	06eb24cf4d	Introduce gpu_assert for assertion in device-code, and disable them with clang-cuda.	2018-07-13 16:04:27 +02:00
David Hyde	d908afe35f	bug #1558 : fix a corner case in MINRES when both v_new and w_new vanish.	2018-07-08 22:06:38 -07:00
Eugene Zhulenev	6e654f3379	Reduce number of allocations in TensorContractionThreadPool.	2018-07-16 14:26:39 -07:00
Gael Guennebaud	7ccb623746	bug #1569 : fix Tensor<half>::mean() on AVX with respective unit test.	2018-07-19 13:15:40 +02:00
Eugene Zhulenev	e3c2d61739	Assert that no output kernel is defined for GPU contraction	2018-07-18 14:34:22 -07:00
Eugene Zhulenev	79d4129cce	Specify default output kernel for TensorContractionOp	2018-07-18 14:21:01 -07:00
Gael Guennebaud	44ea5f7623	Add unit test for -Tensor<complex> on GPU	2018-07-12 17:19:38 +02:00
Thales Sabino	9a6a43319f	Fix cxx11_tensor_fft not building on Windows. The type used in Eigen::DSizes needs to be at least 8 bytes long. Internally Tensor tries to convert this to an __int64 on Windows and this fails to build. On Linux, long and long long are both 8 byte integer types. * * * Changing from "long long" to "std::int64_t".	2018-07-12 11:20:59 +01:00
Gael Guennebaud	b347eb0b1c	Fix doc	2018-07-12 11:56:18 +02:00
Yuefeng Zhou	1eff6cf8a7	Use device's allocate function instead of internal::aligned_malloc. This would make it easier to track memory usage in device instances.	2018-02-20 16:50:05 -08:00
Gael Guennebaud	6cd6551b26	Add deprecated header files for TensorFlow	2018-07-12 10:50:53 +02:00
Deven Desai	876f392c39	Updates corresponding to the latest round of PR feedback The major changes are 1. Moving CUDA/PacketMath.h to GPU/PacketMath.h 2. Moving CUDA/MathFunctions.h to GPU/MathFunction.h 3. Moving CUDA/CudaSpecialFunctions.h to GPU/GpuSpecialFunctions.h The above three changes effectively enable the Eigen "Packet" layer for the HIP platform 4. Merging the "hip_basic" and "cuda_basic" unit tests into one ("gpu_basic") 5. Updating the "EIGEN_DEVICE_FUNC" marking in some places The change has been tested on the HIP and CUDA platforms.	2018-07-11 10:39:54 -04:00
Deven Desai	471cfe5ff7	renaming CUDA* to GPU* for some header files	2018-07-11 09:22:04 -04:00
Deven Desai	38807a2575	merging updates from upstream	2018-07-11 09:17:33 -04:00
Gael Guennebaud	6190aa5632	bug #1567 : add optimized path for tensor broadcasting and 'Channel First' shape	2018-07-09 11:23:16 +02:00
Deven Desai	1bb6fa99a3	merging the CUDA and HIP implementation for the Tensor directory and the unit tests	2018-06-20 16:44:58 -04:00
Deven Desai	cfdabbcc8f	removing the *Hip files from the unsupported/Eigen/CXX11/src/Tensor and unsupported/test directories	2018-06-20 12:57:02 -04:00
Deven Desai	7e41c8f1a9	renaming Cuda files to Gpu in the unsupported/Eigen/CXX11/src/Tensor and unsupported/test directories	2018-06-20 12:52:30 -04:00
Deven Desai	b6cc0961b1	updates based on PR feedback There are two major changes (and a few minor ones which are not listed here...see PR discussion for details) 1. Eigen::half implementations for HIP and CUDA have been merged. This means that - `CUDA/Half.h` and `HIP/hcc/Half.h` got merged to a new file `GPU/Half.h` - `CUDA/PacketMathHalf.h` and `HIP/hcc/PacketMathHalf.h` got merged to a new file `GPU/PacketMathHalf.h` - `CUDA/TypeCasting.h` and `HIP/hcc/TypeCasting.h` got merged to a new file `GPU/TypeCasting.h` After this change the `HIP/hcc` directory only contains one file `math_constants.h`. That will go away too once that file becomes a part of the HIP install. 2. new macros EIGEN_GPUCC, EIGEN_GPU_COMPILE_PHASE and EIGEN_HAS_GPU_FP16 have been added and the code has been updated to use them where appropriate. - `EIGEN_GPUCC` is the same as `(EIGEN_CUDACC \|\| EIGEN_HIPCC)` - `EIGEN_GPU_DEVICE_COMPILE` is the same as `(EIGEN_CUDA_ARCH \|\| EIGEN_HIP_DEVICE_COMPILE)` - `EIGEN_HAS_GPU_FP16` is the same as `(EIGEN_HAS_CUDA_FP16 or EIGEN_HAS_HIP_FP16)`	2018-06-14 10:21:54 -04:00
Deven Desai	d1d22ef0f4	syncing this fork with upstream	2018-06-13 12:09:52 -04:00
Benoit Steiner	d3a380af4d	Merged in mfigurnov/eigen/gamma-der-a (pull request PR-403) Derivative of the incomplete Gamma function and the sample of a Gamma random variable Approved-by: Benoit Steiner <benoit.steiner.goog@gmail.com>	2018-06-11 17:57:47 +00:00
Gael Guennebaud	67ec37f7b0	Activate dgmres unit test	2018-07-02 12:54:14 +02:00
Jonathan Liu	b7689bded9	Use std::complex constructor instead of assignment from scalar Fixes GCC conversion to non-scalar type requested compile error when using boost::multiprecision::cpp_dec_float_50 as scalar type.	2018-06-28 00:32:37 +10:00
Rasmus Munk Larsen	5418154a45	Fix oversharding bug in parallelFor.	2018-06-20 17:51:48 -07:00
Gael Guennebaud	7933267c67	fix prototype	2018-06-08 09:56:01 +02:00
Michael Figurnov	30fa3d0454	Merge from eigen/eigen	2018-06-07 17:57:56 +01:00
Michael Figurnov	6c71c7d360	Merge from eigen/eigen.	2018-06-07 15:54:18 +01:00
Gael Guennebaud	37348d03ae	Fix int versus Index	2018-06-07 15:56:43 +02:00
Michael Figurnov	aa813d417b	Fix compilation of special functions without C99 math. The commit with Bessel functions i0e and i1e placed the ifdef/endif incorrectly, causing i0e/i1e to be undefined when EIGEN_HAS_C99_MATH=0. These functions do not actually require C99 math, so now they are always available.	2018-06-07 14:35:07 +01:00
Gael Guennebaud	b3fd93207b	Fix typos found using codespell	2018-06-07 14:43:02 +02:00
Michael Figurnov	5172a32849	Updated the stopping criteria in igammac_cf_impl. Previously, when computing the derivative, it used a relative error threshold. Now it uses an absolute error threshold. The behavior for computing the value is unchanged. This makes more sense since we do not expect the derivative to often be close to zero. This change makes the derivatives about 30% faster across the board. The error for the igamma_der_a is almost unchanged, while for gamma_sample_der_alpha it is a bit worse for float32 and unchanged for float64.	2018-06-07 12:03:58 +01:00
Michael Figurnov	4bd158fa37	Derivative of the incomplete Gamma function and the sample of a Gamma random variable. In addition to igamma(a, x), this code implements: * igamma_der_a(a, x) = d igamma(a, x) / da -- derivative of igamma with respect to the parameter * gamma_sample_der_alpha(alpha, sample) -- reparameterization derivative of a Gamma(alpha, 1) random variable sample with respect to the alpha parameter The derivatives are computed by forward mode differentiation of the igamma(a, x) code. Although gamma_sample_der_alpha can be implemented via igamma_der_a, a separate function is more accurate and efficient due to analytical cancellation of some terms. All three functions are implemented by a method parameterized with "mode" that always computes the derivatives, but does not return them unless required by the mode. The compiler is expected to (and, based on benchmarks, does) skip the unnecessary computations depending on the mode.	2018-06-06 18:49:26 +01:00
Deven Desai	8fbd47052b	Adding support for using Eigen in HIP kernels. This commit enables the use of Eigen on HIP kernels / AMD GPUs. Support has been added along the same lines as what already exists for using Eigen in CUDA kernels / NVidia GPUs. Application code needs to explicitly define EIGEN_USE_HIP when using Eigen in HIP kernels. This is because some of the CUDA headers get picked up by default during Eigen compile (irrespective of whether or not the underlying compiler is CUDACC/NVCC, for e.g. Eigen/src/Core/arch/CUDA/Half.h). In order to maintain this behavior, the EIGEN_USE_HIP macro is used to switch to using the HIP version of those header files (see Eigen/Core and unsupported/Eigen/CXX11/Tensor) Use the "-DEIGEN_TEST_HIP" cmake option to enable the HIP specific unit tests.	2018-06-06 10:12:58 -04:00
Benoit Steiner	e206f8d4a4	Merged in mfigurnov/eigen (pull request PR-400) Exponentially scaled modified Bessel functions of order zero and one. Approved-by: Benoit Steiner <benoit.steiner.goog@gmail.com>	2018-06-05 17:05:21 +00:00
Penporn Koanantakool	e2ed0cf8ab	Add a ThreadPoolInterface* getter for ThreadPoolDevice.	2018-06-02 12:07:49 -07:00
Michael Figurnov	f216854453	Exponentially scaled modified Bessel functions of order zero and one. The functions are conventionally called i0e and i1e. The exponentially scaled version is more numerically stable. The standard Bessel functions can be obtained as i0(x) = exp(\|x\|) i0e(x) The code is ported from Cephes and tested against SciPy.	2018-05-31 15:34:53 +01:00
Katrin Leinweber	ea94543190	Hyperlink DOIs against preferred resolver	2018-05-24 18:55:40 +02:00
Vamsi Sripathi	6293ad3f39	Performance improvements to tensor broadcast operation 1. Added new packet functions using SIMD for NByOne, OneByN cases 2. Modified existing packet functions to reduce index calculations when input stride is non-SIMD 3. Added 4 test cases to cover the new packet functions	2018-05-23 14:02:05 -07:00
Benoit Steiner	0371380d5b	Merged in rmlarsen/eigen2 (pull request PR-393) Rename scalar_clip_op to scalar_clamp_op to prevent collision with existing functor in TensorFlow.	2018-05-16 21:45:42 +00:00
Rasmus Munk Larsen	b8d36774fa	Rename clip2 to clamp.	2018-05-16 14:04:48 -07:00
Rasmus Munk Larsen	812480baa3	Rename scalar_clip_op to scalar_clip2_op to prevent collision with existing functor in TensorFlow.	2018-05-16 09:49:24 -07:00
Benoit Steiner	1403c2c15b	Merged in didierjansen/eigen (pull request PR-360) Fix bugs and typos in the contraction example of the tensor README	2018-05-16 01:16:36 +00:00
Rasmus Munk Larsen	afec3021f7	Use numext::maxi & numext::mini.	2018-05-14 16:35:39 -07:00
Rasmus Munk Larsen	b8c8e5f436	Add vectorized clip functor for Eigen Tensors.	2018-05-14 16:07:13 -07:00
Benoit Steiner	6118c6ff4f	Enable RawAccess to tensor slices whenever possinle. Avoid 32-bit integer overflow in TensorSlicingOp	2018-04-30 11:28:12 -07:00
Gael Guennebaud	2f3287da7d	Fix "used uninitialized" warnings	2018-04-24 17:17:25 +02:00
Gael Guennebaud	3ffd449ef5	Workaround warning	2018-04-24 17:11:51 +02:00
Christoph Hertzberg	84dcd998a9	Recent Adolc versions require C++11	2018-04-13 19:10:23 +02:00
Weiming Zhao	b0eda3cb9f	Avoid using memcpy for non-POD elements	2018-04-11 11:37:06 +02:00
Gael Guennebaud	67bac6368c	protect calls to isnan	2018-04-03 14:19:04 +02:00
Gael Guennebaud	524119d32a	Fix uninitialized output argument.	2018-04-03 10:56:10 +02:00
Viktor Csomor	000840cae0	Added a move constructor and move assignment operator to Tensor and wrote some tests.	2018-02-07 19:10:54 +01:00
Eugene Zhulenev	c95aacab90	Fix TensorContractionOp evaluators for GPU and SYCL	2018-07-17 14:09:37 -07:00
Deven Desai	f124f07965	applying EIGEN_DECLARE_TEST to gpu tests Also, a few minor fixes for GPU tests running in HIP mode. 1. Adding an include for hip/hip_runtime.h in the Macros.h file For HIP __host__ and __device__ are macros which are defined in hip headers. Their definitions need to be included before their use in the file. 2. Fixing the compile failure in TensorContractionGpu introduced by the commit to "Fuse computations into the Tensor contractions using output kernel" 3. Fixing a HIP/clang specific compile error by making the struct-member assignment explicit	2018-07-17 14:16:48 -04:00
Gael Guennebaud	82f0ce2726	Get rid of EIGEN_TEST_FUNC, unit tests must now be declared with EIGEN_DECLARE_TEST(mytest) { /* code */ }. This provide several advantages: - more flexibility in designing unit tests - unit tests can be glued to speed up compilation - unit tests are compiled with same predefined macros, which is a requirement for zapcc	2018-07-17 14:46:15 +02:00
Eugene Zhulenev	43206ac4de	Call OutputKernel in evalGemv	2018-07-12 14:52:23 -07:00
Eugene Zhulenev	e204ecdaaf	Remove SimpleThreadPool and always use {NonBlocking}ThreadPool	2018-07-16 15:06:57 -07:00
Eugene Zhulenev	01fd4096d3	Fuse computations into the Tensor contractions using output kernel	2018-07-10 13:16:38 -07:00
Gael Guennebaud	5539587b1f	Some warning fixes	2018-07-17 10:29:12 +02:00
Benoit Steiner	8f55956a57	Update the padding computation for PADDING_SAME to be consistent with TensorFlow.	2018-01-30 20:22:12 +00:00
Lee.Deokjae	5b3c367926	Fix typos in the contraction example of tensor README	2018-01-06 14:36:19 +09:00
RJ Ryan	59985cfd26	Disable use of recurrence for computing twiddle factors. Fixes FFT precision issues for large FFTs. https://github.com/tensorflow/tensorflow/issues/10749#issuecomment-354557689	2017-12-31 10:44:56 -05:00
Gael Guennebaud	73214c4bd0	Workaround nvcc 9.0 issue. See PR 351. https://bitbucket.org/eigen/eigen/pull-requests/351	2017-12-15 14:10:59 +01:00
Yangzihao Wang	3122477c86	Update the padding computation for PADDING_SAME to be consistent with TensorFlow.	2017-12-12 11:15:24 -08:00
Rasmus Munk Larsen	e900b010c8	Improve robustness of igamma and igammac to bad inputs. Check for nan inputs and propagate them immediately. Limit the number of internal iterations to 2000 (same number as used by scipy.special.gammainc). This prevents an infinite loop when the function is called with nan or very large arguments. Original change by mfirgunov@google.com	2018-03-19 09:04:54 -07:00
Gael Guennebaud	00bc67c374	Move KLU support to official	2017-11-10 14:11:22 +01:00
Gael Guennebaud	b82cd93c01	KLU: truely disable unimplemented code, add proper static assertions in solve	2017-11-10 14:09:01 +01:00
Gael Guennebaud	8cf63ccb99	Merged in kylemacfarlan/eigen (pull request PR-337) Add support for SuiteSparse's KLU routines	2017-11-10 10:43:17 +00:00
Gael Guennebaud	1495b98a8e	Merged in spraetor/eigen (pull request PR-305) Issue with mpreal and std::numeric_limits::digits	2017-11-10 10:28:54 +00:00
Gael Guennebaud	fc45324380	Merged in jkflying/eigen-fix-scaling (pull request PR-302) Make scaling work with non-square matrices	2017-11-10 10:11:36 +00:00
Gael Guennebaud	1b2dcf9a47	Check that Schur decomposition succeed.	2017-11-10 10:26:09 +01:00
Gael Guennebaud	0a1cc73942	bug #1484 : restore deleted line for 128 bits long doubles, and improve dispatching logic.	2017-11-10 10:25:41 +01:00
Benoit Steiner	3949615176	Merged in JonasMu/eigen (pull request PR-329) Added an example for a contraction to a scalar value to README.md Approved-by: Jonas Harsch <jonas.harsch@gmail.com>	2017-10-27 07:27:46 +00:00
Benoit Steiner	a6d875bac8	Removed unecesasry #include	2017-10-22 08:12:45 -07:00
Benoit Steiner	8eb4b9d254	Merged in benoitsteiner/opencl (pull request PR-341)	2017-10-17 16:39:28 +00:00
Rasmus Munk Larsen	f349507e02	Specialize ThreadPoolDevice::enqueueNotification for the case with no args. As an example this reduces binary size of an TensorFlow demo app for Android by about 2.5%.	2017-10-13 15:58:12 -07:00
Kyle Vedder	c0e1d510fd	Add support for SuiteSparse's KLU routines	2017-10-04 21:01:23 -05:00
Mehdi Goli	2062ac9958	Changes required for new ComputeCpp CE version.	2017-09-18 18:17:39 +01:00
Rasmus Munk Larsen	1b7294f6fc	Fix cut-and-paste error.	2017-09-08 16:35:58 -07:00
Rasmus Munk Larsen	94e2213b38	Avoid undefined behavior in Eigen::TensorCostModel::numThreads. If the cost is large enough then the thread count can be larger than the maximum representable int, so just casting it to an int is undefined behavior. Contributed by phurst@google.com.	2017-09-08 15:49:55 -07:00
Gael Guennebaud	a91918a105	Merged in infinitei/eigen (pull request PR-328) bug #1464 : Fixes construction of EulerAngles from 3D vector expression. Approved-by: Tal Hadad <tal_hd@hotmail.com> Approved-by: Abhijit Kundu <abhijit.kundu@gatech.edu>	2017-09-06 08:42:14 +00:00
Jonas Harsch	a991c80365	Added an example for a contraction to a scalar value, e.g. a double contraction of two second order tensors and how you can get the value of the result. I lost one day to get this doen so I think it will help some guys. I also added Eigen:: to the IndexPair and and array in the same example.	2017-09-01 11:30:26 +00:00
Benoit Steiner	a4089991eb	Added support for CUDA 9.0.	2017-08-31 02:49:39 +00:00
Abhijit Kundu	6d991a9595	bug #1464 : Fixes construction of EulerAngles from 3D vector expression.	2017-08-30 13:26:30 -04:00
Gael Guennebaud	304ef29571	Handle min/max/inf/etc issue in cuda_fp16.h directly in test/main.h	2017-08-24 11:26:41 +02:00
Gael Guennebaud	21633e585b	bug #1462 : remove all occurences of the deprecated __CUDACC_VER__ macro by introducing EIGEN_CUDACC_VER	2017-08-24 11:06:47 +02:00
Benoit Steiner	84d7be103a	Fixing Argmax that was breaking upstream TensorFlow.	2017-07-22 03:19:34 +00:00
Benoit Steiner	f0b154a4b0	Code cleanup	2017-07-10 09:54:09 -07:00
Benoit Steiner	575cda76b3	Fixed syntax errors generated by xcode	2017-07-09 11:39:01 -07:00
Benoit Steiner	5ac27d5b51	Avoid relying on cxx11 features when possible.	2017-07-08 21:58:44 -07:00
Benoit Steiner	c5a241ab9b	Merged in benoitsteiner/opencl (pull request PR-323) Improved support for OpenCL	2017-07-07 16:27:33 +00:00
Benoit Steiner	b7ae4dd9ef	Merged in hughperkins/eigen/add-endif-labels-TensorReductionCuda.h (pull request PR-315) Add labels to #ifdef, in TensorReductionCuda.h	2017-07-07 04:23:52 +00:00
Benoit Steiner	9daed67952	Merged in tntnatbry/eigen (pull request PR-319) Tensor Trace op	2017-07-07 04:18:03 +00:00
Benoit Steiner	6795512e59	Improved the randomness of the tensor random generator	2017-07-06 21:12:45 -07:00
Benoit Steiner	dc524ac716	Fixed compilation warning	2017-07-06 21:11:15 -07:00
Benoit Steiner	62b4634ebe	Merged in mehdi_goli/upstr_benoit/TensorSYCLImageVolumePatchFixed (pull request PR-14) Applying Benoit's comment for Fixing ImageVolumePatch. * Applying Benoit's comment for Fixing ImageVolumePatch. Fixing conflict on cmake file. * Fixing dealocation of the memory in ImagePatch test for SYCL. * Fixing the automerge issue.	2017-07-06 05:08:13 +00:00
Benoit Steiner	53725c10b8	Merged in mehdi_goli/opencl/DataDependancy (pull request PR-10) DataDependancy * Wrapping data type to the pointer class for sycl in non-terminal nodes; not having that breaks Tensorflow Conv2d code. * Applying Ronnan's Comments. * Applying benoit's comments	2017-06-28 17:55:23 +00:00
Benoit Steiner	b8e805497e	Merged in benoitsteiner/opencl (pull request PR-318) Improved support for OpenCL	2017-06-13 05:01:10 +00:00
Gael Guennebaud	8640093af1	fix compilation in C++98	2017-06-09 12:45:01 +02:00
Hugh Perkins	9341f258d4	Add labels to #ifdef, in TensorReductionCuda.h	2017-06-06 15:51:06 +01:00
Benoit Steiner	1e736b9ead	Merged in mehdi_goli/opencl/SYCLAlignAllocator (pull request PR-7) Fixing SYCL alignment issue required by TensorFlow.	2017-05-26 17:23:00 +00:00
Benoit Steiner	9dee55ec33	Merged eigen/eigen into default	2017-05-26 09:01:04 -07:00
Mehdi Goli	0370d3576e	Applying Ronnan's comments.	2017-05-26 16:01:48 +01:00
Mehdi Goli	e3f964ed55	Applying Benoit's comment;removing dead code.	2017-05-25 11:17:26 +01:00
a-doumoulakis	fb853a857a	Restore misplaced comment	2017-05-24 17:50:15 +01:00
a-doumoulakis	7a8ba565f8	Merge changed from upstream	2017-05-24 17:45:29 +01:00
Mmanu Chaturvedi	2971503fed	Specializing numeric_limits For AutoDiffScalar	2017-05-23 17:12:36 -04:00
Gael Guennebaud	26e8f9171e	Fix compilation of matrix log with Map as input	2017-06-07 10:51:23 +02:00
Benoit Steiner	615733381e	Merged in mehdi_goli/opencl/FixingCmakeDependency (pull request PR-2) Fixing Cmake Dependency for SYCL	2017-05-22 17:43:06 +00:00
Mehdi Goli	76c0fc1f95	Fixing SYCL alignment issue required by TensorFlow.	2017-05-22 16:49:32 +01:00
Mehdi Goli	2d17128d6f	Fixing suported device list.	2017-05-22 16:40:33 +01:00
Mehdi Goli	61d7f3664a	Fixing Cmake Dependency for SYCL	2017-05-22 14:58:28 +01:00
a-doumoulakis	052426b824	Add support for triSYCL Eigen is now able to use triSYCL with EIGEN_SYCL_TRISYCL and TRISYCL_INCLUDE_DIR options Fix contraction kernel with correct nd_item dimension	2017-05-05 19:26:27 +01:00
RJ Ryan	949a2da38c	Use scalar_sum_op and scalar_quotient_op instead of operator+ and operator/ in MeanReducer. Improves support for std::complex types when compiling for CUDA. Expands on `e2e9cdd169` and `2bda1b0d93` .	2017-04-14 13:23:35 -07:00
Benoit Steiner	0d08165a7f	Merged in benoitsteiner/opencl (pull request PR-309) OpenCL improvements	2017-04-05 14:28:08 +00:00
Benoit Steiner	068cc09708	Preserve file naming conventions	2017-04-04 10:09:10 -07:00
Benoit Steiner	c302ea7bc4	Deleted empty line of code	2017-04-04 10:05:16 -07:00
Benoit Steiner	a5a0c8fac1	Guard sycl specific code under a EIGEN_USE_SYCL ifdef	2017-04-04 10:03:21 -07:00
Benoit Steiner	a1304b95b7	Code cleanup	2017-04-04 10:00:46 -07:00

... 3 4 5 6 7 ...

2718 Commits