eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2025-01-06 14:14:46 +08:00

Author	SHA1	Message	Date
Benoit Steiner	e23c8c294e	Use actual types instead of the auto keyword to make the code more portable	2018-08-16 10:41:01 -07:00
Mehdi Goli	80f1a76dec	removing the noises.	2018-08-16 13:33:24 +01:00
Mehdi Goli	d0b01ebbf6	Reverting the unitended delete from the code.	2018-08-16 13:21:36 +01:00
Mehdi Goli	161dcbae9b	Using PointerType struct and specializing it per device for TensorCustomOp.h	2018-08-16 00:07:02 +01:00
Sameer Agarwal	f197c3f55b	Removed an used variable (PacketSize) from TensorExecutor	2018-08-15 11:24:57 -07:00
Benoit Steiner	4181556907	Fixed the tensor contraction code.	2018-08-15 09:34:47 -07:00
Benoit Steiner	fbb834144d	Fixed more compilation errors	2018-08-15 08:52:58 -07:00
Benoit Steiner	ab3f481141	Cleaned up the code and make it compile with more compilers	2018-08-14 14:05:46 -07:00
Rasmus Munk Larsen	fa0bcbf230	merge	2018-08-14 12:18:31 -07:00
Rasmus Munk Larsen	15d4f515e2	Use plain_assert in destructors to avoid throwing in CXX11 tests where main.h owerwrites eigen_assert with a throwing version.	2018-08-14 12:17:46 -07:00
Rasmus Munk Larsen	2a98bd9c8e	Merged eigen/eigen into default	2018-08-14 12:02:09 -07:00
Benoit Steiner	59bba77ead	Fixed compilation errors with gcc 4.7 and 4.8	2018-08-14 10:54:48 -07:00
Mehdi Goli	8ba799805b	Merge with upstream	2018-08-14 09:43:45 +01:00
Rasmus Munk Larsen	6d6e7b7027	merge	2018-08-13 15:34:50 -07:00
Rasmus Munk Larsen	9bb75d8d31	Add Barrier.h.	2018-08-13 15:34:03 -07:00
Rasmus Munk Larsen	2e1adc0324	Merged eigen/eigen into default	2018-08-13 15:32:00 -07:00
Rasmus Munk Larsen	8278ae6313	Add support for thread local support on platforms that do not support it through emulation using a hash map.	2018-08-13 15:31:23 -07:00
Benoit Steiner	501be70b27	Code cleanup	2018-08-13 15:16:40 -07:00
Gael Guennebaud	3ec60215df	Merged in rmlarsen/eigen2 (pull request PR-466) Move sigmoid functor to core and rename it to 'logistic'.	2018-08-13 21:28:20 +00:00
Rasmus Munk Larsen	0f1b2e08a5	Call logistic functor from Tensor::sigmoid.	2018-08-13 11:52:58 -07:00
Benoit Steiner	26239ee580	Use NULL instead of nullptr to avoid adding a cxx11 requirement.	2018-08-13 11:05:51 -07:00
Benoit Steiner	3810ec228f	Don't use the auto keyword since it's not always supported properly.	2018-08-13 10:46:09 -07:00
Benoit Steiner	e6d5be811d	Fixed syntax of nested templates chevrons to make it compatible with c++97 mode.	2018-08-13 10:29:21 -07:00
Mehdi Goli	1aa86aad14	Merge with upstream.	2018-08-13 15:40:31 +01:00
Eugene Zhulenev	35d90e8960	Fix BlockAccess enum in CwiseUnaryOp evaluator	2018-08-10 17:37:58 -07:00
Eugene Zhulenev	855b68896b	Merge with eigen/default	2018-08-10 17:18:42 -07:00
Eugene Zhulenev	f2209d06e4	Add block evaluationto CwiseUnaryOp and add PreferBlockAccess enum to all evaluators	2018-08-10 16:53:36 -07:00
Benoit Steiner	c8ea398675	Avoided language features that are only available in cxx11 mode.	2018-08-10 13:02:41 -07:00
Benoit Steiner	4be4286224	Made the code compile with gcc 5.4.	2018-08-10 11:32:58 -07:00
Eugene Zhulenev	cfaedb38cd	Fix bug in a test + compilation errors	2018-08-09 09:44:07 -07:00
Mehdi Goli	ea8fa5e86f	Merge with upstream	2018-08-09 14:07:56 +01:00
Mehdi Goli	8c083bfd0e	Properly fixing the PointerType for TensorCustomOp.h. As the output type here should be based on CoeffreturnType not the Scalar type. Therefore, Similar to reduction and evalTo function, it should have its own MakePointer class. In this case, for other device the type is defaulted to CoeffReturnType and no changes is required on users' code. However, in SYCL, on the device, we can recunstruct the device Type.	2018-08-09 13:57:43 +01:00
Eugene Zhulenev	1c8b9e10a7	Merged with upstream eigen	2018-08-08 16:57:58 -07:00
Benoit Steiner	131ed1191f	Merged in codeplaysoftware/eigen-upstream-pure/Fixing_compiler_warning (pull request PR-462) Fixing compiler warning in TensorBlock.h as it was creating a lot of noise at compilation.	2018-08-08 18:14:15 +00:00
Mehdi Goli	532a0be05c	Fixing compiler warning in TensorBlock.h as it was creating a lot of noise at compilation.	2018-08-08 12:12:26 +01:00
Mehdi Goli	3055e3a7c2	Creating a pointer type in TensorCustomOp.h	2018-08-08 11:19:02 +01:00
Rasmus Munk Larsen	693fb1d41e	Fix init order.	2018-08-07 17:18:51 -07:00
Benoit Steiner	10d286f55b	Silenced a couple of compilation warnings.	2018-08-06 16:00:29 -07:00
Benoit Steiner	d011d05fd6	Fixed compilation errors.	2018-08-06 13:40:51 -07:00
Rasmus Munk Larsen	36e7e7dd8f	Forward declare NoOpOutputKernel as struct rather than class to be consistent with implementation.	2018-08-06 13:16:32 -07:00
Rasmus Munk Larsen	fa68342ef8	Move sigmoid functor to core.	2018-08-03 17:31:23 -07:00
Christoph Hertzberg	023ed6b9a8	Product of empty array must be 1 and not 0.	2018-08-30 17:14:52 +02:00
Christoph Hertzberg	c2f4e8c08e	Fix integer conversion warning	2018-08-30 17:12:53 +02:00
Deven Desai	946c3e2544	adding EIGEN_DEVICE_FUNC attribute to fix some GPU unit tests that are broken in HIP mode	2018-08-27 23:04:08 +00:00
Christoph Hertzberg	73ca600bca	Fix numerous shadow-warnings for GCC<=4.8	2018-08-28 18:32:39 +02:00
Eugene Zhulenev	1b0373ae10	Replace all using declarations with typedefs in Tensor ops	2018-08-01 15:55:46 -07:00
Rasmus Munk Larsen	bcb29f890c	Fix initialization order.	2018-08-03 10:18:53 -07:00
Mehdi Goli	3074b1ff9e	Fixing the compilation error.	2018-08-03 17:13:44 +01:00
Mehdi Goli	01358300d5	Creating separate SYCL required PR for uncontroversial files.	2018-08-03 16:59:15 +01:00
Eugene Zhulenev	64abdf1d7e	Fix typo + get rid of redundant member variables for block sizes	2018-08-01 12:35:19 -07:00
Benoit Steiner	93b9e36e10	Merged in paultucker/eigen (pull request PR-431) Optional ThreadPoolDevice allocator Approved-by: Benoit Steiner <benoit.steiner.goog@gmail.com>	2018-08-01 19:14:34 +00:00
Eugene Zhulenev	385b3ff12f	Merged latest changes from upstream/eigen	2018-08-01 11:59:04 -07:00
Benoit Steiner	17221115c9	Merged in codeplaysoftware/eigen-upstream-pure/eigen_variadic_assert (pull request PR-447) Adding variadic version of assert which can take a parameter pack as its input.	2018-08-01 16:41:54 +00:00
Benoit Steiner	0360c36170	Merged in codeplaysoftware/eigen-upstream-pure/separating_internal_memory_allocation (pull request PR-446) Distinguishing between internal memory allocation/deallocation from explicit user memory allocation/deallocation.	2018-08-01 16:13:15 +00:00
Mehdi Goli	c6a5c70712	Correcting the position of allocate_temp/deallocate_temp in TensorDeviceGpu.h	2018-08-01 16:56:26 +01:00
Benoit Steiner	45f75f1ace	Merged in codeplaysoftware/eigen-upstream-pure/using_PacketType_class (pull request PR-449) Enabling per device specialisation of packetSize.	2018-08-01 15:43:03 +00:00
Mehdi Goli	af96018b49	Using the suggested modification.	2018-08-01 16:04:44 +01:00
Mehdi Goli	b512a9536f	Enabling per device specialisation of packetsize.	2018-08-01 13:39:13 +01:00
Mehdi Goli	3a197a60e6	variadic version of assert which can take a parameter pack as its input.	2018-08-01 12:19:14 +01:00
Mehdi Goli	d7a8414848	Distinguishing between internal memory allocation/deallocation from explicit user memory allocation/deallocation.	2018-08-01 11:56:30 +01:00
Mehdi Goli	9e219bb3d3	Converting ad-hoc inline keyword to EIGEN_STRONG_INLINE MACRO.	2018-08-01 10:47:49 +01:00
Eugene Zhulenev	83c0a16baf	Add block evaluation support to TensorOps	2018-07-31 15:56:31 -07:00
Benoit Steiner	edf46bd7a2	Merged in yuefengz/eigen (pull request PR-370) Use device's allocate function instead of internal::aligned_malloc.	2018-07-31 22:38:28 +00:00
Paul Tucker	385f7b8d0c	Change getAllocator() to allocator() in ThreadPoolDevice.	2018-07-31 13:52:18 -07:00
Gael Guennebaud	678a0dcb12	Merged in ezhulenev/eigen/tiling_3 (pull request PR-438) Tiled tensor executor	2018-07-31 08:13:00 +00:00
Gael Guennebaud	679eece876	Speedup trivial tensor broadcasting on GPU by enforcing unaligned loads. See PR 437.	2018-07-31 10:10:14 +02:00
Eugene Zhulenev	966c2a7bb6	Rename Index to StorageIndex + use Eigen::Array and Eigen::Map when possible	2018-07-27 12:45:17 -07:00
Eugene Zhulenev	6913221c43	Add tiled evaluation support to TensorExecutor	2018-07-25 13:51:10 -07:00
Rasmus Munk Larsen	e478532625	Reduce the number of template specializations of classes related to tensor contraction to reduce binary size.	2018-07-27 12:36:34 -07:00
Eugene Zhulenev	d55efa6f0f	TensorBlockIO	2018-07-23 15:50:55 -07:00
Eugene Zhulenev	34a75c3c5c	Initial support of TensorBlock	2018-07-20 17:37:20 -07:00
Paul Tucker	d4afccde5a	Add test coverage for ThreadPoolDevice optional allocator.	2018-07-19 17:43:44 -07:00
Eugene Zhulenev	c58b874727	PR430: Convert count to the reducer type in MeanReducer Without explicit conversion Tensorflow fails to compile, pset1 template deduction fails. cannot convert '((const Eigen::internal::MeanReducer<Eigen::half>*)this) ->Eigen::internal::MeanReducer<Eigen::half>::packetCount_' (type 'const DenseIndex {aka const long int}') to type 'const type& {aka const Eigen::half&}' return pdiv(vaccum, pset1<Packet>(packetCount_)); Honestly I’m not sure why it works in Eigen tests, because Eigen::half constructor is explicit, and why it stopped working in TF, I didn’t find any relevant changes since previous Eigen upgrade. static_cast<T>(packetCount_) - breaks cxx11_tensor_reductions test for Eigen::half, also quite surprising.	2018-07-19 17:37:03 -07:00
Paul Tucker	4e9848fa86	Actually add optional Allocator* arg to ThreadPoolDevice().	2018-07-16 17:53:36 -07:00
Paul Tucker	b3e7c9132d	Add optional Allocator argument to ThreadPoolDevice constructor. When supplied, this allocator will be used in place of internal::aligned_malloc. This permits e.g. use of a NUMA-node specific allocator where the thread-pool is also restricted a single NUMA-node.	2018-07-16 17:26:05 -07:00
Rasmus Munk Larsen	3a9cf4e290	Get rid of alias for m_broadcast.	2018-07-13 16:24:48 -07:00
Rasmus Munk Larsen	4222550e17	Optimize the case where broadcasting is a no-op.	2018-07-13 16:12:38 -07:00
Gael Guennebaud	06eb24cf4d	Introduce gpu_assert for assertion in device-code, and disable them with clang-cuda.	2018-07-13 16:04:27 +02:00
Eugene Zhulenev	6e654f3379	Reduce number of allocations in TensorContractionThreadPool.	2018-07-16 14:26:39 -07:00
Gael Guennebaud	7ccb623746	bug #1569 : fix Tensor<half>::mean() on AVX with respective unit test.	2018-07-19 13:15:40 +02:00
Eugene Zhulenev	e3c2d61739	Assert that no output kernel is defined for GPU contraction	2018-07-18 14:34:22 -07:00
Eugene Zhulenev	79d4129cce	Specify default output kernel for TensorContractionOp	2018-07-18 14:21:01 -07:00
Yuefeng Zhou	1eff6cf8a7	Use device's allocate function instead of internal::aligned_malloc. This would make it easier to track memory usage in device instances.	2018-02-20 16:50:05 -08:00
Gael Guennebaud	6cd6551b26	Add deprecated header files for TensorFlow	2018-07-12 10:50:53 +02:00
Deven Desai	876f392c39	Updates corresponding to the latest round of PR feedback The major changes are 1. Moving CUDA/PacketMath.h to GPU/PacketMath.h 2. Moving CUDA/MathFunctions.h to GPU/MathFunction.h 3. Moving CUDA/CudaSpecialFunctions.h to GPU/GpuSpecialFunctions.h The above three changes effectively enable the Eigen "Packet" layer for the HIP platform 4. Merging the "hip_basic" and "cuda_basic" unit tests into one ("gpu_basic") 5. Updating the "EIGEN_DEVICE_FUNC" marking in some places The change has been tested on the HIP and CUDA platforms.	2018-07-11 10:39:54 -04:00
Deven Desai	38807a2575	merging updates from upstream	2018-07-11 09:17:33 -04:00
Gael Guennebaud	6190aa5632	bug #1567 : add optimized path for tensor broadcasting and 'Channel First' shape	2018-07-09 11:23:16 +02:00
Deven Desai	1bb6fa99a3	merging the CUDA and HIP implementation for the Tensor directory and the unit tests	2018-06-20 16:44:58 -04:00
Deven Desai	cfdabbcc8f	removing the *Hip files from the unsupported/Eigen/CXX11/src/Tensor and unsupported/test directories	2018-06-20 12:57:02 -04:00
Deven Desai	7e41c8f1a9	renaming Cuda files to Gpu in the unsupported/Eigen/CXX11/src/Tensor and unsupported/test directories	2018-06-20 12:52:30 -04:00
Deven Desai	b6cc0961b1	updates based on PR feedback There are two major changes (and a few minor ones which are not listed here...see PR discussion for details) 1. Eigen::half implementations for HIP and CUDA have been merged. This means that - `CUDA/Half.h` and `HIP/hcc/Half.h` got merged to a new file `GPU/Half.h` - `CUDA/PacketMathHalf.h` and `HIP/hcc/PacketMathHalf.h` got merged to a new file `GPU/PacketMathHalf.h` - `CUDA/TypeCasting.h` and `HIP/hcc/TypeCasting.h` got merged to a new file `GPU/TypeCasting.h` After this change the `HIP/hcc` directory only contains one file `math_constants.h`. That will go away too once that file becomes a part of the HIP install. 2. new macros EIGEN_GPUCC, EIGEN_GPU_COMPILE_PHASE and EIGEN_HAS_GPU_FP16 have been added and the code has been updated to use them where appropriate. - `EIGEN_GPUCC` is the same as `(EIGEN_CUDACC \|\| EIGEN_HIPCC)` - `EIGEN_GPU_DEVICE_COMPILE` is the same as `(EIGEN_CUDA_ARCH \|\| EIGEN_HIP_DEVICE_COMPILE)` - `EIGEN_HAS_GPU_FP16` is the same as `(EIGEN_HAS_CUDA_FP16 or EIGEN_HAS_HIP_FP16)`	2018-06-14 10:21:54 -04:00
Deven Desai	d1d22ef0f4	syncing this fork with upstream	2018-06-13 12:09:52 -04:00
Rasmus Munk Larsen	5418154a45	Fix oversharding bug in parallelFor.	2018-06-20 17:51:48 -07:00
Michael Figurnov	30fa3d0454	Merge from eigen/eigen	2018-06-07 17:57:56 +01:00
Gael Guennebaud	b3fd93207b	Fix typos found using codespell	2018-06-07 14:43:02 +02:00
Michael Figurnov	4bd158fa37	Derivative of the incomplete Gamma function and the sample of a Gamma random variable. In addition to igamma(a, x), this code implements: * igamma_der_a(a, x) = d igamma(a, x) / da -- derivative of igamma with respect to the parameter * gamma_sample_der_alpha(alpha, sample) -- reparameterization derivative of a Gamma(alpha, 1) random variable sample with respect to the alpha parameter The derivatives are computed by forward mode differentiation of the igamma(a, x) code. Although gamma_sample_der_alpha can be implemented via igamma_der_a, a separate function is more accurate and efficient due to analytical cancellation of some terms. All three functions are implemented by a method parameterized with "mode" that always computes the derivatives, but does not return them unless required by the mode. The compiler is expected to (and, based on benchmarks, does) skip the unnecessary computations depending on the mode.	2018-06-06 18:49:26 +01:00
Deven Desai	8fbd47052b	Adding support for using Eigen in HIP kernels. This commit enables the use of Eigen on HIP kernels / AMD GPUs. Support has been added along the same lines as what already exists for using Eigen in CUDA kernels / NVidia GPUs. Application code needs to explicitly define EIGEN_USE_HIP when using Eigen in HIP kernels. This is because some of the CUDA headers get picked up by default during Eigen compile (irrespective of whether or not the underlying compiler is CUDACC/NVCC, for e.g. Eigen/src/Core/arch/CUDA/Half.h). In order to maintain this behavior, the EIGEN_USE_HIP macro is used to switch to using the HIP version of those header files (see Eigen/Core and unsupported/Eigen/CXX11/Tensor) Use the "-DEIGEN_TEST_HIP" cmake option to enable the HIP specific unit tests.	2018-06-06 10:12:58 -04:00
Benoit Steiner	e206f8d4a4	Merged in mfigurnov/eigen (pull request PR-400) Exponentially scaled modified Bessel functions of order zero and one. Approved-by: Benoit Steiner <benoit.steiner.goog@gmail.com>	2018-06-05 17:05:21 +00:00
Penporn Koanantakool	e2ed0cf8ab	Add a ThreadPoolInterface* getter for ThreadPoolDevice.	2018-06-02 12:07:49 -07:00
Michael Figurnov	f216854453	Exponentially scaled modified Bessel functions of order zero and one. The functions are conventionally called i0e and i1e. The exponentially scaled version is more numerically stable. The standard Bessel functions can be obtained as i0(x) = exp(\|x\|) i0e(x) The code is ported from Cephes and tested against SciPy.	2018-05-31 15:34:53 +01:00

1 2 3 4 5 ...

1134 Commits