eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2024-12-21 07:19:46 +08:00

Author	SHA1	Message	Date
Gael Guennebaud	06eb24cf4d	Introduce gpu_assert for assertion in device-code, and disable them with clang-cuda.	2018-07-13 16:04:27 +02:00
Gael Guennebaud	5fd03ddbfb	Make EIGEN_TEST_CUDA_CLANG more friendly with OSX	2018-07-13 16:03:14 +02:00
Gael Guennebaud	86d9c0255c	Forward declaring std::array does not work with all std libs, so let's just include <array>	2018-07-13 13:06:44 +02:00
David Hyde	d908afe35f	bug #1558 : fix a corner case in MINRES when both v_new and w_new vanish.	2018-07-08 22:06:38 -07:00
Eugene Zhulenev	6e654f3379	Reduce number of allocations in TensorContractionThreadPool.	2018-07-16 14:26:39 -07:00
Gael Guennebaud	7ccb623746	bug #1569 : fix Tensor<half>::mean() on AVX with respective unit test.	2018-07-19 13:15:40 +02:00
Alexey Frunze	1f523e7304	Add MIPS changes missing from previous merge.	2018-07-18 12:27:50 -07:00
Eugene Zhulenev	e3c2d61739	Assert that no output kernel is defined for GPU contraction	2018-07-18 14:34:22 -07:00
Eugene Zhulenev	086ded5c85	Disable type traits for GCC < 5.1.0	2018-07-18 16:32:55 -07:00
Eugene Zhulenev	79d4129cce	Specify default output kernel for TensorContractionOp	2018-07-18 14:21:01 -07:00
Gael Guennebaud	6e5a3b898f	Add regression for bugs #1573 and #1575	2018-07-18 23:34:34 +02:00
Gael Guennebaud	863580fe88	bug #1432 : fix conservativeResize for non-relocatable scalar types. For those we need to by-pass realloc routines and fall-back to allocate as new - copy - delete. The remaining problem is that we don't have any mechanism to accurately determine whether a type is relocatable or not, so currently let's be super conservative using either RequireInitialization or std::is_trivially_copyable	2018-07-18 23:33:07 +02:00
Gael Guennebaud	053ed97c72	Generalize ScalarWithExceptions to a full non-copyable and trowing scalar type to be used in other unit tests.	2018-07-18 23:27:37 +02:00
Gael Guennebaud	a503fc8725	bug #1575 : fix regression introduced in bug #1573 patch. Move ctor/assignment should not be defaulted.	2018-07-18 23:26:13 +02:00
Gael Guennebaud	308725c3c9	More clearly disable the inclusion of src/Core/arch/CUDA/Complex.h without CUDA	2018-07-18 13:51:36 +02:00
Alexey Frunze	3875fb05aa	Add support for MIPS SIMD (MSA)	2018-07-06 16:04:30 -07:00
Gael Guennebaud	44ea5f7623	Add unit test for -Tensor<complex> on GPU	2018-07-12 17:19:38 +02:00
Gael Guennebaud	12e1ebb68b	Remove local Index typedef from unit-tests	2018-07-12 17:16:40 +02:00
Gael Guennebaud	63185be8b2	Disable eigenvalues test for clang-cuda	2018-07-12 17:03:14 +02:00
Gael Guennebaud	bec013b2c9	fix unused warning	2018-07-12 17:02:18 +02:00
Gael Guennebaud	5c73c9223a	Fix shadowing typedefs	2018-07-12 17:01:07 +02:00
Gael Guennebaud	98728312c8	Fix compilation regarding std::array	2018-07-12 17:00:37 +02:00
Gael Guennebaud	eb3d8f68bb	fix unused warning	2018-07-12 16:59:47 +02:00
Gael Guennebaud	006e18e52b	Cleanup the mess in Eigen/Core by moving CUDA/HIP stuff at more appropriate places (Macros.h), and alignment/vectorization logic is now in util/ConfigureVectorization.h	2018-07-12 16:57:41 +02:00
Thales Sabino	9a6a43319f	Fix cxx11_tensor_fft not building on Windows. The type used in Eigen::DSizes needs to be at least 8 bytes long. Internally Tensor tries to convert this to an __int64 on Windows and this fails to build. On Linux, long and long long are both 8 byte integer types. * * * Changing from "long long" to "std::int64_t".	2018-07-12 11:20:59 +01:00
Gael Guennebaud	b347eb0b1c	Fix doc	2018-07-12 11:56:18 +02:00
Mark D Ryan	e79c5149bf	Fix AVX512 implementations of psqrt This commit fixes the AVX512 implementations of psqrt in the same way that `3ed67cb0bb` fixed the AVX2 version of this function. The AVX512 versions of psqrt incorrectly return -0.0 for negative values, instead of NaN. Fixing the issues requires adding some additional instructions that slow down the algorithms. A similar test to the one used in `3ed67cb0bb` shows that the corrected Packet16f code runs at 73% of the speed of the existing code, while the corrected Packed8d function runs at 68% of the original.	2018-06-25 05:05:02 -07:00
Yuefeng Zhou	1eff6cf8a7	Use device's allocate function instead of internal::aligned_malloc. This would make it easier to track memory usage in device instances.	2018-02-20 16:50:05 -08:00
Gael Guennebaud	adb134d47e	Fix implicit conversion from 0.0 to scalar	2018-02-16 22:26:01 +04:00
Gael Guennebaud	937ad18221	add unit test for SimplicialCholesky and Boost multiprec.	2018-02-16 22:25:11 +04:00
Julian Kent	6d451cf2b6	Add missing consts for rows and cols functions in SparseLU	2018-02-10 13:44:05 +01:00
Daniele E. Domenichelli	a12b8a8c75	FindEigen3: Set Eigen3_FOUND variable	2018-07-11 16:31:50 +02:00
Gael Guennebaud	8bdb214fd0	remove double ;;	2018-07-12 11:17:53 +02:00
Gael Guennebaud	a9060378d3	bug #1570 : fix warning	2018-07-12 11:07:09 +02:00
Gael Guennebaud	6cd6551b26	Add deprecated header files for TensorFlow	2018-07-12 10:50:53 +02:00
Gael Guennebaud	da0c604078	Merged in deven-amd/eigen (pull request PR-402) Adding support for using Eigen in HIP kernels.	2018-07-12 08:07:16 +00:00
Gael Guennebaud	a4ea611ca7	Remove useless specialization thanks to is_convertible being more robust.	2018-07-12 09:59:44 +02:00
Gael Guennebaud	8a40dda5a6	Add some basic unit-tests	2018-07-12 09:59:00 +02:00
Gael Guennebaud	8ef267ccbd	spellcheck	2018-07-12 09:58:29 +02:00
Gael Guennebaud	21cf4a1a8b	Make is_convertible more robust and conformant to std::is_convertible	2018-07-12 09:57:19 +02:00
Gael Guennebaud	8a5955a052	Optimize the product of a householder-sequence with the identity, and optimize the evaluation of a HouseholderSequence to a dense matrix using faster blocked product.	2018-07-11 17:16:50 +02:00
Gael Guennebaud	d193cc87f4	Fix regression in `9357838f94`	2018-07-11 17:09:23 +02:00
Gael Guennebaud	fb33687736	Fix double ;;	2018-07-11 17:08:30 +02:00
Deven Desai	876f392c39	Updates corresponding to the latest round of PR feedback The major changes are 1. Moving CUDA/PacketMath.h to GPU/PacketMath.h 2. Moving CUDA/MathFunctions.h to GPU/MathFunction.h 3. Moving CUDA/CudaSpecialFunctions.h to GPU/GpuSpecialFunctions.h The above three changes effectively enable the Eigen "Packet" layer for the HIP platform 4. Merging the "hip_basic" and "cuda_basic" unit tests into one ("gpu_basic") 5. Updating the "EIGEN_DEVICE_FUNC" marking in some places The change has been tested on the HIP and CUDA platforms.	2018-07-11 10:39:54 -04:00
Deven Desai	1fe0b74904	deleting hip specific files that are no longer required	2018-07-11 09:28:44 -04:00
Deven Desai	dec47a6493	renaming CUDA* to GPU* for some header files	2018-07-11 09:26:54 -04:00
Deven Desai	471cfe5ff7	renaming CUDA* to GPU* for some header files	2018-07-11 09:22:04 -04:00
Deven Desai	38807a2575	merging updates from upstream	2018-07-11 09:17:33 -04:00
Gael Guennebaud	f00d08cc0a	Optimize extraction of Q in SparseQR by exploiting the structure of the identity matrix.	2018-07-11 14:01:47 +02:00
Gael Guennebaud	1625476091	Add internall::is_identity compile-time helper	2018-07-11 14:00:24 +02:00

1 2 3 4 5 ...

9836 Commits