eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2024-12-21 07:19:46 +08:00

Author	SHA1	Message	Date
Paul Tucker	4e9848fa86	Actually add optional Allocator* arg to ThreadPoolDevice().	2018-07-16 17:53:36 -07:00
Paul Tucker	b3e7c9132d	Add optional Allocator argument to ThreadPoolDevice constructor. When supplied, this allocator will be used in place of internal::aligned_malloc. This permits e.g. use of a NUMA-node specific allocator where the thread-pool is also restricted a single NUMA-node.	2018-07-16 17:26:05 -07:00
Gael Guennebaud	40797dbea3	bug #1572 : use c++11 atomic instead of volatile if c++11 is available, and disable multi-threaded GEMM on non-x86 without c++11.	2018-07-17 00:11:20 +02:00
Gael Guennebaud	add5757488	Simplify handling and non-splitted tests and include split_test_helper.h instead of re-generating it. This also allows us to modify it without breaking existing build folder.	2018-07-16 18:55:40 +02:00
Gael Guennebaud	901c7d31f0	Fix usage of EIGEN_SPLIT_LARGE_TESTS=ON: some unit tests, such as indexed_view have to be split unconditionally.	2018-07-16 18:35:05 +02:00
Gael Guennebaud	f2b52f9946	Add the cmake option "EIGEN_DASHBOARD_BUILD_TARGET" to control the build target in dashboard mode (e.g., ctest -D Experimental)	2018-07-16 17:59:30 +02:00
Gael Guennebaud	23d82c1ac5	Merged in rmlarsen/eigen2 (pull request PR-422) Optimize the case where broadcasting is a no-op.	2018-07-14 11:42:58 +00:00
Gael Guennebaud	a87cff20df	Fix GeneralizedEigenSolver when requesting for eigenvalues only.	2018-07-14 09:38:49 +02:00
Rasmus Munk Larsen	3a9cf4e290	Get rid of alias for m_broadcast.	2018-07-13 16:24:48 -07:00
Rasmus Munk Larsen	4222550e17	Optimize the case where broadcasting is a no-op.	2018-07-13 16:12:38 -07:00
Rasmus Munk Larsen	4a3952fd55	Relax the condition to not only work on Android.	2018-07-13 11:24:07 -07:00
Rasmus Munk Larsen	02a9443db9	Clang produces incorrect Thumb2 assembler when using alloca. Don't define EIGEN_ALLOCA when generating Thumb with clang.	2018-07-13 11:03:04 -07:00
Gael Guennebaud	20991c3203	bug #1571 : fix is_convertible<from,to> with "from" a reference.	2018-07-13 17:47:28 +02:00
Gael Guennebaud	1920129d71	Remove clang warning	2018-07-13 16:05:35 +02:00
Gael Guennebaud	195c9c054b	Print more debug info in gpu_basic	2018-07-13 16:05:07 +02:00
Gael Guennebaud	06eb24cf4d	Introduce gpu_assert for assertion in device-code, and disable them with clang-cuda.	2018-07-13 16:04:27 +02:00
Gael Guennebaud	5fd03ddbfb	Make EIGEN_TEST_CUDA_CLANG more friendly with OSX	2018-07-13 16:03:14 +02:00
Gael Guennebaud	86d9c0255c	Forward declaring std::array does not work with all std libs, so let's just include <array>	2018-07-13 13:06:44 +02:00
David Hyde	d908afe35f	bug #1558 : fix a corner case in MINRES when both v_new and w_new vanish.	2018-07-08 22:06:38 -07:00
Eugene Zhulenev	6e654f3379	Reduce number of allocations in TensorContractionThreadPool.	2018-07-16 14:26:39 -07:00
Gael Guennebaud	7ccb623746	bug #1569 : fix Tensor<half>::mean() on AVX with respective unit test.	2018-07-19 13:15:40 +02:00
Alexey Frunze	1f523e7304	Add MIPS changes missing from previous merge.	2018-07-18 12:27:50 -07:00
Eugene Zhulenev	e3c2d61739	Assert that no output kernel is defined for GPU contraction	2018-07-18 14:34:22 -07:00
Eugene Zhulenev	086ded5c85	Disable type traits for GCC < 5.1.0	2018-07-18 16:32:55 -07:00
Eugene Zhulenev	79d4129cce	Specify default output kernel for TensorContractionOp	2018-07-18 14:21:01 -07:00
Gael Guennebaud	6e5a3b898f	Add regression for bugs #1573 and #1575	2018-07-18 23:34:34 +02:00
Gael Guennebaud	863580fe88	bug #1432 : fix conservativeResize for non-relocatable scalar types. For those we need to by-pass realloc routines and fall-back to allocate as new - copy - delete. The remaining problem is that we don't have any mechanism to accurately determine whether a type is relocatable or not, so currently let's be super conservative using either RequireInitialization or std::is_trivially_copyable	2018-07-18 23:33:07 +02:00
Gael Guennebaud	053ed97c72	Generalize ScalarWithExceptions to a full non-copyable and trowing scalar type to be used in other unit tests.	2018-07-18 23:27:37 +02:00
Gael Guennebaud	a503fc8725	bug #1575 : fix regression introduced in bug #1573 patch. Move ctor/assignment should not be defaulted.	2018-07-18 23:26:13 +02:00
Gael Guennebaud	308725c3c9	More clearly disable the inclusion of src/Core/arch/CUDA/Complex.h without CUDA	2018-07-18 13:51:36 +02:00
Alexey Frunze	3875fb05aa	Add support for MIPS SIMD (MSA)	2018-07-06 16:04:30 -07:00
Gael Guennebaud	44ea5f7623	Add unit test for -Tensor<complex> on GPU	2018-07-12 17:19:38 +02:00
Gael Guennebaud	12e1ebb68b	Remove local Index typedef from unit-tests	2018-07-12 17:16:40 +02:00
Gael Guennebaud	63185be8b2	Disable eigenvalues test for clang-cuda	2018-07-12 17:03:14 +02:00
Gael Guennebaud	bec013b2c9	fix unused warning	2018-07-12 17:02:18 +02:00
Gael Guennebaud	5c73c9223a	Fix shadowing typedefs	2018-07-12 17:01:07 +02:00
Gael Guennebaud	98728312c8	Fix compilation regarding std::array	2018-07-12 17:00:37 +02:00
Gael Guennebaud	eb3d8f68bb	fix unused warning	2018-07-12 16:59:47 +02:00
Gael Guennebaud	006e18e52b	Cleanup the mess in Eigen/Core by moving CUDA/HIP stuff at more appropriate places (Macros.h), and alignment/vectorization logic is now in util/ConfigureVectorization.h	2018-07-12 16:57:41 +02:00
Thales Sabino	9a6a43319f	Fix cxx11_tensor_fft not building on Windows. The type used in Eigen::DSizes needs to be at least 8 bytes long. Internally Tensor tries to convert this to an __int64 on Windows and this fails to build. On Linux, long and long long are both 8 byte integer types. * * * Changing from "long long" to "std::int64_t".	2018-07-12 11:20:59 +01:00
Gael Guennebaud	b347eb0b1c	Fix doc	2018-07-12 11:56:18 +02:00
Mark D Ryan	e79c5149bf	Fix AVX512 implementations of psqrt This commit fixes the AVX512 implementations of psqrt in the same way that `3ed67cb0bb` fixed the AVX2 version of this function. The AVX512 versions of psqrt incorrectly return -0.0 for negative values, instead of NaN. Fixing the issues requires adding some additional instructions that slow down the algorithms. A similar test to the one used in `3ed67cb0bb` shows that the corrected Packet16f code runs at 73% of the speed of the existing code, while the corrected Packed8d function runs at 68% of the original.	2018-06-25 05:05:02 -07:00
Yuefeng Zhou	1eff6cf8a7	Use device's allocate function instead of internal::aligned_malloc. This would make it easier to track memory usage in device instances.	2018-02-20 16:50:05 -08:00
Gael Guennebaud	adb134d47e	Fix implicit conversion from 0.0 to scalar	2018-02-16 22:26:01 +04:00
Gael Guennebaud	937ad18221	add unit test for SimplicialCholesky and Boost multiprec.	2018-02-16 22:25:11 +04:00
Julian Kent	6d451cf2b6	Add missing consts for rows and cols functions in SparseLU	2018-02-10 13:44:05 +01:00
Daniele E. Domenichelli	a12b8a8c75	FindEigen3: Set Eigen3_FOUND variable	2018-07-11 16:31:50 +02:00
Gael Guennebaud	8bdb214fd0	remove double ;;	2018-07-12 11:17:53 +02:00
Gael Guennebaud	a9060378d3	bug #1570 : fix warning	2018-07-12 11:07:09 +02:00
Gael Guennebaud	6cd6551b26	Add deprecated header files for TensorFlow	2018-07-12 10:50:53 +02:00

... 5 6 7 8 9 ...

10080 Commits