eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2025-03-07 18:27:40 +08:00

Author	SHA1	Message	Date
Mehdi Goli	b512a9536f	Enabling per device specialisation of packetsize.	2018-08-01 13:39:13 +01:00
Benoit Steiner	edf46bd7a2	Merged in yuefengz/eigen (pull request PR-370) Use device's allocate function instead of internal::aligned_malloc.	2018-07-31 22:38:28 +00:00
Gael Guennebaud	678a0dcb12	Merged in ezhulenev/eigen/tiling_3 (pull request PR-438) Tiled tensor executor	2018-07-31 08:13:00 +00:00
Gael Guennebaud	679eece876	Speedup trivial tensor broadcasting on GPU by enforcing unaligned loads. See PR 437.	2018-07-31 10:10:14 +02:00
Gael Guennebaud	723856dec1	bug #1577 : fix msvc compilation of unit test, msvc defines ptrdiff_t as long long	2018-07-30 14:52:15 +02:00
Eugene Zhulenev	966c2a7bb6	Rename Index to StorageIndex + use Eigen::Array and Eigen::Map when possible	2018-07-27 12:45:17 -07:00
Eugene Zhulenev	6913221c43	Add tiled evaluation support to TensorExecutor	2018-07-25 13:51:10 -07:00
Alexey Frunze	7b91c11207	bug #1578 : Improve prefetching in matrix multiplication on MIPS.	2018-07-24 18:36:44 -07:00
Patrik Huber	f5cace5e9f	Fix two small typos in the documentation	2018-07-26 19:55:19 +00:00
Gael Guennebaud	34539c4af4	Merged in rmlarsen/eigen1 (pull request PR-441) Reduce the number of template specializations of classes related to tensor contraction to reduce binary size.	2018-07-30 11:26:24 +00:00
Mark D Ryan	bc615e4585	Re-enable FMA for fast sqrt functions	2018-07-30 13:21:00 +02:00
Mark D Ryan	96b030a8e4	Re-enable FMA for fast sqrt functions This commit re-enables the use of FMA for the FAST sqrt functions. Doing so improves the performance of both algorithms. The float32 version is now 88% the speed of the original function, while the double version is 90%.	2018-07-30 10:19:51 +01:00
Rasmus Munk Larsen	e478532625	Reduce the number of template specializations of classes related to tensor contraction to reduce binary size.	2018-07-27 12:36:34 -07:00
Rasmus Munk Larsen	2ebcb911b2	Add pcast packet op for NEON.	2018-07-26 14:28:48 -07:00
Christoph Hertzberg	397b0547e1	DIsable static assertions only when necessary and disable double-promotion warnings in that case as well	2018-07-26 00:01:24 +02:00
Christoph Hertzberg	5e79402b4a	fix warnings for doc-eigen-prerequisites	2018-07-24 21:59:15 +02:00
Christoph Hertzberg	5f79b7f9a9	Removed several shadowing types and use global Index typedef everywhere	2018-07-25 21:47:45 +02:00
Christoph Hertzberg	44ee201337	Rename variable which shadows class name	2018-07-25 20:26:15 +02:00
Gustavo Lima Chaves	705f66a9ca	Account for missing change on commit "Remove SimpleThreadPool and..." "... always use {NonBlocking}ThreadPool". It seems the non-blocking implementation was me the default/only one, but a reference to the old name was left unmodified. Fix that.	2018-07-23 16:29:09 -07:00
Christoph Hertzberg	fd4fe7cbc5	Fixed issue which made documentation not getting built anymore	2018-07-24 22:56:15 +02:00
Christoph Hertzberg	636126ef40	Allow to filter out build-error messages	2018-07-24 20:12:49 +02:00
Eugene Zhulenev	d55efa6f0f	TensorBlockIO	2018-07-23 15:50:55 -07:00
Eugene Zhulenev	34a75c3c5c	Initial support of TensorBlock	2018-07-20 17:37:20 -07:00
Gael Guennebaud	2c2de9da7d	Merged in glchaves/eigen (pull request PR-433) Move cxx11_tensor_uint128 test under an EIGEN_TEST_CXX11 guarded block	2018-07-23 19:38:55 +00:00
Gael Guennebaud	4ca3e48f42	fix typo	2018-07-23 16:51:57 +02:00
Gael Guennebaud	c747cde69a	Add lastN shorcuts to seq/seqN.	2018-07-23 16:20:25 +02:00
Gustavo Lima Chaves	02eaaacbc5	Move cxx11_tensor_uint128 test under an EIGEN_TEST_CXX11 guarded block Builds configured without the -DEIGEN_TEST_CXX11=ON flag would fail right away without this, as this test seems to rely on those language features. The skip under compilation with MSVC was kept.	2018-07-20 16:08:40 -07:00
Eugene Zhulenev	2bf864f1eb	Disable type traits for stdlibc++ <= 4.9.3	2018-07-20 10:11:44 -07:00
Gael Guennebaud	de70671937	Oopps, EIGEN_COMP_MSVC is not available before including Eigen.	2018-07-20 17:51:17 +02:00
Gael Guennebaud	56a750b6cc	Disable optimization for sparse_product unit test with MSVC 2013, otherwise it takes several hours to build.	2018-07-20 08:36:38 -07:00
Eugene Zhulenev	c58b874727	PR430: Convert count to the reducer type in MeanReducer Without explicit conversion Tensorflow fails to compile, pset1 template deduction fails. cannot convert '((const Eigen::internal::MeanReducer<Eigen::half>*)this) ->Eigen::internal::MeanReducer<Eigen::half>::packetCount_' (type 'const DenseIndex {aka const long int}') to type 'const type& {aka const Eigen::half&}' return pdiv(vaccum, pset1<Packet>(packetCount_)); Honestly I’m not sure why it works in Eigen tests, because Eigen::half constructor is explicit, and why it stopped working in TF, I didn’t find any relevant changes since previous Eigen upgrade. static_cast<T>(packetCount_) - breaks cxx11_tensor_reductions test for Eigen::half, also quite surprising.	2018-07-19 17:37:03 -07:00
Gael Guennebaud	2424e3b7ac	Pass by const ref.	2018-07-19 18:48:19 +02:00
Gael Guennebaud	509a5fa77f	Fix IsRelocatable without C++11	2018-07-19 18:47:38 +02:00
Gael Guennebaud	2ca2592009	Fix determination of EIGEN_HAS_TYPE_TRAITS	2018-07-19 18:47:18 +02:00
Gael Guennebaud	5e5987996f	Fix stupid error in Quaternion move ctor	2018-07-19 18:33:53 +02:00
David Hyde	d908afe35f	bug #1558 : fix a corner case in MINRES when both v_new and w_new vanish.	2018-07-08 22:06:38 -07:00
Eugene Zhulenev	6e654f3379	Reduce number of allocations in TensorContractionThreadPool.	2018-07-16 14:26:39 -07:00
Gael Guennebaud	7ccb623746	bug #1569 : fix Tensor<half>::mean() on AVX with respective unit test.	2018-07-19 13:15:40 +02:00
Alexey Frunze	1f523e7304	Add MIPS changes missing from previous merge.	2018-07-18 12:27:50 -07:00
Eugene Zhulenev	e3c2d61739	Assert that no output kernel is defined for GPU contraction	2018-07-18 14:34:22 -07:00
Eugene Zhulenev	086ded5c85	Disable type traits for GCC < 5.1.0	2018-07-18 16:32:55 -07:00
Eugene Zhulenev	79d4129cce	Specify default output kernel for TensorContractionOp	2018-07-18 14:21:01 -07:00
Gael Guennebaud	6e5a3b898f	Add regression for bugs #1573 and #1575	2018-07-18 23:34:34 +02:00
Gael Guennebaud	863580fe88	bug #1432 : fix conservativeResize for non-relocatable scalar types. For those we need to by-pass realloc routines and fall-back to allocate as new - copy - delete. The remaining problem is that we don't have any mechanism to accurately determine whether a type is relocatable or not, so currently let's be super conservative using either RequireInitialization or std::is_trivially_copyable	2018-07-18 23:33:07 +02:00
Gael Guennebaud	053ed97c72	Generalize ScalarWithExceptions to a full non-copyable and trowing scalar type to be used in other unit tests.	2018-07-18 23:27:37 +02:00
Gael Guennebaud	a503fc8725	bug #1575 : fix regression introduced in bug #1573 patch. Move ctor/assignment should not be defaulted.	2018-07-18 23:26:13 +02:00
Gael Guennebaud	308725c3c9	More clearly disable the inclusion of src/Core/arch/CUDA/Complex.h without CUDA	2018-07-18 13:51:36 +02:00
Mark D Ryan	e79c5149bf	Fix AVX512 implementations of psqrt This commit fixes the AVX512 implementations of psqrt in the same way that `3ed67cb0bb` fixed the AVX2 version of this function. The AVX512 versions of psqrt incorrectly return -0.0 for negative values, instead of NaN. Fixing the issues requires adding some additional instructions that slow down the algorithms. A similar test to the one used in `3ed67cb0bb` shows that the corrected Packet16f code runs at 73% of the speed of the existing code, while the corrected Packed8d function runs at 68% of the original.	2018-06-25 05:05:02 -07:00
Yuefeng Zhou	1eff6cf8a7	Use device's allocate function instead of internal::aligned_malloc. This would make it easier to track memory usage in device instances.	2018-02-20 16:50:05 -08:00
Gael Guennebaud	adb134d47e	Fix implicit conversion from 0.0 to scalar	2018-02-16 22:26:01 +04:00

1 2 3 4 5 ...

9784 Commits