eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2024-12-21 07:19:46 +08:00

Author	SHA1	Message	Date
Christoph Hertzberg	66b28e290d	bug #1618 : Use different power-of-2 check to avoid MSVC warning	2018-11-01 13:23:19 +01:00
Rasmus Munk Larsen	07fcdd1438	Merged in ezhulenev/eigen-02 (pull request PR-534) Fix cxx11_tensor_{block_access, reduction} tests	2018-10-25 18:34:35 +00:00
Eugene Zhulenev	8a977c1f46	Fix cxx11_tensor_{block_access, reduction} tests	2018-10-25 11:31:29 -07:00
Halie Murray-Davis	fb62d6d96e	Fix typo in tutorial documentation.	2018-10-25 04:55:34 +00:00
Christoph Hertzberg	b5f077d22c	Document EIGEN_NO_IO preprocessor directive	2018-10-25 16:49:25 +02:00
Christian von Schultz	4a40b3785d	Collapsed revision (based on pull request PR-325) * Support compiling without IO streams Add the preprocessor definition EIGEN_NO_IO which, if defined, disables all use of the IO streams part of the standard library.	2018-10-22 21:14:40 +02:00
Rasmus Munk Larsen	14054e217f	Do not rely on the compiler generating __device__ functions for constexpr in Cuda (via EIGEN_CONSTEXPR_ARE_DEVICE_FUNC. This breaks several target in the TensorFlow Cuda build, e.g., INFO: From Compiling tensorflow/core/kernels/maxpooling_op_gpu.cu.cc: /b/f/w/run/external/eigen_archive/Eigen/src/Core/arch/GPU/Half.h(197): error: calling a __host__ function("std::equal_to<float> ::operator () const") from a __global__ function("tensorflow::_NV_ANON_NAMESPACE::MaxPoolGradBackwardNoMaskNHWC< ::Eigen::half> ") is not allowed /b/f/w/run/external/eigen_archive/Eigen/src/Core/arch/GPU/Half.h(197): error: identifier "std::equal_to<float> ::operator () const" is undefined in device code" /b/f/w/run/external/eigen_archive/Eigen/src/Core/arch/GPU/Half.h(197): error: calling a __host__ function("std::equal_to<float> ::operator () const") from a __global__ function("tensorflow::_NV_ANON_NAMESPACE::MaxPoolGradBackwardNoMaskNCHW< ::Eigen::half> ") is not allowed /b/f/w/run/external/eigen_archive/Eigen/src/Core/arch/GPU/Half.h(197): error: identifier "std::equal_to<float> ::operator () const" is undefined in device code 4 errors detected in the compilation of "/tmp/tmpxft_00000011_00000000-6_maxpooling_op_gpu.cu.cpp1.ii". ERROR: /tmpfs/tensor_flow/tensorflow/core/kernels/BUILD:3753:1: output 'tensorflow/core/kernels/_objs/pooling_ops_gpu/maxpooling_op_gpu.cu.pic.o' was not created ERROR: /tmpfs/tensor_flow/tensorflow/core/kernels/BUILD:3753:1: Couldn't build file tensorflow/core/kernels/_objs/pooling_ops_gpu/maxpooling_op_gpu.cu.pic.o: not all outputs were created or valid	2018-10-22 16:18:24 -07:00
Rasmus Munk Larsen	954b4ca9d0	Suppress compiler warning about unused global variable.	2018-10-22 13:48:56 -07:00
Rasmus Munk Larsen	9caafca550	Merged in rmlarsen/eigen (pull request PR-532) Only set EIGEN_CONSTEXPR_ARE_DEVICE_FUNC for clang++ if cxx_relaxed_constexpr is available.	2018-10-19 21:37:14 +00:00
Christoph Hertzberg	449ff74672	Fix most Doxygen warnings. Also add links to stable documentation from unsupported modules (by using the corresponding Doxytags file). Manually grafted from `d107a371c6`	2018-10-19 21:10:28 +02:00
Rasmus Munk Larsen	39fec15d5c	Merged eigen/eigen into default	2018-10-19 09:48:19 -07:00
Christoph Hertzberg	40fa6f98bf	bug #1606 : Explicitly set the standard before find_package(StandardMathLibrary). Also replace EIGEN_COMPILER_SUPPORT_CXX11 in favor of EIGEN_COMPILER_SUPPORT_CPP11. Grafted manually from `a4afa90d16`	2018-10-19 17:20:51 +02:00
Rasmus Munk Larsen	d8f285852b	Only set EIGEN_CONSTEXPR_ARE_DEVICE_FUNC for clang++ if cxx_relaxed_constexpr is available.	2018-10-18 16:55:02 -07:00
Rasmus Munk Larsen	dda68f56ec	Fix GPU build due to gpu_assert not always being defined.	2018-10-18 16:29:29 -07:00
Gael Guennebaud	1dcf5a6ed8	fix typo in doc	2018-10-17 09:29:36 +02:00
Eugene Zhulenev	9e96e91936	Move from rvalue arguments in ThreadPool enqueue* methods	2018-10-16 16:48:32 -07:00
Eugene Zhulenev	217d839816	Reduce thread scheduling overhead in parallelFor	2018-10-16 14:53:06 -07:00
Rasmus Munk Larsen	d52763bb4f	Merged in ezhulenev/eigen-02 (pull request PR-528) [TensorBlockIO] Check if it's allowed to squeeze inner dimensions Approved-by: Rasmus Munk Larsen <rmlarsen@google.com>	2018-10-16 15:39:40 +00:00
Gael Guennebaud	0f780bb0b4	Fix float-to-double warning	2018-10-16 09:19:45 +02:00
Eugene Zhulenev	900c7c61bb	Check if it's allowed to squueze inner dimensions in TensorBlockIO	2018-10-15 16:52:33 -07:00
Gael Guennebaud	a39e0f7438	bug #1612 : fix regression in "outer-vectorization" of partial reductions for PacketSize==1 (aka complex<double>)	2018-10-16 01:04:25 +02:00
Gael Guennebaud	e3b85771d7	Show call stack in case of failing sparse solving.	2018-10-16 00:43:44 +02:00
Gael Guennebaud	d2d570c116	Remove useless (and broken) resize	2018-10-16 00:42:48 +02:00
Gael Guennebaud	f0fb95135d	Iterative solvers: unify and fix handling of multiple rhs. m_info was not properly computed and the logic was repeated in several places.	2018-10-15 23:47:46 +02:00
Gael Guennebaud	2747b98cfc	DGMRES: fix null rhs, fix restart, fix m_isDeflInitialized for multiple solve	2018-10-15 23:46:00 +02:00
Gael Guennebaud	d835a0bf53	relax number of iterations checks to avoid false negatives	2018-10-15 10:23:32 +02:00
Gael Guennebaud	3a33db4de5	merge	2018-10-15 09:22:27 +02:00
Rasmus Munk Larsen	0ed811a9c1	Suppress unused variable compiler warning in sparse subtest 3.	2018-10-12 13:41:57 -07:00
Mark D Ryan	aa110e681b	PR 526: Speed up multiplication of small, dynamically sized matrices The Packet16f, Packet8f and Packet8d types are too large to use with dynamically sized matrices typically processed by the SliceVectorizedTraversal specialization of the dense_assignment_loop. Using these types is likely to lead to little or no vectorization. Significant slowdown in the multiplication of these small matrices can be observed when building with AVX and AVX512 enabled. This patch introduces a new dense_assignment_kernel that is used when computing small products whose operands have dynamic dimensions. It ensures that the PacketSize used is no larger than 4, thereby increasing the chance that vectorized instructions will be used when computing the product. I tested all 969 possible combinations of M, K, and N that are handled by the dense_assignment_loop on x86 builds. Although a few combinations are slowed down by this patch they are far outnumbered by the cases that are sped up, as the following results demonstrate. Disabling Packed8d on AVX512 builds: Total Cases: 969 Better: 511 Worse: 85 Same: 373 Max Improvement: 169.00% (4 8 6) Max Degradation: 36.50% (8 5 3) Median Improvement: 35.46% Median Degradation: 17.41% Total FLOPs Improvement: 19.42% Disabling Packet16f and Packed8f on AVX512 builds: Total Cases: 969 Better: 658 Worse: 5 Same: 306 Max Improvement: 214.05% (8 6 5) Max Degradation: 22.26% (16 2 1) Median Improvement: 60.05% Median Degradation: 13.32% Total FLOPs Improvement: 59.58% Disabling Packed8f on AVX builds: Total Cases: 969 Better: 663 Worse: 96 Same: 210 Max Improvement: 155.29% (4 10 5) Max Degradation: 35.12% (8 3 2) Median Improvement: 34.28% Median Degradation: 15.05% Total FLOPs Improvement: 26.02%	2018-10-12 15:20:21 +02:00
Eugene Zhulenev	d9392f9e55	Fix code format	2018-11-02 14:51:35 -07:00
Eugene Zhulenev	118520f04a	Workaround nbcc+msvc compiler bug	2018-11-02 14:48:28 -07:00
Christoph Hertzberg	24dc076519	Explicitly convert 0 to Scalar for custom types	2018-10-12 10:22:19 +02:00
Gael Guennebaud	8214cf1896	Make sparse_basic includable from sparse_extra, but disable it since sparse_basic(DynamicSparseMatrix) does not compile at all anyways	2018-10-11 10:27:23 +02:00
Gael Guennebaud	43633fbaba	Fix warning with AVX512f	2018-10-11 10:13:48 +02:00
Gael Guennebaud	97e2c808e9	Fix avx512 plog(NaN) to return NaN instead of +inf	2018-10-11 10:13:13 +02:00
Gael Guennebaud	b3f66d29a5	Enable avx512 plog with clang	2018-10-11 10:12:21 +02:00
Gael Guennebaud	2ef1b39674	Relaxed fastmath unit test: if std::foo fails, then let's only trigger a warning is numext::foo fails too. A true error will triggered only if std::foo works but our numext::foo fails.	2018-10-11 09:45:30 +02:00
Gael Guennebaud	1d5a6363ea	relax numerical tests from equal to approx (x87)	2018-10-11 09:29:56 +02:00
Gael Guennebaud	f0aa7e40fc	Fix regression in changeset `5335659c47`	2018-10-10 23:47:30 +02:00
Gael Guennebaud	ce243ee45b	bug #520 : add diagmat +/- diagmat operators.	2018-10-10 23:38:22 +02:00
Gael Guennebaud	5335659c47	Merged in ezhulenev/eigen-02 (pull request PR-525) Fix bug in partial reduction of expressions requiring evaluation	2018-10-10 20:59:00 +00:00
Gael Guennebaud	eec0dfd688	bug #632 : add specializations for res ?= dense +/- sparse and res ?= sparse +/- dense. They are rewritten as two compound assignment to by-pass hybrid dense-sparse iterator.	2018-10-10 22:50:15 +02:00
Eugene Zhulenev	8e6dc2c81d	Fix bug in partial reduction of expressions requiring evaluation	2018-10-10 13:23:52 -07:00
Gael Guennebaud	76ceae49c1	bug #1609 : add inplace transposition unit test	2018-10-10 21:48:58 +02:00
Eugene Zhulenev	2bf1a31d81	Use void type if stl-style iterators are not supported	2018-10-10 10:31:40 -07:00
Christoph Hertzberg	f3130ee1ba	Avoid empty macro arguments	2018-10-10 08:23:40 +02:00
Rasmus Munk Larsen	e8918743c1	Merged in ezhulenev/eigen-01 (pull request PR-523) Compile time detection for unimplemented stl-style iterators	2018-10-09 23:42:01 +00:00
Eugene Zhulenev	befcac883d	Hide stl-container detection test under #if	2018-10-09 15:36:01 -07:00
Eugene Zhulenev	c0ca8a9fa3	Compile time detection for unimplemented stl-style iterators	2018-10-09 15:28:23 -07:00
Gael Guennebaud	1dd1f8e454	bug #65 : add vectorization of partial reductions along the outer-dimension, for instance: colmajor_mat.rowwise().mean()	2018-10-09 23:36:50 +02:00

... 3 4 5 6 7 ...

10370 Commits