eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2024-12-21 07:19:46 +08:00

Author	SHA1	Message	Date
Gael Guennebaud	354f14293b	Fix double = bool !	2018-11-23 15:12:06 +01:00
Gael Guennebaud	a7842daef2	Fix several uninitialized member from ctor	2018-11-23 15:10:28 +01:00
Gael Guennebaud	a476054879	bug #1624 : improve matrix-matrix product on ARM 64, 20% speedup	2018-11-23 10:25:19 +01:00
Gael Guennebaud	c685fe9838	Move regression test to right unit test file	2018-11-21 15:59:47 +01:00
Gael Guennebaud	4b2cebade8	Workaround weird MSVC bug	2018-11-21 15:53:37 +01:00
Christoph Hertzberg	0ec8afde57	Fixed most conversion warnings in MatrixFunctions module	2018-11-20 16:23:28 +01:00
Gael Guennebaud	6a510fe69c	Make MaxPacketSize a true upper bound, even for fixed-size inputs	2018-11-16 11:25:32 +01:00
Gael Guennebaud	43c987b1c1	Add explicit regression test for bug #1622	2018-11-16 11:24:51 +01:00
Mark D Ryan	670d56441c	PR 544: Set requestedAlignment correctly for SliceVectorizedTraversals Commit `aa110e681b` optimised the multiplication of small dyanmically sized matrices by restricting the packet size to a maximum of 4, increasing the chances that SIMD instructions are used in the computation. However, it introduced a mismatch between the packet size and the requestedAlignment. This mismatch can lead to crashes when the destination is not aligned. This patch fixes the issue by ensuring that the AssignmentTraits are correctly computed when using a restricted packet size. * * * Bind LinearPacketType to MaxPacketSize This commit applies any packet size limit specified when instantiating copy_using_evaluator_traits to the LinearPacketType, providing that the size of the destination is not known at compile time. * * * Add unit test for restricted packet assignment A new unit test is added to check that multiplication of small dynamically sized matrices works correctly when the packet size is restricted to 4 and the destination is unaligned.	2018-11-13 16:15:08 +01:00
Nikolaus Demmel	3dc0845046	Fix typo in comment on EIGEN_MAX_STATIC_ALIGN_BYTES	2018-11-14 18:11:30 +01:00
Gael Guennebaud	7fddc6a51f	typo	2018-11-14 14:43:18 +01:00
Gael Guennebaud	449f948b2a	help doxygen linking to DenseBase::NulllaryExpr	2018-11-14 14:42:59 +01:00
Gael Guennebaud	4263f23c28	Improve doc on multi-threading and warn about hyper-threading	2018-11-14 14:42:29 +01:00
Gael Guennebaud	db529ae4ec	doxygen does not like \addtogroup and \ingroup in the same line	2018-11-14 14:42:06 +01:00
Rasmus Munk Larsen	72928a2c8a	Merged in rmlarsen/eigen2 (pull request PR-543) Add parallel memcpy to TensorThreadPoolDevice in Eigen, but limit the number of threads to 4, beyond which we just seem to be wasting CPU cycles as the threads contend for memory bandwidth. Approved-by: Eugene Zhulenev <ezhulenev@google.com>	2018-11-13 17:10:30 +00:00
Rasmus Munk Larsen	cda479d626	Remove accidental changes.	2018-11-12 18:34:04 -08:00
Rasmus Munk Larsen	719d9aee65	Add parallel memcpy to TensorThreadPoolDevice in Eigen, but limit the number of threads to 4, beyond which we just seem to be wasting CPU cycles as the threads contend for memory bandwidth.	2018-11-12 17:46:02 -08:00
Rasmus Munk Larsen	77b447c24e	Add optimized version of logistic function for float. As an example, this is about 50% faster than the existing version on Haswell using AVX.	2018-11-12 13:42:24 -08:00
Gael Guennebaud	c81bdbdadc	Add manual doc on STL-compatible iterators	2018-11-12 22:06:33 +01:00
Gael Guennebaud	0105146915	Fix warning in c++03	2018-11-10 09:11:38 +01:00
Rasmus Munk Larsen	93f9988a7e	A few small fixes to a) prevent throwing in ctors and dtors of the threading code, and b) supporting matrix exponential on platforms with 113 bits of mantissa for long doubles.	2018-11-09 14:15:32 -08:00
Gael Guennebaud	784a3f13cf	bug #1619 : fix mixing of const and non-const generic iterators	2018-11-09 21:45:10 +01:00
Gael Guennebaud	db9a9a12ba	bug #1619 : make const and non-const iterators compatible	2018-11-09 16:49:19 +01:00
Gael Guennebaud	fbd6e7b025	add missing ref to a.zeta(b)	2018-11-09 13:53:42 +01:00
Gael Guennebaud	dffd1e11de	Limit the size of the toc	2018-11-09 13:52:34 +01:00
Gael Guennebaud	a88e0a0e95	Update doxy hacks wrt doxygen 1.8.13/14	2018-11-09 13:52:10 +01:00
Gael Guennebaud	bd9a00718f	Let doxygen sees lastN	2018-11-09 11:35:48 +01:00
Gael Guennebaud	d7c644213c	Add and update manual pages for slicing, indexing, and reshaping.	2018-11-09 11:35:27 +01:00
Gael Guennebaud	a368848473	Recent xcode versions does support EIGEN_HAS_STATIC_ARRAY_TEMPLATE	2018-11-09 10:33:17 +01:00
Gael Guennebaud	f62a0f69c6	Fix max-size in indexed-view	2018-11-08 18:40:22 +01:00
Gael Guennebaud	bf495859ff	Merged in glchaves/eigen (pull request PR-539) Vectorize row-by-row gebp loop iterations on 16 packets as well	2018-11-07 07:21:15 +00:00
Gael Guennebaud	995730fc6c	Add option to disable plot generation	2018-11-07 00:41:16 +01:00
Gustavo Lima Chaves	4ad359237a	Vectorize row-by-row gebp loop iterations on 16 packets as well Signed-off-by: Gustavo Lima Chaves <gustavo.lima.chaves@intel.com> Signed-off-by: Mark D. Ryan <mark.d.ryan@intel.com>	2018-11-06 10:48:42 -08:00
Gael Guennebaud	9d318b92c6	add unit tests for bug #1619	2018-11-01 15:14:50 +01:00
Matthieu Vigne	8d7a73e48e	bug #1617 : Fix SolveTriangular.solveInPlace crashing for empty matrix. This made FullPivLU.kernel() crash when used on the zero matrix. Add unit test for FullPivLU.kernel() on the zero matrix.	2018-10-31 20:28:18 +01:00
Christoph Hertzberg	66b28e290d	bug #1618 : Use different power-of-2 check to avoid MSVC warning	2018-11-01 13:23:19 +01:00
Rasmus Munk Larsen	07fcdd1438	Merged in ezhulenev/eigen-02 (pull request PR-534) Fix cxx11_tensor_{block_access, reduction} tests	2018-10-25 18:34:35 +00:00
Eugene Zhulenev	8a977c1f46	Fix cxx11_tensor_{block_access, reduction} tests	2018-10-25 11:31:29 -07:00
Halie Murray-Davis	fb62d6d96e	Fix typo in tutorial documentation.	2018-10-25 04:55:34 +00:00
Christoph Hertzberg	b5f077d22c	Document EIGEN_NO_IO preprocessor directive	2018-10-25 16:49:25 +02:00
Christian von Schultz	4a40b3785d	Collapsed revision (based on pull request PR-325) * Support compiling without IO streams Add the preprocessor definition EIGEN_NO_IO which, if defined, disables all use of the IO streams part of the standard library.	2018-10-22 21:14:40 +02:00
Rasmus Munk Larsen	14054e217f	Do not rely on the compiler generating __device__ functions for constexpr in Cuda (via EIGEN_CONSTEXPR_ARE_DEVICE_FUNC. This breaks several target in the TensorFlow Cuda build, e.g., INFO: From Compiling tensorflow/core/kernels/maxpooling_op_gpu.cu.cc: /b/f/w/run/external/eigen_archive/Eigen/src/Core/arch/GPU/Half.h(197): error: calling a __host__ function("std::equal_to<float> ::operator () const") from a __global__ function("tensorflow::_NV_ANON_NAMESPACE::MaxPoolGradBackwardNoMaskNHWC< ::Eigen::half> ") is not allowed /b/f/w/run/external/eigen_archive/Eigen/src/Core/arch/GPU/Half.h(197): error: identifier "std::equal_to<float> ::operator () const" is undefined in device code" /b/f/w/run/external/eigen_archive/Eigen/src/Core/arch/GPU/Half.h(197): error: calling a __host__ function("std::equal_to<float> ::operator () const") from a __global__ function("tensorflow::_NV_ANON_NAMESPACE::MaxPoolGradBackwardNoMaskNCHW< ::Eigen::half> ") is not allowed /b/f/w/run/external/eigen_archive/Eigen/src/Core/arch/GPU/Half.h(197): error: identifier "std::equal_to<float> ::operator () const" is undefined in device code 4 errors detected in the compilation of "/tmp/tmpxft_00000011_00000000-6_maxpooling_op_gpu.cu.cpp1.ii". ERROR: /tmpfs/tensor_flow/tensorflow/core/kernels/BUILD:3753:1: output 'tensorflow/core/kernels/_objs/pooling_ops_gpu/maxpooling_op_gpu.cu.pic.o' was not created ERROR: /tmpfs/tensor_flow/tensorflow/core/kernels/BUILD:3753:1: Couldn't build file tensorflow/core/kernels/_objs/pooling_ops_gpu/maxpooling_op_gpu.cu.pic.o: not all outputs were created or valid	2018-10-22 16:18:24 -07:00
Rasmus Munk Larsen	954b4ca9d0	Suppress compiler warning about unused global variable.	2018-10-22 13:48:56 -07:00
Rasmus Munk Larsen	9caafca550	Merged in rmlarsen/eigen (pull request PR-532) Only set EIGEN_CONSTEXPR_ARE_DEVICE_FUNC for clang++ if cxx_relaxed_constexpr is available.	2018-10-19 21:37:14 +00:00
Christoph Hertzberg	449ff74672	Fix most Doxygen warnings. Also add links to stable documentation from unsupported modules (by using the corresponding Doxytags file). Manually grafted from `d107a371c6`	2018-10-19 21:10:28 +02:00
Rasmus Munk Larsen	39fec15d5c	Merged eigen/eigen into default	2018-10-19 09:48:19 -07:00
Christoph Hertzberg	40fa6f98bf	bug #1606 : Explicitly set the standard before find_package(StandardMathLibrary). Also replace EIGEN_COMPILER_SUPPORT_CXX11 in favor of EIGEN_COMPILER_SUPPORT_CPP11. Grafted manually from `a4afa90d16`	2018-10-19 17:20:51 +02:00
Rasmus Munk Larsen	d8f285852b	Only set EIGEN_CONSTEXPR_ARE_DEVICE_FUNC for clang++ if cxx_relaxed_constexpr is available.	2018-10-18 16:55:02 -07:00
Rasmus Munk Larsen	dda68f56ec	Fix GPU build due to gpu_assert not always being defined.	2018-10-18 16:29:29 -07:00
Gael Guennebaud	1dcf5a6ed8	fix typo in doc	2018-10-17 09:29:36 +02:00

1 2 3 4 5 ...

10204 Commits