eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2025-01-06 14:14:46 +08:00

Author	SHA1	Message	Date
Gael Guennebaud	97e2c808e9	Fix avx512 plog(NaN) to return NaN instead of +inf	2018-10-11 10:13:13 +02:00
Gael Guennebaud	b3f66d29a5	Enable avx512 plog with clang	2018-10-11 10:12:21 +02:00
Gael Guennebaud	f0aa7e40fc	Fix regression in changeset `5335659c47`	2018-10-10 23:47:30 +02:00
Gael Guennebaud	ce243ee45b	bug #520 : add diagmat +/- diagmat operators.	2018-10-10 23:38:22 +02:00
Gael Guennebaud	5335659c47	Merged in ezhulenev/eigen-02 (pull request PR-525) Fix bug in partial reduction of expressions requiring evaluation	2018-10-10 20:59:00 +00:00
Gael Guennebaud	eec0dfd688	bug #632 : add specializations for res ?= dense +/- sparse and res ?= sparse +/- dense. They are rewritten as two compound assignment to by-pass hybrid dense-sparse iterator.	2018-10-10 22:50:15 +02:00
Eugene Zhulenev	8e6dc2c81d	Fix bug in partial reduction of expressions requiring evaluation	2018-10-10 13:23:52 -07:00
Eugene Zhulenev	2bf1a31d81	Use void type if stl-style iterators are not supported	2018-10-10 10:31:40 -07:00
Rasmus Munk Larsen	e8918743c1	Merged in ezhulenev/eigen-01 (pull request PR-523) Compile time detection for unimplemented stl-style iterators	2018-10-09 23:42:01 +00:00
Eugene Zhulenev	c0ca8a9fa3	Compile time detection for unimplemented stl-style iterators	2018-10-09 15:28:23 -07:00
Gael Guennebaud	1dd1f8e454	bug #65 : add vectorization of partial reductions along the outer-dimension, for instance: colmajor_mat.rowwise().mean()	2018-10-09 23:36:50 +02:00
Gael Guennebaud	bfa2a81a50	Make redux_vec_unroller more flexible regarding packet-type	2018-10-09 23:30:41 +02:00
Christoph Hertzberg	f6359ad795	Small Doxygen fixes	2018-10-09 19:33:35 +02:00
Gael Guennebaud	7a882c05ab	Fix compilation on CUDA	2018-10-09 17:02:16 +02:00
Gael Guennebaud	e00487f7d2	bug #1603 : add parenthesis around ternary operator in function body as well as a harmless attempt to make MSVC happy.	2018-10-08 22:27:04 +02:00
Gael Guennebaud	649d4758a6	merge	2018-10-08 17:35:18 +02:00
Gael Guennebaud	774bb9d6f7	fix a doxygen issue	2018-10-08 09:30:15 +02:00
Gael Guennebaud	bcb7c66b53	Workaround gcc's alloc-size-larger-than= warning	2018-10-07 21:55:59 +02:00
Gael Guennebaud	6512c5e136	Implement a better workaround for GCC's bug #87544	2018-10-07 15:00:05 +02:00
Gael Guennebaud	409132bb81	Workaround gcc bug making it trigger an invalid warning	2018-10-07 09:23:15 +02:00
Gael Guennebaud	c6a1ab4036	Workaround MSVC compilation issue	2018-10-06 13:49:17 +02:00
Gael Guennebaud	e21766c6f5	Clarify doc of rowwise/colwise/vectorwise.	2018-10-05 23:12:09 +02:00
Gael Guennebaud	d92f004ab7	Simplify API by removing allCols/allRows and reusing rowwise/colwise to define iterators over rows/columns	2018-10-05 23:11:21 +02:00
Gael Guennebaud	3e64b1fc86	Move iterators to internal, improve doc, make unit test c++03 friendly	2018-10-03 15:13:15 +02:00
Gael Guennebaud	2b2b4d0580	fix unused warning	2018-10-03 14:16:21 +02:00
Gael Guennebaud	5f26f57598	Change the logic of A.reshaped<Order>() to be a simple alias to A.reshaped<Order>(AutoSize,fix<1>). This means that now AutoOrder is allowed, and it always return a column-vector.	2018-10-03 11:41:47 +02:00
Gael Guennebaud	0481900e25	Add pointer-based iterator for direct-access expressions	2018-10-02 23:44:36 +02:00
Gael Guennebaud	8c38528168	Factorize RowsProxy/ColsProxy and related iterators using subVector<>(Index)	2018-10-02 14:03:26 +02:00
Gael Guennebaud	12487531ce	Add templated subVector<Vertical/Horizonal>(Index) aliases to col/row(Index) methods (plus subVectors<>() to retrieve the number of rows/columns)	2018-10-02 14:02:34 +02:00
Gael Guennebaud	37e29fc893	Use Index instead of ptrdiff_t or int, fix random-accessors.	2018-10-02 13:29:32 +02:00
Gael Guennebaud	de2efbc43c	bug #1605 : workaround ABI issue with vector types (aka __m128) versus scalar types (aka float)	2018-10-01 23:45:55 +02:00
Gael Guennebaud	b0c66adfb1	bug #231 : initial implementation of STL iterators for dense expressions	2018-10-01 23:21:37 +02:00
Christoph Hertzberg	564ca71e39	Merged in deven-amd/eigen/HIP_fixes (pull request PR-518) PR with HIP specific fixes (for the eigen nightly regression failures in HIP mode)	2018-10-01 16:51:04 +00:00
Deven Desai	94898488a6	This commit contains the following (HIP specific) updates: - unsupported/Eigen/CXX11/src/Tensor/TensorReductionGpu.h Changing "pass-by-reference" argument to be "pass-by-value" instead (in a __global__ function decl). "pass-by-reference" arguments to __global__ functions are unwise, and will be explicitly flagged as errors by the newer versions of HIP. - Eigen/src/Core/util/Memory.h - unsupported/Eigen/CXX11/src/Tensor/TensorContraction.h Changes introduced in recent commits breaks the HIP compile. Adding EIGEN_DEVICE_FUNC attribute to some functions and calling ::malloc/free instead of the corresponding std:: versions to get the HIP compile working again - unsupported/Eigen/CXX11/src/Tensor/TensorReduction.h Change introduced a recent commit breaks the HIP compile (link stage errors out due to failure to inline a function). Disabling the recently introduced code (only for HIP compile), to get the eigen nightly testing going again. Will submit another PR once we have te proper fix. - Eigen/src/Core/util/ConfigureVectorization.h Enabling GPU VECTOR support when HIP compiler is in use (for both the host and device compile phases)	2018-10-01 14:28:37 +00:00
Gael Guennebaud	af3ad4b513	oops, I've been too fast in previous copy/paste	2018-09-27 09:28:57 +02:00
Gael Guennebaud	24b163a877	#pragma GCC diagnostic push/pop is not supported prioro to gcc 4.6	2018-09-27 09:23:54 +02:00
Gael Guennebaud	41c3a2ffc1	Fix documentation of reshape to vectors.	2018-09-25 16:35:44 +02:00
Christoph Hertzberg	2c083ace3e	Provide EIGEN_OVERRIDE and EIGEN_FINAL macros to mark virtual function overrides	2018-09-24 18:01:17 +02:00
Gael Guennebaud	626942d9dd	fix alignment issue in ploaddup for AVX512	2018-09-28 16:57:32 +02:00
Gael Guennebaud	84a1101b36	Merge with default.	2018-09-23 21:52:58 +02:00
Gael Guennebaud	795e12393b	Fix logic in diagonaldense product in a corner case. The problem was for: diag(1x1) mat(1,n)	2018-09-22 16:44:33 +02:00
Gael Guennebaud	bac36d0996	Demangle Travseral and Unrolling in Redux	2018-09-21 23:03:45 +02:00
Gael Guennebaud	1bf12880ae	Add reshaped<>() shortcuts when returning vectors and remove the reshaping version of operator()(all)	2018-09-21 16:50:04 +02:00
Gael Guennebaud	371068992a	Add more debug output	2018-09-21 14:32:39 +02:00
Gael Guennebaud	b00e48a867	Improve slice-vectorization logic for redux (significant speed-up for reduxion of blocks)	2018-09-21 13:45:56 +02:00
Gael Guennebaud	a488d59787	merge with default Eigen	2018-09-21 11:51:49 +02:00
Gael Guennebaud	47720e7970	Doc fixes	2018-09-21 11:48:22 +02:00
Gael Guennebaud	3ec2985914	Merged indexing cleanup (pull request PR-506)	2018-09-21 09:36:05 +00:00
Gael Guennebaud	651e5d4866	Fix EIGEN_MAKE_ALIGNED_OPERATOR_NEW_IF_VECTORIZABLE_FIXED_SIZE for AVX512 or AVX with malloc aligned on 8 bytes only. This change also make it future proof for AVX1024	2018-09-21 10:33:22 +02:00
Gael Guennebaud	f0ef3467de	Fix doc	2018-09-20 22:57:28 +02:00
Gael Guennebaud	617f75f117	Add indexing namespace	2018-09-20 22:57:10 +02:00
Gael Guennebaud	0c56d22e2e	Fix shadowing	2018-09-20 22:56:21 +02:00
Gael Guennebaud	9419f506d0	Fix regression introduced by the previous fix for AVX512. It brokes the complex-complex case on SSE.	2018-09-20 17:32:34 +02:00
Gael Guennebaud	e38d1ab4d1	Workaround increases required alignment warning	2018-09-20 17:07:33 +02:00
Gael Guennebaud	71496b0e25	Fix gebp kernel for real+complex in case only reals are vectorized (e.g., AVX512). This commit also removes "half-packet" from data-mappers: it was not used and conceptually broken anyways.	2018-09-20 17:01:24 +02:00
Gael Guennebaud	5a30eed17e	Fix warnings in AVX512	2018-09-20 16:58:51 +02:00
Gael Guennebaud	c3a19527a2	Fix doc wrt previous change	2018-09-19 11:49:26 +02:00
Gael Guennebaud	dfa8439e4d	Update reshaped API to use RowMajor/ColMajor directly as integral values instead of introducing RowOrder/ColOrder types. The API changed from A.respahed(rows,cols,RowOrder) to A.template reshaped<RowOrder>(rows,cols).	2018-09-19 11:49:26 +02:00
Gael Guennebaud	297ca62319	ease transition by adding placeholders::all/last/and as deprecated	2018-09-17 16:24:52 +02:00
Gael Guennebaud	2014c7ae28	Move all, last, end from Eigen::placeholders namespace to Eigen::, and rename end to lastp1 to avoid conflicts with std::end.	2018-09-15 14:35:10 +02:00
Gael Guennebaud	82772e8d9d	Rename Symbolic namespace to symbolic to be consistent with numext namespace	2018-09-15 14:16:20 +02:00
Gael Guennebaud	3e8188fc77	bug #1600 : initialize m_info to InvalidInput by default, even though m_info is not accessible until it has been initialized (assert)	2018-09-18 21:24:48 +02:00
Christoph Hertzberg	d7378aae8e	Provide EIGEN_ALIGNOF macro, and give handmade_aligned_malloc the possibility for alignments larger than the standard alignment.	2018-09-14 20:17:47 +02:00
Gael Guennebaud	1141bcf794	Fix conjugate-gradient for very small rhs	2018-09-13 23:53:28 +02:00
Deven Desai	c64fe9ea1f	Updates to fix HIP-clang specific compile errors. Compiling the eigen unittests with hip-clang (HIP with clang as the underlying compiler instead of hcc or nvcc), results in compile errors. The changes in this commit fix those compile errors. The main change is to convert a few instances of "__device__" to "EIGEN_DEVICE_FUNC"	2018-08-30 20:22:16 +00:00
Gael Guennebaud	5927eef612	Enable std::result_of for msvc 2015 and later	2018-09-13 09:44:46 +02:00
Christoph Hertzberg	3adece4827	Fix misleading indentation of errorCode and make it loop-local	2018-09-12 14:41:38 +02:00
Christoph Hertzberg	7e9c9fbb2d	Disable type-limits warnings for g++ < 4.8	2018-09-12 14:40:39 +02:00
Justin Carpentier	4827bec776	LLT: correct doc and add missing reference for the return type of rankUpdate --- Eigen/src/Cholesky/LLT.h \| 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)	2018-09-11 09:33:21 +02:00
luz.paz"	43fd42a33b	Fix doxy and misc. typos Found via `codespell -q 3 -I ../eigen-word-whitelist.txt` --- Eigen/src/Core/ProductEvaluators.h \| 4 ++-- Eigen/src/Core/arch/GPU/Half.h \| 2 +- Eigen/src/Core/util/Memory.h \| 2 +- Eigen/src/Geometry/Hyperplane.h \| 2 +- Eigen/src/Geometry/Transform.h \| 2 +- Eigen/src/Geometry/Translation.h \| 12 ++++++------ doc/PreprocessorDirectives.dox \| 2 +- doc/TutorialGeometry.dox \| 2 +- test/boostmultiprec.cpp \| 2 +- test/triangular.cpp \| 2 +- 10 files changed, 16 insertions(+), 16 deletions(-)	2018-08-01 21:34:47 -04:00
Jiandong Ruan	6dcd2642aa	bug #1526 - CUDA compilation fails on CUDA 9.x SDK when arch is set to compute_60 and/or above	2018-09-08 12:05:33 -07:00
Alexey Frunze	ec38f07b79	bug #1595 : Don't use C++11's std::isnan() in MIPS/MSA packet math. This removes reliance on C++11 and improves generated code.	2018-09-06 15:40:09 -07:00
cgs1019	c6066ac411	Make param name and docs constistent for JacobiRotation::makeGivens Previously the rendered math in the doc string called the optional return value 'r', while the actual parameter and the doc string text referred to the parameter as 'z'. This changeset renames all the z's to r's to match the math.	2018-09-06 11:04:17 -04:00
Christoph Hertzberg	ddbc564386	Fixed a few more shadowing warnings when compiling with g++ (and c++03)	2018-08-30 16:33:03 +02:00
Mehdi Goli	7ec8b40ad9	Collapsed revision * Separating SYCL math function. * Converting function overload to function specialisation. * Applying the suggested design.	2018-08-28 14:20:48 +01:00
Christoph Hertzberg	73ca600bca	Fix numerous shadow-warnings for GCC<=4.8	2018-08-28 18:32:39 +02:00
Christoph Hertzberg	ef4d79fed8	Disable/ReenableStupidWarnings did not work properly, when included recursively	2018-08-28 18:26:22 +02:00
Gael Guennebaud	befaf83f5f	bug #1590 : fix collision with some system headers defining the macro FP32	2018-08-28 13:21:28 +02:00
Christoph Hertzberg	42f3ee4fb8	Old gcc versions have problems with recursive #pragma GCC diagnostic push/pop Workaround: Don't include "DisableStupidWarnings.h" before including other main-headers	2018-08-28 11:44:15 +02:00
Alexey Frunze	050bcf6126	bug #1584 : Improve random (avoid undefined behavior).	2018-08-08 20:19:32 -07:00
Christoph Hertzberg	ad4a08fb68	Use Intel cast intrinsics, since MSVC does not allow direct casting. Reported by David Winkler.	2018-08-24 19:04:33 +02:00
Christoph Hertzberg	41f1cc67b8	Assertion depended on a not yet initialized value	2018-08-17 16:42:53 +02:00
Christoph Hertzberg	595cae9b09	Silence logical-op-parentheses warning	2018-08-17 16:30:32 +02:00
Justin Carpentier	eabc7a4031	PR 465: Fix issue in RowMajor assignment in plain_matrix_type_row_major::type The type should be RowMajor	2018-08-10 14:30:06 +02:00
Rasmus Munk Larsen	c49e93440f	SuiteSparse defines the macro SuiteSparse_long to control what type is used for 64bit integers. The default value of this macro on non-MSVC platforms is long and __int64 on MSVC. CholmodSupport defaults to using long for the long variants of CHOLMOD functions. This creates problems when SuiteSparse_long is different than long. So the correct thing to do here is to use SuiteSparse_long as the type instead of long.	2018-08-13 15:53:31 -07:00
Gael Guennebaud	3ec60215df	Merged in rmlarsen/eigen2 (pull request PR-466) Move sigmoid functor to core and rename it to 'logistic'.	2018-08-13 21:28:20 +00:00
Rasmus Munk Larsen	d6e283ba96	sigmoid -> logistic	2018-08-13 11:14:50 -07:00
Rasmus Munk Larsen	bfc5091dd5	Cast to diagonalSize to RealScalar instead Scalar.	2018-08-09 14:46:17 -07:00
Rasmus Munk Larsen	8603d80029	Cast diagonalSize() to Scalar before multiplication. Without this, automatic differentiation in Ceres breaks because Scalar is a custom type that does not support multiplication by Index.	2018-08-09 11:09:10 -07:00
Mehdi Goli	67711eaa31	Fixing typo.	2018-08-08 11:38:10 +01:00
Mehdi Goli	22031ab59a	Adding EIGEN_UNROLL_LOOP macro.	2018-08-08 11:07:27 +01:00
Rasmus Munk Larsen	fa68342ef8	Move sigmoid functor to core.	2018-08-03 17:31:23 -07:00
Rasmus Munk Larsen	7f8b53fd0e	bug #1580 : Fix cuda clang build. STL is not supported, so std::equal_to and std::not_equal breaks compilation. Update the definition of EIGEN_CONSTEXPR_ARE_DEVICE_FUNC to exclude clang. See also PR 450.	2018-08-01 12:36:24 -07:00
Mehdi Goli	01358300d5	Creating separate SYCL required PR for uncontroversial files.	2018-08-03 16:59:15 +01:00
Gael Guennebaud	62169419ab	Fix two regressions introduced in previous merges: bad usage of EIGEN_HAS_VARIADIC_TEMPLATES and linking issue.	2018-08-01 23:35:34 +02:00
Benoit Steiner	17221115c9	Merged in codeplaysoftware/eigen-upstream-pure/eigen_variadic_assert (pull request PR-447) Adding variadic version of assert which can take a parameter pack as its input.	2018-08-01 16:41:54 +00:00
Mehdi Goli	af96018b49	Using the suggested modification.	2018-08-01 16:04:44 +01:00
Mehdi Goli	c84509d7cc	Adding new arch/SYCL headers, used for SYCL vectorization.	2018-08-01 12:40:54 +01:00
Mehdi Goli	3a197a60e6	variadic version of assert which can take a parameter pack as its input.	2018-08-01 12:19:14 +01:00
Alexey Frunze	7b91c11207	bug #1578 : Improve prefetching in matrix multiplication on MIPS.	2018-07-24 18:36:44 -07:00
Mark D Ryan	bc615e4585	Re-enable FMA for fast sqrt functions	2018-07-30 13:21:00 +02:00
Mark D Ryan	e79c5149bf	Fix AVX512 implementations of psqrt This commit fixes the AVX512 implementations of psqrt in the same way that `3ed67cb0bb` fixed the AVX2 version of this function. The AVX512 versions of psqrt incorrectly return -0.0 for negative values, instead of NaN. Fixing the issues requires adding some additional instructions that slow down the algorithms. A similar test to the one used in `3ed67cb0bb` shows that the corrected Packet16f code runs at 73% of the speed of the existing code, while the corrected Packed8d function runs at 68% of the original.	2018-06-25 05:05:02 -07:00
Rasmus Munk Larsen	2ebcb911b2	Add pcast packet op for NEON.	2018-07-26 14:28:48 -07:00
Christoph Hertzberg	fd4fe7cbc5	Fixed issue which made documentation not getting built anymore	2018-07-24 22:56:15 +02:00
Gael Guennebaud	4ca3e48f42	fix typo	2018-07-23 16:51:57 +02:00
Gael Guennebaud	c747cde69a	Add lastN shorcuts to seq/seqN.	2018-07-23 16:20:25 +02:00
Eugene Zhulenev	2bf864f1eb	Disable type traits for stdlibc++ <= 4.9.3	2018-07-20 10:11:44 -07:00
Gael Guennebaud	509a5fa77f	Fix IsRelocatable without C++11	2018-07-19 18:47:38 +02:00
Gael Guennebaud	2ca2592009	Fix determination of EIGEN_HAS_TYPE_TRAITS	2018-07-19 18:47:18 +02:00
Gael Guennebaud	5e5987996f	Fix stupid error in Quaternion move ctor	2018-07-19 18:33:53 +02:00
Alexey Frunze	1f523e7304	Add MIPS changes missing from previous merge.	2018-07-18 12:27:50 -07:00
Eugene Zhulenev	086ded5c85	Disable type traits for GCC < 5.1.0	2018-07-18 16:32:55 -07:00
Gael Guennebaud	863580fe88	bug #1432 : fix conservativeResize for non-relocatable scalar types. For those we need to by-pass realloc routines and fall-back to allocate as new - copy - delete. The remaining problem is that we don't have any mechanism to accurately determine whether a type is relocatable or not, so currently let's be super conservative using either RequireInitialization or std::is_trivially_copyable	2018-07-18 23:33:07 +02:00
Gael Guennebaud	a503fc8725	bug #1575 : fix regression introduced in bug #1573 patch. Move ctor/assignment should not be defaulted.	2018-07-18 23:26:13 +02:00
Gael Guennebaud	308725c3c9	More clearly disable the inclusion of src/Core/arch/CUDA/Complex.h without CUDA	2018-07-18 13:51:36 +02:00
Deven Desai	f124f07965	applying EIGEN_DECLARE_TEST to gpu tests Also, a few minor fixes for GPU tests running in HIP mode. 1. Adding an include for hip/hip_runtime.h in the Macros.h file For HIP __host__ and __device__ are macros which are defined in hip headers. Their definitions need to be included before their use in the file. 2. Fixing the compile failure in TensorContractionGpu introduced by the commit to "Fuse computations into the Tensor contractions using output kernel" 3. Fixing a HIP/clang specific compile error by making the struct-member assignment explicit	2018-07-17 14:16:48 -04:00
Gael Guennebaud	2b2cd85694	bug #1573 : add noexcept move constructor and move assignment operator to Quaternion	2018-07-17 11:11:33 +02:00
Gael Guennebaud	5539587b1f	Some warning fixes	2018-07-17 10:29:12 +02:00
Gael Guennebaud	40797dbea3	bug #1572 : use c++11 atomic instead of volatile if c++11 is available, and disable multi-threaded GEMM on non-x86 without c++11.	2018-07-17 00:11:20 +02:00
Gael Guennebaud	a87cff20df	Fix GeneralizedEigenSolver when requesting for eigenvalues only.	2018-07-14 09:38:49 +02:00
Rasmus Munk Larsen	4a3952fd55	Relax the condition to not only work on Android.	2018-07-13 11:24:07 -07:00
Rasmus Munk Larsen	02a9443db9	Clang produces incorrect Thumb2 assembler when using alloca. Don't define EIGEN_ALLOCA when generating Thumb with clang.	2018-07-13 11:03:04 -07:00
Gael Guennebaud	20991c3203	bug #1571 : fix is_convertible<from,to> with "from" a reference.	2018-07-13 17:47:28 +02:00
Gael Guennebaud	86d9c0255c	Forward declaring std::array does not work with all std libs, so let's just include <array>	2018-07-13 13:06:44 +02:00
Alexey Frunze	3875fb05aa	Add support for MIPS SIMD (MSA)	2018-07-06 16:04:30 -07:00
Gael Guennebaud	5c73c9223a	Fix shadowing typedefs	2018-07-12 17:01:07 +02:00
Gael Guennebaud	98728312c8	Fix compilation regarding std::array	2018-07-12 17:00:37 +02:00
Gael Guennebaud	eb3d8f68bb	fix unused warning	2018-07-12 16:59:47 +02:00
Gael Guennebaud	006e18e52b	Cleanup the mess in Eigen/Core by moving CUDA/HIP stuff at more appropriate places (Macros.h), and alignment/vectorization logic is now in util/ConfigureVectorization.h	2018-07-12 16:57:41 +02:00
Julian Kent	6d451cf2b6	Add missing consts for rows and cols functions in SparseLU	2018-02-10 13:44:05 +01:00
Gael Guennebaud	8bdb214fd0	remove double ;;	2018-07-12 11:17:53 +02:00
Gael Guennebaud	a9060378d3	bug #1570 : fix warning	2018-07-12 11:07:09 +02:00
Gael Guennebaud	da0c604078	Merged in deven-amd/eigen (pull request PR-402) Adding support for using Eigen in HIP kernels.	2018-07-12 08:07:16 +00:00
Gael Guennebaud	a4ea611ca7	Remove useless specialization thanks to is_convertible being more robust.	2018-07-12 09:59:44 +02:00
Gael Guennebaud	8ef267ccbd	spellcheck	2018-07-12 09:58:29 +02:00
Gael Guennebaud	21cf4a1a8b	Make is_convertible more robust and conformant to std::is_convertible	2018-07-12 09:57:19 +02:00
Gael Guennebaud	8a5955a052	Optimize the product of a householder-sequence with the identity, and optimize the evaluation of a HouseholderSequence to a dense matrix using faster blocked product.	2018-07-11 17:16:50 +02:00
Gael Guennebaud	d193cc87f4	Fix regression in `9357838f94`	2018-07-11 17:09:23 +02:00
Gael Guennebaud	fb33687736	Fix double ;;	2018-07-11 17:08:30 +02:00
Deven Desai	876f392c39	Updates corresponding to the latest round of PR feedback The major changes are 1. Moving CUDA/PacketMath.h to GPU/PacketMath.h 2. Moving CUDA/MathFunctions.h to GPU/MathFunction.h 3. Moving CUDA/CudaSpecialFunctions.h to GPU/GpuSpecialFunctions.h The above three changes effectively enable the Eigen "Packet" layer for the HIP platform 4. Merging the "hip_basic" and "cuda_basic" unit tests into one ("gpu_basic") 5. Updating the "EIGEN_DEVICE_FUNC" marking in some places The change has been tested on the HIP and CUDA platforms.	2018-07-11 10:39:54 -04:00
Deven Desai	471cfe5ff7	renaming CUDA* to GPU* for some header files	2018-07-11 09:22:04 -04:00
Deven Desai	38807a2575	merging updates from upstream	2018-07-11 09:17:33 -04:00
Gael Guennebaud	f00d08cc0a	Optimize extraction of Q in SparseQR by exploiting the structure of the identity matrix.	2018-07-11 14:01:47 +02:00
Gael Guennebaud	1625476091	Add internall::is_identity compile-time helper	2018-07-11 14:00:24 +02:00
Gael Guennebaud	fe723d6129	Fix conversion warning	2018-07-10 09:10:32 +02:00
Gael Guennebaud	9357838f94	bug #1543 : improve linear indexing for general block expressions	2018-07-10 09:10:15 +02:00
Gael Guennebaud	de9e31a06d	Introduce the macro ei_declare_local_nested_eval to help allocating on the stack local temporaries via alloca, and let outer-products makes a good use of it. If successful, we should use it everywhere nested_eval is used to declare local dense temporaries.	2018-07-09 15:41:14 +02:00
Gael Guennebaud	ec323b7e66	Skip null numerators in triangular-vector-solve (as in BLAS TRSV).	2018-07-09 11:13:19 +02:00
Gael Guennebaud	359dd77ec3	Fix legitimate "declaration shadows a typedef" warning	2018-07-09 11:03:39 +02:00
Mark D Ryan	90a53ca6fd	Fix the Packet16h version of ptranspose The AVX512 version of ptranpose for PacketBlock<Packet16h,16> was reordering the PacketBlock argument incorrectly. This lead to errors in the multiplication of matrices composed of 16 bit floats on AVX512 machines, if at least of the matrices was using RowMajor order. This error is responsible for one tensorflow unit test failure on AVX512 machines: //tensorflow/python/kernel_tests:batch_matmul_op_test	2018-06-16 15:13:06 -07:00
Gael Guennebaud	1f54164eca	Fix a few issues with Packet16h	2018-07-07 00:15:07 +02:00
Gael Guennebaud	f2dc048df9	complete implementation of Packet16h (AVX512)	2018-07-06 17:43:11 +02:00
Gael Guennebaud	f4d623ffa7	Complete Packet8h implementation and test it in packetmath unit test	2018-07-06 17:13:36 +02:00
Deven Desai	b6cc0961b1	updates based on PR feedback There are two major changes (and a few minor ones which are not listed here...see PR discussion for details) 1. Eigen::half implementations for HIP and CUDA have been merged. This means that - `CUDA/Half.h` and `HIP/hcc/Half.h` got merged to a new file `GPU/Half.h` - `CUDA/PacketMathHalf.h` and `HIP/hcc/PacketMathHalf.h` got merged to a new file `GPU/PacketMathHalf.h` - `CUDA/TypeCasting.h` and `HIP/hcc/TypeCasting.h` got merged to a new file `GPU/TypeCasting.h` After this change the `HIP/hcc` directory only contains one file `math_constants.h`. That will go away too once that file becomes a part of the HIP install. 2. new macros EIGEN_GPUCC, EIGEN_GPU_COMPILE_PHASE and EIGEN_HAS_GPU_FP16 have been added and the code has been updated to use them where appropriate. - `EIGEN_GPUCC` is the same as `(EIGEN_CUDACC \|\| EIGEN_HIPCC)` - `EIGEN_GPU_DEVICE_COMPILE` is the same as `(EIGEN_CUDA_ARCH \|\| EIGEN_HIP_DEVICE_COMPILE)` - `EIGEN_HAS_GPU_FP16` is the same as `(EIGEN_HAS_CUDA_FP16 or EIGEN_HAS_HIP_FP16)`	2018-06-14 10:21:54 -04:00
Deven Desai	ba972fb6b4	moving Half headers from CUDA dir to GPU dir, removing the HIP versions	2018-06-13 12:26:18 -04:00
Deven Desai	d1d22ef0f4	syncing this fork with upstream	2018-06-13 12:09:52 -04:00
Benoit Steiner	d3a380af4d	Merged in mfigurnov/eigen/gamma-der-a (pull request PR-403) Derivative of the incomplete Gamma function and the sample of a Gamma random variable Approved-by: Benoit Steiner <benoit.steiner.goog@gmail.com>	2018-06-11 17:57:47 +00:00
Andrea Bocci	f7124b3e46	Extend CUDA support to matrix inversion and selfadjointeigensolver	2018-06-11 18:33:24 +02:00
Gael Guennebaud	0537123953	bug #1565 : help MSVC to generatenot too bad ASM in reductions.	2018-07-05 09:21:26 +02:00
Gael Guennebaud	6a241bd8ee	Implement custom inplace triangular product to avoid a temporary	2018-07-03 14:02:46 +02:00
Gael Guennebaud	3ae2083e23	Make is_same_dense compatible with different scalar types.	2018-07-03 13:21:43 +02:00
Gael Guennebaud	047677a08d	Fix regression in changeset `f05dea6b23` : computeFromHessenberg can take any expression for matrixQ, not only an HouseholderSequence.	2018-07-02 12:18:25 +02:00
Gael Guennebaud	d625564936	Simplify redux_evaluator using inheritance, and properly rename parameters in reducers.	2018-07-02 11:50:41 +02:00
Gael Guennebaud	d428a199ab	bug #1562 : optimize evaluation of small products of the form sAB by rewriting them as: s*(A.lazyProduct(B)) to save a costly temporary. Measured speedup from 2x to 5x...	2018-07-02 11:41:09 +02:00
Gael Guennebaud	0cdacf3fa4	update comment	2018-06-29 11:28:36 +02:00
Gael Guennebaud	9a81de1d35	Fix order of EIGEN_DEVICE_FUNC and returned type	2018-06-28 00:20:59 +02:00
Gael Guennebaud	f9d337780d	First step towards a generic vectorised quaternion product	2018-06-25 14:26:51 +02:00
Gael Guennebaud	ee5864f72e	bug #1560 fix product with a 1x1 diagonal matrix	2018-06-25 10:30:12 +02:00
Rasmus Munk Larsen	bda71ad394	Fix typo in pbend for AltiVec.	2018-06-22 15:04:35 -07:00
Gael Guennebaud	d6813fb1c5	bug #1531 : expose NumDimensions for solve and sparse expressions.	2018-06-08 16:55:10 +02:00
Gael Guennebaud	89d65bb9d6	bug #1531 : expose NumDimensions for compatibility with Tensor	2018-06-08 16:50:17 +02:00
Gael Guennebaud	f05dea6b23	bug #1550 : prevent avoidable memory allocation in RealSchur	2018-06-08 10:14:57 +02:00
Benoit Steiner	522d3ca54d	Don't use std::equal_to inside cuda kernels since it's not supported.	2018-06-07 13:02:07 -07:00
Christoph Hertzberg	7d7bb91537	Missing line during manual rebase of PR-374	2018-06-07 20:30:09 +02:00
Michael Figurnov	30fa3d0454	Merge from eigen/eigen	2018-06-07 17:57:56 +01:00
Gael Guennebaud	af7c83b9a2	Fix warning	2018-06-07 15:45:24 +02:00
Gael Guennebaud	7fe29aceeb	Fix MSVC warning C4290: C++ exception specification ignored except to indicate a function is not __declspec(nothrow)	2018-06-07 15:36:20 +02:00
Christoph Hertzberg	e5f9f4768f	Avoid unnecessary C++11 dependency	2018-06-07 15:03:50 +02:00
Gael Guennebaud	b3fd93207b	Fix typos found using codespell	2018-06-07 14:43:02 +02:00
Michael Figurnov	4bd158fa37	Derivative of the incomplete Gamma function and the sample of a Gamma random variable. In addition to igamma(a, x), this code implements: * igamma_der_a(a, x) = d igamma(a, x) / da -- derivative of igamma with respect to the parameter * gamma_sample_der_alpha(alpha, sample) -- reparameterization derivative of a Gamma(alpha, 1) random variable sample with respect to the alpha parameter The derivatives are computed by forward mode differentiation of the igamma(a, x) code. Although gamma_sample_der_alpha can be implemented via igamma_der_a, a separate function is more accurate and efficient due to analytical cancellation of some terms. All three functions are implemented by a method parameterized with "mode" that always computes the derivatives, but does not return them unless required by the mode. The compiler is expected to (and, based on benchmarks, does) skip the unnecessary computations depending on the mode.	2018-06-06 18:49:26 +01:00
Deven Desai	8fbd47052b	Adding support for using Eigen in HIP kernels. This commit enables the use of Eigen on HIP kernels / AMD GPUs. Support has been added along the same lines as what already exists for using Eigen in CUDA kernels / NVidia GPUs. Application code needs to explicitly define EIGEN_USE_HIP when using Eigen in HIP kernels. This is because some of the CUDA headers get picked up by default during Eigen compile (irrespective of whether or not the underlying compiler is CUDACC/NVCC, for e.g. Eigen/src/Core/arch/CUDA/Half.h). In order to maintain this behavior, the EIGEN_USE_HIP macro is used to switch to using the HIP version of those header files (see Eigen/Core and unsupported/Eigen/CXX11/Tensor) Use the "-DEIGEN_TEST_HIP" cmake option to enable the HIP specific unit tests.	2018-06-06 10:12:58 -04:00
Michael Figurnov	f216854453	Exponentially scaled modified Bessel functions of order zero and one. The functions are conventionally called i0e and i1e. The exponentially scaled version is more numerically stable. The standard Bessel functions can be obtained as i0(x) = exp(\|x\|) i0e(x) The code is ported from Cephes and tested against SciPy.	2018-05-31 15:34:53 +01:00
Gael Guennebaud	647b724a36	Define pcast<> for SSE types even when AVX is enabled. (otherwise float are silently reinterpreted as int instead of being converted)	2018-05-29 20:46:46 +02:00
Gael Guennebaud	49262dfee6	Fix compilation and SSE support with PGI compiler	2018-05-29 15:09:31 +02:00
Gael Guennebaud	f0862b062f	Fix internal::is_integral<size_t/ptrdiff_t> with MSVC 2013 and older.	2018-05-22 19:29:51 +02:00
Gael Guennebaud	36e413a534	Workaround a MSVC 2013 compilation issue with MatrixBase(Index,int)	2018-05-22 18:51:35 +02:00
Gael Guennebaud	725bd92903	fix stupid typo	2018-05-18 17:46:43 +02:00
Gael Guennebaud	a382bc9364	is_convertible<T,Index> does not seems to work well with MSVC 2013, so let's rather use __is_enum(T) for old MSVC versions	2018-05-18 17:02:27 +02:00
Gael Guennebaud	4dd767f455	add some internal checks	2018-05-18 13:59:55 +02:00
Mark D Ryan	405859f18d	Set EIGEN_IDEAL_MAX_ALIGN_BYTES correctly for AVX512 builds bug #1548 The macro EIGEN_IDEAL_MAX_ALIGN_BYTES is being incorrectly set to 32 on AVX512 builds. It should be set to 64. In the current code it is only set to 64 if the macro EIGEN_VECTORIZE_AVX512 is defined. This macro does get defined in AVX512 builds in Core, but only after Macros.h, the file that defines EIGEN_IDEAL_MAX_ALIGN_BYTES, has been included. This commit fixes the issue by setting EIGEN_IDEAL_MAX_ALIGN_BYTES to 64 if __AVX512F__ is defined.	2018-05-17 17:04:00 +01:00
Gael Guennebaud	7134fa7a2e	Fix compilation with MSVC by reverting to char* for _mm_prefetch except for PGI (the later being the one that has the wrong prototype).	2018-06-07 09:33:10 +02:00
Robert Lukierski	b2053990d0	Adding EIGEN_DEVICE_FUNC to Products, especially Dense2Dense Assignment specializations. Otherwise causes problems with small fixed size matrix multiplication (call to 0x00 in call_assignment_no_alias in debug mode or trap in release with CUDA 9.1).	2018-03-14 16:19:43 +00:00
Jeff Trull	9f0c5c3669	Make sparse QR result sizes consistent with dense QR, with the following rules: 1) Q is always square 2) QRP' is valid and recovers the original matrix This implies that the size of Q is the number of rows in the original matrix, square, and that the size of R is the size of the original matrix.	2018-02-15 15:00:31 -08:00
Christoph Hertzberg	d655900953	bug #1544 : Generate correct Q matrix in complex case. Original patch was by Jeff Trull in PR-386.	2018-05-17 19:17:01 +02:00
Christoph Hertzberg	0272f2451a	Fix "suggest parentheses around comparison" warning	2018-05-15 19:35:53 +02:00
Gael Guennebaud	6e7118265d	Fix compilation with NEON+MSVC	2018-04-26 10:50:41 +02:00
Gael Guennebaud	8810baaed4	Add multi-threading for sparse-row-major * dense-row-major	2018-04-25 10:14:48 +02:00
Gael Guennebaud	e8ca5166a9	bug #1428 : atempt to make NEON vectorization compilable by MSVC. The workaround is to wrap NEON packet types to make them different c++ types.	2018-04-24 11:19:49 +02:00
Benoit Steiner	6f5935421a	fix AVX512 plog	2018-04-23 15:49:26 +00:00
Gael Guennebaud	e9da464e20	Add specializations of is_arithmetic for long long in c++11	2018-04-23 16:26:29 +02:00

... 2 3 4 5 6 ...

5877 Commits