Commit Graph

5906 Commits

Author SHA1 Message Date
Gael Guennebaud
7ef879f6bf GEBP: improves pipelining in the 1pX4 path with FMA.
Prior to this change, a product with a LHS having 8 rows was faster with AVX-only than with AVX+FMA.
With AVX+FMA I measured a speed up of about x1.25 in such cases.
2019-01-30 23:45:12 +01:00
Gael Guennebaud
de77bf5d6c Fix compilation with ARM64. 2019-01-30 16:48:20 +01:00
Gael Guennebaud
eb4c6bb22d Fix conflicts and merge 2019-01-30 15:57:08 +01:00
Gael Guennebaud
df12fae8b8 According to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89101, the previous GCC issue is fixed in GCC trunk (will be gcc 9). 2019-01-30 11:52:28 +01:00
Gael Guennebaud
3775926bba ARM64 & GEBP: add specialization for double +30% speed up 2019-01-30 11:49:06 +01:00
Gael Guennebaud
be5b0f664a ARM64 & GEBP: Make use of vfmaq_laneq_f32 and workaround GCC's issue in generating good ASM 2019-01-30 11:48:25 +01:00
Gael Guennebaud
8a06c699d0 bug #1669: fix PartialPivLU/inverse with zero-sized matrices. 2019-01-29 10:27:13 +01:00
Gael Guennebaud
a2a07e62b9 Fix compilation with c++03 (local class cannot be template arguments), and make SparseMatrix::assignDiagonal truly protected. 2019-01-29 10:10:07 +01:00
Gael Guennebaud
f489f44519 bug #1574: implement "sparse_matrix =,+=,-= diagonal_matrix" with smart insertion strategies of missing diagonal coeffs. 2019-01-28 17:29:50 +01:00
Gael Guennebaud
803fa79767 Move evaluator<SparseCompressedBase>::find(i,j) to a more general and reusable SparseCompressedBase::lower_bound(i,j) functiion 2019-01-28 17:24:44 +01:00
Christoph Hertzberg
5a52e35f9a Renaming some more I identifiers 2019-01-26 13:18:21 +01:00
Rasmus Munk Larsen
71429883ee Fix compilation error in NEON GEBP specializaition of madd. 2019-01-25 17:00:21 -08:00
Gael Guennebaud
ec8a387972 cleanup 2019-01-24 10:24:45 +01:00
David Tellenbach
237b03b372 PR 574: use variadic template instead of initializer_list to implement fixed-size vector ctor from coefficients. 2019-01-23 00:07:19 +01:00
Gael Guennebaud
80f81f9c4b Cleanup SFINAE in Array/Matrix(initializer_list) ctors and minor doc editing. 2019-01-22 17:08:47 +01:00
David Tellenbach
db152b9ee6 PR 572: Add initializer list constructors to Matrix and Array (include unit tests and doc)
- {1,2,3,4,5,...} for fixed-size vectors only
- {{1,2,3},{4,5,6}} for the general cases
- {{1,2,3,4,5,....}} is allowed for both row and column-vector
2019-01-21 16:25:57 +01:00
nluehr
92774f0275 Replace host_define.h with cuda_runtime_api.h 2019-01-18 16:10:09 -06:00
Christoph Hertzberg
da0a41b9ce Mask unused-parameter warnings, when building with NDEBUG 2019-01-18 10:41:14 +01:00
Rasmus Munk Larsen
2eccbaf3f7 Add missing logical packet ops for GPU and NEON. 2019-01-17 17:45:08 -08:00
Gael Guennebaud
ee3662abc5 Remove some useless const_cast 2019-01-17 18:27:49 +01:00
Gael Guennebaud
0fe6b7d687 Make nestByValue works again (broken since 3.3) and add unit tests. 2019-01-17 18:27:25 +01:00
Gael Guennebaud
4b7cf7ff82 Extend reshaped unit tests and remove useless const_cast 2019-01-17 17:35:32 +01:00
Gael Guennebaud
b57c9787b1 Cleanup useless const_cast and add missing broadcast assignment tests 2019-01-17 16:55:42 +01:00
Gael Guennebaud
be05d0030d Make FullPivLU use conjugateIf<> 2019-01-17 12:01:00 +01:00
Patrick Peltzer
15e53d5d93 PR 567: makes all dense solvers inherit SoverBase (LU,Cholesky,QR,SVD).
This changeset also includes:
 * add HouseholderSequence::conjugateIf
 * define int as the StorageIndex type for all dense solvers
 * dedicated unit tests, including assertion checking
 * _check_solve_assertion(): this method can be implemented in derived solver classes to implement custom checks
 * CompleteOrthogonalDecompositions: add applyZOnTheLeftInPlace, fix scalar type in applyZAdjointOnTheLeftInPlace(), add missing assertions
 * Cholesky: add missing assertions
 * FullPivHouseholderQR: Corrected Scalar type in _solve_impl()
 * BDCSVD: Unambiguous return type for ternary operator
 * SVDBase: Corrected Scalar type in _solve_impl()
2019-01-17 01:17:39 +01:00
Gael Guennebaud
7f32109c11 Add conjugateIf<bool> members to DesneBase, TriangularView, SelfadjointView, and make PartialPivLU use it. 2019-01-17 11:33:43 +01:00
Gael Guennebaud
562985bac4 bug #1646: fix false aliasing detection for A.row(0) = A.col(0);
This changeset completely disable the detection for vectors for which are current mechanism cannot detect any positive aliasing anyway.
2019-01-17 00:14:27 +01:00
Rasmus Munk Larsen
7401e2541d Fix compilation error for logical packet ops with older compilers. 2019-01-16 14:43:33 -08:00
Gael Guennebaud
0f028f61cb GEBP: fix swapped kernel mode with AVX512 and complex scalars 2019-01-16 22:26:38 +01:00
Gael Guennebaud
e118ce86fd GEBP: cleanup logic to choose between a 4 packets of 1 packet 2019-01-16 21:47:42 +01:00
Gael Guennebaud
70e133333d bug #1661: fix regression in GEBP and AVX512 2019-01-16 21:22:20 +01:00
Gael Guennebaud
502f717980 bug #1646: disable aliasing detection for empty and 1x1 expression 2019-01-16 14:33:45 +01:00
Gael Guennebaud
0b466b6933 bug #1633: use proper type for madd temporaries, factorize RhsPacketx4. 2019-01-16 13:50:13 +01:00
Renjie Liu
dbfcceabf5 Bug: 1633: refactor gebp kernel and optimize for neon 2019-01-16 12:51:36 +08:00
Gael Guennebaud
2b70b2f570 Make Transform::rotation() an alias to Transform::linear() in the case of an Isometry 2019-01-15 22:50:42 +01:00
Gael Guennebaud
2c2c114995 Silent maybe-uninitialized warnings by gcc 2019-01-15 16:53:15 +01:00
Gael Guennebaud
6ec6bf0b0d Enable visitor on empty matrices (the visitor is left unchanged), and protect min/maxCoeff(Index*,Index*) on empty matrices by an assertion (+ doc & unit tests) 2019-01-15 15:21:14 +01:00
Gael Guennebaud
027e44ed24 bug #1592: makes partial min/max reductions trigger an assertion on inputs with a zero reduction length (+doc and tests) 2019-01-15 15:13:24 +01:00
Gael Guennebaud
f8bc5cb39e Fix detection of vector-at-time: use Rows/Cols instead of MaxRow/MaxCols.
This fix VectorXd(n).middleCol(0,0).outerSize() which was equal to 1.
2019-01-15 15:09:49 +01:00
Gael Guennebaud
6cf7afa3d9 Typo 2019-01-15 11:04:37 +01:00
Rasmus Larsen
7b3aab0936 Merged in rmlarsen/eigen (pull request PR-570)
Add support for inverse hyperbolic functions. Fix cost of division.
2019-01-14 21:31:33 +00:00
Gael Guennebaud
250dcd1fdb bug #1652: fix position of EIGEN_ALIGN16 attributes in Neon and Altivec 2019-01-14 21:45:56 +01:00
Rasmus Larsen
5a59452aae Merged eigen/eigen into default 2019-01-14 10:23:23 -08:00
Gael Guennebaud
3c9e6d206d AVX512: fix pgather/pscatter for Packet4cd and unaligned pointers 2019-01-14 17:57:28 +01:00
Gael Guennebaud
61b6eb05fe AVX512 (r)sqrt(double) was mistakenly disabled with clang and others 2019-01-14 17:28:47 +01:00
Gael Guennebaud
ccddeaad90 fix warning 2019-01-14 16:51:16 +01:00
Gael Guennebaud
d4881751d3 Doc: add Isometry in the list of supported Mode of Transform<> 2019-01-14 16:38:26 +01:00
Greg Coombe
9d988a1e1a Initialize isometric transforms like affine transforms.
The isometric transform, like the affine transform, has an implicit last
row of [0, 0, 0, 1]. This was not being properly initialized, as verified
by a new test function.
2019-01-11 23:14:35 -08:00
Gael Guennebaud
4356a55a61 PR 571: Implements an accurate argument reduction algorithm for huge inputs of sin/cos and call it instead of falling back to std::sin/std::cos.
This makes both the small and huge argument cases faster because:
- for small inputs this removes the last pselect
- for large inputs only the reduction part follows a scalar path,
the rest use the same SIMD path as the small-argument case.
2019-01-14 13:54:01 +01:00
Gael Guennebaud
f566724023 Fix StorageIndex FIXME in dense LU solvers 2019-01-13 17:54:30 +01:00