Commit Graph

6666 Commits

Author SHA1 Message Date
Erik Schultheis
d60f7fa518 Improved lapacke binding code for HouseholderQR and PartialPivLU 2021-12-02 00:10:58 +00:00
Erik Schultheis
ec2fd0f7ed Require recent GCC and MSCV and removed EIGEN_HAS_CXX14 and some other feature test macros 2021-12-01 00:48:34 +00:00
Rasmus Munk Larsen
085c2fc5d5 Revert "Update SVD Module to allow specifying computation options with a... 2021-11-30 18:45:54 +00:00
Erik Schultheis
4dd126c630 fixed cholesky with 0 sized matrix (cf. #785) 2021-11-30 17:17:41 +00:00
Rohit Santhanam
4d3e50036f Fix for HIP compilation breakage in selfAdjoint and triangular view classes. 2021-11-30 14:00:59 +00:00
Erik Schultheis
63abb35dfd SFINAE'ing away non-const overloads if selfAdjoint/triangular view is not referring to an lvalue 2021-11-29 22:51:26 +00:00
Jakub Gałecki
1b8dce564a bugfix: issue #2375 2021-11-29 22:26:15 +00:00
Francesco Mazzoli
eb85b97339 Select AVX2 even if the data size is not a multiple of 8 2021-11-29 21:13:24 +00:00
Arthur
eef33946b7 Update SVD Module to allow specifying computation options with a template parameter. Resolves #2051 2021-11-29 20:50:46 +00:00
Erik Schultheis
f33a31b823 removed EIGEN_HAS_CXX11_* and redundant EIGEN_COMP_CXXVER checks 2021-11-29 19:18:57 +00:00
Rohit Santhanam
9d3ffb3fbf Fix for HIP compilation failure in DenseBase. 2021-11-28 15:59:30 +00:00
David Tellenbach
08da52eb85 Remove DenseBase::nonZeros() which just calls DenseBase::size()
Fixes #2382.
2021-11-27 14:31:00 +00:00
Ali Can Demiralp
96e537d6fd Add EIGEN_DEVICE_FUNC to DenseBase::hasNaN() and DenseBase::allFinite(). 2021-11-27 11:27:52 +00:00
Erik Schultheis
b8b6566f0f Currently, the binding of LLT to Lapacke is done using a large macro. This factors out a large part of the functionality of the macro and implement them explicitly. 2021-11-25 16:11:25 +00:00
Erik Schultheis
ec4efbd696 remove EIGEN_HAS_CXX11 2021-11-24 20:08:49 +00:00
Rasmus Munk Larsen
5137a5157a Make numeric_limits members constexpr as per the newer C++ standards.
Author: majnemer@google.com
2021-11-19 15:58:36 +00:00
Erik Schultheis
7e586635ba don't use deprecated MappedSparseMatrix 2021-11-19 15:58:04 +00:00
Erik Schultheis
b0fb5417d3 Fixed Sparse-Sparse Product in case of mixed StorageIndex types 2021-11-18 18:33:31 +00:00
Pablo Speciale
d04edff570 Update Umeyama.h: src_var is only used when with_scaling == true. Therefore, the actual computation can be avoided when with_scaling == false. 2021-11-16 17:58:22 +00:00
Rasmus Munk Larsen
2b9297196c Update Transform.h to make transform_construct_from_matrix and transform_take_affine_part callable from device code. Fixes #2377. 2021-11-16 00:58:30 +00:00
Erik Schultheis
ca9c848679 use consistent StorageIndex types in SparseMatrix::Map
and `SparseMatrix::TransposedSparseMatrix`
2021-11-15 22:18:26 +00:00
Erik Schultheis
13954c4440 moved pruning code to SparseVector.h 2021-11-15 22:16:01 +00:00
Nathan Luehr
da79095923 Convert diag pragmas to nv_diag. 2021-11-15 03:42:42 +00:00
Erik Schultheis
532cc73f39 fix a typo 2021-11-13 13:11:06 +02:00
Gengxin Xie
5c642950a5 Bug Fix: correct the bug that won't define EIGEN_HAS_FP16_C
if the compiler isn't clang
2021-11-04 22:13:01 +00:00
Gilad
0d73440fb2 Documentation of Quaternion constructor from MatrixBase 2021-11-04 16:21:26 +00:00
Xinle Liu
478a1bdda6 Fix total deflation issue in BDCSVD, when & only when M is already diagonal. 2021-11-02 16:53:55 +00:00
Chip Kerchner
9cf34ee0ae Invert rows and depth in non-vectorized portion of packing (PowerPC). 2021-10-28 21:59:41 +00:00
Ilya Tokar
e1cb6369b0 Add AVX vector path to float2half/half2float
Makes e. g. matrix multiplication 2x faster:
name         old cpu/op  new cpu/op  delta
BM_convers   181ms ± 1%    62ms ± 9%  -65.82%  (p=0.016 n=4+5)

Tested on all possible input values (not adding tests, since they
take a long time).
2021-10-28 13:59:01 -04:00
Antonio Sanchez
03d4cbb307 Fix min/max nan-propagation for scalar "other".
Copied input type from `EIGEN_MAKE_CWISE_BINARY_OP`.

Fixes #2362.
2021-10-28 09:28:29 -07:00
Antonio Sanchez
e559701981 Fix compile issue for gcc 4.8 2021-10-28 08:23:19 -07:00
Rohit Santhanam
48e40b22bf Preliminary HIP bfloat16 GPU support. 2021-10-27 18:36:45 +00:00
Antonio Sanchez
40bbe8a4d0 Fix ZVector build.
Cross-compiled via `s390x-linux-gnu-g++`, run via qemu.  This allows the
packetmath tests to pass.
2021-10-27 16:30:15 +00:00
Alex Druinsky
6bb6a6bf53 Vectorize fp16 tanh and logistic functions on Neon
Activates vectorization of the Eigen::half versions of the tanh and
logistic functions when they run on Neon. Both functions convert their
inputs to float before computing the output, and as a result of this
commit, the conversions and the computation in float are vectorized.
2021-10-27 16:09:16 +00:00
Andreas Krebbel
8faafc3aaa ZVector: Move alignas qualifier to come first
We currently have plenty of type definitions with the alignment
qualifier coming after the type.  The compiler warns about ignoring
them:
int EIGEN_ALIGN16 ai[4];

Turn this into:
EIGEN_ALIGN16 int ai[4];
2021-10-26 15:33:47 +02:00
Alex Druinsky
d0e3791b1a Fix vectorized reductions for Eigen::half
Fixes compiler errors in expressions that look like

  Eigen::Matrix<Eigen::half, 3, 1>::Random().maxCoeff()

The error comes from the code that creates the initial value for
vectorized reductions. The fix is to specify the scalar type of the
reduction's initial value.

The cahnge is necessary for Eigen::half because unlike other types,
Eigen::half scalars cannot be implicitly created from integers.
2021-10-25 14:44:33 -07:00
Yann Billeter
6c3206152a fix(CommaInitializer): pass dims at compile-time 2021-10-25 19:53:38 +00:00
Antonio Sanchez
0578feaabc Remove const from visitor return type.
This seems to interfere with `pload`/`ploadu`, since `pload<const
Packet**>` are not defined.

This should unbreak the arm/ppc builds.
2021-10-25 19:09:50 +00:00
benardp
b63c096fbb Extend EIGEN_QT_SUPPORT to Qt6 2021-10-23 23:43:06 +00:00
Lennart Steffen
163f11e24a Included note on inner stride for compile-time vectors. See https://gitlab.com/libeigen/eigen/-/issues/2355#note_711078126 2021-10-22 09:46:43 +00:00
Rasmus Munk Larsen
2d3fec8ff6 Add nan-propagation options to matrix and array plugins. 2021-10-21 19:40:11 +00:00
Antonio Sanchez
b86e013321 Revert bit_cast to use memcpy for CUDA.
To elide the memcpy, we need to first load the `src` value into
registers by making a local copy. This avoids the need to resort
to potential UB by using `reinterpret_cast`.

This change doesn't seem to affect CPU (at least not with gcc/clang).
With optimizations on, the copy is also elided.
2021-10-21 08:14:11 -07:00
Antonio Sanchez
45e67a6fda Use reinterpret_cast on GPU for bit_cast.
This seems to be the recommended approach for doing type punning in
CUDA. See for example
- https://stackoverflow.com/questions/47037104/cuda-type-punning-memcpy-vs-ub-union
- https://developer.nvidia.com/blog/faster-parallel-reductions-kepler/
(the latter puns a double to an `int2`).
The issue is that for CUDA, the `memcpy` is not elided, and ends up
being an expensive operation.  We already have similar `reintepret_cast`s across
the Eigen codebase for GPU (as does TensorFlow).
2021-10-20 21:34:40 +00:00
Antonio Sanchez
95bb645e92 Fix MSVC+NVCC EIGEN_INHERIT_ASSIGNMENT_EQUAL_OPERATOR compilation.
Looks like we need to update the
`EIGEN_INHERIT_ASSIGNMENT_EQUAL_OPERATOR` for newer versions of MSVC as
well when compiling with NVCC.  Fixes build issues for VS 2017.
2021-10-20 19:38:14 +00:00
Antonio Sanchez
fd5f48e465 Fix tuple compilation for VS2017.
VS2017 doesn't like deducing alias types, leading to a bunch of compile
errors for functions involving the `tuple` alias.  Replacing with
`TupleImpl` seems to solve this, allowing the test to compile/pass.
2021-10-20 19:18:34 +00:00
Antonio Sanchez
d0d34524a1 Move CUDA/Complex.h to GPU/Complex.h, remove TensorReductionCuda.h
The `Complex.h` file applies equally to HIP/CUDA, so placing under the
generic `GPU` folder.

The `TensorReductionCuda.h` has already been deprecated, now removing
for the next Eigen version.
2021-10-20 12:00:19 -07:00
Rasmus Munk Larsen
f2c9c2d2f7 Vectorize Visitor.h. 2021-10-20 16:58:01 +00:00
Antonio Sanchez
f0f1d7938b Disable testing of complex compound assignment operators for MSVC.
MSVC does not support specializing compound assignments for
`std::complex`, since it already specializes them (contrary to the
standard).

Trying to use one of these on device will currently lead to a
duplicate definition error.  This is still probably preferable
to no error though.  If we remove the definitions for MSVC, then
it will compile, but the kernel will fail silently.

The only proper solution would be to define our own custom `Complex`
type.
2021-09-27 15:15:11 -07:00
Antonio Sanchez
21640612be Disable more CUDA warnings.
For cuda 9.2 and 11.4, they changed the numbers again.

Fixes #2331.
2021-09-24 21:31:14 -07:00
Antonio Sanchez
e9e90892fe Disable another device warning 2021-09-23 13:43:18 -07:00