Commit Graph

11638 Commits

Author SHA1 Message Date
David Tellenbach
08da52eb85 Remove DenseBase::nonZeros() which just calls DenseBase::size()
Fixes #2382.
2021-11-27 14:31:00 +00:00
Ali Can Demiralp
96e537d6fd Add EIGEN_DEVICE_FUNC to DenseBase::hasNaN() and DenseBase::allFinite(). 2021-11-27 11:27:52 +00:00
Erik Schultheis
b8b6566f0f Currently, the binding of LLT to Lapacke is done using a large macro. This factors out a large part of the functionality of the macro and implement them explicitly. 2021-11-25 16:11:25 +00:00
Erik Schultheis
ec4efbd696 remove EIGEN_HAS_CXX11 2021-11-24 20:08:49 +00:00
Rasmus Munk Larsen
cfdb3ce3f0 Fix warnings about shadowing definitions. 2021-11-23 14:34:47 -08:00
Rasmus Munk Larsen
5e89573e2a Implement Eigen::array<...>::reverse_iterator if std::reverse_iterator exists. 2021-11-20 00:22:46 +00:00
Rasmus Munk Larsen
5137a5157a Make numeric_limits members constexpr as per the newer C++ standards.
Author: majnemer@google.com
2021-11-19 15:58:36 +00:00
Erik Schultheis
7e586635ba don't use deprecated MappedSparseMatrix 2021-11-19 15:58:04 +00:00
Rasmus Munk Larsen
11cb7b8372 Add basic iterator support for Eigen::array to ease transition to std::array in third-party libraries. 2021-11-19 05:14:30 +00:00
Antonio Sanchez
c107bd6102 Fix errors for windows build. 2021-11-19 04:23:25 +00:00
Erik Schultheis
b0fb5417d3 Fixed Sparse-Sparse Product in case of mixed StorageIndex types 2021-11-18 18:33:31 +00:00
Rasmus Munk Larsen
96aeffb013 Make the new TensorIO implementation work with TensorMap with const elements. 2021-11-17 18:16:04 -08:00
Rasmus Munk Larsen
824d06eb36 Include <numeric> to get std::iota. 2021-11-18 00:47:18 +00:00
Pablo Speciale
d04edff570 Update Umeyama.h: src_var is only used when with_scaling == true. Therefore, the actual computation can be avoided when with_scaling == false. 2021-11-16 17:58:22 +00:00
Antonio Sanchez
ffb78e23a1 Fix tensor broadcast off-by-one error.
Caught by JAX unit tests.  Triggered if broadcast is smaller than packet
size.
2021-11-16 17:37:38 +00:00
cpp977
f73c95c032 Reimplemented the Tensor stream output. 2021-11-16 17:36:58 +00:00
Rasmus Munk Larsen
2b9297196c Update Transform.h to make transform_construct_from_matrix and transform_take_affine_part callable from device code. Fixes #2377. 2021-11-16 00:58:30 +00:00
Erik Schultheis
ca9c848679 use consistent StorageIndex types in SparseMatrix::Map
and `SparseMatrix::TransposedSparseMatrix`
2021-11-15 22:18:26 +00:00
Erik Schultheis
13954c4440 moved pruning code to SparseVector.h 2021-11-15 22:16:01 +00:00
Nathan Luehr
da79095923 Convert diag pragmas to nv_diag. 2021-11-15 03:42:42 +00:00
Erik Schultheis
532cc73f39 fix a typo 2021-11-13 13:11:06 +02:00
jenswehner
675b72e44b added clang format 2021-11-09 23:49:01 +01:00
Ben Barsdell
50df8d3d6d Avoid integer overflow in EigenMetaKernel indexing
- The current implementation computes `size + total_threads`, which can
  overflow and cause CUDA_ERROR_ILLEGAL_ADDRESS when size is close to
  the maximum representable value.
- The num_blocks calculation can also overflow due to the implementation
  of divup().
- This patch prevents these overflows and allows the kernel to work
  correctly for the full representable range of tensor sizes.
- Also adds relevant tests.
2021-11-05 16:39:37 +11:00
Rasmus Munk Larsen
55e3ae02ac Compare summation results against forward error bound. 2021-11-04 18:04:04 -07:00
Gengxin Xie
5c642950a5 Bug Fix: correct the bug that won't define EIGEN_HAS_FP16_C
if the compiler isn't clang
2021-11-04 22:13:01 +00:00
Gilad
0d73440fb2 Documentation of Quaternion constructor from MatrixBase 2021-11-04 16:21:26 +00:00
Minh Quan HO
4284c68fbb nestbyvalue test: fix uninitialized matrix
- Doing computation with uninitialized (zero-ed ? but thanks Linux) matrix, or
worse NaN on other non-linux systems.
- This commit fixes it by initializing to Random().
2021-11-04 14:32:12 +01:00
Xinle Liu
478a1bdda6 Fix total deflation issue in BDCSVD, when & only when M is already diagonal. 2021-11-02 16:53:55 +00:00
Antonio Sanchez
8f8c2ba2fe Remove bad "take" impl that causes g++-11 crash.
For some reason, having `take<n, numeric_list<T>>` for `n > 0` causes
g++-11 to ICE with
```
sorry, unimplemented: unexpected AST of kind nontype_argument_pack
```
It does work with other versions of gcc, and with clang.
I filed a GCC bug
[here](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102999).

Technically we should never actually run into this case, since you
can't take n > 0 elements from an empty list.  Commenting it out
allows our Eigen tests to pass.
2021-11-01 17:04:41 +00:00
Antonio Sanchez
f6c8cc0e99 Fix TensorReduction warnings and error bound for sum accuracy test.
The sum accuracy test currently uses the default test precision for
the given scalar type.  However, scalars are generated via a normal
distribution, and given a large enough count and strong enough random
generator, the expected sum is zero.  This causes the test to
periodically fail.

Here we estimate an upper-bound for the error as `sqrt(N) * prec` for
summing N values, with each having an approximate epsilon of `prec`.

Also fixed a few warnings generated by MSVC when compiling the
reduction test.
2021-10-30 14:59:00 -07:00
Rasmus Munk Larsen
b3bea43a2d Don't use unrolled loops for stateful reducers. The problem is the combination step, e.g.
reducer0.reducePacket(accum1, accum0);
reducer0.reducePacket(accum2, accum0);
reducer0.reducePacket(accum3, accum0);

For the mean reducer this will increment the count as well as adding together the accumulators and result in the wrong count being divided into the sum at the end.
2021-10-28 23:52:54 +00:00
Chip Kerchner
9cf34ee0ae Invert rows and depth in non-vectorized portion of packing (PowerPC). 2021-10-28 21:59:41 +00:00
Ilya Tokar
e1cb6369b0 Add AVX vector path to float2half/half2float
Makes e. g. matrix multiplication 2x faster:
name         old cpu/op  new cpu/op  delta
BM_convers   181ms ± 1%    62ms ± 9%  -65.82%  (p=0.016 n=4+5)

Tested on all possible input values (not adding tests, since they
take a long time).
2021-10-28 13:59:01 -04:00
Antonio Sanchez
03d4cbb307 Fix min/max nan-propagation for scalar "other".
Copied input type from `EIGEN_MAKE_CWISE_BINARY_OP`.

Fixes #2362.
2021-10-28 09:28:29 -07:00
Antonio Sanchez
e559701981 Fix compile issue for gcc 4.8 2021-10-28 08:23:19 -07:00
Fabian Keßler
19cacd3ecb optimize cmake scripts for subproject use 2021-10-28 16:08:02 +02:00
Rohit Santhanam
48e40b22bf Preliminary HIP bfloat16 GPU support. 2021-10-27 18:36:45 +00:00
Antonio Sanchez
40bbe8a4d0 Fix ZVector build.
Cross-compiled via `s390x-linux-gnu-g++`, run via qemu.  This allows the
packetmath tests to pass.
2021-10-27 16:30:15 +00:00
Alex Druinsky
6bb6a6bf53 Vectorize fp16 tanh and logistic functions on Neon
Activates vectorization of the Eigen::half versions of the tanh and
logistic functions when they run on Neon. Both functions convert their
inputs to float before computing the output, and as a result of this
commit, the conversions and the computation in float are vectorized.
2021-10-27 16:09:16 +00:00
Antonio Sánchez
185ad0e610 Revert "Avoid integer overflow in EigenMetaKernel indexing"
This reverts commit 100d7caf92
2021-10-27 14:55:25 +00:00
Rasmus Munk Larsen
68e0d023c0 Remove license column in tables for builtin sparse solvers since all are MPL2 now. 2021-10-26 18:09:22 +00:00
Andreas Krebbel
8faafc3aaa ZVector: Move alignas qualifier to come first
We currently have plenty of type definitions with the alignment
qualifier coming after the type.  The compiler warns about ignoring
them:
int EIGEN_ALIGN16 ai[4];

Turn this into:
EIGEN_ALIGN16 int ai[4];
2021-10-26 15:33:47 +02:00
Ben Barsdell
100d7caf92 Avoid integer overflow in EigenMetaKernel indexing
- The current implementation computes `size + total_threads`, which can
  overflow and cause CUDA_ERROR_ILLEGAL_ADDRESS when size is close to
  the maximum representable value.
- The num_blocks calculation can also overflow due to the implementation
  of divup().
- This patch prevents these overflows and allows the kernel to work
  correctly for the full representable range of tensor sizes.
- Also adds relevant tests.
2021-10-26 00:04:28 +00:00
Alex Druinsky
d0e3791b1a Fix vectorized reductions for Eigen::half
Fixes compiler errors in expressions that look like

  Eigen::Matrix<Eigen::half, 3, 1>::Random().maxCoeff()

The error comes from the code that creates the initial value for
vectorized reductions. The fix is to specify the scalar type of the
reduction's initial value.

The cahnge is necessary for Eigen::half because unlike other types,
Eigen::half scalars cannot be implicitly created from integers.
2021-10-25 14:44:33 -07:00
Maxiwell S. Garcia
99600bd1a6 test: fix boostmutiprec test to compile with older Boost versions
Eigen boostmultiprec test redefines a symbol that is already defined
inside Boot Math [1]. Boost has fixed it recently [2], but this
patch avoids errors if Boost version was less than 1.77.

https://github.com/boostorg/math/blob/boost-1.76.0/include/boost/math/policies/policy.hpp#L18
6830712302 (diff-c7a8e5911c2e6be4138e1a966d762200f147792ac16ad96fdcc724313d11f839)
2021-10-25 20:32:33 +00:00
Yann Billeter
6c3206152a fix(CommaInitializer): pass dims at compile-time 2021-10-25 19:53:38 +00:00
Antonio Sanchez
a500da1dc0 Fix broadcasting oob error.
For vectorized 1-dimensional inputs that do not take the special
blocking path (e.g. `std::complex<...>`), there was an
index-out-of-bounds error causing the broadcast size to be
computed incorrectly.  Here we fix this, and make other minor
cleanup changes.

Fixes #2351.
2021-10-25 19:31:12 +00:00
Antonio Sanchez
0578feaabc Remove const from visitor return type.
This seems to interfere with `pload`/`ploadu`, since `pload<const
Packet**>` are not defined.

This should unbreak the arm/ppc builds.
2021-10-25 19:09:50 +00:00
benardp
b63c096fbb Extend EIGEN_QT_SUPPORT to Qt6 2021-10-23 23:43:06 +00:00
Lennart Steffen
163f11e24a Included note on inner stride for compile-time vectors. See https://gitlab.com/libeigen/eigen/-/issues/2355#note_711078126 2021-10-22 09:46:43 +00:00