There were some typos that checked `EIGEN_HAS_CXX14` that should have
checked `EIGEN_HAS_CXX14_VARIABLE_TEMPLATES`, causing a mismatch
in some of the `Eigen::fix<N>` assumptions.
Also fixed the `symbolic_index` test when
`EIGEN_HAS_CXX14_VARIABLE_TEMPLATES` is 0.
Fixes#2308
Removed all configurations that explicitly test or set the c++ standard
flags. The only place the standard is now configured is at the top of
the main `CMakeLists.txt` file, which can easily be updated (e.g. if
we decide to move to c++14+). This can also be set via command-line using
```
> cmake -DCMAKE_CXX_STANDARD 14
```
Kept the `EIGEN_TEST_CXX11` flag for now - that still controls whether to
build/run the `cxx11_*` tests. We will likely end up renaming these
tests and removing the `CXX11` subfolder.
Manually constructing an unaligned object declared as aligned
invokes UB, so we cannot technically check for alignment from
within the constructor. Newer versions of clang optimize away
this check.
Removing the affected tests.
The `memset` function and bitwise manipulation only apply to POD types
that do not require initialization, otherwise resulting in UB. We currently
violate this in `ptrue` and `pzero`, we assume bitmasks for `pselect`, and
bitwise operations are applied byte-by-byte in the generic implementations.
This is causing issues for scalar types that do require initialization
or that contain non-POD info such as pointers (#2201). We either break
them, or force specializations of these functions for custom scalars,
even if they are not vectorized.
Here we modify these functions for scalars only - instead using only
scalar operations:
- `pzero`: `Scalar(0)` for all scalars.
- `ptrue`: `Scalar(1)` for non-trivial scalars, bitset to one bits for trivial scalars.
- `pselect`: ternary select comparing mask to `Scalar(0)` for all scalars
- `pand`, `por`, `pxor`, `pnot`: use operators `&`, `|`, `^`, `~` for all integer or non-trivial scalars, otherwise apply bytewise.
For non-scalar types, the original implementations are used to maintain
compatibility and minimize the number of changes.
Fixes#2201.
Since `std::equal_to::operator()` is not a device function, it
fails on GPU. On my device, I seem to get a silent crash in the
kernel (no reported error, but the kernel does not complete).
Replacing this with a portable version enables comparisons on device.
Addresses #2292 - would need to be cherry-picked. The 3.3 branch
also requires adding `EIGEN_DEVICE_FUNC` in `BooleanRedux.h` to get
fully working.
For custom scalars, zero is not necessarily represented by
a zeroed-out memory block (e.g. gnu MPFR). We therefore
cannot rely on `memset` if we want to fill a matrix or tensor
with zeroes. Instead, we should rely on `fill`, which for trivial
types does end up getting converted to a `memset` under-the-hood
(at least with gcc/clang).
Requires adding a `fill(begin, end, v)` to `TensorDevice`.
Replaced all potentially bad instances of memset with fill.
Fixes#2245.
For empty or single-column matrices, the current `PartialPivLU`
currently dereferences a `nullptr` or accesses memory out-of-bounds.
Here we adjust the checks to avoid this.
When calling conservativeResize() on a matrix with DontAlign flag, the
temporary variable used to perform the resize should have the same
Options as the original matrix to ensure that the correct override of
swap is called (i.e. PlainObjectBase::swap(DenseBase<OtherDerived> &
other). Calling the base class swap (i.e in DenseBase) results in
assertions errors or memory corruption.
The cxx11 path for `numext::arg` incorrectly returned the complex type
instead of the real type, leading to compile errors. Fixed this and
added tests.
Related to !477, which uncovered the issue.
Fixes#2229.
For dynamic matrices with fixed-sized storage, only copy/swap
elements that have been set. Otherwise, this leads to inefficient
copying, and potential UB for non-initialized elements.
The namespace declaration for googlehash is a configurable macro that
can be disabled. In particular, it is disabled within google, causing
compile errors since `dense_hash_map`/`sparse_hash_map` are then in
the global namespace instead of in `::google`.
Here we play a bit of gynastics to allow for both `google::*_hash_map`
and `*_hash_map`, while limiting namespace polution. Symbols within
the `::google` namespace are imported into `Eigen::google`.
We also remove checks based on `_SPARSE_HASH_MAP_H_`, as this is
fragile, and instead require `EIGEN_GOOGLEHASH_SUPPORT` to be
defined.
Clang-tidy complains that full specializations in headers can cause ODR
violations. Marked these as `inline` to fix.
It also complains about renaming arguments in specializations. Set the
argument names to match.
Some CUDA/HIP constants fail on device with `constexpr` since they
internally rely on non-constexpr functions, e.g.
```
\#define CUDART_INF_F __int_as_float(0x7f800000)
```
This fails for cuda-clang (though passes with nvcc). These constants are
currently used by `device::numeric_limits`. For portability, we
need to remove `constexpr` from the affected functions.
For C++11 or higher, we should be able to rely on the `std::numeric_limits`
versions anyways, since the methods themselves are now `constexpr`, so
should be supported on device (clang/hipcc natively, nvcc with
`--expr-relaxed-constexpr`).
`g_called` is not used in subtest 7, so was generating a
`-Wunneeded-internal-declaration` warnings. Here we silence
it by initializing the static variable.
The original fails with nvcc+msvc - there's a static order of initialization
issue leading to registered tests being cleared. The test then fails on
```
VERIFY(EigenTest::all().size()>0);
```
since `EigenTest` no longer contains any tests. The singleton pattern
fixes this.
Replace usage of `std::numeric_limits<...>::min/max_exponent` in
codebase where possible. Also replaced some other `numeric_limits`
usages in affected tests with the `NumTraits` equivalent.
The previous MR !443 failed for c++03 due to lack of `constexpr`.
Because of this, we need to keep around the `std::numeric_limits`
version in enum expressions until the switch to c++11.
Fixes#2148
Replace usage of `std::numeric_limits<...>::min/max_exponent` in
codebase. Also replaced some other `numeric_limits` usages in
affected tests with the `NumTraits` equivalent.
Fixes#2148
NVCC does not understand `__forceinline`, so we need to use `inline`
when compiling for GPU.
ICC specializes `std::complex` operators for `float` and `double`
by default, which cannot be used on device and conflict with Eigen's
workaround in CUDA/Complex.h. This can be prevented by defining
`_OVERRIDE_COMPLEX_SPECIALIZATION_` before including `<complex>`.
Added this define to the tests and to `Eigen/Core`, but this will
not work if the user includes `<complex>` before `<Eigen/Core>`.
ICC also seems to generate a duplicate `Map` symbol in
`PlainObjectBase`:
```
error: "Map" has already been declared in the current scope
static ConstMapType Map(const Scalar *data)
```
I tracked this down to `friend class Eigen::Map`. Putting the `friend`
statements at the bottom of the class seems to resolve this issue.
Fixes#2180
The original swap approach leads to potential undefined behavior (reading
uninitialized memory) and results in unnecessary copying of data for static
storage.
Here we pass down the move assignment to the underlying storage. Static
storage does a one-way copy, dynamic storage does a swap.
Modified the tests to no longer read from the moved-from matrix/tensor,
since that can lead to UB. Added a test to ensure we do not access
uninitialized memory in a move.
Fixes: #2119
The macro `__cplusplus` is not defined correctly in MSVC unless building
with the the `/Zc:__cplusplus` flag. Instead, it defines `_MSVC_LANG` to the
specified c++ standard version number.
Here we introduce `EIGEN_CPLUSPLUS` which will contain the c++ version
number both for MSVC and otherwise. This simplifies checks for supported
features.
Also replaced most instances of standard version checking via `__cplusplus`
with the existing `EIGEN_COMP_CXXVER` macro for better clarity.
Fixes: #2170