Commit Graph

2904 Commits

Author SHA1 Message Date
Antonio Sanchez
69adf26aa3 Modify googlehash use to account for namespace issues.
The namespace declaration for googlehash is a configurable macro that
can be disabled.  In particular, it is disabled within google, causing
compile errors since `dense_hash_map`/`sparse_hash_map` are then in
the global namespace instead of in `::google`.

Here we play a bit of gynastics to allow for both `google::*_hash_map`
and `*_hash_map`, while limiting namespace polution.  Symbols within
the `::google` namespace are imported into `Eigen::google`.

We also remove checks based on `_SPARSE_HASH_MAP_H_`, as this is
fragile, and instead require `EIGEN_GOOGLEHASH_SUPPORT` to be
defined.
2021-04-12 19:00:39 -07:00
Rasmus Munk Larsen
a2c0542010 Fix typo in TensorDimensions.h 2021-04-12 18:59:56 +00:00
Rohit Santhanam
dfd6720d82 Fix for float16 GPU unit test. 2021-04-12 10:19:06 +00:00
Jens Wehner
f6fc66aa75 fixed doxygen for unsupported iterative solver module 2021-04-11 16:26:14 +00:00
Rohit Santhanam
2859db0220 This fixes an issue where the compiler was not choosing the GPU specific specialization of ScanLauncher.
The issue was discovered when the GPU scan unit test was run and resulted in a segmentation fault.

The segmantation fault occurred because the unit test allocated GPU memory and passed a pointer to that memory to the computation that it presumed would execute on the GPU.

But because of the issue, the computation was scheduled to execute on the CPU so a situation was constructed where the CPU attempted to access a GPU memory location.

The fix expands the GPU specific ScanLauncher specialization to handle cases where vectorization is enabled.

Previously, the GPU specialization is chosen only if Vectorization is not used.
2021-04-08 15:14:48 +00:00
Steve Bronder
e7b8643d70 Revert "Revert "Adds EIGEN_CONSTEXPR and EIGEN_NOEXCEPT to rows(), cols(), innerStride(), outerStride(), and size()""
This reverts commit 5f0b4a4010.
2021-03-24 18:14:56 +00:00
Jens Wehner
c0a889890f Fixed output of complex matrices 2021-03-15 21:51:55 +00:00
Antonio Sanchez
543e34ab9d Re-implement move assignments.
The original swap approach leads to potential undefined behavior (reading
uninitialized memory) and results in unnecessary copying of data for static
storage.

Here we pass down the move assignment to the underlying storage.  Static
storage does a one-way copy, dynamic storage does a swap.

Modified the tests to no longer read from the moved-from matrix/tensor,
since that can lead to UB. Added a test to ensure we do not access
uninitialized memory in a move.

Fixes: #2119
2021-03-10 16:55:20 +00:00
Antonio Sanchez
2468253c9a Define EIGEN_CPLUSPLUS and replace most __cplusplus checks.
The macro `__cplusplus` is not defined correctly in MSVC unless building
with the the `/Zc:__cplusplus` flag. Instead, it defines `_MSVC_LANG` to the
specified c++ standard version number.

Here we introduce `EIGEN_CPLUSPLUS` which will contain the c++ version
number both for MSVC and otherwise.  This simplifies checks for supported
features.

Also replaced most instances of standard version checking via `__cplusplus`
with the existing `EIGEN_COMP_CXXVER` macro for better clarity.

Fixes: #2170
2021-03-05 18:33:18 +00:00
David Tellenbach
5f0b4a4010 Revert "Adds EIGEN_CONSTEXPR and EIGEN_NOEXCEPT to rows(), cols(), innerStride(), outerStride(), and size()"
This reverts commit 6cbb3038ac because it
breaks clang-10 builds on x86 and aarch64 when C++11 is enabled.
2021-03-05 13:16:43 +01:00
Steve Bronder
6cbb3038ac Adds EIGEN_CONSTEXPR and EIGEN_NOEXCEPT to rows(), cols(), innerStride(), outerStride(), and size() 2021-03-04 18:58:08 +00:00
Eugene Zhulenev
a6601070f2 Add log2 operation to TensorBase 2021-03-04 00:13:36 +00:00
Christoph Hertzberg
2660d01fa7 Inherit from no_assignment_operator to avoid implicit copy constructor warnings
(cherry picked from commit 9bbb7ea4b54b1f307863be4ed8d105c38cdefe50)
2021-02-27 18:44:26 +01:00
Christoph Hertzberg
a3521d743c Fix some enum-enum conversion warnings
(cherry picked from commit 838f3d8ce22a5549ef10c7386fb03040721749a0)
2021-02-27 18:44:26 +01:00
Christoph Hertzberg
81b5fe2f0a ReturnByValue is already non-copyable
(cherry picked from commit abbf95045009619f37bd92b45433eedbfcbe41cf)
2021-02-27 18:44:26 +01:00
Christoph Hertzberg
4fb3459a23 Fix double-promotion warnings
(cherry picked from commit c22c103e932e511e96645186831363585a44b7a3)
2021-02-27 18:44:26 +01:00
Jens Wehner
4bfcee47b9 Idrs iterative linear solver 2021-02-27 12:09:33 +00:00
Rasmus Munk Larsen
f284c8592b Don't crash when attempting to slice an empty tensor. 2021-02-24 18:12:51 -08:00
Guoqiang QI
f44197fabd Some improvements for kissfft from Martin Reinecke(pocketfft author):
1.Only computing about half of the factors and use complex conjugate symmetry for the rest instead of all to save time.
2.All twiddles are calculated in double because that gives the maximum achievable precision when doing float transforms.
3.Reducing all angles to the range 0<angle<pi/4 which gives even more precision.
2021-02-24 21:36:47 +00:00
Antonio Sanchez
119763cf38 Eliminate CMake FindPackageHandleStandardArgs warnings.
CMake complains that the package name does not match when the case
differs, e.g.:
```
CMake Warning (dev) at /usr/share/cmake-3.18/Modules/FindPackageHandleStandardArgs.cmake:273 (message):
  The package name passed to `find_package_handle_standard_args` (UMFPACK)
  does not match the name of the calling package (Umfpack).  This can lead to
  problems in calling code that expects `find_package` result variables
  (e.g., `_FOUND`) to follow a certain pattern.
Call Stack (most recent call first):
  cmake/FindUmfpack.cmake:50 (find_package_handle_standard_args)
  bench/spbench/CMakeLists.txt:24 (find_package)
This warning is for project developers.  Use -Wno-dev to suppress it.
```
Here we rename the libraries to match their true cases.
2021-02-24 09:52:05 +00:00
Antonio Sanchez
5f9cfb2529 Add missing adolc isinf/isnan.
Also modified cmake/FindAdolc.cmake to eliminate warnings, and added
search paths to match install layout.

Fixed: #2157
2021-02-19 22:26:56 +00:00
frgossen
33e0af0130 Return nan at poles of polygamma, digamma, and zeta if limit is not defined 2021-02-19 16:35:11 +00:00
David Tellenbach
36200b7855 Remove vim specific comments to recognoize correct file-type.
As discussed in #2143 we remove editor specific comments.
2021-02-09 09:13:09 +01:00
Ralf Hannemann-Tamas
984d010b7b add specialization of check_sparse_solving() for SuperLU solver, in order to test adjoint and transpose solves 2021-02-08 22:00:31 +00:00
Antonio Sanchez
3f4684f87d Include <cstdint> in one place, remove custom typedefs
Originating from
[this SO issue](https://stackoverflow.com/questions/65901014/how-to-solve-this-all-error-2-in-this-case),
some win32 compilers define `__int32` as a `long`, but MinGW defines
`std::int32_t` as an `int`, leading to a type conflict.

To avoid this, we remove the custom `typedef` definitions for win32.  The
Tensor module requires C++11 anyways, so we are guaranteed to have
included `<cstdint>` already in `Eigen/Core`.

Also re-arranged the headers to only include `<cstdint>` in one place to
avoid this type of error again.
2021-01-26 14:23:05 -08:00
Gmc2
a4edb1079c fix test of ExtractVolumePatchesOp 2021-01-25 03:23:46 +00:00
David Tellenbach
660c6b857c Remove std::cerr in iterative solver since we don't have iostream.
This fixes #2123
2021-01-21 11:40:05 +01:00
Maozhou, Ge
21a8a2487c fix paddings of TensorVolumePatchOp 2021-01-15 11:51:49 +08:00
Antonio Sanchez
070d303d56 Add CUDA complex sqrt.
This is to support scalar `sqrt` of complex numbers `std::complex<T>` on
device, requested by Tensorflow folks.

Technically `std::complex` is not supported by NVCC on device
(though it is by clang), so the default `sqrt(std::complex<T>)` function only
works on the host. Here we create an overload to add back the
functionality.

Also modified the CMake file to add `--relaxed-constexpr` (or
equivalent) flag for NVCC to allow calling constexpr functions from
device functions, and added support for specifying compute architecture for
NVCC (was already available for clang).
2020-12-22 23:25:23 -08:00
Turing Eret
19e6496ce0 Replace call to FixedDimensions() with a singleton instance of
FixedDimensions.
2020-12-16 07:34:44 -07:00
Turing Eret
bc7d1599fb TensorStorage with FixedDimensions now has zero instance memory overhead.
Removed m_dimension as instance member of TensorStorage with
FixedDimensions and instead use the template parameter. This
means that the sizeof a pure fixed-size storage is exactly
equal to the data it is storing.
2020-12-14 07:19:34 -07:00
Alexander Grund
cf0b5b0344 Remove code checking for CMake < 3.5
As the CMake version is at least 3.5 the code checking for earlier versions can be removed.
2020-12-14 09:57:44 +00:00
Antonio Sanchez
2dbac2f99f Fix bad NEON fp16 check 2020-12-04 13:42:18 -08:00
Antonio Sanchez
e2f21465fe Special function implementations for half/bfloat16 packets.
Current implementations fail to consider half-float packets, only
half-float scalars.  Added specializations for packets on AVX, AVX512 and
NEON.  Added tests to `special_packetmath`.

The current `special_functions` tests would fail for half and bfloat16 due to
lack of precision. The NEON tests also fail with precision issues and
due to different handling of `sqrt(inf)`, so special functions bessel, ndtri
have been disabled.

Tested with AVX, AVX512.
2020-12-04 10:16:29 -08:00
Rasmus Munk Larsen
71c85df4c1 Clean up the Tensor header and get rid of the EIGEN_SLEEP macro. 2020-12-02 11:04:04 -08:00
Bowie Owens
9842366bba Make inclusion of doc sub-directory optional by adjusting options.
Allows exclusion of doc and related targets to help when using eigen via add_subdirectory().

Requested by:

https://gitlab.com/libeigen/eigen/-/issues/1842

Also required making EIGEN_TEST_BUILD_DOCUMENTATION a dependent option on EIGEN_BUILD_DOC. This ensures documentation targets are properly defined when EIGEN_TEST_BUILD_DOCUMENTATION is ON.
2020-11-27 08:11:49 +11:00
Antonio Sanchez
22f67b5958 Fix boolean float conversion and product warnings.
This fixes some gcc warnings such as:
```
Eigen/src/Core/GenericPacketMath.h:655:63: warning: implicit conversion turns floating-point number into bool: 'typename __gnu_cxx::__enable_if<__is_integer<bool>::__value, double>::__type' (aka 'double') to 'bool' [-Wimplicit-conversion-floating-point-to-bool]
    Packet psqrt(const Packet& a) { EIGEN_USING_STD(sqrt); return sqrt(a); }
```

Details:

- Added `scalar_sqrt_op<bool>` (`-Wimplicit-conversion-floating-point-to-bool`).

- Added `scalar_square_op<bool>` and `scalar_cube_op<bool>`
specializations (`-Wint-in-bool-context`)

- Deprecated above specialized ops for bool.

- Modified `cxx11_tensor_block_eval` to specialize generator for
booleans (`-Wint-in-bool-context`) and to use `abs` instead of `square` to
avoid deprecated bool ops.
2020-11-24 20:20:36 +00:00
Antonio Sanchez
a8fdcae55d Fix sparse_extra_3, disable counting temporaries for testing DynamicSparseMatrix.
Multiplication of column-major `DynamicSparseMatrix`es involves three
temporaries:
- two for transposing twice to sort the coefficients
(`ConservativeSparseSparseProduct.h`, L160-161)
- one for a final copy assignment (`SparseAssign.h`, L108)
The latter is avoided in an optimization for `SparseMatrix`.

Since `DynamicSparseMatrix` is deprecated in favor of `SparseMatrix`, it's not
worth the effort to optimize further, so I simply disabled counting
temporaries via a macro.

Note that due to the inclusion of `sparse_product.cpp`, the `sparse_extra`
tests actually re-run all the original `sparse_product` tests as well.

We may want to simply drop the `DynamicSparseMatrix` tests altogether, which
would eliminate the test duplication.

Related to #2048
2020-11-18 23:15:33 +00:00
Antonio Sanchez
17268b155d Add bit_cast for half/bfloat to/from uint16_t, fix TensorRandom
The existing `TensorRandom.h` implementation makes the assumption that
`half` (`bfloat16`) has a `uint16_t` member `x` (`value`), which is not
always true. This currently fails on arm64, where `x` has type `__fp16`.
Added `bit_cast` specializations to allow casting to/from `uint16_t`
for both `half` and `bfloat16`.  Also added tests in
`half_float`, `bfloat16_float`, and `cxx11_tensor_random` to catch
these errors in the future.
2020-11-18 20:32:35 +00:00
Antonio Sanchez
3669498f5a Fix rule-of-3 for the Tensor module.
Adds copy constructors to Tensor ops, inherits assignment operators from
`TensorBase`.

Addresses #1863
2020-11-18 18:14:53 +00:00
Antonio Sanchez
852513e7a6 Disable testing of OpenGL by default.
The `OpenGLSupport` module contains mostly deprecated features, and the
test is highly GL context-dependent, relies on deprecated GLUT, and
requires a display.  Until the module is updated to support modern
OpenGL and the test to use newer windowing frameworks (e.g. GLFW)
it's probably best to disable the test by default.

The test can be enabled with `cmake -DEIGEN_TEST_OPENGL=ON`.

See #2053 for more details.
2020-11-12 16:15:40 -08:00
Antonio Sanchez
6961468915 Address issues with openglsupport test.
The existing test fails on several systems due to GL runtime version mismatches,
the use of deprecated features, and memory errors due to improper use of GLUT.
The test was modified to:

- Run within a display function, allowing proper GLUT cleanup.
- Generate dynamic shaders with a supported GLSL version string and output variables.
- Report shader compilation errors.
- Check GL context version before launching version-specific tests.

Note that most of the existing `OpenGLSupport` module and tests rely on deprecated
features (e.g. fixed-function pipeline). The test was modified to allow it to
pass on various systems. We might want to consider removing the module or re-writing
it entirely to support modern OpenGL.  This is beyond the scope of this patch.

Testing of legacy GL (for platforms that support it) can be enabled by defining
`EIGEN_LEGACY_OPENGL`.  Otherwise, the test will try to create a modern context.

Tested on
- MacBook Air (2019), macOS Catalina 10.15.7 (OpenGL 2.1, 4.1)
- Debian 10.6, NVidia Quadro K1200 (OpenGL 3.1, 3.3)
2020-11-11 15:54:43 -08:00
Deven Desai
9d11e2c03e CMakefile update for ROCm 4.0
Starting with ROCm 4.0, the `hipconfig --platform` command will return `amd` (prior return value was `hcc`). Updating the CMakeLists.txt files in the test dirs to account for this change.
2020-10-29 18:06:31 +00:00
mehdi-goli
a725a3233c [SYCL clean up the code] : removing exrta #pragma unroll in SYCL which was causing issues in embeded systems 2020-10-28 08:34:49 +00:00
Rasmus Munk Larsen
274ef12b61 Remove leftover debug print statement in cxx11_tensor_expr.cpp 2020-10-14 22:59:51 +00:00
Rasmus Munk Larsen
61fc78bbda Get rid of nested template specialization in TensorReductionGpu.h, which was broken by c6953f799b. 2020-10-13 23:53:11 +00:00
Rasmus Munk Larsen
c6953f799b Add packet generic ops predux_fmin, predux_fmin_nan, predux_fmax, and predux_fmax_nan that implement reductions with PropagateNaN, and PropagateNumbers semantics. Add (slow) generic implementations for most reductions. 2020-10-13 21:48:31 +00:00
David Tellenbach
8f8d77b516 Add EIGEN prefix for HAS_LGAMMA_R 2020-10-08 18:32:19 +02:00
Eugene Zhulenev
2279f2c62f Use lgamma_r if it is available (update check for glibc 2.19+) 2020-10-08 00:26:45 +00:00
Rasmus Munk Larsen
b431024404 Don't make assumptions about NaN-propagation for pmin/pmax - it various across platforms.
Change test to only test for NaN-propagation for pfmin/pfmax.
2020-10-07 19:05:18 +00:00