eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2024-12-27 07:29:52 +08:00

Author	SHA1	Message	Date
Jakub Lichman	8877f8d9b2	ptranpose test for non-square kernels added	2021-05-19 08:26:45 +00:00
Guoqiang QI	3e006bfd31	Ensure all generated matrices for inverse_4x4 testes are invertible, this fix #2248 .	2021-05-13 15:03:30 +00:00
Nathan Luehr	972cf0c28a	Fix calls to device functions from host code	2021-05-11 22:47:49 +00:00
Nathan Luehr	7e6a1c129c	Device implementation of log for std::complex types.	2021-05-11 22:02:21 +00:00
Nathan Luehr	6753f0f197	Fix ambiguity due to argument dependent lookup.	2021-05-11 15:41:11 -05:00
guoqiangqi	3d9051ea84	Changing the storage of the SSE complex packets to that of the wrapper. This should fix #2242 .	2021-05-10 23:53:16 +00:00
Rohit Santhanam	39ec31c0ad	Fix for issue where numext::imag and numext::real are used before they are defined.	2021-05-10 19:48:32 +00:00
Antonio Sanchez	c0eb5f89a4	Restore ABI compatibility for conj with 3.3, fix conflict with boost. The boost library unfortunately specializes `conj` for various types and assumes the original two-template-parameter version. This changes restores the second parameter. This also restores ABI compatibility. The specialization for `std::complex` is because `std::conj` is not a device function. For custom complex scalar types, users should provide their own `conj` implementation. We may consider removing the unnecessary second parameter in the future - but this will require modifying boost as well. Fixes #2112.	2021-05-07 18:14:00 +00:00
Antonio Sanchez	0eba8a1fe3	Clean up gpu device properties. Made a class and singleton to encapsulate initialization and retrieval of device properties. Related to !481, which already changed the API to address a static linkage issue.	2021-05-07 17:51:29 +00:00
Antonio Sanchez	90e9a33e1c	Fix numext::arg return type. The cxx11 path for `numext::arg` incorrectly returned the complex type instead of the real type, leading to compile errors. Fixed this and added tests. Related to !477, which uncovered the issue.	2021-05-07 16:26:57 +00:00
Christoph Hertzberg	722ca0b665	Revert addition of unused `paddsub<Packet2cf>`. This fixes #2242	2021-05-06 18:36:47 +02:00
Antonio Sanchez	e3b7f59659	Simplify TensorRandom and remove time-dependence. Time-dependence prevents tests from being repeatable. This has long been an issue with debugging the tensor tests. Removing this will allow future tests to be repeatable in the usual way. Also, the recently added macros in !476 are causing headaches across different platforms. For example, checking `_XOPEN_SOURCE` is leading to multiple ambiguous macro errors across Google, and `_DEFAULT_SOURCE`/`_SVID_SOURCE`/`_BSD_SOURCE` are sometimes defined with values, sometimes defined as empty, and sometimes not defined at all when they probably should be. This is leading to multiple build breakages. The simplest approach is to generate a seed via `Eigen::internal::random<uint64_t>()` if on CPU. For GPU, we use a hash based on the current thread ID (since `rand()` isn't supported on GPU). Fixes #1602.	2021-05-04 13:34:49 -07:00
Antonio Sanchez	1c013be2cc	Better CUDA complex division. The original produced NaNs when dividing 0/b for subnormal b. The `complex_divide_stable` was changed to use the more common Smith's algorithm.	2021-04-29 17:39:58 +00:00
Antonio Sanchez	172db7bfc3	Add missing pcmp_lt_or_nan for NEON Packet4bf.	2021-04-27 14:12:11 -07:00
Theo Fletcher	2ced0cc233	Added complex matrix unit tests for SelfAdjointEigenSolve	2021-04-26 19:00:51 +00:00
Jakub Lichman	d87648a6be	Tests added and AVX512 bug fixed for pcmp_lt_or_nan	2021-04-25 20:58:56 +00:00
Jakub Lichman	1115f5462e	Tests for pcmp_lt and pcmp_le added	2021-04-23 19:51:43 +00:00
Turing Eret	3804ca0d90	Fix for issue with static global variables in TensorDeviceGpu.h m_deviceProperties and m_devicePropInitialized are defined as global statics which will define multiple copies which can cause issues if initializeDeviceProp() is called in one translation unit and then m_deviceProperties is used in a different translation unit. Added inline functions getDeviceProperties() and getDevicePropInitialized() which defines those variables as static locals. As per the C++ standard 7.1.2/4, a static local declared in an inline function always refers to the same object, so this should be safer. Credit to Sun Chenggen for this fix. This fixes issue #1475.	2021-04-23 07:43:35 -06:00
Antonio Sanchez	045c0609b5	Check existence of BSD random before use. `TensorRandom` currently relies on BSD `random()`, which is not always available. The [linux manpage](https://man7.org/linux/man-pages/man3/srandom.3.html) gives the glibc condition: ``` _XOPEN_SOURCE >= 500 \|\| /* Glibc since 2.19: / _DEFAULT_SOURCE \|\| / Glibc <= 2.19: */ _SVID_SOURCE \|\| _BSD_SOURCE ``` In particular, this was failing to compile for MinGW via msys2. If not available, we fall back to using `rand()`.	2021-04-22 20:42:12 +00:00
Antonio Sanchez	d213a0bcea	DenseStorage safely copy/swap. Fixes #2229. For dynamic matrices with fixed-sized storage, only copy/swap elements that have been set. Otherwise, this leads to inefficient copying, and potential UB for non-initialized elements.	2021-04-22 18:45:19 +00:00
Rasmus Munk Larsen	85a76a16ea	Make vectorized compute_inverse_size4 compile with AVX.	2021-04-22 15:21:01 +00:00
Jakub Lichman	d72c794ccd	Compilation of basicbenchmark fixed	2021-04-21 06:53:32 +00:00
Chip-Kerchner	06c2760bd1	Fix taking address of rvalue compiler issue with TensorFlow (plus other warnings).	2021-04-21 00:47:13 +00:00
Jakub Lichman	2b1dfd1ba0	HasExp added for AVX512 Packet8d	2021-04-20 19:07:58 +00:00
Antonio Sanchez	1d79c68ba0	Fix ldexp for AVX512 (#2215 ) Wrong shuffle was used. Need to interleave low/high halves with a `permute` instruction. Fixes #2215.	2021-04-20 16:25:22 +00:00
David Tellenbach	3e819d83bf	Before 3.4 branch	2021-04-18 23:36:14 +02:00
Antonio Sanchez	69adf26aa3	Modify googlehash use to account for namespace issues. The namespace declaration for googlehash is a configurable macro that can be disabled. In particular, it is disabled within google, causing compile errors since `dense_hash_map`/`sparse_hash_map` are then in the global namespace instead of in `::google`. Here we play a bit of gynastics to allow for both `google::_hash_map` and `_hash_map`, while limiting namespace polution. Symbols within the `::google` namespace are imported into `Eigen::google`. We also remove checks based on `_SPARSE_HASH_MAP_H_`, as this is fragile, and instead require `EIGEN_GOOGLEHASH_SUPPORT` to be defined.	2021-04-12 19:00:39 -07:00
Christoph Hertzberg	9357feedc7	Avoid using uninitialized inputs and if available, use slightly more efficient `movsd` instruction for `pset1<Packet2cf>`.	2021-04-13 01:36:59 +02:00
Rasmus Munk Larsen	a2c0542010	Fix typo in TensorDimensions.h	2021-04-12 18:59:56 +00:00
Rohit Santhanam	dfd6720d82	Fix for float16 GPU unit test.	2021-04-12 10:19:06 +00:00
Christoph Hertzberg	1e1c8a735c	Use EIGEN_HAS_CXX11 and EIGEN_COMP_CXXVER macros to detect C++ version for `std::result_of` and `std::invoke_result`. Fixes #2209	2021-04-12 01:26:15 +00:00
Jens Wehner	f6fc66aa75	fixed doxygen for unsupported iterative solver module	2021-04-11 16:26:14 +00:00
Christoph Hertzberg	d58678069c	Make iterators default constructible and assignable, by making...	2021-04-09 17:03:28 +00:00
Rohit Santhanam	2859db0220	This fixes an issue where the compiler was not choosing the GPU specific specialization of ScanLauncher. The issue was discovered when the GPU scan unit test was run and resulted in a segmentation fault. The segmantation fault occurred because the unit test allocated GPU memory and passed a pointer to that memory to the computation that it presumed would execute on the GPU. But because of the issue, the computation was scheduled to execute on the CPU so a situation was constructed where the CPU attempted to access a GPU memory location. The fix expands the GPU specific ScanLauncher specialization to handle cases where vectorization is enabled. Previously, the GPU specialization is chosen only if Vectorization is not used.	2021-04-08 15:14:48 +00:00
Antonio Sanchez	fcb5106c6e	Scaled epsilon the wrong way. Should have been 0.5 to widen the bounds, since this is inverse precision. Setting to 0.5, however, leads to many more failing tests at Google, so reverting to 1 for now.	2021-04-07 15:08:39 -07:00
Christoph Hertzberg	6197ce1a35	Replace `-2147483648` by `-0.0f` or `-0.0` constants (this should fix #2189 ). Also, remove unnecessary `pgather` operations.	2021-04-07 11:25:27 +00:00
Rasmus Munk Larsen	22edb46823	Align local arrays to Packet boundary.	2021-04-06 16:22:36 +00:00
Antonio Sanchez	ace7f132ed	Fix clang tidy warnings in AnnoyingScalar. Clang-tidy complains that full specializations in headers can cause ODR violations. Marked these as `inline` to fix. It also complains about renaming arguments in specializations. Set the argument names to match.	2021-04-05 12:49:38 -07:00
Antonio Sanchez	90187a33e1	Fix SelfAdjoingEigenSolver (#2191 ) Adjust the relaxation step to use the condition ``` abs(subdiag[i]) <= epsilon * sqrt(abs(diag[i]) + abs(diag[i+1])) ``` for setting the subdiagonal entry to zero. Also adjust Wilkinson shift for small `e = subdiag[end-1]` - I couldn't find a reference for the original, and it was not consistent with the Wilkinson definition. Fixes #2191.	2021-04-05 11:19:09 -07:00
Rasmus Munk Larsen	3ddc0974ce	Fix two bugs in commit	2021-04-02 22:06:27 +00:00
Chip Kerchner	c24bee6120	Fix address of temporary object errors in clang11. This fixes the problem with taking the address of temporary objects which clang11 treats as errors.	2021-04-02 16:27:08 +00:00
David Tellenbach	e4233b6e3d	Add CI infrastructure for pre-merge smoke tests. This patch adds pre-merge smoke tests for x86 Linux using gcc-10 and clang-10. Closes #2188.	2021-04-01 00:08:37 +00:00
David Tellenbach	ae95b74af9	Add CMake infrastructure for smoke testing Necessary CMake changes to implement pre-merge smoke tests running via CI.	2021-03-31 22:09:00 +00:00
Rasmus Munk Larsen	5bbc9cea93	Add an info() method to the SVDBase class to make it possible to tell the user that the computation failed, possibly due to invalid input. Make Jacobi and divide-and-conquer fail fast and return info() == InvalidInput if the matrix contains NaN or +/-Inf.	2021-03-31 21:09:19 +00:00
Guoqiang QI	b5a926a0f6	Add GitLab templates for issues and merge requests This patch adds GitLab templates for bug reports, feature and merge requests. This closes #2117.	2021-03-31 16:01:12 +00:00
Antonio Sanchez	78ee3d6261	Fix CUDA constexpr issues for numeric_limits. Some CUDA/HIP constants fail on device with `constexpr` since they internally rely on non-constexpr functions, e.g. ``` \#define CUDART_INF_F __int_as_float(0x7f800000) ``` This fails for cuda-clang (though passes with nvcc). These constants are currently used by `device::numeric_limits`. For portability, we need to remove `constexpr` from the affected functions. For C++11 or higher, we should be able to rely on the `std::numeric_limits` versions anyways, since the methods themselves are now `constexpr`, so should be supported on device (clang/hipcc natively, nvcc with `--expr-relaxed-constexpr`).	2021-03-30 18:01:27 +00:00
Antonio Sanchez	af1247fbc1	Use Index type in loop over coefficients. Previously was `int`. Brought up by Kyle Snow (Polaris Geospatial Services) on the mailing list.	2021-03-29 17:40:55 +00:00
Antonio Sanchez	87729ea39f	Eliminate `round_impl` double-promotion warnings for c++03.	2021-03-25 16:52:19 +00:00
Deven Desai	748489ef9c	Un-defining EIGEN_HAS_CONSTEXPR on the HIP platform The Eigen unit-tests started failing on the HIP/ROCm platform, after the following commit `e7b8643d70` ``` In file included from /home/rocm-user/eigen/test/main.h:360: In file included from /home/rocm-user/eigen/Eigen/QR:11: In file included from /home/rocm-user/eigen/Eigen/Core:162: /home/rocm-user/eigen/Eigen/src/Core/util/Meta.h:300:17: error: constexpr function never produces a constant expression [-Winvalid-constexpr] static float (max)() { ^ /home/rocm-user/eigen/Eigen/src/Core/util/Meta.h:304:12: note: non-constexpr function '__int_as_float' cannot be used in a constant expression return HIPRT_MAX_NORMAL_F; ^ /home/rocm-user/eigen/Eigen/src/Core/arch/HIP/hcc/math_constants.h:14:28: note: expanded from macro 'HIPRT_MAX_NORMAL_F' #define HIPRT_MAX_NORMAL_F __int_as_float(0x7f7fffff) ^ /opt/rocm/hip/include/hip/hcc_detail/device_functions.h:913:32: note: declared here __device__ static inline float __int_as_float(int x) { ^ ``` The problem seems to that some of the constants defined in the HIP `math_constants.h` have a call to `__int_as_float` routine which is not declared `constexpr` in the HIP runtime header file. Working around this issue for now, be skipping the const_expr support (enabled via the above commit) on HIP	2021-03-25 13:45:52 +00:00
Chip Kerchner	d59ef212e1	Fixed performance issues for complex VSX and P10 MMA in gebp_kernel (level 3).	2021-03-25 11:08:19 +00:00

... 3 4 5 6 7 ...

11612 Commits