eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2024-12-27 07:29:52 +08:00

Author	SHA1	Message	Date
Antonio Sánchez	185ad0e610	Revert "Avoid integer overflow in EigenMetaKernel indexing" This reverts commit `100d7caf92`	2021-10-27 14:55:25 +00:00
Rasmus Munk Larsen	68e0d023c0	Remove license column in tables for builtin sparse solvers since all are MPL2 now.	2021-10-26 18:09:22 +00:00
Andreas Krebbel	8faafc3aaa	ZVector: Move alignas qualifier to come first We currently have plenty of type definitions with the alignment qualifier coming after the type. The compiler warns about ignoring them: int EIGEN_ALIGN16 ai[4]; Turn this into: EIGEN_ALIGN16 int ai[4];	2021-10-26 15:33:47 +02:00
Ben Barsdell	100d7caf92	Avoid integer overflow in EigenMetaKernel indexing - The current implementation computes `size + total_threads`, which can overflow and cause CUDA_ERROR_ILLEGAL_ADDRESS when size is close to the maximum representable value. - The num_blocks calculation can also overflow due to the implementation of divup(). - This patch prevents these overflows and allows the kernel to work correctly for the full representable range of tensor sizes. - Also adds relevant tests.	2021-10-26 00:04:28 +00:00
Alex Druinsky	d0e3791b1a	Fix vectorized reductions for Eigen::half Fixes compiler errors in expressions that look like Eigen::Matrix<Eigen::half, 3, 1>::Random().maxCoeff() The error comes from the code that creates the initial value for vectorized reductions. The fix is to specify the scalar type of the reduction's initial value. The cahnge is necessary for Eigen::half because unlike other types, Eigen::half scalars cannot be implicitly created from integers.	2021-10-25 14:44:33 -07:00
Maxiwell S. Garcia	99600bd1a6	test: fix boostmutiprec test to compile with older Boost versions Eigen boostmultiprec test redefines a symbol that is already defined inside Boot Math [1]. Boost has fixed it recently [2], but this patch avoids errors if Boost version was less than 1.77. https://github.com/boostorg/math/blob/boost-1.76.0/include/boost/math/policies/policy.hpp#L18 `6830712302 (diff-c7a8e5911c2e6be4138e1a966d762200f147792ac16ad96fdcc724313d11f839)`	2021-10-25 20:32:33 +00:00
Yann Billeter	6c3206152a	fix(CommaInitializer): pass dims at compile-time	2021-10-25 19:53:38 +00:00
Antonio Sanchez	a500da1dc0	Fix broadcasting oob error. For vectorized 1-dimensional inputs that do not take the special blocking path (e.g. `std::complex<...>`), there was an index-out-of-bounds error causing the broadcast size to be computed incorrectly. Here we fix this, and make other minor cleanup changes. Fixes #2351.	2021-10-25 19:31:12 +00:00
Antonio Sanchez	0578feaabc	Remove const from visitor return type. This seems to interfere with `pload`/`ploadu`, since `pload<const Packet**>` are not defined. This should unbreak the arm/ppc builds.	2021-10-25 19:09:50 +00:00
benardp	b63c096fbb	Extend EIGEN_QT_SUPPORT to Qt6	2021-10-23 23:43:06 +00:00
Lennart Steffen	163f11e24a	Included note on inner stride for compile-time vectors. See https://gitlab.com/libeigen/eigen/-/issues/2355#note_711078126	2021-10-22 09:46:43 +00:00
Nico	b17bcddbca	Fix -Wbitwise-instead-of-logical clang warning & and \| short-circuit, && and \|\| don't. When both arguments to those are boolean, the short-circuiting version is usually the desired one, so clang warns on this. Here, it is inconsequential, so switch to && and \|\| to suppress the warning.	2021-10-21 23:32:45 -04:00
Rasmus Munk Larsen	2d3fec8ff6	Add nan-propagation options to matrix and array plugins.	2021-10-21 19:40:11 +00:00
Antonio Sanchez	b86e013321	Revert bit_cast to use memcpy for CUDA. To elide the memcpy, we need to first load the `src` value into registers by making a local copy. This avoids the need to resort to potential UB by using `reinterpret_cast`. This change doesn't seem to affect CPU (at least not with gcc/clang). With optimizations on, the copy is also elided.	2021-10-21 08:14:11 -07:00
Antonio Sanchez	45e67a6fda	Use reinterpret_cast on GPU for bit_cast. This seems to be the recommended approach for doing type punning in CUDA. See for example - https://stackoverflow.com/questions/47037104/cuda-type-punning-memcpy-vs-ub-union - https://developer.nvidia.com/blog/faster-parallel-reductions-kepler/ (the latter puns a double to an `int2`). The issue is that for CUDA, the `memcpy` is not elided, and ends up being an expensive operation. We already have similar `reintepret_cast`s across the Eigen codebase for GPU (as does TensorFlow).	2021-10-20 21:34:40 +00:00
Antonio Sanchez	24ebb37f38	Disable Tree reduction for GPU. For moderately sized inputs, running the Tree reduction quickly fills/overflows the GPU thread stack space, leading to memory errors. This was happening in the `cxx11_tensor_complex_gpu` test, for example. Disabling tree reduction on GPU fixes this.	2021-10-20 20:42:37 +00:00
Rasmus Munk Larsen	360290fc42	Improve accuracy of full tensor reduction for half and bfloat16 by reducing leaf size in tree reduction. Add more unit tests for summation accuracy.	2021-10-20 19:54:06 +00:00
Antonio Sanchez	95bb645e92	Fix MSVC+NVCC EIGEN_INHERIT_ASSIGNMENT_EQUAL_OPERATOR compilation. Looks like we need to update the `EIGEN_INHERIT_ASSIGNMENT_EQUAL_OPERATOR` for newer versions of MSVC as well when compiling with NVCC. Fixes build issues for VS 2017.	2021-10-20 19:38:14 +00:00
Antonio Sanchez	fd5f48e465	Fix tuple compilation for VS2017. VS2017 doesn't like deducing alias types, leading to a bunch of compile errors for functions involving the `tuple` alias. Replacing with `TupleImpl` seems to solve this, allowing the test to compile/pass.	2021-10-20 19:18:34 +00:00
Antonio Sanchez	d0d34524a1	Move CUDA/Complex.h to GPU/Complex.h, remove TensorReductionCuda.h The `Complex.h` file applies equally to HIP/CUDA, so placing under the generic `GPU` folder. The `TensorReductionCuda.h` has already been deprecated, now removing for the next Eigen version.	2021-10-20 12:00:19 -07:00
Rasmus Munk Larsen	f2c9c2d2f7	Vectorize Visitor.h.	2021-10-20 16:58:01 +00:00
Antonio Sanchez	2bf07fa5b5	Fix Windows CMake compiler/OS detection. Replaced deprecated `DetermineVSServicePack`macro with recommended `CMAKE_CXX_COMPILER_VERSION`. Deleted custom `OSVersion` detection. The windows-specific code is highly outdated, and on other systems simply returns `CMAKE_SYSTEM`. We will get values like `windows-10.0.17763`, but this is preferable to `unknownwin`, and saves us needing to maintain a separate cmake file.	2021-10-02 16:30:38 +00:00
Rasmus Munk Larsen	1d75fab368	Speed up tensor reduction	2021-10-02 14:58:23 +00:00
Antonio Sanchez	be9e7d205f	Reduce tensor_contract_gpu test. The original test times out after 60 minutes on Windows, even when setting flags to optimize for speed. Reducing the number of contractions performed from 3600->27 for subtests 8,9 allow the two to run in just over a minute each.	2021-10-02 04:36:15 +00:00
Antonio Sanchez	701f5d1c91	Fix gpu special function tests. Some checks used incorrect values, partly from copy-paste errors, partly from the change in behaviour introduced in !398. Modified results to match scipy, simplified tests by updating `VERIFY_IS_CWISE_APPROX` to work for scalars.	2021-10-01 10:20:50 -07:00
Antonio Sanchez	f0f1d7938b	Disable testing of complex compound assignment operators for MSVC. MSVC does not support specializing compound assignments for `std::complex`, since it already specializes them (contrary to the standard). Trying to use one of these on device will currently lead to a duplicate definition error. This is still probably preferable to no error though. If we remove the definitions for MSVC, then it will compile, but the kernel will fail silently. The only proper solution would be to define our own custom `Complex` type.	2021-09-27 15:15:11 -07:00
Kolja Brix	51a0b4e2d2	Reorganize test main file	2021-09-27 18:30:47 +00:00
Antonio Sanchez	21640612be	Disable more CUDA warnings. For cuda 9.2 and 11.4, they changed the numbers again. Fixes #2331.	2021-09-24 21:31:14 -07:00
Antonio Sanchez	de218b471d	Add -arch=<arch> argument for nvcc. Without this flag, when compiling with nvcc, if the compute architecture of a card does not exactly match any of those listed for `-gencode arch=compute_<arch>,code=sm_<arch>`, then the kernel will fail to run with: ``` cudaErrorNoKernelImageForDevice: no kernel image is available for execution on the device. ``` This can happen, for example, when compiling with an older cuda version that does not support a newer architecture (e.g. T4 is `sm_75`, but cuda 9.2 only supports up to `sm_70`). With the `-arch=<arch>` flag, the code will compile and run at the supplied architecture.	2021-09-24 20:48:01 -07:00
Antonio Sanchez	846d34384a	Rename EIGEN_CUDA_FLAGS to EIGEN_CUDA_CXX_FLAGS Also add a missing space for clang.	2021-09-24 20:15:55 -07:00
Antonio Sanchez	7b00e8b186	Clean up CUDA CMake files. - Unify test/CMakeLists.txt and unsupported/test/CMakeLists.txt - Added `EIGEN_CUDA_FLAGS` that are appended to the set of flags passed to the cuda compiler (nvcc or clang). The latter is to support passing custom flags (e.g. `-arch=` to nvcc, or to disable cuda-specific warnings).	2021-09-24 14:43:59 -07:00
Antonio Sanchez	e9e90892fe	Disable another device warning	2021-09-23 13:43:18 -07:00
Antonio Sanchez	86c0decc48	Disable more NVCC warnings. The 2979 warning is yet another "calling a __host__ function from a __host__ device__ function. Although we probably should eventually address these, they are flooding the logs. Most of these are harmless since we only call the original from the host. In cases where these are actually called from device, an error is generated instead anyways. The 2977 warning is a bit strange - although the warning suggests the `__device__` annotation is ignored, this doesn't actually seem to be the case. Without the `__device__` declarations, the kernel actually fails to run when attempting to construct such objects. Again, these warnings are flooding the logs, so disabling for now.	2021-09-23 10:52:39 -07:00
Kolja Brix	afa616bc9e	Fix some typos found	2021-09-23 15:22:00 +00:00
Antonio Sanchez	76bb29c0c2	Add -mfma for AVX512DQ tests.	2021-09-22 14:06:29 -07:00
sciencewhiz	4b6036e276	fix various typos	2021-09-22 16:15:06 +00:00
Antonio Sanchez	3753e6a2b3	Add AVX512 test job to CI.	2021-09-21 15:11:31 -07:00
Antonio Sanchez	343847273d	Enable AVX512 testing.	2021-09-21 15:00:36 -07:00
Alexander Grund	b5eaa42695	Fix alias violation in BFloat16 reinterpret_cast between unrelated types is undefined behavior and leads to misoptimizations on some platforms. Use the safer (and faster) version via bit_cast	2021-09-20 10:37:50 +02:00
Alexander Karatarakis	4d622be118	[AutodiffScalar] Remove const when returning by value clang-tidy: Return type 'const T' is 'const'-qualified at the top level, which may reduce code readability without improving const correctness The types are somewhat long, but the affected return types are of the form: ``` const T my_func() { // } ``` Change to: ``` T my_func() { // } ```	2021-09-18 21:23:32 +00:00
Antonio Sanchez	f49217e52b	Fix implicit conversion warnings in tuple_test. Fixes #2329.	2021-09-17 19:40:22 -07:00
Rasmus Munk Larsen	5595cfd194	Run CI tests in parallel no available cores.	2021-09-17 22:35:22 +00:00
Antonio Sanchez	3c724c44cf	Fix strict aliasing bug causing product_small failure. Packet loading is skipped due to aliasing violation, leading to nullopt matrix multiplication. Fixes #2327.	2021-09-17 21:09:34 +00:00
Antonio Sanchez	9882aec279	Silence string overflow warning for GCC in initializer_list_construction test. This looks to be a GCC bug. It doesn't seem to reproduce is a smaller example, making it hard to isolate.	2021-09-17 18:33:50 +00:00
Rasmus Munk Larsen	5dac69ff0b	Added a macro to pass arguments to ctest, e.g. to run tests in parallel.	2021-09-17 18:33:12 +00:00
Antonio Sanchez	5dac0b53c9	Move Eigen::all,last,lastp1,lastN to Eigen::placeholders::. These names are so common, IMO they should not exist directly in the `Eigen::` namespace. This prevents us from using the `last` or `all` names for any parameters or local variables, otherwise spitting out warnings about shadowing or hiding the global values. Many external projects (and our own examples) also heavily use ``` using namespace Eigen; ``` which means these conflict with external libraries as well, e.g. `std::fill(first,last,value)`. It seems originally these were placed in a separate namespace `Eigen::placeholders`, which has since been deprecated. I propose to un-deprecate this, and restore the original locations. These symbols are also imported into `Eigen::indexing`, which additionally imports `fix` and `seq`. An alternative is to remove the `placeholders` namespace and stick with `indexing`. NOTE: this is an API-breaking change. Fixes #2321.	2021-09-17 10:21:42 -07:00
Rohit Santhanam	44da7a3b9d	Disable specific subtests that fail on HIP due to non-functional device side malloc/free (on HIP).	2021-09-17 16:19:03 +00:00
Antonio Sanchez	16f9a20a6f	Add buildtests_gpu and check_gpu to simplify GPU testing. This is in preparation of adding GPU tests to the CI, allowing us to limit building/testing of GPU-specific tests for a given GPU-capable runner. GPU tests are tagged with the label "gpu". The new targets ``` make buildtests_gpu make check_gpu ``` allow building and running only the gpu tests.	2021-09-17 00:48:57 +00:00
Rasmus Munk Larsen	1239adfcab	Remove -fabi-version=6 flag from AVX512 builds. It was added to fix builds with gcc 4.9, but these don't even work today, and the flag breaks compilation with newer versions of gcc.	2021-09-16 16:16:47 -07:00
Rasmus Munk Larsen	6cadab6896	Clean up EIGEN_STATIC_ASSERT to only use standard c++11 static_assert.	2021-09-16 20:43:54 +00:00

1 2 3 4 5 ...

11599 Commits