eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2024-12-27 07:29:52 +08:00

Author	SHA1	Message	Date
Antonio Sanchez	f6c8cc0e99	Fix TensorReduction warnings and error bound for sum accuracy test. The sum accuracy test currently uses the default test precision for the given scalar type. However, scalars are generated via a normal distribution, and given a large enough count and strong enough random generator, the expected sum is zero. This causes the test to periodically fail. Here we estimate an upper-bound for the error as `sqrt(N) * prec` for summing N values, with each having an approximate epsilon of `prec`. Also fixed a few warnings generated by MSVC when compiling the reduction test.	2021-10-30 14:59:00 -07:00
Rohit Santhanam	48e40b22bf	Preliminary HIP bfloat16 GPU support.	2021-10-27 18:36:45 +00:00
Antonio Sánchez	185ad0e610	Revert "Avoid integer overflow in EigenMetaKernel indexing" This reverts commit `100d7caf92`	2021-10-27 14:55:25 +00:00
Ben Barsdell	100d7caf92	Avoid integer overflow in EigenMetaKernel indexing - The current implementation computes `size + total_threads`, which can overflow and cause CUDA_ERROR_ILLEGAL_ADDRESS when size is close to the maximum representable value. - The num_blocks calculation can also overflow due to the implementation of divup(). - This patch prevents these overflows and allows the kernel to work correctly for the full representable range of tensor sizes. - Also adds relevant tests.	2021-10-26 00:04:28 +00:00
Antonio Sanchez	a500da1dc0	Fix broadcasting oob error. For vectorized 1-dimensional inputs that do not take the special blocking path (e.g. `std::complex<...>`), there was an index-out-of-bounds error causing the broadcast size to be computed incorrectly. Here we fix this, and make other minor cleanup changes. Fixes #2351.	2021-10-25 19:31:12 +00:00
Rasmus Munk Larsen	360290fc42	Improve accuracy of full tensor reduction for half and bfloat16 by reducing leaf size in tree reduction. Add more unit tests for summation accuracy.	2021-10-20 19:54:06 +00:00
Antonio Sanchez	be9e7d205f	Reduce tensor_contract_gpu test. The original test times out after 60 minutes on Windows, even when setting flags to optimize for speed. Reducing the number of contractions performed from 3600->27 for subtests 8,9 allow the two to run in just over a minute each.	2021-10-02 04:36:15 +00:00
Antonio Sanchez	701f5d1c91	Fix gpu special function tests. Some checks used incorrect values, partly from copy-paste errors, partly from the change in behaviour introduced in !398. Modified results to match scipy, simplified tests by updating `VERIFY_IS_CWISE_APPROX` to work for scalars.	2021-10-01 10:20:50 -07:00
Antonio Sanchez	de218b471d	Add -arch=<arch> argument for nvcc. Without this flag, when compiling with nvcc, if the compute architecture of a card does not exactly match any of those listed for `-gencode arch=compute_<arch>,code=sm_<arch>`, then the kernel will fail to run with: ``` cudaErrorNoKernelImageForDevice: no kernel image is available for execution on the device. ``` This can happen, for example, when compiling with an older cuda version that does not support a newer architecture (e.g. T4 is `sm_75`, but cuda 9.2 only supports up to `sm_70`). With the `-arch=<arch>` flag, the code will compile and run at the supplied architecture.	2021-09-24 20:48:01 -07:00
Antonio Sanchez	846d34384a	Rename EIGEN_CUDA_FLAGS to EIGEN_CUDA_CXX_FLAGS Also add a missing space for clang.	2021-09-24 20:15:55 -07:00
Antonio Sanchez	7b00e8b186	Clean up CUDA CMake files. - Unify test/CMakeLists.txt and unsupported/test/CMakeLists.txt - Added `EIGEN_CUDA_FLAGS` that are appended to the set of flags passed to the cuda compiler (nvcc or clang). The latter is to support passing custom flags (e.g. `-arch=` to nvcc, or to disable cuda-specific warnings).	2021-09-24 14:43:59 -07:00
Kolja Brix	afa616bc9e	Fix some typos found	2021-09-23 15:22:00 +00:00
Rasmus Munk Larsen	6cadab6896	Clean up EIGEN_STATIC_ASSERT to only use standard c++11 static_assert.	2021-09-16 20:43:54 +00:00
Antonio Sanchez	eea2a3385c	Remove more DynamicSparseMatrix references. Also fixed some typos in SparseExtra/MarketIO.h.	2021-09-02 15:36:47 -07:00
Jens Wehner	8286073c73	Matrixmarket extension	2021-09-02 17:23:33 +00:00
Antonio Sanchez	74da2e6821	Rename Tuple -> Pair. This is to make way for a new `Tuple` class that mimics `std::tuple`, but can be reliably used on device and with aligned Eigen types. The existing Tuple has very few references, and is actually an analogue of `std::pair`.	2021-09-02 02:20:54 +00:00
Maxiwell S. Garcia	09fc0f97b5	Rename 'vec_all_nan' of cxx11_tensor_expr test because this symbol is used by altivec.h	2021-09-01 16:42:51 +00:00
Jens Wehner	53ad9c75b4	included unordered_map header	2021-08-27 16:53:28 +00:00
jenswehner	9abf4d0bec	made RandomSetter C++11 compatible	2021-08-25 20:24:55 +00:00
Antonio Sanchez	eeacbd26c8	Bump CMake files to at least c++11. Removed all configurations that explicitly test or set the c++ standard flags. The only place the standard is now configured is at the top of the main `CMakeLists.txt` file, which can easily be updated (e.g. if we decide to move to c++14+). This can also be set via command-line using ``` > cmake -DCMAKE_CXX_STANDARD 14 ``` Kept the `EIGEN_TEST_CXX11` flag for now - that still controls whether to build/run the `cxx11_*` tests. We will likely end up renaming these tests and removing the `CXX11` subfolder.	2021-08-25 20:07:48 +00:00
jenswehner	d85de1ef56	removed sparse dynamic matrix	2021-08-24 10:33:00 +02:00
jenswehner	e3e74001f7	added includes for unordered_map	2021-08-10 13:34:57 +02:00
Alexander Karatarakis	4ba872bd75	Avoid leading underscore followed by cap in template identifiers	2021-08-04 22:41:52 +00:00
Antonio Sanchez	31f796ebef	Fix MPReal detection and support. The latest version of `mpreal` has a bug that breaks `min`/`max`. It also breaks with the latest dev version of `mpfr`. Here we add `FindMPREAL.cmake` which searches for the library and tests if compilation works. Removed our internal copy of `mpreal.h` under `unsupported/test`, as it is out-of-sync with the latest, and similarly breaks with the latest `mpfr`. It would be best to use the installed version of `mpreal` anyways, since that's what we actually want to test. Fixes #2282.	2021-08-03 17:55:03 +00:00
Antonio Sanchez	8cf6cb27ba	Fix TriSycl CMake files. This is to enable compiling with the latest trisycl. `FindTriSYCL.cmake` was broken by commit `00f32752`, which modified `add_sycl_to_target` for ComputeCPP. This makes the corresponding modifications for trisycl to make them consistent. Also, trisycl now requires c++17.	2021-08-03 16:44:29 +00:00
Antonio Sanchez	9c22795d65	Put attach/detach buffer back in for TensorDeviceSycl. Also added a test to verify the original buffer is updated correctly.	2021-07-09 10:00:05 -07:00
Antonio Sanchez	1e6c6c1576	Replace memset with fill to work for non-trivial scalars. For custom scalars, zero is not necessarily represented by a zeroed-out memory block (e.g. gnu MPFR). We therefore cannot rely on `memset` if we want to fill a matrix or tensor with zeroes. Instead, we should rely on `fill`, which for trivial types does end up getting converted to a `memset` under-the-hood (at least with gcc/clang). Requires adding a `fill(begin, end, v)` to `TensorDevice`. Replaced all potentially bad instances of memset with fill. Fixes #2245.	2021-07-08 18:34:41 +00:00
Jonas Harsch	aab747021b	Don't crash when attempting to shuffle an empty tensor.	2021-07-02 20:33:52 +00:00
Rohit Santhanam	c8d40a7bf1	Removed dead code from GPU float16 unit test.	2021-05-28 20:06:48 +00:00
Antonio Sanchez	69adf26aa3	Modify googlehash use to account for namespace issues. The namespace declaration for googlehash is a configurable macro that can be disabled. In particular, it is disabled within google, causing compile errors since `dense_hash_map`/`sparse_hash_map` are then in the global namespace instead of in `::google`. Here we play a bit of gynastics to allow for both `google::_hash_map` and `_hash_map`, while limiting namespace polution. Symbols within the `::google` namespace are imported into `Eigen::google`. We also remove checks based on `_SPARSE_HASH_MAP_H_`, as this is fragile, and instead require `EIGEN_GOOGLEHASH_SUPPORT` to be defined.	2021-04-12 19:00:39 -07:00
Rohit Santhanam	dfd6720d82	Fix for float16 GPU unit test.	2021-04-12 10:19:06 +00:00
Jens Wehner	c0a889890f	Fixed output of complex matrices	2021-03-15 21:51:55 +00:00
Antonio Sanchez	543e34ab9d	Re-implement move assignments. The original swap approach leads to potential undefined behavior (reading uninitialized memory) and results in unnecessary copying of data for static storage. Here we pass down the move assignment to the underlying storage. Static storage does a one-way copy, dynamic storage does a swap. Modified the tests to no longer read from the moved-from matrix/tensor, since that can lead to UB. Added a test to ensure we do not access uninitialized memory in a move. Fixes: #2119	2021-03-10 16:55:20 +00:00
Antonio Sanchez	2468253c9a	Define EIGEN_CPLUSPLUS and replace most __cplusplus checks. The macro `__cplusplus` is not defined correctly in MSVC unless building with the the `/Zc:__cplusplus` flag. Instead, it defines `_MSVC_LANG` to the specified c++ standard version number. Here we introduce `EIGEN_CPLUSPLUS` which will contain the c++ version number both for MSVC and otherwise. This simplifies checks for supported features. Also replaced most instances of standard version checking via `__cplusplus` with the existing `EIGEN_COMP_CXXVER` macro for better clarity. Fixes: #2170	2021-03-05 18:33:18 +00:00
Jens Wehner	4bfcee47b9	Idrs iterative linear solver	2021-02-27 12:09:33 +00:00
Rasmus Munk Larsen	f284c8592b	Don't crash when attempting to slice an empty tensor.	2021-02-24 18:12:51 -08:00
Antonio Sanchez	119763cf38	Eliminate CMake FindPackageHandleStandardArgs warnings. CMake complains that the package name does not match when the case differs, e.g.: ``` CMake Warning (dev) at /usr/share/cmake-3.18/Modules/FindPackageHandleStandardArgs.cmake:273 (message): The package name passed to `find_package_handle_standard_args` (UMFPACK) does not match the name of the calling package (Umfpack). This can lead to problems in calling code that expects `find_package` result variables (e.g., `_FOUND`) to follow a certain pattern. Call Stack (most recent call first): cmake/FindUmfpack.cmake:50 (find_package_handle_standard_args) bench/spbench/CMakeLists.txt:24 (find_package) This warning is for project developers. Use -Wno-dev to suppress it. ``` Here we rename the libraries to match their true cases.	2021-02-24 09:52:05 +00:00
Antonio Sanchez	5f9cfb2529	Add missing adolc isinf/isnan. Also modified cmake/FindAdolc.cmake to eliminate warnings, and added search paths to match install layout. Fixed: #2157	2021-02-19 22:26:56 +00:00
frgossen	33e0af0130	Return nan at poles of polygamma, digamma, and zeta if limit is not defined	2021-02-19 16:35:11 +00:00
Gmc2	a4edb1079c	fix test of ExtractVolumePatchesOp	2021-01-25 03:23:46 +00:00
Alexander Grund	cf0b5b0344	Remove code checking for CMake < 3.5 As the CMake version is at least 3.5 the code checking for earlier versions can be removed.	2020-12-14 09:57:44 +00:00
Antonio Sanchez	e2f21465fe	Special function implementations for half/bfloat16 packets. Current implementations fail to consider half-float packets, only half-float scalars. Added specializations for packets on AVX, AVX512 and NEON. Added tests to `special_packetmath`. The current `special_functions` tests would fail for half and bfloat16 due to lack of precision. The NEON tests also fail with precision issues and due to different handling of `sqrt(inf)`, so special functions bessel, ndtri have been disabled. Tested with AVX, AVX512.	2020-12-04 10:16:29 -08:00
Rasmus Munk Larsen	71c85df4c1	Clean up the Tensor header and get rid of the EIGEN_SLEEP macro.	2020-12-02 11:04:04 -08:00
Antonio Sanchez	22f67b5958	Fix boolean float conversion and product warnings. This fixes some gcc warnings such as: ``` Eigen/src/Core/GenericPacketMath.h:655:63: warning: implicit conversion turns floating-point number into bool: 'typename __gnu_cxx::__enable_if<__is_integer<bool>::__value, double>::__type' (aka 'double') to 'bool' [-Wimplicit-conversion-floating-point-to-bool] Packet psqrt(const Packet& a) { EIGEN_USING_STD(sqrt); return sqrt(a); } ``` Details: - Added `scalar_sqrt_op<bool>` (`-Wimplicit-conversion-floating-point-to-bool`). - Added `scalar_square_op<bool>` and `scalar_cube_op<bool>` specializations (`-Wint-in-bool-context`) - Deprecated above specialized ops for bool. - Modified `cxx11_tensor_block_eval` to specialize generator for booleans (`-Wint-in-bool-context`) and to use `abs` instead of `square` to avoid deprecated bool ops.	2020-11-24 20:20:36 +00:00
Antonio Sanchez	a8fdcae55d	Fix sparse_extra_3, disable counting temporaries for testing DynamicSparseMatrix. Multiplication of column-major `DynamicSparseMatrix`es involves three temporaries: - two for transposing twice to sort the coefficients (`ConservativeSparseSparseProduct.h`, L160-161) - one for a final copy assignment (`SparseAssign.h`, L108) The latter is avoided in an optimization for `SparseMatrix`. Since `DynamicSparseMatrix` is deprecated in favor of `SparseMatrix`, it's not worth the effort to optimize further, so I simply disabled counting temporaries via a macro. Note that due to the inclusion of `sparse_product.cpp`, the `sparse_extra` tests actually re-run all the original `sparse_product` tests as well. We may want to simply drop the `DynamicSparseMatrix` tests altogether, which would eliminate the test duplication. Related to #2048	2020-11-18 23:15:33 +00:00
Antonio Sanchez	17268b155d	Add bit_cast for half/bfloat to/from uint16_t, fix TensorRandom The existing `TensorRandom.h` implementation makes the assumption that `half` (`bfloat16`) has a `uint16_t` member `x` (`value`), which is not always true. This currently fails on arm64, where `x` has type `__fp16`. Added `bit_cast` specializations to allow casting to/from `uint16_t` for both `half` and `bfloat16`. Also added tests in `half_float`, `bfloat16_float`, and `cxx11_tensor_random` to catch these errors in the future.	2020-11-18 20:32:35 +00:00
Antonio Sanchez	852513e7a6	Disable testing of OpenGL by default. The `OpenGLSupport` module contains mostly deprecated features, and the test is highly GL context-dependent, relies on deprecated GLUT, and requires a display. Until the module is updated to support modern OpenGL and the test to use newer windowing frameworks (e.g. GLFW) it's probably best to disable the test by default. The test can be enabled with `cmake -DEIGEN_TEST_OPENGL=ON`. See #2053 for more details.	2020-11-12 16:15:40 -08:00
Antonio Sanchez	6961468915	Address issues with `openglsupport` test. The existing test fails on several systems due to GL runtime version mismatches, the use of deprecated features, and memory errors due to improper use of GLUT. The test was modified to: - Run within a display function, allowing proper GLUT cleanup. - Generate dynamic shaders with a supported GLSL version string and output variables. - Report shader compilation errors. - Check GL context version before launching version-specific tests. Note that most of the existing `OpenGLSupport` module and tests rely on deprecated features (e.g. fixed-function pipeline). The test was modified to allow it to pass on various systems. We might want to consider removing the module or re-writing it entirely to support modern OpenGL. This is beyond the scope of this patch. Testing of legacy GL (for platforms that support it) can be enabled by defining `EIGEN_LEGACY_OPENGL`. Otherwise, the test will try to create a modern context. Tested on - MacBook Air (2019), macOS Catalina 10.15.7 (OpenGL 2.1, 4.1) - Debian 10.6, NVidia Quadro K1200 (OpenGL 3.1, 3.3)	2020-11-11 15:54:43 -08:00
Deven Desai	9d11e2c03e	CMakefile update for ROCm 4.0 Starting with ROCm 4.0, the `hipconfig --platform` command will return `amd` (prior return value was `hcc`). Updating the CMakeLists.txt files in the test dirs to account for this change.	2020-10-29 18:06:31 +00:00
Rasmus Munk Larsen	274ef12b61	Remove leftover debug print statement in cxx11_tensor_expr.cpp	2020-10-14 22:59:51 +00:00

1 2 3 4 5 ...

1175 Commits