eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2024-12-27 07:29:52 +08:00

Author	SHA1	Message	Date
Christoph Hertzberg	2660d01fa7	Inherit from `no_assignment_operator` to avoid implicit copy constructor warnings (cherry picked from commit 9bbb7ea4b54b1f307863be4ed8d105c38cdefe50)	2021-02-27 18:44:26 +01:00
Christoph Hertzberg	a3521d743c	Fix some enum-enum conversion warnings (cherry picked from commit 838f3d8ce22a5549ef10c7386fb03040721749a0)	2021-02-27 18:44:26 +01:00
Christoph Hertzberg	ca528593f4	Fixed/masked more implicit copy constructor warnings (cherry picked from commit 2883e91ce5a99c391fbf28e20160176b70854992)	2021-02-27 18:44:26 +01:00
Christoph Hertzberg	81b5fe2f0a	ReturnByValue is already non-copyable (cherry picked from commit abbf95045009619f37bd92b45433eedbfcbe41cf)	2021-02-27 18:44:26 +01:00
Christoph Hertzberg	4fb3459a23	Fix double-promotion warnings (cherry picked from commit c22c103e932e511e96645186831363585a44b7a3)	2021-02-27 18:44:26 +01:00
Jens Wehner	4bfcee47b9	Idrs iterative linear solver	2021-02-27 12:09:33 +00:00
Antonio Sanchez	29ebd84cb7	Fix NEON sqrt for 32-bit, add prsqrt. With !406, we accidentally broke arm 32-bit NEON builds, since `vsqrt_f32` is only available for 64-bit. Here we add back the `rsqrt` implementation for 32-bit, relying on a `prsqrt` implementation with better handling of edge cases. Note that several of the 32-bit NEON packet tests are currently failing - either due to denormal handling (NEON versions flush to zero, but scalar paths don't) or due to accuracy (e.g. sin/cos).	2021-02-26 14:08:40 -08:00
Rasmus Munk Larsen	fe19714f80	Merge branch 'rmlarsen1/eigen-nan_prop'	2021-02-26 09:21:24 -08:00
Rasmus Munk Larsen	e67672024d	Merge branch 'nan_prop' of https://gitlab.com/rmlarsen1/eigen into nan_prop	2021-02-26 09:12:44 -08:00
Rasmus Munk Larsen	5e7d4c33d6	Add TODO.	2021-02-26 09:08:45 -08:00
Rasmus Munk Larsen	fb5b59641a	Defer default for minCoeff/maxCoeff to templated variant.	2021-02-26 09:07:00 -08:00
Antonio Sanchez	e19829c3b0	Fix floor/ceil for NEON fp16. Forgot to test this. Fixes bug introduced in !416.	2021-02-25 20:39:56 -08:00
Antonio Sanchez	5529db7524	Fix SSE/NEON pfloor/pceil for saturated values. The original will saturate if the input does not fit into an integer type. Here we fix this, returning the input if it doesn't have enough precision to have a fractional part. Also added `pceil` for NEON. Fixes #1969.	2021-02-25 14:39:26 -08:00
Rasmus Munk Larsen	51eba8c3e2	Fix indentation.	2021-02-25 18:21:21 +00:00
Rasmus Munk Larsen	5297b7162a	Make it possible to specify NaN propagation strategy for maxCoeff/minCoeff reductions.	2021-02-25 18:21:21 +00:00
Antonio Sanchez	ecb7b19dfa	Disable new/delete test for HIP	2021-02-25 08:04:05 -08:00
Chip-Kerchner	6eebe97bab	Fix clang compile when no MMA flags are set. Simplify MMA compiler detection.	2021-02-24 20:43:23 -06:00
Rasmus Munk Larsen	f284c8592b	Don't crash when attempting to slice an empty tensor.	2021-02-24 18:12:51 -08:00
Rasmus Munk Larsen	4cb0592af7	Fix indentation.	2021-02-24 17:59:36 -08:00
Rasmus Munk Larsen	6b34568c74	Merge branch 'nan_prop' of https://gitlab.com/rmlarsen1/eigen into nan_prop	2021-02-24 17:54:58 -08:00
Rasmus Munk Larsen	0065f9d322	Make it possible to specify NaN propagation strategy for maxCoeff/minCoeff reductions.	2021-02-25 01:54:36 +00:00
Rasmus Munk Larsen	841c8986f8	Make it possible to specify NaN propagation strategy for maxCoeff/minCoeff reductions.	2021-02-24 17:49:20 -08:00
Rasmus Munk Larsen	113e61f364	Remove unused function scalar_cmp_with_cast.	2021-02-24 23:59:35 +00:00
Rasmus Munk Larsen	98ca58b02c	Cast anonymous enums to int when used in expressions.	2021-02-24 23:50:06 +00:00
Chip-Kerchner	c31ead8a15	Having forward template function declarations in a P10 file causes bad code in certain situations.	2021-02-24 23:43:30 +00:00
Guoqiang QI	f44197fabd	Some improvements for kissfft from Martin Reinecke(pocketfft author): 1.Only computing about half of the factors and use complex conjugate symmetry for the rest instead of all to save time. 2.All twiddles are calculated in double because that gives the maximum achievable precision when doing float transforms. 3.Reducing all angles to the range 0<angle<pi/4 which gives even more precision.	2021-02-24 21:36:47 +00:00
Antonio Sanchez	a31effc3bc	Add `invoke_result` and eliminate `result_of` warnings for C++17+. The `std::result_of` meta struct is deprecated in C++17 and removed in C++20. It was still slipping through due to a faulty definition of `EIGEN_HAS_STD_RESULT_OF`. Added a new macro `EIGEN_HAS_STD_INVOKE_RESULT` and `Eigen::internal::invoke_result` implementation with fallback for pre C++17. Replaces the `result_of` definition with one based on `std::invoke_result` for C++17 and higher. For completeness, added nullary op support for c++03. Fixes #1850.	2021-02-24 21:36:14 +00:00
Chip-Kerchner	8523d447a1	Fixes to support old and new versions of the compilers for built-ins. Cast to non-const when using vector_pair with certain built-ins.	2021-02-24 20:49:15 +00:00
Antonio Sanchez	5908aeeaba	Fix CUDA device new and delete, and add test. HIP does not support new/delete on device, so test is skipped.	2021-02-24 11:31:41 -08:00
Antonio Sanchez	119763cf38	Eliminate CMake FindPackageHandleStandardArgs warnings. CMake complains that the package name does not match when the case differs, e.g.: ``` CMake Warning (dev) at /usr/share/cmake-3.18/Modules/FindPackageHandleStandardArgs.cmake:273 (message): The package name passed to `find_package_handle_standard_args` (UMFPACK) does not match the name of the calling package (Umfpack). This can lead to problems in calling code that expects `find_package` result variables (e.g., `_FOUND`) to follow a certain pattern. Call Stack (most recent call first): cmake/FindUmfpack.cmake:50 (find_package_handle_standard_args) bench/spbench/CMakeLists.txt:24 (find_package) This warning is for project developers. Use -Wno-dev to suppress it. ``` Here we rename the libraries to match their true cases.	2021-02-24 09:52:05 +00:00
Antonio Sanchez	6cf0ab5e99	Disable fast psqrt for NEON. Accuracy is too poor - requires at least two Newton iterations, but then it is no longer significantly faster than `vsqrt`. Fixes #2094.	2021-02-23 19:52:55 -08:00
Antonio Sanchez	aba3998278	Fix check if GPU compile phase for std::hash	2021-02-23 19:52:08 -08:00
Antonio Sanchez	db5691ff2b	Fix some CUDA warnings. Added `EIGEN_HAS_STD_HASH` macro, checking for C++11 support and not running on GPU. `std::hash<float>` is not a device function, so cannot be used by `std::hash<bfloat16>`. Removed `EIGEN_DEVICE_FUNC` and only define if `EIGEN_HAS_STD_HASH`. Same for `half`. Added `EIGEN_CUDA_HAS_FP16_ARITHMETIC` to improve readability, eliminate warnings about `EIGEN_CUDA_ARCH` not being defined. Replaced a couple C-style casts with `reinterpret_cast` for aligned loading of `half` to `half2`. This eliminates `-Wcast-align` warnings in clang. Although not ideal due to potential type aliasing, this is how CUDA handles these conversions internally.	2021-02-24 00:16:31 +00:00
Rasmus Munk Larsen	88d4c6d4c8	Accurate pow, part 2. This change adds specializations of log2 and exp2 for double that make pow<double> accurate the 1 ULP. Speed for AVX-512 is within 0.5% of the currect implementation.	2021-02-23 23:11:03 +00:00
Adam Shapiro	2ac0b78739	Fixed sparse conservativeResize() when both num cols and rows decreased. The previous implementation caused a buffer overflow trying to calculate non- zero counts for columns that no longer exist.	2021-02-23 21:32:39 +00:00
Chip-Kerchner	10c77b0ff4	Fix compilation errors with later versions of GCC and use of MMA.	2021-02-22 15:01:47 -06:00
Christoph Hertzberg	73922b0174	Fixes Bug #1925 . Packets should be passed by const reference, even to inline functions.	2021-02-20 18:56:42 +01:00
Antonio Sanchez	5f9cfb2529	Add missing adolc isinf/isnan. Also modified cmake/FindAdolc.cmake to eliminate warnings, and added search paths to match install layout. Fixed: #2157	2021-02-19 22:26:56 +00:00
Christoph Hertzberg	ce4af0b38f	Missing change regarding #1910	2021-02-19 20:51:35 +01:00
Christoph Hertzberg	a7749c09bc	Bug #1910 : Make SparseCholesky work for RowMajor matrices	2021-02-19 19:36:18 +01:00
Antonio Sánchez	128eebf05e	Revert "add EIGEN_DEVICE_FUNC to EIGEN_MAKE_ALIGNED_OPERATOR_NEW_IF macros (only if not HIPCC)." This reverts commit `12fd3dd655`	2021-02-19 17:09:16 +00:00
frgossen	33e0af0130	Return nan at poles of polygamma, digamma, and zeta if limit is not defined	2021-02-19 16:35:11 +00:00
Rasmus Munk Larsen	7f09d3487d	Use the Cephes double subtraction trick in pexp<float> even when FMA is available. Otherwise the accuracy drops from 1 ulp to 3 ulp.	2021-02-18 20:49:18 +00:00
Masaki Murooka	12fd3dd655	add EIGEN_DEVICE_FUNC to EIGEN_MAKE_ALIGNED_OPERATOR_NEW_IF macros (only if not HIPCC).	2021-02-17 22:55:47 +00:00
David Tellenbach	aa8b22e776	Bump to 3.4.99	2021-02-17 23:23:17 +01:00
David Tellenbach	5336ad8591	Define internal::make_unsigned for [unsigned]long long on macOS. macOS defines int64_t as long long even for C++03 and therefore expects a template specialization internal::make_unsigned<long long>, for C++03. Since other platforms define int64_t as long for C++03 we cannot add the specialization for all cases.	2021-02-17 23:03:10 +01:00
Antonio Sanchez	0845df7f77	Fix uninitialized warning on AVX.	2021-02-17 13:13:39 -08:00
Chip Kerchner	9b51dc7972	Fixed performance issues for VSX and P10 MMA in general_matrix_matrix_product	2021-02-17 17:49:23 +00:00
Rasmus Munk Larsen	be0574e215	New accurate algorithm for pow(x,y). This version is accurate to 1.4 ulps for float, while still being 10x faster than std::pow for AVX512. A future change will introduce a specialization for double.	2021-02-17 02:50:32 +00:00
Antonio Sanchez	7ff0b7a980	Updated pfrexp implementation. The original implementation fails for 0, denormals, inf, and NaN. See #2150	2021-02-17 02:23:24 +00:00

1 2 3 4 5 ...

11317 Commits