eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2024-12-27 07:29:52 +08:00

Author	SHA1	Message	Date
sciencewhiz	4b6036e276	fix various typos	2021-09-22 16:15:06 +00:00
Alexander Grund	b5eaa42695	Fix alias violation in BFloat16 reinterpret_cast between unrelated types is undefined behavior and leads to misoptimizations on some platforms. Use the safer (and faster) version via bit_cast	2021-09-20 10:37:50 +02:00
Antonio Sanchez	3c724c44cf	Fix strict aliasing bug causing product_small failure. Packet loading is skipped due to aliasing violation, leading to nullopt matrix multiplication. Fixes #2327.	2021-09-17 21:09:34 +00:00
Antonio Sanchez	5dac0b53c9	Move Eigen::all,last,lastp1,lastN to Eigen::placeholders::. These names are so common, IMO they should not exist directly in the `Eigen::` namespace. This prevents us from using the `last` or `all` names for any parameters or local variables, otherwise spitting out warnings about shadowing or hiding the global values. Many external projects (and our own examples) also heavily use ``` using namespace Eigen; ``` which means these conflict with external libraries as well, e.g. `std::fill(first,last,value)`. It seems originally these were placed in a separate namespace `Eigen::placeholders`, which has since been deprecated. I propose to un-deprecate this, and restore the original locations. These symbols are also imported into `Eigen::indexing`, which additionally imports `fix` and `seq`. An alternative is to remove the `placeholders` namespace and stick with `indexing`. NOTE: this is an API-breaking change. Fixes #2321.	2021-09-17 10:21:42 -07:00
Rasmus Munk Larsen	6cadab6896	Clean up EIGEN_STATIC_ASSERT to only use standard c++11 static_assert.	2021-09-16 20:43:54 +00:00
Rasmus Munk Larsen	7b975acb1f	Remove unused variable.	2021-09-16 20:27:13 +00:00
Rasmus Munk Larsen	92849d814b	Remove unused variable.	2021-09-16 20:21:31 +00:00
Rasmus Munk Larsen	da027fa20a	Remove unused variable.	2021-09-16 20:02:42 +00:00
Antonio Sanchez	cb50730993	Default eigen_packet_wrapper constructor. This makes it trivial, allowing use of `memcpy`. Fixes #2326	2021-09-14 10:57:22 -07:00
Rasmus Munk Larsen	d7d0bf832d	Issue an error in case of direct inclusion of internal headers.	2021-09-10 19:12:26 +00:00
Antonio Sanchez	26e5beb8cb	Device-compatible Tuple implementation. An analogue of `std::tuple` that works on device. Context: I've tried `std::tuple` in various versions of NVCC and clang, and although code seems to compile, it often fails to run - generating "illegal memory access" errors, or "illegal instruction" errors. This replacement does work on device.	2021-09-08 13:34:19 -07:00
Antonio Sanchez	fcd73b4884	Add a simple serialization mechanism. The `Serializer<T>` class implements a binary serialization that can write to (`serialize`) and read from (`deserialize`) a byte buffer. Also added convenience routines for serializing a list of arguments. This will mainly be for testing, specifically to transfer data to and from the GPU.	2021-09-08 09:38:59 -07:00
Antonio Sanchez	7792b1e909	Fix AVX2 PacketMath.h. There were a couple typos ps -> epi32, and an unaligned load issue.	2021-09-03 19:47:57 +00:00
Antonio Sanchez	5bf35383e0	Disable MSVC constant condition warning. We use extensive use of `if (CONSTANT)`, and cannot use c++17's `if constexpr`.	2021-09-03 11:07:18 -07:00
Antonio Sanchez	def145547f	Add missing packet types in pset1 call. Oops, introduced this when "fixing" integer packets.	2021-09-02 16:21:07 -07:00
Antonio Sanchez	3b48a3b964	Remove stray DynamicSparseMatrix references. DynamicSparseMatrix has been removed. These shouldn't be here anymore.	2021-09-02 19:47:26 +00:00
Antonio Sanchez	ebd4b17d2f	Fix tridiagonalization_inplace_selector. The `Options` of the new `hCoeffs` vector do not necessarily match those of the `MatrixType`, leading to build errors. Having the `CoeffVectorType` be a template parameter relieves this restriction.	2021-09-02 12:23:27 -07:00
Antonio Sanchez	998bab4b04	Missing EIGEN_DEVICE_FUNCs to get `gpu_basic` passing with CUDA 9. CUDA 9 seems to require labelling defaulted constructors as `EIGEN_DEVICE_FUNC`, despite giving warnings that such labels are ignored. Without these labels, the `gpu_basic` test fails to compile, with errors about calling `__host__` functions from `__host__ __device__` functions.	2021-09-01 19:49:53 -07:00
Antonio Sanchez	3d4ba855e0	Fix AVX integer packet issues. Most are instances of AVX2 functions not protected by `EIGEN_VECTORIZE_AVX2`. There was also a missing semi-colon for AVX512.	2021-09-01 14:14:43 -07:00
Antonio Sanchez	3a6296d4f1	Fix EIGEN_OPTIMIZATION_BARRIER for arm-clang. Clang doesn't like !621, needs the "g" constraint back. The "g" constraint also works for GCC >= 5. This fixes our gitlab CI.	2021-09-01 09:19:55 -07:00
Antonio Sanchez	ff07a8a639	GCC 4.8 arm EIGEN_OPTIMIZATION_BARRIER fix (#2315 ). GCC 4.8 doesn't seem to like the `g` register constraint, failing to compile with "error: 'asm' operand requires impossible reload". Tested `r` instead, and that seems to work, even with latest compilers. Also fixed some minor macro issues to eliminate warnings on armv7. Fixes #2315.	2021-08-31 20:20:47 +00:00
Antonio Sanchez	cc3573ab44	Disable cuda Eigen::half vectorization on host. All cuda `__half` functions are device-only in CUDA 9, including conversions. Host-side conversions were added in CUDA 10. The existing code doesn't build prior to 10.0. All arithmetic functions are always device-only, so there's therefore no reason to use vectorization on the host at all. Modified the code to disable vectorization for `__half` on host, which required also updating the `TensorReductionGpu` implementation which previously made assumptions about available packets.	2021-08-31 19:13:12 +00:00
Adam Kallai	1415817d8d	win: include intrin header in Windows on ARM intrin header is needed for _BitScanReverse and _BitScanReverse64	2021-08-31 10:57:34 +02:00
Antonio Sanchez	5db9e5c779	Fix fix<N> when variable templates are not supported. There were some typos that checked `EIGEN_HAS_CXX14` that should have checked `EIGEN_HAS_CXX14_VARIABLE_TEMPLATES`, causing a mismatch in some of the `Eigen::fix<N>` assumptions. Also fixed the `symbolic_index` test when `EIGEN_HAS_CXX14_VARIABLE_TEMPLATES` is 0. Fixes #2308	2021-08-30 08:06:55 -07:00
Jakub Lichman	dc5b1f7d75	AVX512 and AVX2 support for Packet16i and Packet8i added	2021-08-25 19:38:23 +00:00
Han-Kuan Chen	ab28419298	optimize predux if architecture is aarch64	2021-08-25 19:18:54 +00:00
Rasmus Munk Larsen	82dd3710da	Update version of master branch to 3.4.90.	2021-08-18 13:46:05 -07:00
Antonio Sanchez	2b410ecbef	Workaround VS 2017 arg bug. In VS 2017, `std::arg` for real inputs always returns 0, even for negative inputs. It should return `PI` for negative real values. This seems to be fixed in VS 2019 (MSVC 1920).	2021-08-18 18:39:18 +00:00
Jakob Struye	53a29c7e35	Clearer doc for squaredNorm	2021-08-18 15:11:15 +00:00
Antonio Sanchez	fc9d352432	Renamed shift_left/shift_right to shiftLeft/shiftRight. For naming consistency. Also moved to ArrayCwiseUnaryOps, and added test.	2021-08-17 20:04:48 -07:00
Antonio Sanchez	2cc6ee0d2e	Add missing PPC packet comparisons. This is to fix the packetmath tests on the ppc pipeline.	2021-08-17 07:42:04 -07:00
Chip-Kerchner	8dcf3e38ba	Fix unaligned loads in ploadLhs & ploadRhs for P8.	2021-08-16 20:28:22 -05:00
andiwand	5c6b3efead	minor doc fix in Map.h	2021-08-16 12:02:33 +00:00
Chip-Kerchner	e07227c411	Reverse compare logic in F32ToBf16 since vec_cmpne is not available in Power8 - now compiles for clang10 default (P8).	2021-08-13 11:21:28 -05:00
Chip Kerchner	66499f0f17	Get rid of used uninitialized warnings for EIGEN_UNUSED_VARIABLE in gcc11+	2021-08-12 21:38:54 +00:00
Rasmus Munk Larsen	8ce341caf2	* revise the meta_least_common_multiple function template, add a bool variable to check whether the A is larger than B. * This can make less compile_time if A is smaller than B. and avoid failure in compile if we get a little A and a great B. Authored by @awoniu.	2021-08-11 18:10:01 +00:00
ChipKerchner	413bc491f1	Fix errors on older compilers (gcc 7.5 - lack of vec_neg, clang10 - can not use const pointers with vec_xl).	2021-08-10 15:03:18 -05:00
Gauri Deshpande	e6a5a594a7	remove denormal flushing in fp32tobf16 for avx & avx512	2021-08-09 22:15:21 +00:00
Rasmus Munk Larsen	a5a7faeb45	Avoid memory allocation in tridiagonalization_inplace_selector::run.	2021-08-06 20:48:10 +00:00
Alexander Karatarakis	4ba872bd75	Avoid leading underscore followed by cap in template identifiers	2021-08-04 22:41:52 +00:00
Antonio Sanchez	5ad8b9bfe2	Make inverse 3x3 faster and avoid gcc bug. There seems to be a gcc 4.7 bug that incorrectly flags the current 3x3 inverse as using uninitialized memory. I'm pretty sure it's a false positive, but it's hard to trigger. The same warning does not trigger with clang or later compiler versions. In trying to find a work-around, this implementation turns out to be faster anyways for static-sized matrices. ``` name old cpu/op new cpu/op delta BM_Inverse3x3<DynamicMatrix3T<float>> 423ns ± 2% 433ns ± 3% +2.32% (p=0.000 n=98+96) BM_Inverse3x3<DynamicMatrix3T<double>> 425ns ± 2% 427ns ± 3% +0.48% (p=0.003 n=99+96) BM_Inverse3x3<StaticMatrix3T<float>> 7.10ns ± 2% 0.80ns ± 1% -88.67% (p=0.000 n=114+112) BM_Inverse3x3<StaticMatrix3T<double>> 7.45ns ± 2% 1.34ns ± 1% -82.01% (p=0.000 n=105+111) BM_AliasedInverse3x3<DynamicMatrix3T<float>> 409ns ± 3% 419ns ± 3% +2.40% (p=0.000 n=100+98) BM_AliasedInverse3x3<DynamicMatrix3T<double>> 414ns ± 3% 413ns ± 2% ~ (p=0.322 n=98+98) BM_AliasedInverse3x3<StaticMatrix3T<float>> 7.57ns ± 1% 0.80ns ± 1% -89.37% (p=0.000 n=111+114) BM_AliasedInverse3x3<StaticMatrix3T<double>> 9.09ns ± 1% 2.58ns ±41% -71.60% (p=0.000 n=113+116) ```	2021-08-04 21:18:44 +00:00
Antonio Sanchez	3d98a6ef5c	Modify scalar pzero, ptrue, pselect, and p<binary> operations to avoid memset. The `memset` function and bitwise manipulation only apply to POD types that do not require initialization, otherwise resulting in UB. We currently violate this in `ptrue` and `pzero`, we assume bitmasks for `pselect`, and bitwise operations are applied byte-by-byte in the generic implementations. This is causing issues for scalar types that do require initialization or that contain non-POD info such as pointers (#2201). We either break them, or force specializations of these functions for custom scalars, even if they are not vectorized. Here we modify these functions for scalars only - instead using only scalar operations: - `pzero`: `Scalar(0)` for all scalars. - `ptrue`: `Scalar(1)` for non-trivial scalars, bitset to one bits for trivial scalars. - `pselect`: ternary select comparing mask to `Scalar(0)` for all scalars - `pand`, `por`, `pxor`, `pnot`: use operators `&`, `\|`, `^`, `~` for all integer or non-trivial scalars, otherwise apply bytewise. For non-scalar types, the original implementations are used to maintain compatibility and minimize the number of changes. Fixes #2201.	2021-08-03 08:44:28 -07:00
Antonio Sanchez	7880f10526	Enable equality comparisons on GPU. Since `std::equal_to::operator()` is not a device function, it fails on GPU. On my device, I seem to get a silent crash in the kernel (no reported error, but the kernel does not complete). Replacing this with a portable version enables comparisons on device. Addresses #2292 - would need to be cherry-picked. The 3.3 branch also requires adding `EIGEN_DEVICE_FUNC` in `BooleanRedux.h` to get fully working.	2021-08-03 01:53:31 +00:00
hyunggi-sv	02a0e79c70	fix:typo in dox (has->have)	2021-08-03 00:45:00 +00:00
Antonio Sanchez	9816fe59b4	Fix assignment operator issue for latest MSVC+NVCC. Details are scattered across #920, #1000, #1324, #2291. Summary: some MSVC versions have a bug that requires omitting explicit `operator=` definitions (leads to duplicate definition errors), and some MSVC versions require adding explicit `operator=` definitions (otherwise implicitly deleted errors). This mess tries to cover all the cases encountered. Fixes #2291.	2021-08-03 00:26:10 +00:00
Antonio Sanchez	de2e62c62d	Disable vectorization of comparisons except for bool. Packet input/output types must currently be the same, and since these have a return type of `bool`, vectorization will only work if input is bool.	2021-07-25 13:39:50 -07:00
derekjchow	66ca41bd47	Add support for vectorizing logical comparisons.	2021-07-23 20:07:48 +00:00
arthurfeeney	a77638387d	Fixes #1387 for compilation error in JacobiSVD with HouseholderQRPreconditioner that occurs when input is a compile-time row vector.	2021-07-20 20:11:22 +00:00
Antonio Sanchez	297f0f563d	Fix explicit default cache size typo.	2021-07-20 11:40:17 -07:00
Rohit Santhanam	beea14a18f	Enable extract et. al. for HIP GPU.	2021-07-09 14:58:07 +00:00

1 2 3 4 5 ...

6614 Commits