eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2025-01-06 14:14:46 +08:00

Author	SHA1	Message	Date
Rasmus Munk Larsen	c4b99f78c7	Fix breakage in pcast<Packet2l, Packet2d> due to _mm_cvtsi128_si64 not being available on 32 bit x86. If SSE 4.1 is available use the faster _mm_extract_epi64 intrinsic.	2020-09-18 18:13:20 -07:00
guoqiangqi	9aad16b443	Fix undefined reference to pset1frombits bug on different platforms	2020-09-19 00:53:21 +00:00
David Tellenbach	c4aa8e0db2	Rename variable to avoid shadowing of a previously declared one	2020-09-18 22:53:15 +02:00
Rasmus Munk Larsen	e55182ac09	Get rid of initialization logic for blueNorm by making the computed constants static const or constexpr. Move macro definition EIGEN_CONSTEXPR to Core and make all methods in NumTraits constexpr when EIGEN_HASH_CONSTEXPR is 1.	2020-09-18 17:38:58 +00:00
Rasmus Munk Larsen	14022f5eb5	Fix more mildly embarrassing typos in ARM intrinsics in PacketMath.h. 'vmvnq_u64' does not exist for some reason.	2020-09-18 04:14:13 +00:00
Rasmus Munk Larsen	a5b226920f	Fix typo in PacketMath.h	2020-09-18 01:22:23 +00:00
Rasmus Munk Larsen	3af744b023	Add missing packet op pcmp_lt_or_nan for Packet2d on ARM.	2020-09-18 01:07:01 +00:00
Rasmus Munk Larsen	31a6b88ff3	Disable double version of compute_inverse_size4 on Inverse_NEON.h if Packet2d is not supported.	2020-09-17 23:51:06 +00:00
Brad King	880fa43b2b	Add support for CastXML on ARM aarch64 CastXML simulates the preprocessors of other compilers, but actually parses the translation unit with an internal Clang compiler. Use the same `vld1q_u64` workaround that we do for Clang. Fixes: #1979	2020-09-16 13:40:23 -04:00
daravi	6f0f6f792e	Fix compiler error due to c++20 operator== generation rules	2020-09-16 02:06:53 +00:00
Benoit Jacob	cc0c38ace8	Remove old Clang compiler bug work-arounds. The two LLVM bugs referenced in the comments here have long been fixed. The workarounds were now detrimental because (1) they prevented using fused mul-add on Clang/ARM32 and (2) the unnecessary 'volatile' in 'asm volatile' prevented legitimate reordering by the compiler.	2020-09-15 20:54:14 -04:00
Tim Shen	bb56a62582	Make bfloat16(float(-nan)) produce -nan, not nan.	2020-09-15 13:24:23 -07:00
Guoqiang QI	3012e755e9	Add plog ops support packet2d for NEON	2020-09-15 17:10:35 +00:00
Rasmus Munk Larsen	e4fb0ddf78	Add EIGEN_UNUSED_VARIABLE to unused variable in Memory.h	2020-09-15 01:18:55 +00:00
Pedro Caldeira	65e400896b	Fix bfloat16 round on gcc 4.8	2020-09-14 10:43:59 -03:00
Rasmus Munk Larsen	5636f80d11	Fix issue #1968 . Don't discard return value from "new" in C++17.	2020-09-13 17:38:45 +00:00
Guoqiang QI	7c5d48f313	Unified sse pldexp_double api	2020-09-12 10:56:55 +00:00
Rasmus Munk Larsen	71e08c702b	Make blueNorm threadsafe if C++11 atomics are available.	2020-09-12 01:23:29 +00:00
David Tellenbach	adc861cabd	New CI infrastructure, including AArch64 runners	2020-09-11 18:11:49 +00:00
Niels Dekker	5328c9be43	Fix half_impl::float_to_half_rtne(float) warning: '<<' causes overflow Fixed Visual Studio 2019 Code Analysis (C++ Core Guidelines) warning C26450 from inside `half_impl::float_to_half_rtne(float)`: > Arithmetic overflow: '<<' operation causes overflow at compile time.	2020-09-10 16:22:28 +02:00
Pedro Caldeira	35d149e34c	Add missing functions for Packet8bf in Altivec architecture. Including new tests for bfloat16 Packets. Fix prsqrt on GenericPacketMath.	2020-09-08 09:22:11 -05:00
Guoqiang QI	85428a3440	Add Neon psqrt<Packet2d> and pexp<Packet2d>	2020-09-08 09:04:03 +00:00
Alexander Neumann	5272106826	remove semi triggering -Wextra-semi-stmt	2020-09-07 11:42:30 +02:00
Stephen Zheng	5f25bcf7d6	Add Inverse_NEON.h Implemented fast size-4 matrix inverse (mimicking Inverse_SSE.h) using NEON intrinsics. ``` Benchmark Time CPU Time Old Time New CPU Old CPU New -------------------------------------------------------------------------------------------------------- BM_float -0.1285 -0.1275 568 495 572 499 BM_double -0.2265 -0.2254 638 494 641 496 ```	2020-09-04 10:55:47 +00:00
Everton Constantino	6fe88a3c9d	MatrixProuct enhancements: - Changes to Altivec/MatrixProduct Adapting code to gcc 10. Generic code style and performance enhancements. Adding PanelMode support. Adding stride/offset support. Enabling float64, std::complex and std::complex. Fixing lack of symm_pack. Enabling mixedtypes. - Adding std::complex tests to blasutil. - Adding an implementation of storePacketBlock when Incr!= 1.	2020-09-02 18:21:36 -03:00
Everton Constantino	6568856275	Changing u/int8_t to un/signed char because clang does not understand it. Implementing pcmp_eq to Packet8 and Packet16.	2020-09-02 17:02:15 -03:00
Gael Guennebaud	27e6648074	fix #1901 : warning in Mode==(Upper\|Lower)	2020-09-02 15:43:58 +02:00
Hans Johnson	5b9bfc892a	BUG: cmake_minimum_required must be the first command https://cmake.org/cmake/help/v3.5/command/project.html Note: Call the cmake_minimum_required() command at the beginning of the top-level CMakeLists.txt file even before calling the project() command. It is important to establish version and policy settings before invoking other commands whose behavior they may affect. See also policy CMP0000.	2020-08-28 22:57:16 +00:00
Chip Kerchner	e5886457c8	Change Packet8s and Packet8us to use vector commands on Power for pmadd, pmul and psub.	2020-08-28 19:27:32 +00:00
Gael Guennebaud	25424d91f6	Fix #1974 : assertion when reserving an empty sparse matrix	2020-08-26 12:32:20 +02:00
Guoqiang QI	8bb0febaf9	add psqrt ops support packet2f/packet4f for NEON	2020-08-21 03:17:15 +00:00
Georg Jäger	1b1082334b	adding attributes to constructors to support hip-clang on ROCm 3.5	2020-08-20 16:48:11 +02:00
Deven Desai	603e213d13	Fixing a CUDA / P100 regression introduced by PR 181 PR 181 ( https://gitlab.com/libeigen/eigen/-/merge_requests/181 ) adds `__launch_bounds__(1024)` attribute to GPU kernels, that did not have that attribute explicitly specified. That PR seems to cause regressions on the CUDA platform. This PR/commit makes the changes in PR 181, to be applicable for HIP only	2020-08-20 00:29:57 +00:00
David Tellenbach	c060114a25	Fix nightly CI configuration	2020-08-19 20:52:34 +02:00
David Tellenbach	fe8c3ef3cb	Add possibility to split test suit build targets and improved CI configuration - Introduce CMake option `EIGEN_SPLIT_TESTSUITE` that allows to divide the single test build target into several subtargets - Add CI pipeline for merge request that can be run by GitLab's shared runners - Add nightly CI pipeline	2020-08-19 18:27:45 +00:00
Rasmus Munk Larsen	d10b27fe37	Add missing inline keyword in Quaternion.h.	2020-08-14 17:51:04 +00:00
David Tellenbach	d4a727d092	Disable min/max NaN propagation in test cxx11_tensor_expr The current pmin/pmax implementation for Arm Neon propagate NaNs differently than std::min/std::max. See issue https://gitlab.com/libeigen/eigen/-/issues/1937	2020-08-14 16:16:27 +00:00
David Tellenbach	d2bb6cf396	Fix compilation error in blasutil test	2020-08-14 18:15:18 +02:00
David Tellenbach	c6820a6316	Replace the call to int64_t in the blasutil test by explicit types Some platforms define int64_t to be long long even for C++03. If this is the case we miss the definition of internal::make_unsigned for this type. If we just define the template we get duplicated definitions errors for platforms defining int64_t as signed long for C++03. We need to find a way to distinguish both cases at compile-time.	2020-08-14 17:24:37 +02:00
David Tellenbach	8ba1b0f41a	bfloat16 packetmath for Arm Neon backend	2020-08-13 15:48:40 +00:00
Pedro Caldeira	704798d1df	Add support for Bfloat16 to use vector instructions on Altivec architecture	2020-08-10 13:22:01 -05:00
Deven Desai	46f8a18567	Adding an explicit launch_bounds(1024) attribute for GPU kernels. Starting with ROCm 3.5, the HIP compiler will change from HCC to hip-clang. This compiler change introduce a change in the default value of the `__launch_bounds__` attribute associated with a GPU kernel. (default value means the value assumed by the compiler as the `__launch_bounds attribute__` value, when it is not explicitly specified by the user) Currently (i.e. for HIP with ROCm 3.3 and older), the default value is 1024. That changes to 256 with ROCm 3.5 (i.e. hip-clang compiler). As a consequence of this change, if a GPU kernel with a `__luanch_bounds__` attribute of 256 is launched at runtime with a threads_per_block value > 256, it leads to a runtime error. This is leading to a couple of Eigen unit test failures with ROCm 3.5. This commit adds an explicit `__launch_bounds(1024)__` attribute to every GPU kernel that currently does not have it explicitly specified (and hence will end up getting the default value of 256 with the change to hip-clang)	2020-08-05 01:46:34 +00:00
Zachary Garrett	21122498ec	Temporarily turn off the NEON implementation of pfloor as it does not work for large values. The NEON implementation mimics the SSE implementation, but didn't mention the caveat that due to the unsigned of signed integer conversions, not all values in the original floating point represented are supported.	2020-08-04 16:28:23 +00:00
David Tellenbach	23b7f0572b	Disable CI buildstage again	2020-08-03 15:41:43 +02:00
Gael Guennebaud	d0f5d4bc50	add a banner to advertise the survey	2020-07-29 19:01:38 +02:00
David Tellenbach	5e484fa11d	Fix StlDeque for GCC 10 StlDeque extends std::deque by accessing some of its internal members. Since GCC 10 these are not accessible anymore.	2020-07-29 12:31:13 +00:00
Teng Lu	3ec4f0b641	Fix undefine BF16 union behavior in AVX512.	2020-07-29 02:20:21 +00:00
Rasmus Munk Larsen	b92206676c	Inherit alignment trait from argument in TensorBroadcasting to avoid segfault when the argument is unaligned.	2020-07-28 19:19:37 +00:00
David Tellenbach	99da2e1a8d	Fix clang-tidy warnings in generic bfloat16 implementation See !172 for related discussions.	2020-07-27 16:00:24 +02:00
qxxxb	649fd1c2ae	Fix CMake install command	2020-07-25 16:35:13 -04:00

... 2 3 4 5 6 ...

11215 Commits