eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2024-12-27 07:29:52 +08:00

Author	SHA1	Message	Date
Rasmus Munk Larsen	4e4d3f32d1	Clean up packetmath tests and fix various bugs to make bfloat16 pass (almost) all packetmath tests with SSE, AVX, and AVX512.	2020-10-09 20:05:49 +00:00
David Tellenbach	7a8d3d5b81	Disable test exceptions when using OpenMP.	2020-10-09 17:49:07 +02:00
Rasmus Munk Larsen	b431024404	Don't make assumptions about NaN-propagation for pmin/pmax - it various across platforms. Change test to only test for NaN-propagation for pfmin/pfmax.	2020-10-07 19:05:18 +00:00
Rasmus Munk Larsen	3b445d9bf2	Add a generic packet ops corresponding to {std}::fmin and {std}::fmax. The non-sensical NaN-propagation rules for std::min std::max implemented by pmin and pmax in Eigen is a longstanding source og confusion and bug report. This change is a first step towards addressing it, as discussing in issue #564 .	2020-10-01 16:54:31 +00:00
Antonio Sanchez	d5a0d89491	Fix alignedbox 32-bit precision test failure. The current `test/geo_alignedbox` tests fail on 32-bit arm due to small floating-point errors. In particular, the following is not guaranteed to hold: ``` IsometryTransform identity = IsometryTransform::Identity(); BoxType transformedC; transformedC.extend(c.transformed(identity)); VERIFY(transformedC.contains(c)); ``` since `c.transformed(identity)` is ever-so-slightly different from `c`. Instead, we replace this test with one that checks an identity transform is within floating-point precision of `c`. Also updated the condition on `AlignedBox::transform(...)` to only accept `Affine`, `AffineCompact`, and `Isometry` modes explicitly. Otherwise, invalid combinations of modes would also incorrectly pass the assertion.	2020-09-30 08:42:03 -07:00
Martin Pecka	6425e875a1	Added AlignedBox::transform(AffineTransform).	2020-09-28 18:06:23 +00:00
David Tellenbach	493a7c773c	Remove EIGEN_CONSTEXPR from NumTraits<boost::multiprecision::number<...>>	2020-09-21 12:43:41 +02:00
Rasmus Munk Larsen	e55182ac09	Get rid of initialization logic for blueNorm by making the computed constants static const or constexpr. Move macro definition EIGEN_CONSTEXPR to Core and make all methods in NumTraits constexpr when EIGEN_HASH_CONSTEXPR is 1.	2020-09-18 17:38:58 +00:00
Tim Shen	bb56a62582	Make bfloat16(float(-nan)) produce -nan, not nan.	2020-09-15 13:24:23 -07:00
Pedro Caldeira	35d149e34c	Add missing functions for Packet8bf in Altivec architecture. Including new tests for bfloat16 Packets. Fix prsqrt on GenericPacketMath.	2020-09-08 09:22:11 -05:00
Everton Constantino	6fe88a3c9d	MatrixProuct enhancements: - Changes to Altivec/MatrixProduct Adapting code to gcc 10. Generic code style and performance enhancements. Adding PanelMode support. Adding stride/offset support. Enabling float64, std::complex and std::complex. Fixing lack of symm_pack. Enabling mixedtypes. - Adding std::complex tests to blasutil. - Adding an implementation of storePacketBlock when Incr!= 1.	2020-09-02 18:21:36 -03:00
Gael Guennebaud	25424d91f6	Fix #1974 : assertion when reserving an empty sparse matrix	2020-08-26 12:32:20 +02:00
Deven Desai	603e213d13	Fixing a CUDA / P100 regression introduced by PR 181 PR 181 ( https://gitlab.com/libeigen/eigen/-/merge_requests/181 ) adds `__launch_bounds__(1024)` attribute to GPU kernels, that did not have that attribute explicitly specified. That PR seems to cause regressions on the CUDA platform. This PR/commit makes the changes in PR 181, to be applicable for HIP only	2020-08-20 00:29:57 +00:00
David Tellenbach	fe8c3ef3cb	Add possibility to split test suit build targets and improved CI configuration - Introduce CMake option `EIGEN_SPLIT_TESTSUITE` that allows to divide the single test build target into several subtargets - Add CI pipeline for merge request that can be run by GitLab's shared runners - Add nightly CI pipeline	2020-08-19 18:27:45 +00:00
David Tellenbach	d2bb6cf396	Fix compilation error in blasutil test	2020-08-14 18:15:18 +02:00
David Tellenbach	c6820a6316	Replace the call to int64_t in the blasutil test by explicit types Some platforms define int64_t to be long long even for C++03. If this is the case we miss the definition of internal::make_unsigned for this type. If we just define the template we get duplicated definitions errors for platforms defining int64_t as signed long for C++03. We need to find a way to distinguish both cases at compile-time.	2020-08-14 17:24:37 +02:00
Pedro Caldeira	704798d1df	Add support for Bfloat16 to use vector instructions on Altivec architecture	2020-08-10 13:22:01 -05:00
Deven Desai	46f8a18567	Adding an explicit launch_bounds(1024) attribute for GPU kernels. Starting with ROCm 3.5, the HIP compiler will change from HCC to hip-clang. This compiler change introduce a change in the default value of the `__launch_bounds__` attribute associated with a GPU kernel. (default value means the value assumed by the compiler as the `__launch_bounds attribute__` value, when it is not explicitly specified by the user) Currently (i.e. for HIP with ROCm 3.3 and older), the default value is 1024. That changes to 256 with ROCm 3.5 (i.e. hip-clang compiler). As a consequence of this change, if a GPU kernel with a `__luanch_bounds__` attribute of 256 is launched at runtime with a threads_per_block value > 256, it leads to a runtime error. This is leading to a couple of Eigen unit test failures with ROCm 3.5. This commit adds an explicit `__launch_bounds(1024)__` attribute to every GPU kernel that currently does not have it explicitly specified (and hence will end up getting the default value of 256 with the change to hip-clang)	2020-08-05 01:46:34 +00:00
David Tellenbach	c1ffe452fc	Fix bfloat16 casts If we have explicit conversion operators available (C++11) we define explicit casts from bfloat16 to other types. If not (C++03), we don't define conversion operators but rely on implicit conversion chains from bfloat16 over float to other types.	2020-07-23 20:55:06 +00:00
Rasmus Munk Larsen	1b84f21e32	Revert change that made conversion from bfloat16 to {float, double} implicit. Add roundtrip tests for casting between bfloat16 and complex types.	2020-07-22 18:09:00 -07:00
Niels Dekker	0e1a33a461	Faster conversion from integer types to bfloat16 Specialized `bfloat16_impl::float_to_bfloat16_rtne(float)` for normal floating point numbers, infinity and zero, in order to improve the performance of `bfloat16::bfloat16(const T&)` for integer argument types. A reduction of more than 20% of the runtime duration of conversion from int to bfloat16 was observed, using Visual C++ 2019 on Windows 10.	2020-07-22 19:25:49 +02:00
Niels Dekker	4ab32e2de2	Allow implicit conversion from bfloat16 to float and double Conversion from `bfloat16` to `float` and `double` is lossless. It seems natural to allow the conversion to be implicit, as the C++ language also support implicit conversion from a smaller to a larger floating point type. Intel's OneDLL bfloat16 implementation also has an implicit `operator float()`: https://github.com/oneapi-src/oneDNN/blob/v1.5/src/common/bfloat16.hpp	2020-07-11 13:32:28 +02:00
Rasmus Munk Larsen	dcf7655b3d	Guard operator<< test by EIGEN_NO_IO.	2020-07-09 19:54:48 +00:00
Rasmus Munk Larsen	fb77b7288c	Add operator<< to print a quaternion.	2020-07-09 12:49:58 -07:00
David Tellenbach	ee4715ff48	Fix test basic stuff - Guard fundamental types that are not available pre C++11 - Separate subsequent angle brackets >> by spaces - Allow casting of Eigen::half and Eigen::bfloat16 to complex types	2020-07-09 17:24:00 +00:00
Rasmus Munk Larsen	6964ae8d52	Change the sign operator in Eigen to return NaN for NaN arguments, not zero.	2020-07-07 01:54:04 +00:00
David Tellenbach	cb63153183	Make test packetmath C++98 compliant	2020-07-01 20:41:59 +02:00
Kan Chen	8731452b97	Delete duplicate test cases in vectorization_logic.cpp	2020-07-01 00:51:15 +00:00
Antonio Sanchez	9cb8771e9c	Fix tensor casts for large packets and casts to/from std::complex The original tensor casts were only defined for `SrcCoeffRatio`:`TgtCoeffRatio` 1:1, 1:2, 2:1, 4:1. Here we add the missing 1:N and 8:1. We also add casting `Eigen::half` to/from `std::complex<T>`, which was missing to make it consistent with `Eigen:bfloat16`, and generalize the overload to work for any complex type. Tests were added to `basicstuff`, `packetmath`, and `cxx11_tensor_casts` to test all cast configurations.	2020-06-30 18:53:55 +00:00
Antonio Sanchez	145e51516f	Fix denormal check pre c++11. `float_denorm_style` is an old-style `enum`, so the `denorm_present` symbol only exists in the `std` namespace prior to c++11.	2020-06-30 17:28:30 +00:00
David Tellenbach	689b57070d	Report custom C++ flags in CMake testing summary	2020-06-30 17:18:54 +00:00
Antonio Sanchez	7222f0b6b5	Fix packetmath_1 float tests for arm/aarch64. Added missing `pmadd<Packet2f>` for NEON. This leads to significant improvement in precision than previous `pmul+padd`, which was causing the `pcos` tests to fail. Also added an approx test with `std::sin`/`std::cos` since otherwise returning any `a^2+b^2=1` would pass. Modified `log(denorm)` tests. Denorms are not always supported by all systems (returns `::min`), are always flushed to zero on 32-bit arm, and configurably flush to zero on sse/avx/aarch64. This leads to inconsistent results across different systems (i.e. `-inf` vs `nan`). Added a check for existence and exclude ARM. Removed logistic exactness test, since scalar and vectorized versions follow different code-paths due to differences in `pexp` and `pmadd`, which result in slightly different values. For example, exactness always fails on arm, aarch64, and altivec.	2020-06-24 14:03:35 -07:00
Antonio Sanchez	03ebdf6acb	Added missing NEON pcasts, update packetmath tests. The NEON `pcast` operators are all implemented and tested for existing packets. This requires adding a `pcast(a,b,c,d,e,f,g,h)` for casting between `int64_t` and `int8_t` in `GenericPacketMath.h`. Removed incorrect `HasHalfPacket` definition for NEON's `Packet2l`/`Packet2ul`. Adjustments were also made to the `packetmath` tests. These include - minor bug fixes for cast tests (i.e. 4:1 casts, only casting for packets that are vectorizable) - added 8:1 cast tests - random number generation - original had uninteresting 0 to 0 casts for many casts between floating-point and integers, and exhibited signed overflow undefined behavior Tested: ``` $ aarch64-linux-gnu-g++ -static -I./ '-DEIGEN_TEST_PART_ALL=1' test/packetmath.cpp -o packetmath $ adb push packetmath /data/local/tmp/ $ adb shell "/data/local/tmp/packetmath" ```	2020-06-21 09:32:31 -07:00
Teng Lu	386d809bde	Support BFloat16 in Eigen	2020-06-20 19:16:24 +00:00
Sebastien Boisvert	39cbd6578f	Fix #1911 : add benchmark for move semantics with fixed-size matrix $ clang++ -O3 bench/bench_move_semantics.cpp -I. -std=c++11 \ -o bench_move_semantics $ ./bench_move_semantics float copy semantics: 1755.97 ms float move semantics: 55.063 ms double copy semantics: 2457.65 ms double move semantics: 55.034 ms	2020-06-11 23:43:25 +00:00
Antonio Sanchez	a7d2552af8	Remove HasCast and fix packetmath cast tests. The use of the `packet_traits<>::HasCast` field is currently inconsistent with `type_casting_traits<>`, and is unused apart from within `test/packetmath.cpp`. In addition, those packetmath cast tests do not currently reflect how casts are performed in practice: they ignore the `SrcCoeffRatio` and `TgtCoeffRatio` fields, assuming a 1:1 ratio. Here we remove the unsed `HasCast`, and modify the packet cast tests to better reflect their usage.	2020-06-11 17:26:56 +00:00
Sebastien Boisvert	463ec86648	Fix #1757 : remove the word 'suicide'	2020-06-11 00:56:54 +00:00
Rasmus Munk Larsen	c2ab36f47a	Fix broken packetmath test for logistic on Arm.	2020-06-04 16:24:47 -07:00
Gael Guennebaud	029a76e115	Bug #1777 : make the scalar and packet path consistent for the logistic function + respective unit test	2020-05-31 00:53:37 +02:00
Gael Guennebaud	ab615e4114	Save one extra temporary when assigning a sparse product to a row-major sparse matrix	2020-05-30 23:15:12 +02:00
David Tellenbach	5328cd62b3	Guard usage of decltype since it's a C++11 feature This fixes https://gitlab.com/libeigen/eigen/-/issues/1897	2020-05-20 16:04:16 +02:00
Rasmus Munk Larsen	cc86a31e20	Add guard around specialization for bool, which is only currently implemented for SSE.	2020-05-19 16:21:56 -07:00
Everton Constantino	8a7f360ec3	- Vectorizing MMA packing. - Optimizing MMA kernel. - Adding PacketBlock store to blas_data_mapper.	2020-05-19 19:24:11 +00:00
Rasmus Munk Larsen	9b411757ab	Add missing packet ops for bool, and make it pass the same packet op unit tests as other arithmetic types. This change also contains a few minor cleanups: 1. Remove packet op pnot, which is not needed for anything other than pcmp_le_or_nan, which can be done in other ways. 2. Remove the "HasInsert" enum, which is no longer needed since we removed the corresponding packet ops. 3. Add faster pselect op for Packet4i when SSE4.1 is supported. Among other things, this makes the fast transposeInPlace() method available for Matrix<bool>. Run on ************** (72 X 2994 MHz CPUs); 2020-05-09T10:51:02.372347913-07:00 CPU: Intel Skylake Xeon with HyperThreading (36 cores) dL1:32KB dL2:1024KB dL3:24MB Benchmark Time(ns) CPU(ns) Iterations ----------------------------------------------------------------------- BM_TransposeInPlace<float>/4 9.77 9.77 71670320 BM_TransposeInPlace<float>/8 21.9 21.9 31929525 BM_TransposeInPlace<float>/16 66.6 66.6 10000000 BM_TransposeInPlace<float>/32 243 243 2879561 BM_TransposeInPlace<float>/59 844 844 829767 BM_TransposeInPlace<float>/64 933 933 750567 BM_TransposeInPlace<float>/128 3944 3945 177405 BM_TransposeInPlace<float>/256 16853 16853 41457 BM_TransposeInPlace<float>/512 204952 204968 3448 BM_TransposeInPlace<float>/1k 1053889 1053861 664 BM_TransposeInPlace<bool>/4 14.4 14.4 48637301 BM_TransposeInPlace<bool>/8 36.0 36.0 19370222 BM_TransposeInPlace<bool>/16 31.5 31.5 22178902 BM_TransposeInPlace<bool>/32 111 111 6272048 BM_TransposeInPlace<bool>/59 626 626 1000000 BM_TransposeInPlace<bool>/64 428 428 1632689 BM_TransposeInPlace<bool>/128 1677 1677 417377 BM_TransposeInPlace<bool>/256 7126 7126 96264 BM_TransposeInPlace<bool>/512 29021 29024 24165 BM_TransposeInPlace<bool>/1k 116321 116330 6068	2020-05-14 22:39:13 +00:00
Felipe Attanasio	d640276d31	Added support for reverse iterators for Vectorwise operations.	2020-05-14 22:38:20 +00:00
Christopher Moore	fa8fd4b4d5	Indexed view should have RowMajorBit when there is staticly a single row	2020-05-14 22:11:19 +00:00
Christopher Moore	a187ffea28	Resolve "IndexedView of a vector should allow linear access"	2020-05-13 19:24:42 +00:00
Rasmus Munk Larsen	c1d944dd91	Remove packet ops pinsertfirst and pinsertlast that are only used in a single place, and can be replaced by other ops when constructing the first/final packet in linspaced_op_impl::packetOp. I cannot measure any performance changes for SSE, AVX, or AVX512. name old time/op new time/op delta BM_LinSpace<float>/1 1.63ns ± 0% 1.63ns ± 0% ~ (p=0.762 n=5+5) BM_LinSpace<float>/8 4.92ns ± 3% 4.89ns ± 3% ~ (p=0.421 n=5+5) BM_LinSpace<float>/64 34.6ns ± 0% 34.6ns ± 0% ~ (p=0.841 n=5+5) BM_LinSpace<float>/512 217ns ± 0% 217ns ± 0% ~ (p=0.421 n=5+5) BM_LinSpace<float>/4k 1.68µs ± 0% 1.68µs ± 0% ~ (p=1.000 n=5+5) BM_LinSpace<float>/32k 13.3µs ± 0% 13.3µs ± 0% ~ (p=0.905 n=5+4) BM_LinSpace<float>/256k 107µs ± 0% 107µs ± 0% ~ (p=0.841 n=5+5) BM_LinSpace<float>/1M 427µs ± 0% 427µs ± 0% ~ (p=0.690 n=5+5)	2020-05-08 15:41:50 -07:00
Rasmus Munk Larsen	225ab040e0	Remove unused packet op "palign". Clean up a compiler warning in c++03 mode in AVX512/Complex.h.	2020-05-07 17:14:26 -07:00
Rasmus Munk Larsen	74ec8e6618	Make size odd for transposeInPlace test to make sure we hit the scalar path.	2020-05-07 17:29:56 +00:00

1 2 3 4 5 ...

2387 Commits