eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2024-12-21 07:19:46 +08:00

Author	SHA1	Message	Date
guoqiangqi	28aef8e816	Improve polynomial evaluation with instruction-level parallelism for pexp_float and pexp<Packet16f>	2020-10-20 11:37:09 +08:00
guoqiangqi	4a77eda1fd	remove unnecessary specialize template of pexp for scale float/double	2020-10-19 00:51:42 +00:00
Antonio Sanchez	d9f0d9eb76	Fix missing `pfirst<Packet16b>` for MSVC. It was only defined under one `#ifdef` case. This fixes the `packetmath_14` test for MSVC.	2020-10-16 16:22:00 -07:00
Rasmus Munk Larsen	21edea5edd	Fix the specialization of pfrexp for AVX to be faster when AVX2/AVX512DQ is not available, and avoid undefined behavior in C++. Also mask off the sign bit when extracting the exponent.	2020-10-15 18:39:58 -07:00
Deven Desai	011e0db31d	Fix for ROCm/HIP breakage - 201013 The following commit seems to have introduced regressions in ROCm/HIP support. `183a208212` It causes some unit-tests to fail with the following error ``` ... Eigen/src/Core/GenericPacketMath.h:322:3: error: no member named 'bit_and' in the global namespace; did you mean 'std::bit_and'? ... Eigen/src/Core/GenericPacketMath.h:329:3: error: no member named 'bit_or' in the global namespace; did you mean 'std::bit_or'? ... Eigen/src/Core/GenericPacketMath.h:336:3: error: no member named 'bit_xor' in the global namespace; did you mean 'std::bit_xor'? ... ``` The error occurs because, when compiling the device code in HIP/CUDA, the compiler will pick up the some of the std functions (whose calls are prefixed by EIGEN_USING_STD) from the global namespace (i.e. use ::bit_xor instead of std::bit_xor). For this to work, those functions must be declared in the global namespace in the HIP/CUDA header files. The `bit_and`, `bit_or` and `bit_xor` routines are not declared in the HIP header file that contain the decls for the std math functions ( `math_functions.h` ), and this is the cause of the error above. It seems that the newer HIP compilers do support the calling of `std::` math routines within device code, and the ideal fix here would have been to change all calls to std math functions in EIGEN to use the `std::` namespace (instead of the global namespace ), when compiling with HIP compiler. However it seems there was a recent commit to remove the EIGEN_USING_STD_MATH macro and collapse it uses into the EIGEN_USING_STD macro ( `4091f6b25c` ). Replacing all std math calls will essentially require re-surrecting the EIGEN_USING_STD_MATH macro, so not choosing that option. Also HIP compilers only have support std math calls within device code, and not all std functions (specifically not for malloc/free which are prefixed via EIGEN_USING_STD). So modyfing EIGEN_USE_STD implementation to use std:: namspace for HIP will not work either. Hence going for the ugly solution of special casing the three calls that breaking the HIP compile, to explicitly use the std:: namespace	2020-10-15 12:17:35 +00:00
Rasmus Munk Larsen	6ea8091705	Revert change from `4e4d3f32d1` that broke BFloat16.h build with older compilers.	2020-10-15 01:20:08 +00:00
Guoqiang QI	4700713faf	Add AVX plog<Packet4d> and AVX512 plog<Packet8d> ops,also unified AVX512 plog<Packet16f> op with generic api	2020-10-15 00:54:45 +00:00
Rasmus Munk Larsen	af6f43d7ff	Add specializations for pmin/pmax with prescribed NaN propagation semantics for SSE/AVX/AVX512.	2020-10-14 23:11:24 +00:00
Rasmus Munk Larsen	274ef12b61	Remove leftover debug print statement in cxx11_tensor_expr.cpp	2020-10-14 22:59:51 +00:00
Rasmus Munk Larsen	208b3626d1	Revert generic implementation of `predux`, since it break compilation of `predux_any` with MSVC.	2020-10-14 21:41:28 +00:00
David Tellenbach	e3e2cf9d24	Add MatrixBase::cwiseArg()	2020-10-14 01:56:42 +00:00
Rasmus Munk Larsen	61fc78bbda	Get rid of nested template specialization in TensorReductionGpu.h, which was broken by `c6953f799b`.	2020-10-13 23:53:11 +00:00
Rasmus Munk Larsen	c6953f799b	Add packet generic ops `predux_fmin`, `predux_fmin_nan`, `predux_fmax`, and `predux_fmax_nan` that implement reductions with `PropagateNaN`, and `PropagateNumbers` semantics. Add (slow) generic implementations for most reductions.	2020-10-13 21:48:31 +00:00
acxz	807e51528d	undefine EIGEN_CONSTEXPR before redefinition	2020-10-12 20:28:56 -04:00
Rasmus Munk Larsen	9a4d04c05f	Make bitwise_helper a device function to unbreak GPU builds.	2020-10-10 01:45:20 +00:00
Rasmus Munk Larsen	4e4d3f32d1	Clean up packetmath tests and fix various bugs to make bfloat16 pass (almost) all packetmath tests with SSE, AVX, and AVX512.	2020-10-09 20:05:49 +00:00
David Tellenbach	7a8d3d5b81	Disable test exceptions when using OpenMP.	2020-10-09 17:49:07 +02:00
David Tellenbach	9022f5aa8a	Mention problems when using potentially throwing scalars and OpenMP	2020-10-09 17:04:25 +02:00
Karl Ljungkvist	d199c17b14	Fix typo in Tutorial_BlockOperations_block_assignment.cpp	2020-10-09 07:51:36 +00:00
David Tellenbach	4091f6b25c	Drop EIGEN_USING_STD_MATH in favour of EIGEN_USING_STD	2020-10-09 02:05:05 +02:00
Rasmus Munk Larsen	183a208212	Implement generic bitwise logical packet ops that work for all types.	2020-10-08 22:45:20 +00:00
David Tellenbach	8f8d77b516	Add EIGEN prefix for HAS_LGAMMA_R	2020-10-08 18:32:19 +02:00
Eugene Zhulenev	2279f2c62f	Use lgamma_r if it is available (update check for glibc 2.19+)	2020-10-08 00:26:45 +00:00
Rasmus Munk Larsen	b431024404	Don't make assumptions about NaN-propagation for pmin/pmax - it various across platforms. Change test to only test for NaN-propagation for pfmin/pfmax.	2020-10-07 19:05:18 +00:00
David Tellenbach	f66f3393e3	Use reinterpret_cast instead of C-style cast in Inverse_NEON.h	2020-10-04 00:35:09 +02:00
Rasmus Munk Larsen	22c971a225	Don't cast away const in Inverse_NEON.h.	2020-10-02 15:06:34 -07:00
Rasmus Munk Larsen	f93841b53e	Use EIGEN_USING_STD to fix CUDA compilation error on BFloat16.h.	2020-10-02 14:47:15 -07:00
Rasmus Munk Larsen	ee714f79f7	Fix CUDA build breakage and incorrect result for absdiff on HIP with long double arguments.	2020-10-02 21:05:35 +00:00
janos	f7b185a8b1	dont use =* might not return a Scalar	2020-10-02 14:36:51 +02:00
Rasmus Munk Larsen	9078f47cd6	Fix build breakage with MSVC 2019, which does not support MMX intrinsics for 64 bit builds, see: https://stackoverflow.com/questions/60933486/mmx-intrinsics-like-mm-cvtpd-pi32-not-found-with-msvc-2019-for-64bit-targets-c Instead use the equivalent SSE2 intrinsics.	2020-10-01 12:37:55 -07:00
Rasmus Munk Larsen	3b445d9bf2	Add a generic packet ops corresponding to {std}::fmin and {std}::fmax. The non-sensical NaN-propagation rules for std::min std::max implemented by pmin and pmax in Eigen is a longstanding source og confusion and bug report. This change is a first step towards addressing it, as discussing in issue #564 .	2020-10-01 16:54:31 +00:00
Rasmus Munk Larsen	44b9d4e412	Specialize pldexp_double and pfdexp_double and get rid of Packet2l definition for SSE. SSE does not support conversion between 64 bit integers and double and the existing implementation of casting between Packet2d and Packer2l results in undefined behavior when casting NaN to int. Since pldexp and pfdexp only manipulate exponent fields that fit in 32 bit, this change provides specializations that use existing instructions _mm_cvtpd_pi32 and _mm_cvtsi32_pd instead.	2020-09-30 13:33:44 -07:00
Antonio Sanchez	d5a0d89491	Fix alignedbox 32-bit precision test failure. The current `test/geo_alignedbox` tests fail on 32-bit arm due to small floating-point errors. In particular, the following is not guaranteed to hold: ``` IsometryTransform identity = IsometryTransform::Identity(); BoxType transformedC; transformedC.extend(c.transformed(identity)); VERIFY(transformedC.contains(c)); ``` since `c.transformed(identity)` is ever-so-slightly different from `c`. Instead, we replace this test with one that checks an identity transform is within floating-point precision of `c`. Also updated the condition on `AlignedBox::transform(...)` to only accept `Affine`, `AffineCompact`, and `Isometry` modes explicitly. Otherwise, invalid combinations of modes would also incorrectly pass the assertion.	2020-09-30 08:42:03 -07:00
David Tellenbach	30960d485e	Fix failure in GEBP kernel when compiling with OpenMP and FMA Fixes #1995	2020-09-30 01:26:07 +02:00
Rasmus Munk Larsen	f9d1500f74	Revert !182 .	2020-09-29 13:56:17 -07:00
Rasmus Munk Larsen	068121ec02	Add missing newline at the end of Inverse_NEON.h	2020-09-29 15:32:52 +00:00
Rasmus Munk Larsen	74ff5719b3	Fix compilation of 64 bit constant arguments to pset1frombits in TypeCasting.h on platforms where uint64_t != unsigned long.	2020-09-28 22:47:11 +00:00
Rasmus Munk Larsen	3a0b23e473	Fix compilation of pset1frombits calls on iOS.	2020-09-28 22:30:36 +00:00
Christoph Hertzberg	6b0c0b587e	Provide a more efficient Packet2l->Packet2d cast method	2020-09-28 22:14:02 +00:00
Martin Pecka	6425e875a1	Added AlignedBox::transform(AffineTransform).	2020-09-28 18:06:23 +00:00
Alexander Grund	a967fadb21	Make relative path variables of type STRING When the type is PATH an absolute path is expected and user-defined values are converted into absolute paths relative to the current directory. Fixes #1990	2020-09-28 16:39:48 +00:00
Zhuyie	e4b24e7fb2	Fix Eigen::ThreadPool::CurrentThreadId returning wrong thread id when EIGEN_AVOID_THREAD_LOCAL and NDEBUG are defined	2020-09-25 09:36:43 +00:00
Deven Desai	ce5c59729d	Fix for ROCm/HIP breakage - 200921 The following commit causes regressions in the ROCm/HIP support for Eigen `e55182ac09` I suspect the same breakages occur on the CUDA side too. The above commit puts the EIGEN_CONSTEXPR attribute on `half_base` constructor. `half_base` is derived from `__half_raw`. When compiling with GPU support, the definition of `__half_raw` gets picked up from the GPU Compiler specific header files (`hip_fp16.h`, `cuda_fp16.h`). Properly supporting the above commit would require adding the `constexpr` attribute to the `__half_raw` constructor (and other `half` routines) in those header files. While that is something we can explore in the future, for now we need to undo the above commit when compiling with GPU support, which is what this commit does. This commit also reverts a small change in the `raw_uint16_to_half` routine made by the above commit. Similar to the case above, that change was leading to compile errors due to the fact that `__half_raw` has a different definition when compiling with DPU support.	2020-09-22 22:26:45 +00:00
David Tellenbach	b8a13f13ca	Add CI configuration for ppc64le	2020-09-22 00:26:23 +00:00
Guoqiang QI	821702e771	Fix the #issue1997 and #issue1991 bug triggered by unsupport a[index](type a: __i28d) ops with MSVC compiler	2020-09-21 15:49:00 +00:00
David Tellenbach	493a7c773c	Remove EIGEN_CONSTEXPR from NumTraits<boost::multiprecision::number<...>>	2020-09-21 12:43:41 +02:00
Павел Мацула	38e4a67394	Fix using FindStandardMathLibrary.cmake with -Wall (-Wunused-value) added to CMAKE_CXX_FLAG	2020-09-19 16:13:16 +00:00
Rasmus Munk Larsen	c4b99f78c7	Fix breakage in pcast<Packet2l, Packet2d> due to _mm_cvtsi128_si64 not being available on 32 bit x86. If SSE 4.1 is available use the faster _mm_extract_epi64 intrinsic.	2020-09-18 18:13:20 -07:00
guoqiangqi	9aad16b443	Fix undefined reference to pset1frombits bug on different platforms	2020-09-19 00:53:21 +00:00
David Tellenbach	c4aa8e0db2	Rename variable to avoid shadowing of a previously declared one	2020-09-18 22:53:15 +02:00

1 2 3 4 5 ...

11112 Commits