David Tellenbach
e3e2cf9d24
Add MatrixBase::cwiseArg()
2020-10-14 01:56:42 +00:00
Rasmus Munk Larsen
61fc78bbda
Get rid of nested template specialization in TensorReductionGpu.h, which was broken by c6953f799b
.
2020-10-13 23:53:11 +00:00
Rasmus Munk Larsen
c6953f799b
Add packet generic ops predux_fmin
, predux_fmin_nan
, predux_fmax
, and predux_fmax_nan
that implement reductions with PropagateNaN
, and PropagateNumbers
semantics. Add (slow) generic implementations for most reductions.
2020-10-13 21:48:31 +00:00
acxz
807e51528d
undefine EIGEN_CONSTEXPR before redefinition
2020-10-12 20:28:56 -04:00
Rasmus Munk Larsen
9a4d04c05f
Make bitwise_helper a device function to unbreak GPU builds.
2020-10-10 01:45:20 +00:00
Rasmus Munk Larsen
4e4d3f32d1
Clean up packetmath tests and fix various bugs to make bfloat16 pass (almost) all packetmath tests with SSE, AVX, and AVX512.
2020-10-09 20:05:49 +00:00
David Tellenbach
7a8d3d5b81
Disable test exceptions when using OpenMP.
2020-10-09 17:49:07 +02:00
David Tellenbach
9022f5aa8a
Mention problems when using potentially throwing scalars and OpenMP
2020-10-09 17:04:25 +02:00
Karl Ljungkvist
d199c17b14
Fix typo in Tutorial_BlockOperations_block_assignment.cpp
2020-10-09 07:51:36 +00:00
David Tellenbach
4091f6b25c
Drop EIGEN_USING_STD_MATH in favour of EIGEN_USING_STD
2020-10-09 02:05:05 +02:00
Rasmus Munk Larsen
183a208212
Implement generic bitwise logical packet ops that work for all types.
2020-10-08 22:45:20 +00:00
David Tellenbach
8f8d77b516
Add EIGEN prefix for HAS_LGAMMA_R
2020-10-08 18:32:19 +02:00
Eugene Zhulenev
2279f2c62f
Use lgamma_r if it is available (update check for glibc 2.19+)
2020-10-08 00:26:45 +00:00
Rasmus Munk Larsen
b431024404
Don't make assumptions about NaN-propagation for pmin/pmax - it various across platforms.
...
Change test to only test for NaN-propagation for pfmin/pfmax.
2020-10-07 19:05:18 +00:00
David Tellenbach
f66f3393e3
Use reinterpret_cast instead of C-style cast in Inverse_NEON.h
2020-10-04 00:35:09 +02:00
Rasmus Munk Larsen
22c971a225
Don't cast away const in Inverse_NEON.h.
2020-10-02 15:06:34 -07:00
Rasmus Munk Larsen
f93841b53e
Use EIGEN_USING_STD to fix CUDA compilation error on BFloat16.h.
2020-10-02 14:47:15 -07:00
Rasmus Munk Larsen
ee714f79f7
Fix CUDA build breakage and incorrect result for absdiff on HIP with long double arguments.
2020-10-02 21:05:35 +00:00
janos
f7b185a8b1
dont use =* might not return a Scalar
2020-10-02 14:36:51 +02:00
Rasmus Munk Larsen
9078f47cd6
Fix build breakage with MSVC 2019, which does not support MMX intrinsics for 64 bit builds, see:
...
https://stackoverflow.com/questions/60933486/mmx-intrinsics-like-mm-cvtpd-pi32-not-found-with-msvc-2019-for-64bit-targets-c
Instead use the equivalent SSE2 intrinsics.
2020-10-01 12:37:55 -07:00
Rasmus Munk Larsen
3b445d9bf2
Add a generic packet ops corresponding to {std}::fmin and {std}::fmax. The non-sensical NaN-propagation rules for std::min std::max implemented by pmin and pmax in Eigen is a longstanding source og confusion and bug report. This change is a first step towards addressing it, as discussing in issue #564 .
2020-10-01 16:54:31 +00:00
Rasmus Munk Larsen
44b9d4e412
Specialize pldexp_double and pfdexp_double and get rid of Packet2l definition for SSE. SSE does not support conversion between 64 bit integers and double and the existing implementation of casting between Packet2d and Packer2l results in undefined behavior when casting NaN to int. Since pldexp and pfdexp only manipulate exponent fields that fit in 32 bit, this change provides specializations that use existing instructions _mm_cvtpd_pi32 and _mm_cvtsi32_pd instead.
2020-09-30 13:33:44 -07:00
Antonio Sanchez
d5a0d89491
Fix alignedbox 32-bit precision test failure.
...
The current `test/geo_alignedbox` tests fail on 32-bit arm due to small floating-point errors.
In particular, the following is not guaranteed to hold:
```
IsometryTransform identity = IsometryTransform::Identity();
BoxType transformedC;
transformedC.extend(c.transformed(identity));
VERIFY(transformedC.contains(c));
```
since `c.transformed(identity)` is ever-so-slightly different from `c`. Instead, we replace this test with one that checks an identity transform is within floating-point precision of `c`.
Also updated the condition on `AlignedBox::transform(...)` to only accept `Affine`, `AffineCompact`, and `Isometry` modes explicitly. Otherwise, invalid combinations of modes would also incorrectly pass the assertion.
2020-09-30 08:42:03 -07:00
David Tellenbach
30960d485e
Fix failure in GEBP kernel when compiling with OpenMP and FMA
...
Fixes #1995
2020-09-30 01:26:07 +02:00
Rasmus Munk Larsen
f9d1500f74
Revert !182 .
2020-09-29 13:56:17 -07:00
Rasmus Munk Larsen
068121ec02
Add missing newline at the end of Inverse_NEON.h
2020-09-29 15:32:52 +00:00
Rasmus Munk Larsen
74ff5719b3
Fix compilation of 64 bit constant arguments to pset1frombits in TypeCasting.h on platforms where uint64_t != unsigned long.
2020-09-28 22:47:11 +00:00
Rasmus Munk Larsen
3a0b23e473
Fix compilation of pset1frombits calls on iOS.
2020-09-28 22:30:36 +00:00
Christoph Hertzberg
6b0c0b587e
Provide a more efficient Packet2l->Packet2d cast method
2020-09-28 22:14:02 +00:00
Martin Pecka
6425e875a1
Added AlignedBox::transform(AffineTransform).
2020-09-28 18:06:23 +00:00
Alexander Grund
a967fadb21
Make relative path variables of type STRING
...
When the type is PATH an absolute path is expected and user-defined
values are converted into absolute paths relative to the current directory.
Fixes #1990
2020-09-28 16:39:48 +00:00
Zhuyie
e4b24e7fb2
Fix Eigen::ThreadPool::CurrentThreadId returning wrong thread id when EIGEN_AVOID_THREAD_LOCAL and NDEBUG are defined
2020-09-25 09:36:43 +00:00
Deven Desai
ce5c59729d
Fix for ROCm/HIP breakage - 200921
...
The following commit causes regressions in the ROCm/HIP support for Eigen
e55182ac09
I suspect the same breakages occur on the CUDA side too.
The above commit puts the EIGEN_CONSTEXPR attribute on `half_base` constructor. `half_base` is derived from `__half_raw`.
When compiling with GPU support, the definition of `__half_raw` gets picked up from the GPU Compiler specific header files (`hip_fp16.h`, `cuda_fp16.h`). Properly supporting the above commit would require adding the `constexpr` attribute to the `__half_raw` constructor (and other `*half*` routines) in those header files. While that is something we can explore in the future, for now we need to undo the above commit when compiling with GPU support, which is what this commit does.
This commit also reverts a small change in the `raw_uint16_to_half` routine made by the above commit. Similar to the case above, that change was leading to compile errors due to the fact that `__half_raw` has a different definition when compiling with DPU support.
2020-09-22 22:26:45 +00:00
David Tellenbach
b8a13f13ca
Add CI configuration for ppc64le
2020-09-22 00:26:23 +00:00
Guoqiang QI
821702e771
Fix the #issue1997 and #issue1991 bug triggered by unsupport a[index](type a: __i28d) ops with MSVC compiler
2020-09-21 15:49:00 +00:00
David Tellenbach
493a7c773c
Remove EIGEN_CONSTEXPR from NumTraits<boost::multiprecision::number<...>>
2020-09-21 12:43:41 +02:00
Павел Мацула
38e4a67394
Fix using FindStandardMathLibrary.cmake with -Wall (-Wunused-value) added to CMAKE_CXX_FLAG
2020-09-19 16:13:16 +00:00
Rasmus Munk Larsen
c4b99f78c7
Fix breakage in pcast<Packet2l, Packet2d> due to _mm_cvtsi128_si64 not being available on 32 bit x86.
...
If SSE 4.1 is available use the faster _mm_extract_epi64 intrinsic.
2020-09-18 18:13:20 -07:00
guoqiangqi
9aad16b443
Fix undefined reference to pset1frombits bug on different platforms
2020-09-19 00:53:21 +00:00
David Tellenbach
c4aa8e0db2
Rename variable to avoid shadowing of a previously declared one
2020-09-18 22:53:15 +02:00
Rasmus Munk Larsen
e55182ac09
Get rid of initialization logic for blueNorm by making the computed constants static const or constexpr.
...
Move macro definition EIGEN_CONSTEXPR to Core and make all methods in NumTraits constexpr when EIGEN_HASH_CONSTEXPR is 1.
2020-09-18 17:38:58 +00:00
Rasmus Munk Larsen
14022f5eb5
Fix more mildly embarrassing typos in ARM intrinsics in PacketMath.h.
...
'vmvnq_u64' does not exist for some reason.
2020-09-18 04:14:13 +00:00
Rasmus Munk Larsen
a5b226920f
Fix typo in PacketMath.h
2020-09-18 01:22:23 +00:00
Rasmus Munk Larsen
3af744b023
Add missing packet op pcmp_lt_or_nan for Packet2d on ARM.
2020-09-18 01:07:01 +00:00
Rasmus Munk Larsen
31a6b88ff3
Disable double version of compute_inverse_size4 on Inverse_NEON.h if Packet2d is not supported.
2020-09-17 23:51:06 +00:00
Brad King
880fa43b2b
Add support for CastXML on ARM aarch64
...
CastXML simulates the preprocessors of other compilers, but actually
parses the translation unit with an internal Clang compiler.
Use the same `vld1q_u64` workaround that we do for Clang.
Fixes : #1979
2020-09-16 13:40:23 -04:00
daravi
6f0f6f792e
Fix compiler error due to c++20 operator== generation rules
2020-09-16 02:06:53 +00:00
Benoit Jacob
cc0c38ace8
Remove old Clang compiler bug work-arounds. The two LLVM bugs referenced in the comments here have long been fixed. The workarounds were now detrimental because (1) they prevented using fused mul-add on Clang/ARM32 and (2) the unnecessary 'volatile' in 'asm volatile' prevented legitimate reordering by the compiler.
2020-09-15 20:54:14 -04:00
Tim Shen
bb56a62582
Make bfloat16(float(-nan)) produce -nan, not nan.
2020-09-15 13:24:23 -07:00
Guoqiang QI
3012e755e9
Add plog ops support packet2d for NEON
2020-09-15 17:10:35 +00:00