Rasmus Munk Larsen
|
f19a6803c8
|
Refactor special case handling in pow(x,y) and revert to repeated squaring for <float,int>
|
2024-11-27 00:24:21 +00:00 |
|
Rasmus Munk Larsen
|
5064cb7d5e
|
Add test for using pcast on scalars.
|
2024-11-25 22:27:26 -08:00 |
|
Rasmus Munk Larsen
|
1ea61a5d26
|
Improve pow(x,y): 25% speedup, increase accuracy for integer exponents.
|
2024-11-26 06:13:48 +00:00 |
|
Charles Schlosser
|
8ad4344ca7
|
optimize setConstant, setZero
|
2024-11-22 03:39:19 +00:00 |
|
Rasmus Munk Larsen
|
5610a13b77
|
Simplify and speed up pow() by 5-6%
|
2024-11-20 12:45:00 +00:00 |
|
Rasmus Munk Larsen
|
6c6ce9d06b
|
Enable vectorized erf<double>(x) for SSE and AVX, which was accidentally removed in merge request 1750.
|
2024-11-19 22:14:29 +00:00 |
|
Rasmus Munk Larsen
|
e7c799b7c9
|
Prevent premature overflow to infinity in exp(x). The changes also provide a 3-4% speedup.
|
2024-11-19 13:08:18 -08:00 |
|
Rasmus Munk Larsen
|
00af47102d
|
Revert 040180078d
|
2024-11-19 10:25:16 -08:00 |
|
Rasmus Munk Larsen
|
8ee6f8475a
|
Speed up exp(x).
|
2024-11-19 17:50:34 +00:00 |
|
Charles Schlosser
|
93ec5450cb
|
disable fill_n optimization for msvc
|
2024-11-19 01:38:48 +00:00 |
|
Rasmus Munk Larsen
|
0af6ab4b76
|
Remove unnecessary check for HasBlend trait.
|
2024-11-18 21:16:45 +00:00 |
|
Rasmus Munk Larsen
|
d5eec781b7
|
Get rid of redundant computation for large arguments to erf(x).
|
2024-11-18 10:51:58 -08:00 |
|
Tyler Veness
|
2fc63808e4
|
Fix C++20 constexpr test compilation failures
|
2024-11-18 01:56:55 +00:00 |
|
Rasmus Munk Larsen
|
5133c836c0
|
Vectorize erf(x) for double.
|
2024-11-16 19:05:16 +00:00 |
|
Conrad Poelman
|
d6e3b528b2
|
Update Assign_MKL.h to cast disparate enum type to int, so it can be compared...
|
2024-11-15 20:00:29 +00:00 |
|
breathe1
|
040180078d
|
Ensure that destructor's needed by lldb make it into binary in non-inlined fashion
|
2024-11-15 17:15:09 +00:00 |
|
Tyler Veness
|
0fb2ed140d
|
Make element accessors constexpr
|
2024-11-14 01:05:29 +00:00 |
|
Charles Schlosser
|
8b4efc8ed8
|
check_size_for_overflow: use numeric limits instead of c99 macro
|
2024-11-13 00:35:35 +00:00 |
|
Charles Schlosser
|
489dbbc651
|
make fixed_size matrices conform to std::is_standard_layout
|
2024-11-12 23:34:26 +00:00 |
|
Rasmus Munk Larsen
|
283d871a3f
|
Add missing EIGEN_DEVICE_FUNCTION decorations.
|
2024-11-08 14:25:57 -08:00 |
|
Rasmus Munk Larsen
|
0d366f6532
|
Vectorize erfc(x) for double and improve erfc(x) for float.
|
2024-11-08 17:21:11 +00:00 |
|
Charles Schlosser
|
8adf43640e
|
more avx predux_any
|
2024-11-07 19:58:48 +00:00 |
|
Charles Schlosser
|
bc424f617a
|
add missing avx predux_any functions
|
2024-11-07 19:11:29 +00:00 |
|
Charles Schlosser
|
e52ac76ca3
|
use EIGEN_CPLUSPLUS instead of checking cpp version
|
2024-11-06 17:25:22 +00:00 |
|
Rasmus Munk Larsen
|
122be167cd
|
Revert "make fixed-size objects trivially move assignable"
|
2024-11-06 01:09:38 +00:00 |
|
Tobias Wood
|
d49021212b
|
Tensor Roll / Circular Shift / Rotate
|
2024-11-05 14:10:19 +00:00 |
|
Charles Schlosser
|
bb73be8a2e
|
make fixed-size objects trivially move assignable
|
2024-11-04 17:55:27 +00:00 |
|
Antonio Sánchez
|
7fd305ecae
|
Fix GPU builds.
|
2024-11-01 04:50:03 +00:00 |
|
Morris Hafner
|
c8267654f2
|
Don't use __builtin_alloca_with_align with nvc++
|
2024-10-30 18:02:08 +00:00 |
|
Tyler Veness
|
84c446df2c
|
Fix macro redefinition warning in FFTW test
|
2024-10-30 17:18:42 +00:00 |
|
Antonio Sánchez
|
a9584d8e3c
|
Fix clang6 failures.
|
2024-10-30 14:41:50 +00:00 |
|
Antonio Sánchez
|
dd4c2805d9
|
Fix clang6 failures.
|
2024-10-29 22:18:30 +00:00 |
|
Antonio Sánchez
|
9e962d9c54
|
Fix OOB access in triangular matrix multiplication.
|
2024-10-29 19:07:07 +00:00 |
|
Antonio Sánchez
|
695e49d1bd
|
Fix NVCC builds for CUDA 10+.
|
2024-10-29 18:38:14 +00:00 |
|
Antonio Sánchez
|
dae09773fc
|
Don't pass matrices by value.
|
2024-10-29 18:19:02 +00:00 |
|
Rasmus Munk Larsen
|
c23ec3420e
|
Add tests for sizeof() with one dynamic dimension.
|
2024-10-28 13:48:53 -07:00 |
|
Rasmus Munk Larsen
|
58b252e5b3
|
Fix typo in PacketMath.h
|
2024-10-28 18:19:52 +00:00 |
|
Rasmus Munk Larsen
|
6c04d0cd68
|
Add missing exp2 definition for Altivec.
|
2024-10-28 18:12:36 +00:00 |
|
Peter Gavin
|
b15ebb1c2d
|
add nextafter for bfloat16
|
2024-10-26 00:08:25 +00:00 |
|
Rasmus Munk Larsen
|
53b83cddf9
|
Include <type_traits> in main.h for std::is_trivial*
|
2024-10-25 20:55:51 +00:00 |
|
Charles Schlosser
|
37563856c9
|
Fix stack allocation assert
|
2024-10-25 17:02:43 +00:00 |
|
Rasmus Munk Larsen
|
3f067c4850
|
Add exp2() as a packet op and array method.
|
2024-10-22 22:09:34 +00:00 |
|
Charles Schlosser
|
4e5136d239
|
make fixed size matrices and arrays trivially_default_constructible
|
2024-10-21 17:10:15 +00:00 |
|
Antonio Sánchez
|
b396a6fbb2
|
Add free-function swap.
|
2024-10-14 15:51:40 +00:00 |
|
Charles Schlosser
|
820e8a45fb
|
add compile time info to reverse in place
|
2024-10-13 17:55:56 +00:00 |
|
Charles Schlosser
|
b55dab7f21
|
Fix DenseBase::tail for Dynamic template argument
|
2024-10-12 21:03:30 +00:00 |
|
Charles Schlosser
|
e0cbc55d92
|
Update README.md
|
2024-10-10 01:54:30 +00:00 |
|
Rasmus Munk Larsen
|
7eea0a9213
|
Vectorize erfc() for float
|
2024-10-09 18:38:05 +00:00 |
|
Rasmus Munk Larsen
|
78f3c654ee
|
Don't use constexpr with half.
|
2024-10-08 16:44:40 +00:00 |
|
Antonio Sánchez
|
6d7af238fa
|
Adjust array_cwise for 32-bit arm.
|
2024-10-07 23:15:24 +00:00 |
|