Charles Schlosser
|
5330960900
|
Enable packet segment in partial redux
nightly
|
2025-04-14 17:44:53 +00:00 |
|
Charles Schlosser
|
6266d430cc
|
packet segment: also check DiagonalWrapper
|
2025-04-12 19:34:11 +00:00 |
|
Charles Schlosser
|
e39ad8badc
|
fix constexpr in CoreEvaluators.h
|
2025-04-12 18:54:09 +00:00 |
|
Charles Schlosser
|
7aefb9f4d9
|
fix memset optimization for std::complex types
|
2025-04-12 16:20:09 +00:00 |
|
Charles Schlosser
|
73ca849a68
|
fix packetSegment for ArrayWrapper / MatrixWrapper
|
2025-04-12 12:12:48 +00:00 |
|
Charles Schlosser
|
28c3b26d53
|
masked load/store framework
|
2025-04-12 00:31:10 +00:00 |
|
Eugene Zhulenev
|
cebe09110c
|
Fix a potential deadlock because of Eigen thread pool
|
2025-04-11 23:43:14 +00:00 |
|
William Kong
|
11fd34cc1c
|
Fix the typing of the Tasks in ForkJoin.h
|
2025-04-09 17:21:36 +00:00 |
|
Hunter Belanger
|
2cd47d743e
|
Fixe Conversion Warning in Parallelizer
|
2025-04-08 07:39:01 +00:00 |
|
Antonio Sánchez
|
b860042263
|
Add postream for ostream-ing packets more reliably.
|
2025-04-01 22:12:00 +00:00 |
|
Antonio Sánchez
|
02d9e1138a
|
Add missing pmadd for Packet16bf.
|
2025-03-31 04:17:17 +00:00 |
|
Antonio Sánchez
|
9cc9209b9b
|
Fix cmake warning and default to j0.
|
2025-03-29 16:09:40 +00:00 |
|
Rasmus Munk Larsen
|
e0c99a8dd6
|
By default, run ctests on all available cores in parallel.
|
2025-03-28 04:28:10 +00:00 |
|
Rasmus Munk Larsen
|
63a40ffb95
|
Use fma<float> for fma<half> and fma<bfloat16> if native fma is not available on the platform.
|
2025-03-28 04:26:04 +00:00 |
|
Antonio Sanchez
|
44fb6422be
|
All triggering full CI if MR label containts all-tests
|
2025-03-27 08:37:24 -07:00 |
|
Rasmus Munk Larsen
|
3866cbfbe8
|
Fix test for TensorRef of trace.
|
2025-03-25 23:01:46 +00:00 |
|
Antonio Sanchez
|
6579e36eb4
|
Allow Tensor trace to be passed to a TensorRef.
|
2025-03-25 08:26:23 -07:00 |
|
Antonio Sanchez
|
8e32cbf7da
|
Reduce flakiness of test for Eigen::half.
|
2025-03-23 22:31:25 -07:00 |
|
Antonio Sánchez
|
d935916ac6
|
Add numext::fma and missing pmadd implementations.
|
2025-03-23 01:05:53 +00:00 |
|
Charles Schlosser
|
754bd24f5e
|
fix 2828
|
2025-03-22 17:19:44 +00:00 |
|
Charles Schlosser
|
ac2165c11f
|
fix allFinite
|
2025-03-20 16:04:46 +00:00 |
|
William Kong
|
3143968195
|
Generalize the Eigen ForkJoin scheduler to use any ThreadPool interface.
|
2025-03-19 19:56:21 +00:00 |
|
Antonio Sánchez
|
70f2aead9a
|
Use native _Float16 for AVX512FP16 and update vectorization.
|
2025-03-19 19:55:26 +00:00 |
|
Markus Vieth
|
0259a52b0e
|
Use more .noalias()
|
2025-03-17 19:41:00 +01:00 |
|
Antonio Sánchez
|
14f845a1a8
|
Fix givens rotation.
|
2025-03-14 17:15:57 +00:00 |
|
Guilhem Saurel
|
33b04fe518
|
CMake: add install-doc target
|
2025-03-14 00:35:00 +00:00 |
|
Charles Schlosser
|
10e62ccd22
|
Fix x86 complex vectorized fma
|
2025-03-12 17:06:32 +00:00 |
|
Rasmus Munk Larsen
|
464c1d0978
|
Format TensorDeviceThreadPool.h & use if constexpr for c++20.
|
2025-03-08 01:09:36 +00:00 |
|
Rasmus Munk Larsen
|
21223f6bb6
|
Fix addition of different enum types.
|
2025-03-07 22:18:00 +00:00 |
|
Rasmus Munk Larsen
|
350544eb01
|
Clean up TensorDeviceThreadPool.h
|
2025-03-07 18:14:17 +00:00 |
|
Kevin
|
43810fc1be
|
Fix extra semicolon in DeviceWrapper
|
2025-03-07 01:07:23 +00:00 |
|
Charles Schlosser
|
d28041ed5a
|
refactor AssignmentFunctors.h, unify with existing scalar_op
|
2025-03-06 01:28:39 +00:00 |
|
Gopinath Vasalamarri
|
9a86214039
|
Optimize division operations in TensorVolumePatch.h
|
2025-02-28 22:34:13 +00:00 |
|
Antonio Sánchez
|
be5147b090
|
Fix STL feature detection for c++20.
|
2025-02-28 19:52:37 +00:00 |
|
Antonio Sanchez
|
179a49684a
|
Fix CMake BOOST warning
|
2025-02-28 07:33:26 -08:00 |
|
Antonio Sanchez
|
dd56367554
|
Fix docs job for nightlies
|
2025-02-26 16:01:33 +00:00 |
|
Antonio Sánchez
|
d79bac0d3c
|
Fix boolean scatter and random generation for tensors.
|
2025-02-25 21:37:09 +00:00 |
|
Tyler Veness
|
9935396b15
|
Specify constructor template arguments for ConstexprTest struct
|
2025-02-25 19:38:47 +00:00 |
|
Rasmus Munk Larsen
|
72adf891d5
|
Slightly simplify ForkJoin code, and make sure the test is actually run.
|
2025-02-25 17:22:43 +00:00 |
|
Antonio Sanchez
|
6aebfa9acc
|
Build docs on push, and don't expire
|
2025-02-24 08:29:21 -08:00 |
|
Markus Vieth
|
bddaa99e15
|
Fix bitwise operation error when compiling as C++26
|
2025-02-23 02:30:55 +00:00 |
|
C. Antonio Sanchez
|
e42dceb3a1
|
Fix implicit copy-constructor warning in TensorRef.
|
2025-02-22 08:37:56 -08:00 |
|
Antonio Sanchez
|
5fc6fc9881
|
Initialize matrix in bicgstab test
|
2025-02-21 10:27:29 -08:00 |
|
Tyler Veness
|
0ae7b59018
|
Make assignment constexpr
|
2025-02-21 18:16:46 +00:00 |
|
Charles Schlosser
|
4dda5b927a
|
fix Warray-bounds in inner product
|
2025-02-20 22:40:55 +00:00 |
|
C. Antonio Sanchez
|
66f7f51b7e
|
Disable fno-check-new on clang.
|
2025-02-18 21:24:47 -08:00 |
|
Charles Schlosser
|
151f6127df
|
Fix Warray-bounds warning for fixed-size assignments
|
2025-02-18 19:23:14 +00:00 |
|
C. Antonio Sanchez
|
1d8b82b074
|
Fix power builds for no VSX and no POWER8.
|
2025-02-15 13:56:47 -08:00 |
|
Charles Schlosser
|
eb3f9f443d
|
refactor AssignmentEvaluator
|
2025-02-15 00:39:41 +00:00 |
|
Antonio Sánchez
|
9c211430b5
|
Fix TensorRef details
|
2025-02-14 18:33:26 +00:00 |
|