mirror of
https://gitlab.com/libeigen/eigen.git
synced 2025-01-24 14:45:14 +08:00
1024a70e82
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The patch works by altering the gebp lhs packing routines to also
consider ½ and ¼ packet lenght rows when packing, besides the original
whole package and row-by-row attempts. Finally, gebp itself will try
to fit a fraction of a packet at a time if:
i) ½ and/or ¼ packets are available for the current context (e.g. AVX2
and SSE-sized SIMD register for x86)
ii) The matrix's height is favorable to it (it may be it's too small
in that dimension to take full advantage of the current/maximum
packet width or it may be the case that last rows may take
advantage of smaller packets before gebp goes row-by-row)
This helps mitigate huge slowdowns one had on AVX512 builds when
compared to AVX2 ones, for some dimensions. Gains top at an extra 1x
in throughput. This patch is a complement to changeset
|
||
---|---|---|
.. | ||
src | ||
Cholesky | ||
CholmodSupport | ||
CMakeLists.txt | ||
Core | ||
Dense | ||
Eigen | ||
Eigenvalues | ||
Geometry | ||
Householder | ||
IterativeLinearSolvers | ||
Jacobi | ||
KLUSupport | ||
LU | ||
MetisSupport | ||
OrderingMethods | ||
PardisoSupport | ||
PaStiXSupport | ||
QR | ||
QtAlignedMalloc | ||
Sparse | ||
SparseCholesky | ||
SparseCore | ||
SparseLU | ||
SparseQR | ||
SPQRSupport | ||
StdDeque | ||
StdList | ||
StdVector | ||
SuperLUSupport | ||
SVD | ||
UmfPackSupport |