eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2025-01-24 14:45:14 +08:00

History

Antonio Sanchez f85038b7f3 Fix excessive GEBP register spilling for 32-bit NEON. Clang does a poor job of optimizing the GEBP microkernel on 32-bit ARM, leading to excessive 16-byte register spills, slowing down basic f32 matrix multiplication by approx 50%. By specializing `gebp_traits`, we can eliminate the register spills. Volatile inline ASM both acts as a barrier to prevent reordering and enforces strict register use. In a simple f32 matrix multiply example, this modification reduces 16-byte spills from 109 instances to zero, leading to a 1.5x speed increase (search for `16-byte Spill` in the assembly in https://godbolt.org/z/chsPbE). This is a replacement of !379. See there for further discussion. Also moved `gebp_traits` specializations for NEON to `Eigen/src/Core/arch/NEON/GeneralBlockPanelKernel.h` to be alongside other NEON-specific code. Fixes #2138.		2021-02-03 09:01:48 -08:00
..
src	Fix excessive GEBP register spilling for 32-bit NEON.	2021-02-03 09:01:48 -08:00
Cholesky
CholmodSupport
Core	Fix excessive GEBP register spilling for 32-bit NEON.	2021-02-03 09:01:48 -08:00
Dense
Eigen
Eigenvalues
Geometry	1)provide a better generic paddsub op implementation	2021-01-13 22:54:03 +00:00
Householder
IterativeLinearSolvers
Jacobi
KLUSupport
LU	Unify Inverse_SSE.h and Inverse_NEON.h into a single generic implementation using PacketMath.	2020-11-17 12:27:01 +00:00
MetisSupport
OrderingMethods
PardisoSupport
PaStiXSupport
QR
QtAlignedMalloc
Sparse
SparseCholesky
SparseCore
SparseLU
SparseQR
SPQRSupport
StdDeque
StdList
StdVector
SuperLUSupport
SVD
UmfPackSupport