Go to file
Antonio Sanchez f85038b7f3 Fix excessive GEBP register spilling for 32-bit NEON.
Clang does a poor job of optimizing the GEBP microkernel on 32-bit ARM,
leading to excessive 16-byte register spills, slowing down basic f32
matrix multiplication by approx 50%.

By specializing `gebp_traits`, we can eliminate the register spills.
Volatile inline ASM both acts as a barrier to prevent reordering and
enforces strict register use. In a simple f32 matrix multiply example,
this modification reduces 16-byte spills from 109 instances to zero,
leading to a 1.5x speed increase (search for `16-byte Spill` in the
assembly in https://godbolt.org/z/chsPbE).

This is a replacement of !379.  See there for further discussion.

Also moved `gebp_traits` specializations for NEON to
`Eigen/src/Core/arch/NEON/GeneralBlockPanelKernel.h` to be alongside
other NEON-specific code.

Fixes #2138.
2021-02-03 09:01:48 -08:00
bench Fix #1911: add benchmark for move semantics with fixed-size matrix 2020-06-11 23:43:25 +00:00
blas Replace language_support module with builtin CheckLanguage 2021-01-27 13:26:40 +00:00
ci Add CI configuration for ppc64le 2020-09-22 00:26:23 +00:00
cmake Replace language_support module with builtin CheckLanguage 2021-01-27 13:26:40 +00:00
debug MIsc. source and comment typos 2018-03-11 10:01:44 -04:00
demos Make file formatting comply with POSIX and Unix standards 2020-03-23 18:09:02 +00:00
doc Fix typo in doc 2020-11-30 10:53:29 +00:00
Eigen Fix excessive GEBP register spilling for 32-bit NEON. 2021-02-03 09:01:48 -08:00
failtest Make file formatting comply with POSIX and Unix standards 2020-03-23 18:09:02 +00:00
lapack Replace language_support module with builtin CheckLanguage 2021-01-27 13:26:40 +00:00
scripts Replace calls to "hg" by calls to "git" 2019-12-04 11:24:06 +01:00
test Allow for negative strides. 2021-01-27 23:32:12 +01:00
unsupported Include <cstdint> in one place, remove custom typedefs 2021-01-26 14:23:05 -08:00
.gitignore New CI infrastructure, including AArch64 runners 2020-09-11 18:11:49 +00:00
.gitlab-ci.yml New CI infrastructure, including AArch64 runners 2020-09-11 18:11:49 +00:00
.hgeol Added a pattern which forces LF line endings for *.sh files. 2013-07-31 18:20:58 +02:00
CMakeLists.txt Remove code checking for CMake < 3.5 2020-12-14 09:57:44 +00:00
COPYING.APACHE Add Apache 2.0 license text in COPYING.APACHE. 2020-06-18 12:45:27 -07:00
COPYING.BSD Make file formatting comply with POSIX and Unix standards 2020-03-23 18:09:02 +00:00
COPYING.GPL
COPYING.LGPL
COPYING.MINPACK Make file formatting comply with POSIX and Unix standards 2020-03-23 18:09:02 +00:00
COPYING.MPL2
COPYING.README
CTestConfig.cmake STYLE: Convert CMake-language commands to lower case 2019-10-31 11:36:37 -05:00
CTestCustom.cmake.in Allow to filter out build-error messages 2018-07-24 20:12:49 +02:00
eigen3.pc.in Further fixes for CMAKE_INSTALL_PREFIX correctness 2015-11-07 21:29:24 -05:00
INSTALL
README.md Update old links to bitbucket to point to gitlab.com 2019-12-04 10:57:07 +01:00
signature_of_eigen3_matrix_library

Eigen is a C++ template library for linear algebra: matrices, vectors, numerical solvers, and related algorithms.

For more information go to http://eigen.tuxfamily.org/.

For pull request, bug reports, and feature requests, go to https://gitlab.com/libeigen/eigen.