This change also adds additional checks for non-increasing diagonal in R11 to existing unit tests, and adds a new unit test with the Kahan matrix, which consistently fails for the original code.
Benchmark timings on Intel(R) Xeon(R) CPU E5-1650 v3 @ 3.50GHz. Code compiled with AVX & FMA. I just ran on square matrices of 3 difference sizes.
Benchmark Time(ns) CPU(ns) Iterations
-------------------------------------------------------
Before:
BM_EigencolPivQR/64 53677 53627 12890
BM_EigencolPivQR/512 15265408 15250784 46
BM_EigencolPivQR/4k 15403556228 15388788368 2
After (non-vectorized version):
Benchmark Time(ns) CPU(ns) Iterations Degradation
--------------------------------------------------------------------
BM_EigencolPivQR/64 63736 63669 10844 18.5%
BM_EigencolPivQR/512 16052546 16037381 43 5.1%
BM_EigencolPivQR/4k 15149263620 15132025316 2 -2.0%
Performance-wise there seems to be a ~18.5% degradation for small (64x64) matrices, probably due to the cost of more O(min(m,n)^2) sqrt operations that are not needed for the unstable formula.
- remove most of the metaprogramming kung fu in MathFunctions.h (only keep functions that differs from the std)
- remove the overloads for array expression that were in the std namespace
Renamed meta_{true|false} to {true|false}_type, meta_if to conditional, is_same_type to is_same, un{ref|pointer|const} to remove_{reference|pointer|const} and makeconst to add_const.
Changed boolean type 'ret' member to 'value'.
Changed 'ret' members refering to types to 'type'.
Adapted all code occurences.
- Updated unit tests to check above constructor.
- In the compute() method of decompositions: Made temporary matrices/vectors class members to avoid heap allocations during compute() (when dynamic matrices are used, of course).
These changes can speed up decomposition computation time when a solver instance is used to solve multiple same-sized problems. An added benefit is that the compute() method can now be invoked in contexts were heap allocations are forbidden, such as in real-time control loops.
CAVEAT: Not all of the decompositions in the Eigenvalues module have a heap-allocation-free compute() method. A future patch may address this issue, but some required API changes need to be incorporated first.
Finally the createRandomMatrixOfRank() function is renamed to createRandomPIMatrixOfRank, where PI stands for 'partial isometry', that is, a matrix whose singular values are 0 or 1.
(fixes lu test failures when testing solve())
* LU test: set appropriate threshold and limit the number of times that a specially tricky test
is run. (fixes lu test failures when testing rank()).
* Tests: rename createRandomMatrixOfRank to createRandomProjectionOfRank
* be aware of number of actual householder vectors
(optimization in non-full-rank case, no behavior change)
* fix applyThisOnTheRight, it was using k instead of actual_k
* QR: rename matrixQ() to householderQ() where applicable
* renaming, e.g. LU ---> FullPivLU
* split tests framework: more robust, e.g. dont generate empty tests if a number is skipped
* make all remaining tests use that splitting, as needed.
* Fix 4x4 inversion (see stable branch)
* Transform::inverse() and geo_transform test : adapt to new inverse() API, it was also trying to instantiate inverse() for 3x4 matrices.
* CMakeLists: more robust regexp to parse the version number
* misc fixes in unit tests
For Colpiv that was just changing MatrixQType to MatrixType in the instantiation of HouseholderSequence.
For HouseholderQR I also re-ported the solve method from Colpiv as there were multiple issues.