replaced _mm_prefetch in GeneralBlockPanelKernel.h, with ei_prefetch() inline function.
Implemented NEON and AltiVec versions, copied SSE version over from GeneralBlockPanelKernel.h.
Also in GCC case (or rather !_MSC_VER) it's implemented using __builtin_prefetch().
NEON managed to give a small but welcome boost, 0.88GFLOPS -> 0.91GFLOPS.
Implemented NEON and AltiVec versions, copied SSE version over from GeneralBlockPanelKernel.h.
Also in GCC case (or rather !_MSC_VER) it's implemented using __builtin_prefetch().
NEON managed to give a small but welcome boost, 0.88GFLOPS -> 0.91GFLOPS.
* get rid of BlockReturnType: it was not needed, and code was not always using it consistently anyway
* add topRows(), leftCols(), bottomRows(), rightCols()
* add corners unit-test covering all of that
* adapt docs, expand "porting from eigen 2 to 3"
* adapt Eigen2Support
- Updated unit tests to check above constructor.
- In the compute() method of decompositions: Made temporary matrices/vectors class members to avoid heap allocations during compute() (when dynamic matrices are used, of course).
These changes can speed up decomposition computation time when a solver instance is used to solve multiple same-sized problems. An added benefit is that the compute() method can now be invoked in contexts were heap allocations are forbidden, such as in real-time control loops.
CAVEAT: Not all of the decompositions in the Eigenvalues module have a heap-allocation-free compute() method. A future patch may address this issue, but some required API changes need to be incorporated first.
* adapt Eigenvalues module to the new rule that the RowMajorBit must have the proper value for vectors
* Fix RowMajorBit in ei_traits<ProductBase>
* Fix vectorizability logic in CoeffBasedProduct
* Introduction of strides-at-compile-time so for example the optimized code really knows when it needs to evaluate to a temporary
* StorageKind / XprKind
* Quaternion::setFromTwoVectors: use JacobiSVD instead of SVD
* ComplexSchur: support the 1x1 case