Commit aa110e681b
optimised the multiplication of small dyanmically
sized matrices by restricting the packet size to a maximum of 4, increasing
the chances that SIMD instructions are used in the computation. However, it
introduced a mismatch between the packet size and the requestedAlignment. This
mismatch can lead to crashes when the destination is not aligned. This patch
fixes the issue by ensuring that the AssignmentTraits are correctly computed
when using a restricted packet size.
* * *
Bind LinearPacketType to MaxPacketSize
This commit applies any packet size limit specified when instantiating
copy_using_evaluator_traits to the LinearPacketType, providing that the
size of the destination is not known at compile time.
* * *
Add unit test for restricted packet assignment
A new unit test is added to check that multiplication of small dynamically
sized matrices works correctly when the packet size is restricted to 4 and
the destination is unaligned.
This provide several advantages:
- more flexibility in designing unit tests
- unit tests can be glued to speed up compilation
- unit tests are compiled with same predefined macros, which is a requirement for zapcc
* Make product EvalAtOnce in cases OuterProduct, GemmProduct and
GemvProduct
* Ensure that product evaluators are nested inside EvalToTemp
evaluator
* As temporary kludge, evaluate expression to temporary in AllAtOnce
traversal and pass expression operator to evalTo()
* Rename the old copy_using_evaluators to noalias_copy_using_evaluators.
* Write a new copy_using_evaluators which strips NoAlias expression, if present,
and calls noalias_copy_using_evaluators; in future, it will also take care of
aliasing in products.
* Add expression() getter to NoAlias.
* Copy implementation from CoeffBasedProduct.
* Copy implementation from GeneralProduct in InnerProduct case.
* For GeneralProduct in other cases, call the evalTo() member function with
expression objects in constructor of evaluator.