* Use them to write an unrolled path in echelon.cpp, as an
experiment before I do this LU module.
* For floating-point types, make ei_random() use an amplitude
of 1.
using a macro and _Pragma.
- use OpenMP also in cacheOptimalProduct and in the
vectorized paths as well
- kill the vector assignment unroller. implement in
operator= the logic for assigning a row-vector in
a col-vector.
- CMakeLists support for building tests/examples
with -fopenmp and/or -msse2
- updates in bench/, especially replace identity()
by ones() which prevents underflows from perturbing
bench results.