all per plot settings have been moved to a single file, go_mean now takes an
optional second argument "tiny" to generate plots for tiny matrices, and
output of comparison information wrt to previous benchs (if any).
- remove all invertibility checking, will be redundant with LU
- general case: adapt to matrix storage order for better perf
- size 4 case: handle corner cases without falling back to gen case.
- rationalize with selectors instead of compile time if
- add C-style computeInverse()
* update inverse test.
* in snippets, default cout precision to 3 decimal places
* add some cmake module from kdelibs to support btl with cmake 2.4
- removed the ugly X11 and PNG gnuplots terminals
- use enhanced postscript terminal
- use imagemagick to generate the png files (with compression)
- disable the fortran impl by default since it is as meaningless as a "C impl"
- update line settings
It basically performs 4 dot products at once reducing loads of the vector and improving
instructions scheduling. With 3 cache friendly algorithms, we now handle all product
configurations with outstanding perf for large matrices.
the modifications to initial code follow:
* changed build system from plain makefiles to cmake
* added eigen2 (4 versions: vec/novec and fixed/dynamic), GMM++, MTL4 interfaces
* added "transposed matrix * vector" product action
* updated blitz interface to use condensed products instead of hand coded loops
* removed some deprecated interfaces
* changed default storage order to column major for all libraries
* new generic bench timer strategy which is supposed to be more accurate
* various code clean-up