Commit Graph

13 Commits

Author SHA1 Message Date
Gael Guennebaud
93115619c2 * updated benchmark files according to recent renamings
* various improvements in BTL including trisolver and cholesky bench
2008-07-27 11:39:47 +00:00
Gael Guennebaud
8463b7d3f4 * fix compilation issue in Product
* added some tests for product and swap
* overload .swap() for dynamic-sized matrix of same size
2008-07-02 16:05:33 +00:00
Benoit Jacob
c905b31b42 * Big rework of Assign.h:
** Much better organization
** Fix a few bugs
** Add the ability to unroll only the inner loop
** Add an unrolled path to the Like1D vectorization. Not well tested.
** Add placeholder for sliced vectorization. Unimplemented.

* Rework of corrected_flags:
** improve rules determining vectorizability
** for vectors, the storage-order is indifferent, so we tweak it
   to allow vectorization of row-vectors.

* fix compilation in benchmark, and a warning in Transpose.
2008-06-16 10:49:44 +00:00
Benoit Jacob
8c6007f80e * Patch by Konstantinos Margaritis: AltiVec vectorization.
* Fix several warnings, temporarily disable determinant test.
2008-05-03 12:21:23 +00:00
Benoit Jacob
890a8de962 Make products always eval into expressions. Improves performance
in benchmark. Still not as fasts as explicit eval(), strangely.
2008-05-02 08:53:23 +00:00
Gael Guennebaud
1ec2d21ca5 Fixed a couple of issues introduced in previous commits.
Added a test for Triangular.
2008-04-26 20:28:27 +00:00
Benoit Jacob
2a86f052a5 - optimized determinant calculations for small matrices (size <= 4)
(only 30 muls for size 4)
- rework the matrix inversion: now using cofactor technique for size<=3,
  so the ugly unrolling is only used for size 4 anymore, and even there
  I'm looking to get rid of it.
2008-04-14 17:07:12 +00:00
Benoit Jacob
ab4046970b * Add fixed-size template versions of corner(), start(), end().
* Use them to write an unrolled path in echelon.cpp, as an
  experiment before I do this LU module.
* For floating-point types, make ei_random() use an amplitude
  of 1.
2008-04-12 17:37:27 +00:00
Benoit Jacob
dcebc46cdc - cleaner use of OpenMP (no code duplication anymore)
using a macro and _Pragma.
- use OpenMP also in cacheOptimalProduct and in the
  vectorized paths as well
- kill the vector assignment unroller. implement in
  operator= the logic for assigning a row-vector in
  a col-vector.
- CMakeLists support for building tests/examples
  with -fopenmp and/or -msse2
- updates in bench/, especially replace identity()
  by ones() which prevents underflows from perturbing
  bench results.
2008-04-11 14:28:42 +00:00
Benoit Jacob
9d8876ce82 * rename XprCopy -> Nested
* rename OperatorEquals -> Assign
* move Util.h and FwDecl.h to a util/ subdir
2008-04-10 09:01:28 +00:00
Benoit Jacob
371d302efb - merge ei_xpr_copy and ei_eval_if_needed_before_nesting
- make use of CoeffReadCost to determine when to unroll the loops,
  for now only in Product.h and in OperatorEquals.h
performance remains the same: generally still not as good as before the
big changes.
2008-04-06 18:01:03 +00:00
Benoit Jacob
fe569b060c get rid of MatrixRef, simplifications. 2008-03-13 20:36:01 +00:00
Gael Guennebaud
9d9d81ad71 * basic support for multicore CPU via a .evalOMP() which
internaly uses OpenMP if enabled at compile time.
 * added a bench/ folder with a couple benchmarks and benchmark tools.
2008-03-09 16:13:47 +00:00