Commit Graph

219 Commits

Author SHA1 Message Date
Gael Guennebaud
b970a9c8aa trivial fix in EulerAngles constructor 2008-07-15 22:42:55 +00:00
Gael Guennebaud
99a625243f Optimization: added super efficient rowmajor * vector product (and vector * colmajor).
It basically performs 4 dot products at once reducing loads of the vector and improving
instructions scheduling. With 3 cache friendly algorithms, we now handle all product
configurations with outstanding perf for large matrices.
2008-07-13 01:22:54 +00:00
Gael Guennebaud
861d18d553 * Optimization: added a specialization of Block for xpr with DirectAccessBit
* some simplifications and fixes in cache friendly products
2008-07-12 22:59:34 +00:00
Gael Guennebaud
b7bd1b3446 Add a *very efficient* evaluation path for both col-major matrix * vector
and vector * row-major products. Currently, it is enabled only is the matrix
has DirectAccessBit flag and the product is "large enough".
Added the respective unit tests in test/product/cpp.
2008-07-12 12:12:02 +00:00
Benoit Jacob
2b53fd4d53 some performance fixes in Assign.h reported by Gael. Some doc update in
Cwise.
2008-07-10 16:15:55 +00:00
Gael Guennebaud
c9b046d5d5 * added optimized paths for matrix-vector and vector-matrix products
(using either a cache friendly strategy or re-using dot-product
  vectorized implementation)
* add LinearAccessBit to Transpose
2008-07-09 22:30:18 +00:00
Gael Guennebaud
5f55ab524c * added a lazyAssign overload skipping .lazy() such that c = (<xpr>).lazy() such that
lazyAssign overloads of <xpr> are automatically called (this also reduces assign instansiations)
2008-07-09 13:54:21 +00:00
Gael Guennebaud
783eb6da9b I forgot that the previous commit needed minor changes outside the bench folder 2008-07-08 17:25:58 +00:00
Benoit Jacob
6f09d3a67d - many updates after Cwise change
- fix compilation in product.cpp with std::complex
- fix bug in MatrixBase::operator!=
2008-07-08 07:56:01 +00:00
Benoit Jacob
f5791eeb70 the big Array/Cwise rework as discussed on the mailing list. The new API
can be seen in Eigen/src/Core/Cwise.h.
2008-07-08 00:49:10 +00:00
Benoit Jacob
a9d319d44f * do the ActualPacketAccesBit change as discussed on list
* add comment in Product.h about CanVectorizeInner
* fix typo in test/product.cpp
2008-07-04 12:43:55 +00:00
Gael Guennebaud
8463b7d3f4 * fix compilation issue in Product
* added some tests for product and swap
* overload .swap() for dynamic-sized matrix of same size
2008-07-02 16:05:33 +00:00
Gael Guennebaud
9433df83a7 * resurected Flagged::_expression used to optimize m+=(a*b).lazy()
(equivalent to the GEMM blas routine)
* added a GEMM benchmark
2008-07-01 16:20:06 +00:00
Gael Guennebaud
37a50fa526 * added an in-place version of inverseProduct which
might be twice faster fot small fixed size matrix
* added a sparse triangular solver (sparse version
  of inverseProduct)
* various other improvements in the Sparse module
2008-06-29 21:29:12 +00:00
Gael Guennebaud
027818d739 * added innerSize / outerSize functions to MatrixBase
* added complete implementation of sparse matrix product
  (with a little glue in Eigen/Core)
* added an exhaustive bench of sparse products including GMM++ and MTL4
  => Eigen outperforms in all transposed/density configurations !
2008-06-28 23:07:14 +00:00
Benoit Jacob
55e08f7102 fix breakage from my last commit 2008-06-28 17:15:16 +00:00
Benoit Jacob
844f69e4a9 * update CMakeLists, only build instantiations if TEST_LIB is defined
* allow default Matrix constructor in dynamic size, defaulting to (1,
1), this is convenient in mandelbrot example.
2008-06-27 10:53:30 +00:00
Benoit Jacob
6de4871c8c fix a couple of issues in the new Map.h 2008-06-27 01:42:44 +00:00
Benoit Jacob
e27b2b95cf * rework Map, allow vectorization
* rework PacketMath and DummyPacketMath, make these actual template
specializations instead of just overriding by non-template inline
functions
* introduce ei_ploadt and ei_pstoret, make use of them in Map and Matrix
* remove Matrix::map() methods, use Map constructors instead.
2008-06-27 01:22:35 +00:00
Gael Guennebaud
e5d301dc96 various work on the Sparse module:
* added some glue to Eigen/Core (SparseBit, ei_eval, Matrix)
* add two new sparse matrix types:
   HashMatrix: based on std::map (for random writes)
   LinkedVectorMatrix: array of linked vectors
   (for outer coherent writes, e.g. to transpose a matrix)
* add a SparseSetter class to easily set/update any kind of matrices, e.g.:
   { SparseSetter<MatrixType,RandomAccessPattern> wrapper(mymatrix);
     for (...) wrapper->coeffRef(rand(),rand()) = rand(); }
* automatic shallow copy for RValue
* and a lot of mess !
plus:
* remove the remaining ArrayBit related stuff
* don't use alloca in product for very large memory allocation
2008-06-26 23:22:26 +00:00
Benoit Jacob
c5bd1703cb change derived classes methods from "private:_method()"
to "public:method()" i.e. reimplementing the generic method()
from MatrixBase.
improves compilation speed by 7%, reduces almost by half the call depth
of trivial functions, making gcc errors and application backtraces
nicer...
2008-06-26 20:08:16 +00:00
Benoit Jacob
25ba9f377c * add bench/benchVecAdd.cpp by Gael, fix crash (ei_pload on non-aligned)
* introduce packet(int), make use of it in linear vectorized paths
  --> completely fixes the slowdown noticed in benchVecAdd.
* generalize coeff(int) to linear-access xprs
* clarify the access flag bits
* rework api dox in Coeffs.h and util/Constants.h
* improve certain expressions's flags, allowing more vectorization
* fix bug in Block: start(int) and end(int) returned dyn*dyn size
* fix bug in Block: just because the Eval type has packet access
  doesn't imply the block xpr should have it too.
2008-06-26 16:06:41 +00:00
Benoit Jacob
5b0da4b778 make use of ei_pmadd in dot-product: will further improve performance
on architectures having a packed-mul-add assembly instruction.
2008-06-24 18:08:35 +00:00
Benoit Jacob
3b94436d2f * vectorize dot product, copying code from sum.
* make the conj functor vectorizable: it is just identity in real case,
  and complex doesn't use the vectorized path anyway.
* fix bug in Block: a 3x1 block in a 4x4 matrix (all fixed-size)
  should not be vectorizable, since in fixed-size we are assuming
  the size to be a multiple of packet size. (Or would you prefer
  Vector3d to be flagged "packetaccess" even though no packet access
  is possible on vectors of that type?)
* rename:
  isOrtho for vectors ---> isOrthogonal
  isOrtho for matrices ---> isUnitary
* add normalize()
* reimplement normalized with quotient1 functor
2008-06-24 15:13:00 +00:00
Benoit Jacob
c9560df4a0 * add ei_pdiv intrinsic, make quotient functor vectorizable
* add vdw benchmark from Tim's real-world use case
2008-06-23 22:00:18 +00:00
Gael Guennebaud
ac9aa47bbc optimize linear vectorization both in Assign and Sum (optimal amortized perf) 2008-06-23 15:50:28 +00:00
Gael Guennebaud
ea1990ef3d add experimental code for sparse matrix:
- uses the common "Compressed Column Storage" scheme
 - supports every unary and binary operators with xpr template
   assuming binaryOp(0,0) == 0 and unaryOp(0) = 0 (otherwise a sparse
   matrix doesnot make sense)
 - this is the first commit, so of course, there are still several shorcommings !
2008-06-23 13:25:22 +00:00
Benoit Jacob
03d19f3bae quick temporary fix for a perf issue we just identified with
vectorization....
now the sum benchmark runs 3x faster with vectorization than without.
2008-06-23 11:23:05 +00:00
Benoit Jacob
dc9206cec5 split sum away from redux and vectorize it.
(could come back to redux after it has been vectorized,
and could serve as a starting point for that)
also make the abs2 functor vectorizable (for real types).
2008-06-23 10:32:48 +00:00
Benoit Jacob
8a967fb17c * implement slice vectorization. Because it uses unaligned
packet access, it is not certain that it will bring a performance
  improvement: benchmarking needed.
* improve logic choosing slice vectorization.
* fix typo in SSE packet math, causing crash in unaligned case.
* fix bug in Product, causing crash in unaligned case.
* add TEST_SSE3 CMake option.
2008-06-22 15:02:05 +00:00
Gael Guennebaud
32c5ea388e work on rotations in the Geometry module:
- convertions are done trough constructors and operator=
 - added a EulerAngles class
2008-06-21 15:01:49 +00:00
Benoit Jacob
574416b842 Override MatrixBase::eval() since matrices don't need
to be evaluated, it is enough to just read them.
2008-06-20 15:26:39 +00:00
Gael Guennebaud
54238961d6 * added a pseudo expression Array giving access to:
- matrix-scalar addition/subtraction operators, e.g.:
       m.array() += 0.5;
   - matrix/matrix comparison operators, e.g.:
      if (m1.array() < m2.array()) {}
* fix compilation issues with Transform and gcc < 4.1
2008-06-20 12:38:03 +00:00
Gael Guennebaud
e735692e37 move "enum" back to "const int" int ei_assign_impl: in fact, casting
enums to int is enough to get compile time constants with ICC.
2008-06-20 07:10:50 +00:00
Gael Guennebaud
fb4a151982 * more cleaning in Product
* make Matrix2f (and similar) vectorized using linear path
* fix a couple of warnings and compilation issues with ICC and gcc 3.3/3.4
  (cannot get Transform compiles with gcc 3.3/3.4, see the FIXME)
2008-06-19 23:00:51 +00:00
Gael Guennebaud
82c3cea1d5 * refactoring of Product:
* use ProductReturnType<>::Type to get the correct Product xpr type
  * Product is no longer instanciated for xpr types which are evaluated
  * vectorization of "a.transpose() * b" for the normal product (small and fixed-size matrix)
  * some cleanning
* removed ArrayBase
2008-06-19 17:33:57 +00:00
Gael Guennebaud
5dbfed1902 fix two bugs dicovered by the previous commit. 2008-06-16 16:39:58 +00:00
Benoit Jacob
bb1f4e44f1 * Block: row and column expressions in the inner direction
now have the Like1D flag.

* Big renaming:
  packetCoeff ---> packet
  VectorizableBit ---> PacketAccessBit
  Like1DArrayBit ---> LinearAccessBit
2008-06-16 14:54:31 +00:00
Benoit Jacob
9857764ae7 aaargh. 2008-06-16 11:20:29 +00:00
Benoit Jacob
478bfaf228 fix bug in computation of unrolling limit: div instead of mul 2008-06-16 11:18:59 +00:00
Benoit Jacob
c905b31b42 * Big rework of Assign.h:
** Much better organization
** Fix a few bugs
** Add the ability to unroll only the inner loop
** Add an unrolled path to the Like1D vectorization. Not well tested.
** Add placeholder for sliced vectorization. Unimplemented.

* Rework of corrected_flags:
** improve rules determining vectorizability
** for vectors, the storage-order is indifferent, so we tweak it
   to allow vectorization of row-vectors.

* fix compilation in benchmark, and a warning in Transpose.
2008-06-16 10:49:44 +00:00
Gael Guennebaud
bc0c7c57ed Added an extensible mechanism to support any kind of rotation
representation in Transform via the template static class
ToRotationMatrix.
Added a lightweight AngleAxis class (similar to Rotation2D).
2008-06-15 17:22:41 +00:00
Gael Guennebaud
0ee6b08128 * split Product to a DiagonalProduct template specialization
to optimize matrix-diag and diag-matrix products without
  making Product over complicated.
* compilation fixes in Tridiagonalization and HessenbergDecomposition
  in the case of 2x2 matrices.
* added an Orientation2D small class with similar interface than Quaternion
  (used by Transform to handle 2D and 3D orientations seamlessly)
* added a couple of features in Transform.
2008-06-15 11:54:18 +00:00
Gael Guennebaud
fbbd8afe30 Started a Transform class in the Geometry module to represent
homography.
Fix indentation in Quaternion.h
2008-06-15 08:33:44 +00:00
Gael Guennebaud
4af7089ab8 * Added a generalized eigen solver for the selfadjoint case.
(as new members to SelfAdjointEigenSolver)
  The QR module now depends on Cholesky.
* Fix Transpose to correctly preserve the *TriangularBit.
2008-06-14 19:42:12 +00:00
Gael Guennebaud
f07f907810 Add QR and Cholesky module instantiations in the lib.
To try it with the unit tests set the cmake variable TEST_LIB to ON.
2008-06-14 13:02:41 +00:00
Benoit Jacob
53289a8b64 * even though the _Flags default to the corrected value, still correct
them in the ei_traits, so that they're guaranteed even if the user
  specified his own non-default flags (like before).

  Measured to not make compilation any slower.
2008-06-13 08:09:48 +00:00
Benoit Jacob
c90c77051f * make the _Flags template parameter of Matrix default to the corrected
flags. This ensures that unless explicitly messed up otherwise,
  a Matrix type is equal to its own Eval type. This seriously reduces
  the number of types instantiated. Measured +13% compile speed, -7%
  binary size.

* Improve doc of Matrix template parameters.
2008-06-13 07:53:45 +00:00
Gael Guennebaud
e3fac69f19 Added a Hessenberg decomposition class for both real and complex matrices.
This is the first step towards a non-selfadjoint eigen solver.
Notes:
 - We might consider merging Tridiagonalization and Hessenberg toghether ?
 - Or we could factorize some code into a Householder class (could also be shared with QR)
2008-06-08 15:03:23 +00:00
Gael Guennebaud
4dd57b585d * rewrite of the QR decomposition:
- works for complex
  - allows direct access to the matrix R
* removed the scale by the matrix dimensions in MatrixBase::isMuchSmallerThan(scalar)
2008-06-07 22:47:11 +00:00