Commit Graph

241 Commits

Author SHA1 Message Date
Gael Guennebaud
6d11a07e5e Added a ei_palign function align a packet from two others.
This allows much faster code dealing with unligned as
in the updated matrix-vector product functions.
2008-08-03 15:15:46 +00:00
Gael Guennebaud
55aeb1f83a Optimizations:
* faster matrix-matrix and matrix-vector products (especially for not aligned cases)
 * faster tridiagonalization (make it using our matrix-vector impl.)
Others:
 * fix Flags of Map
 * split the test_product to two smaller ones
2008-08-01 23:44:59 +00:00
Gael Guennebaud
b32b186c14 removed the packet specializations of some functors
(GCC generates better code without those "optimizations")
2008-07-31 21:03:11 +00:00
Gael Guennebaud
842c4f8bfa Several compilation fixes for MSVC and NVCC, basically:
- added explicit enum to int conversion where needed
- if a function is not defined as declared and the return type is "tricky"
  then the type must be typedefined somewhere. A "tricky return type" can be:
  * a template class with a default parameter which depends on another template parameter
  * a nested template class, or type of a nested template class
2008-07-29 16:33:07 +00:00
Gael Guennebaud
44d95e0540 fix some internal asserts in CacheFrinedlyProduct 2008-07-27 22:14:08 +00:00
Gael Guennebaud
02a7efa910 forgot to include this file in previous commit 2008-07-27 14:24:32 +00:00
Gael Guennebaud
e9e5261664 Fix a couple issues introduced in the previous commit:
* removed DirectAccessBit from Part
* use a template specialization in inverseProduct() to transform a Part xpr to a Flagged xpr
2008-07-26 23:05:44 +00:00
Gael Guennebaud
e77ccf2928 * Rewrite the triangular solver so that we can take advantage of our efficient matrix-vector products:
=> up to 6 times faster !
* Added DirectAccessBit to Part
* Added an exemple of a cwise operator
* Renamed perpendicular() => someOrthogonal() (geometry module)
* Fix a weired bug in ei_constant_functor: the default copy constructor did not copy
  the imaginary part when the single member of the class is a complex...
2008-07-26 20:40:29 +00:00
Gael Guennebaud
2940617e6f bugfix in some internal asserts of CacheFriendlyProduct 2008-07-26 12:26:27 +00:00
Benoit Jacob
f997a3e902 update the inverse test a little
make use of static asserts in Map
fix 2 warnings in CacheFriendlyProduct: unused var 'Vectorized'
2008-07-26 12:08:28 +00:00
Gael Guennebaud
b466c266a0 * Fix some complex alignment issues in the cache friendly matrix-vector products.
* Minor update of the cores of the Cholesky algorithms to make them more friendly
  wrt to matrix-vector products => speedup x5 !
2008-07-23 17:30:00 +00:00
Gael Guennebaud
172000aaeb Add .perpendicular() function in Geometry module (adapted from Eigen1)
Documentation:
 * add an overview for each module.
 * add an example for .all() and Cwise::operator<
2008-07-22 10:54:42 +00:00
Gael Guennebaud
516db2c3b9 Fix compilation issues with icc and g++ < 4.1. Those include:
- conflicts with operator * overloads
 - discard the use of ei_pdiv for interger
   (g++ handles operators on __m128* types, this is why it worked)
 - weird behavior of icc in fixed size Block() constructor complaining
   the initializer of m_blockRows and m_blockCols were missing while
   we are in fixed size (maybe this hide deeper problem since this is a
   recent one, but icc gives only little feedback)
2008-07-21 12:40:56 +00:00
Gael Guennebaud
c10f069b6b * Merge Extract and Part to the Part expression.
Renamed "MatrixBase::extract() const" to "MatrixBase::part() const"
* Renamed static functions identity, zero, ones, random with an upper case
  first letter: Identity, Zero, Ones and Random.
2008-07-21 00:34:46 +00:00
Gael Guennebaud
ce425d92f1 Various documentation improvements, in particualr in Cholesky and Geometry module.
Added doxygen groups for Matrix typedefs and the Geometry module
2008-07-20 15:18:54 +00:00
Gael Guennebaud
269f683902 Add cholesky's members to MatrixBase
Various documentation improvements including new snippets (AngleAxis and Cholesky)
2008-07-19 22:59:05 +00:00
Gael Guennebaud
6e2c53e056 Added an automatically generated list of selected examples in the documentation.
Added the custom gemetry_module tag, and use it.
2008-07-19 20:36:41 +00:00
Gael Guennebaud
05ad083467 Added MatrixBase::Unit*() static function to easily create unit/basis vectors.
Removed EulerAngles, addes typdefs for Quaternion and AngleAxis,
and added automatic conversions from Quaternion/AngleAxis to Matrix3 such that:
 Matrix3f m = AngleAxisf(0.2,Vector3f::UnitX) * AngleAxisf(0.2,Vector3f::UnitY);
just works.
2008-07-19 13:03:23 +00:00
Gael Guennebaud
7245c63067 Complete rewrite of partial reduction according to mailing list discussions. 2008-07-19 11:36:32 +00:00
Benoit Jacob
8b4945a5a2 add some static asserts, use them, fix gcc 4.3 warning in Product.h. 2008-07-19 00:25:41 +00:00
Gael Guennebaud
22a816ade8 * Fix a couple of issues related to the recent cache friendly products
* Improve the efficiency of matrix*vector in unaligned cases
* Trivial fixes in the destructors of MatrixStorage
* Removed the matrixNorm in test/product.cpp (twice faster and
  that assumed the matrix product was ok while checking that !!)
2008-07-19 00:09:01 +00:00
Benoit Jacob
62ec1dd616 * big rework of Inverse.h:
- remove all invertibility checking, will be redundant with LU
  - general case: adapt to matrix storage order for better perf
  - size 4 case: handle corner cases without falling back to gen case.
  - rationalize with selectors instead of compile time if
  - add C-style computeInverse()
* update inverse test.
* in snippets, default cout precision to 3 decimal places
* add some cmake module from kdelibs to support btl with cmake 2.4
2008-07-15 23:56:17 +00:00
Gael Guennebaud
b970a9c8aa trivial fix in EulerAngles constructor 2008-07-15 22:42:55 +00:00
Gael Guennebaud
99a625243f Optimization: added super efficient rowmajor * vector product (and vector * colmajor).
It basically performs 4 dot products at once reducing loads of the vector and improving
instructions scheduling. With 3 cache friendly algorithms, we now handle all product
configurations with outstanding perf for large matrices.
2008-07-13 01:22:54 +00:00
Gael Guennebaud
861d18d553 * Optimization: added a specialization of Block for xpr with DirectAccessBit
* some simplifications and fixes in cache friendly products
2008-07-12 22:59:34 +00:00
Gael Guennebaud
b7bd1b3446 Add a *very efficient* evaluation path for both col-major matrix * vector
and vector * row-major products. Currently, it is enabled only is the matrix
has DirectAccessBit flag and the product is "large enough".
Added the respective unit tests in test/product/cpp.
2008-07-12 12:12:02 +00:00
Benoit Jacob
2b53fd4d53 some performance fixes in Assign.h reported by Gael. Some doc update in
Cwise.
2008-07-10 16:15:55 +00:00
Gael Guennebaud
c9b046d5d5 * added optimized paths for matrix-vector and vector-matrix products
(using either a cache friendly strategy or re-using dot-product
  vectorized implementation)
* add LinearAccessBit to Transpose
2008-07-09 22:30:18 +00:00
Gael Guennebaud
5f55ab524c * added a lazyAssign overload skipping .lazy() such that c = (<xpr>).lazy() such that
lazyAssign overloads of <xpr> are automatically called (this also reduces assign instansiations)
2008-07-09 13:54:21 +00:00
Gael Guennebaud
783eb6da9b I forgot that the previous commit needed minor changes outside the bench folder 2008-07-08 17:25:58 +00:00
Benoit Jacob
6f09d3a67d - many updates after Cwise change
- fix compilation in product.cpp with std::complex
- fix bug in MatrixBase::operator!=
2008-07-08 07:56:01 +00:00
Benoit Jacob
f5791eeb70 the big Array/Cwise rework as discussed on the mailing list. The new API
can be seen in Eigen/src/Core/Cwise.h.
2008-07-08 00:49:10 +00:00
Benoit Jacob
a9d319d44f * do the ActualPacketAccesBit change as discussed on list
* add comment in Product.h about CanVectorizeInner
* fix typo in test/product.cpp
2008-07-04 12:43:55 +00:00
Gael Guennebaud
8463b7d3f4 * fix compilation issue in Product
* added some tests for product and swap
* overload .swap() for dynamic-sized matrix of same size
2008-07-02 16:05:33 +00:00
Gael Guennebaud
9433df83a7 * resurected Flagged::_expression used to optimize m+=(a*b).lazy()
(equivalent to the GEMM blas routine)
* added a GEMM benchmark
2008-07-01 16:20:06 +00:00
Gael Guennebaud
37a50fa526 * added an in-place version of inverseProduct which
might be twice faster fot small fixed size matrix
* added a sparse triangular solver (sparse version
  of inverseProduct)
* various other improvements in the Sparse module
2008-06-29 21:29:12 +00:00
Gael Guennebaud
027818d739 * added innerSize / outerSize functions to MatrixBase
* added complete implementation of sparse matrix product
  (with a little glue in Eigen/Core)
* added an exhaustive bench of sparse products including GMM++ and MTL4
  => Eigen outperforms in all transposed/density configurations !
2008-06-28 23:07:14 +00:00
Benoit Jacob
55e08f7102 fix breakage from my last commit 2008-06-28 17:15:16 +00:00
Benoit Jacob
844f69e4a9 * update CMakeLists, only build instantiations if TEST_LIB is defined
* allow default Matrix constructor in dynamic size, defaulting to (1,
1), this is convenient in mandelbrot example.
2008-06-27 10:53:30 +00:00
Benoit Jacob
6de4871c8c fix a couple of issues in the new Map.h 2008-06-27 01:42:44 +00:00
Benoit Jacob
e27b2b95cf * rework Map, allow vectorization
* rework PacketMath and DummyPacketMath, make these actual template
specializations instead of just overriding by non-template inline
functions
* introduce ei_ploadt and ei_pstoret, make use of them in Map and Matrix
* remove Matrix::map() methods, use Map constructors instead.
2008-06-27 01:22:35 +00:00
Gael Guennebaud
e5d301dc96 various work on the Sparse module:
* added some glue to Eigen/Core (SparseBit, ei_eval, Matrix)
* add two new sparse matrix types:
   HashMatrix: based on std::map (for random writes)
   LinkedVectorMatrix: array of linked vectors
   (for outer coherent writes, e.g. to transpose a matrix)
* add a SparseSetter class to easily set/update any kind of matrices, e.g.:
   { SparseSetter<MatrixType,RandomAccessPattern> wrapper(mymatrix);
     for (...) wrapper->coeffRef(rand(),rand()) = rand(); }
* automatic shallow copy for RValue
* and a lot of mess !
plus:
* remove the remaining ArrayBit related stuff
* don't use alloca in product for very large memory allocation
2008-06-26 23:22:26 +00:00
Benoit Jacob
c5bd1703cb change derived classes methods from "private:_method()"
to "public:method()" i.e. reimplementing the generic method()
from MatrixBase.
improves compilation speed by 7%, reduces almost by half the call depth
of trivial functions, making gcc errors and application backtraces
nicer...
2008-06-26 20:08:16 +00:00
Benoit Jacob
25ba9f377c * add bench/benchVecAdd.cpp by Gael, fix crash (ei_pload on non-aligned)
* introduce packet(int), make use of it in linear vectorized paths
  --> completely fixes the slowdown noticed in benchVecAdd.
* generalize coeff(int) to linear-access xprs
* clarify the access flag bits
* rework api dox in Coeffs.h and util/Constants.h
* improve certain expressions's flags, allowing more vectorization
* fix bug in Block: start(int) and end(int) returned dyn*dyn size
* fix bug in Block: just because the Eval type has packet access
  doesn't imply the block xpr should have it too.
2008-06-26 16:06:41 +00:00
Benoit Jacob
5b0da4b778 make use of ei_pmadd in dot-product: will further improve performance
on architectures having a packed-mul-add assembly instruction.
2008-06-24 18:08:35 +00:00
Benoit Jacob
3b94436d2f * vectorize dot product, copying code from sum.
* make the conj functor vectorizable: it is just identity in real case,
  and complex doesn't use the vectorized path anyway.
* fix bug in Block: a 3x1 block in a 4x4 matrix (all fixed-size)
  should not be vectorizable, since in fixed-size we are assuming
  the size to be a multiple of packet size. (Or would you prefer
  Vector3d to be flagged "packetaccess" even though no packet access
  is possible on vectors of that type?)
* rename:
  isOrtho for vectors ---> isOrthogonal
  isOrtho for matrices ---> isUnitary
* add normalize()
* reimplement normalized with quotient1 functor
2008-06-24 15:13:00 +00:00
Benoit Jacob
c9560df4a0 * add ei_pdiv intrinsic, make quotient functor vectorizable
* add vdw benchmark from Tim's real-world use case
2008-06-23 22:00:18 +00:00
Gael Guennebaud
ac9aa47bbc optimize linear vectorization both in Assign and Sum (optimal amortized perf) 2008-06-23 15:50:28 +00:00
Gael Guennebaud
ea1990ef3d add experimental code for sparse matrix:
- uses the common "Compressed Column Storage" scheme
 - supports every unary and binary operators with xpr template
   assuming binaryOp(0,0) == 0 and unaryOp(0) = 0 (otherwise a sparse
   matrix doesnot make sense)
 - this is the first commit, so of course, there are still several shorcommings !
2008-06-23 13:25:22 +00:00
Benoit Jacob
03d19f3bae quick temporary fix for a perf issue we just identified with
vectorization....
now the sum benchmark runs 3x faster with vectorization than without.
2008-06-23 11:23:05 +00:00