Some naming questions:
- for "extend" we could also think of: "expand", "union", "add"
- same for "clamp": "crop", "intersect"
- same for "contains": "isInside", "intersect"
=> ah "intersect" is conflicting, so that eliminates this one !
* remove the automatic resizing feature of operator =
* add function Matrix::set() to be used when the previous
behavior is wanted
* the default constructor of dynamic-size matrices now
creates a "null" matrix (data=0, rows = cols = 0)
instead of a 1x1 matrix
* fix UnixX typos ;)
as described on the wiki (one map per N column)
Here's some bench results for the 4 currently supported map impl:
std::map => 18.3385 (581 MB)
gnu::hash_map => 6.52574 (555 MB)
google::dense => 2.87982 (315 MB)
google::sparse => 15.7441 (165 MB)
This is the time is second (and memory consumption) to insert/lookup
10 million of coeffs with random coords inside a 10000^2 matrix,
with one map per packet of 64 columns => google::dense really rocks !
Note I use for the key value the index of the column in the packet (between 0 and 63)
times the number of rows and I used the default hash function.... so maybe there is
room for improvement here....
solver from suitesparse (as cholmod). It seems to be even faster
than SuperLU and it was much simpler to interface ! Well,
the factorization is faster, but for the solve part, SuperLU is
quite faster. On the other hand the solve part represents only a
fraction of the whole procedure. Moreover, I bench random matrices
that does not represents real cases, and I'm not sure at all
I use both libraries with their best settings !
* rename Cholesky to LLT
* rename CholeskyWithoutSquareRoot to LDLT
* rename MatrixBase::cholesky() to llt()
* rename MatrixBase::choleskyNoSqrt() to ldlt()
* make {LLT,LDLT}::solve() API consistent with other modules
Note that we are going to keep a source compatibility untill the next beta release.
E.g., the "old" Cholesky* classes, etc are still available for some time.
To be clear, Eigen beta2 should be (hopefully) source compatible with beta1,
and so beta2 will contain all the deprecated API of beta1. Those features marked
as deprecated will be removed in beta3 (or in the final 2.0 if there is no beta 3 !).
Also includes various updated in sparse Cholesky.
* several fixes (transpose, matrix product, etc...)
* Added a basic cholesky factorization
* Added a low level hybrid dense/sparse vector class
to help writing code involving intensive read/write
in a fixed vector. It is currently used to implement
the matrix product itself as well as in the Cholesky
factorization.
However, for matrices larger than 5, it seems there is constantly a quite large error for a very
few coefficients. I don't what's going on, but that's certainely not due to numerical issues only.
(also note that the test with the pseudo eigenvectors fails the same way)
* eigenvectors => pseudoEigenvectors
* added pseudoEigenvalueMatrix
* clear the documentation
* added respective unit test
Still missing: a proper eigenvectors() function.
based on the former.
* opengl_demo: makes IcoSphere better (vertices are instanciated only once) and
removed the generation of a big geometry for the fancy spheres...
* add a WithAlignedOperatorNew class with overloaded operator new
* make Matrix (and Quaternion, Transform, Hyperplane, etc.) use it
if needed such that "*(new Vector4) = xpr" does not failed anymore.
* Please: make sure your classes having fixed size Eigen's vector
or matrice attributes inherit WithAlignedOperatorNew
* add a ei_new_allocator STL memory allocator to use with STL containers.
This allocator really calls operator new on your types (unlike GCC's
new_allocator). Example:
std::vector<Vector4f> data(10);
will segfault if the vectorization is enabled, instead use:
std::vector<Vector4f,ei_new_allocator<Vector4f> > data(10);
NOTE: you only have to worry if you deal with fixed-size matrix types
with "sizeof(matrix_type)%16==0"...
few bits left of the comma and for floating-point types will never return zero.
This replaces the custom functions in test/main.h, so one does not anymore need
to think about that when writing tests.
This allow code factorization and generic template specialization
of functions
* added any_rotation * {Translation,Scaling,Transform} products methods
* rewrite of the actually broken ToRoationMatrix helper class to
a global ei_toRotationMatrix function.
NonAffine, Affine (default), contains NoShear, contains NoScaling
that allows significant speed improvements. If you like it, this concept could be applied to
Transform::extractRotation (or to a more advanced decomposition function) and to Hyperplane::transformed()
and maybe to some other places... e.g., I think a Transform::normalMatrix() function would not harm and
warn user that the transformation of normals is not that trivial (I saw this mistake much too often)
* handling Quaternion, AngleAxis and Rotation2D, 2 options here:
1- make all of them inheriting a common base class Rotation such that we can
have a single version of operator* for all the rotation type (they all get converted to a matrix)
2- write a version for all type (so 3 rotations types * 3 for Transform,Translation and Scaling)
* real documentation
- the coefficients are stored in a single vector
- added transformation methods
- removed Line* typedef since in 2D this is really an hyperplane
and not really a line...
- HyperPlane => Hyperplane
and various cleaning in Altivec code. Altivec vectorization have been re-enabled
in CoreDeclaration
* added copy constructors in non empty functors because I observed weird behavior with
std::complex<>
AngleAxis*Vector products were wrong because they returned the product
_expression_
toRotationMatrix()*other;
and toRotationMatrix() died before that expression would be later
evaluated. Here it would not have been practical to NestByValue as this
is a whole matrix. So, let them simply evaluate and return the result by
value.
The geometry.cpp unit-test only checked for compatibility between
various rotations, it didn't check the correctness of the rotations
themselves. That's why this bug escaped us. So, this commit checks that
the rotations produced by AngleAxis have all the expected properties.
Since the compatibility with the other rotations is already checked,
this should validate them as well.
* added a meta.cpp unit test
* EIGEN_TUNE_FOR_L2_CACHE_SIZE now represents L2 block size in Bytes (whence the ei_meta_sqrt...)
* added a CustomizeEigen.dox page
* added a TOC to QuickStartGuide.dox
* replaced the Flags template parameter of Matrix by StorageOrder
and move it back to the 4th position such that we don't have to
worry about the two Max* template parameters
* extended EIGEN_USING_MATRIX_TYPEDEFS with the ei_* math functions
* bugfix in Dot unroller
* added special random generator for the unit tests and reduced the tolerance threshold by an order of magnitude
this fixes issues with sum.cpp but other tests still failed sometimes, this have to be carefully checked...
asm("...") from the code while fixing MSVC compat (so your changes crossed
one another).
- move the pragma warning to CoreDeclarations, it's the right place to do early
platform checks.
CCMAIL:ps_ml@gmx.de
IoFormat OctaveFmt(4, AlignCols, ", ", ";\n", "", "", "[", "]");
cout << mat.format(OctaveFmt);
The first "4" is the precision.
Documentation missing.
* Some compilation fixes
* add some explanations in the typedefs page
* expand a bit the new QuickStartGuide. Some placeholders (not a pb since
it's not even yet linked to from other pages). The point I want to make is
that it's super important to have fully compilable short programs (even
with compile instructions for the first one) not just small snippets, at
least at the beginning. Let's start with examples of compilable programs.
- the decompostion code has been adfapted from JAMA
- handles non square matrices of size MxN with M>=N
- does not work for complex matrices
- includes a solver where the parts corresponding to zero singular values are set to zero
- 33 new snippets
- unfuck doxygen output in Cwise (issues with function macros)
- more see-also links from outside, making Cwise more discoverable
* rename matrixNorm() to operatorNorm(). There are many matrix norms
(the L2 is another one) but only one is called the operator norm.
Risk of confusion with keyword operator is not too scary after all.
* remove the cast operators in the Geometry module: they are replaced by constructors
and new operator= in Matrix
* extended the operations supported by Rotation2D
* rewrite in solveTriangular:
- merge the Upper and Lower specializations
- big optimization of the path for row-major triangular matrices
I don't see any reason not to allow it, it doesn't add much code, and
it makes porting from eigen1 easier.
*expand tests/basicstuff to first test coefficient access methods
* fix .normalized() so that Random().normalized() works; since the return
type became complicated to write down i just let it return an actual
vector, perhaps not optimal.
* add Sparse/CMakeLists.txt. I suppose that it was intentional that it
didn't have CMakeLists, but in <=2.0 releases I'll just manually remove
Sparse.
- added a MapBase base xpr on top of which Map and the specialization
of Block are implemented
- MapBase forces both aligned loads (and aligned stores, see below) in expressions
such as "x.block(...) += other_expr"
* Significant vectorization improvement:
- added a AlignedBit flag meaning the first coeff/packet is aligned,
this allows to not generate extra code to deal with the first unaligned part
- removed all unaligned stores when no unrolling
- removed unaligned loads in Sum when the input as the DirectAccessBit flag
* Some code simplification in CacheFriendly product
* Some minor documentation improvements
- removes much code
- 2.5x faster (even though LU uses complete pivoting contrary to what
inverse used to do!)
- there _were_ numeric stability problems with the partial pivoting
approach of inverse(), with 200x200 matrices inversion failed almost
half of the times (overflow). Now these problems are solved thanks to
complete pivoting.
* remove some useless stuff in LU
*in test/CMakeLists : modify EI_ADD_TEST so that 2nd argument is
additional compiler flags. used to add -O2 to test_product_large so it
doesn't take forever.
not allow to easily get the rank), fix a bug (which could have been
triggered by matrices having coefficients of very different
magnitudes).
Part: add an assert to prevent hard to find bugs
Swap: update comments
Note: in fact, inverse() always uses partial pivoting because the algo
currently used doesn't make sense with complete pivoting. No num
stability issue so far even with size 200x200. If there is any problem
we can of course reimplement inverse on top of LU.
pivoting for better numerical stability. For now the only application is
determinant.
* New determinant unit-test.
* Disable most of Swap.h for now as it makes LU fail (mysterious).
Anyway Swap needs a big overhaul as proposed on IRC.
* Remnants of old class Inverse removed.
* Some warnings fixed.
* faster matrix-matrix and matrix-vector products (especially for not aligned cases)
* faster tridiagonalization (make it using our matrix-vector impl.)
Others:
* fix Flags of Map
* split the test_product to two smaller ones
- added explicit enum to int conversion where needed
- if a function is not defined as declared and the return type is "tricky"
then the type must be typedefined somewhere. A "tricky return type" can be:
* a template class with a default parameter which depends on another template parameter
* a nested template class, or type of a nested template class
=> up to 6 times faster !
* Added DirectAccessBit to Part
* Added an exemple of a cwise operator
* Renamed perpendicular() => someOrthogonal() (geometry module)
* Fix a weired bug in ei_constant_functor: the default copy constructor did not copy
the imaginary part when the single member of the class is a complex...
- conflicts with operator * overloads
- discard the use of ei_pdiv for interger
(g++ handles operators on __m128* types, this is why it worked)
- weird behavior of icc in fixed size Block() constructor complaining
the initializer of m_blockRows and m_blockCols were missing while
we are in fixed size (maybe this hide deeper problem since this is a
recent one, but icc gives only little feedback)
Renamed "MatrixBase::extract() const" to "MatrixBase::part() const"
* Renamed static functions identity, zero, ones, random with an upper case
first letter: Identity, Zero, Ones and Random.
Removed EulerAngles, addes typdefs for Quaternion and AngleAxis,
and added automatic conversions from Quaternion/AngleAxis to Matrix3 such that:
Matrix3f m = AngleAxisf(0.2,Vector3f::UnitX) * AngleAxisf(0.2,Vector3f::UnitY);
just works.
* Improve the efficiency of matrix*vector in unaligned cases
* Trivial fixes in the destructors of MatrixStorage
* Removed the matrixNorm in test/product.cpp (twice faster and
that assumed the matrix product was ok while checking that !!)
- remove all invertibility checking, will be redundant with LU
- general case: adapt to matrix storage order for better perf
- size 4 case: handle corner cases without falling back to gen case.
- rationalize with selectors instead of compile time if
- add C-style computeInverse()
* update inverse test.
* in snippets, default cout precision to 3 decimal places
* add some cmake module from kdelibs to support btl with cmake 2.4
It basically performs 4 dot products at once reducing loads of the vector and improving
instructions scheduling. With 3 cache friendly algorithms, we now handle all product
configurations with outstanding perf for large matrices.
and vector * row-major products. Currently, it is enabled only is the matrix
has DirectAccessBit flag and the product is "large enough".
Added the respective unit tests in test/product/cpp.
might be twice faster fot small fixed size matrix
* added a sparse triangular solver (sparse version
of inverseProduct)
* various other improvements in the Sparse module
* added complete implementation of sparse matrix product
(with a little glue in Eigen/Core)
* added an exhaustive bench of sparse products including GMM++ and MTL4
=> Eigen outperforms in all transposed/density configurations !
* rework PacketMath and DummyPacketMath, make these actual template
specializations instead of just overriding by non-template inline
functions
* introduce ei_ploadt and ei_pstoret, make use of them in Map and Matrix
* remove Matrix::map() methods, use Map constructors instead.
* added some glue to Eigen/Core (SparseBit, ei_eval, Matrix)
* add two new sparse matrix types:
HashMatrix: based on std::map (for random writes)
LinkedVectorMatrix: array of linked vectors
(for outer coherent writes, e.g. to transpose a matrix)
* add a SparseSetter class to easily set/update any kind of matrices, e.g.:
{ SparseSetter<MatrixType,RandomAccessPattern> wrapper(mymatrix);
for (...) wrapper->coeffRef(rand(),rand()) = rand(); }
* automatic shallow copy for RValue
* and a lot of mess !
plus:
* remove the remaining ArrayBit related stuff
* don't use alloca in product for very large memory allocation
to "public:method()" i.e. reimplementing the generic method()
from MatrixBase.
improves compilation speed by 7%, reduces almost by half the call depth
of trivial functions, making gcc errors and application backtraces
nicer...
* introduce packet(int), make use of it in linear vectorized paths
--> completely fixes the slowdown noticed in benchVecAdd.
* generalize coeff(int) to linear-access xprs
* clarify the access flag bits
* rework api dox in Coeffs.h and util/Constants.h
* improve certain expressions's flags, allowing more vectorization
* fix bug in Block: start(int) and end(int) returned dyn*dyn size
* fix bug in Block: just because the Eval type has packet access
doesn't imply the block xpr should have it too.
* make the conj functor vectorizable: it is just identity in real case,
and complex doesn't use the vectorized path anyway.
* fix bug in Block: a 3x1 block in a 4x4 matrix (all fixed-size)
should not be vectorizable, since in fixed-size we are assuming
the size to be a multiple of packet size. (Or would you prefer
Vector3d to be flagged "packetaccess" even though no packet access
is possible on vectors of that type?)
* rename:
isOrtho for vectors ---> isOrthogonal
isOrtho for matrices ---> isUnitary
* add normalize()
* reimplement normalized with quotient1 functor
- uses the common "Compressed Column Storage" scheme
- supports every unary and binary operators with xpr template
assuming binaryOp(0,0) == 0 and unaryOp(0) = 0 (otherwise a sparse
matrix doesnot make sense)
- this is the first commit, so of course, there are still several shorcommings !
(could come back to redux after it has been vectorized,
and could serve as a starting point for that)
also make the abs2 functor vectorizable (for real types).
packet access, it is not certain that it will bring a performance
improvement: benchmarking needed.
* improve logic choosing slice vectorization.
* fix typo in SSE packet math, causing crash in unaligned case.
* fix bug in Product, causing crash in unaligned case.
* add TEST_SSE3 CMake option.
* make Matrix2f (and similar) vectorized using linear path
* fix a couple of warnings and compilation issues with ICC and gcc 3.3/3.4
(cannot get Transform compiles with gcc 3.3/3.4, see the FIXME)
* use ProductReturnType<>::Type to get the correct Product xpr type
* Product is no longer instanciated for xpr types which are evaluated
* vectorization of "a.transpose() * b" for the normal product (small and fixed-size matrix)
* some cleanning
* removed ArrayBase
** Much better organization
** Fix a few bugs
** Add the ability to unroll only the inner loop
** Add an unrolled path to the Like1D vectorization. Not well tested.
** Add placeholder for sliced vectorization. Unimplemented.
* Rework of corrected_flags:
** improve rules determining vectorizability
** for vectors, the storage-order is indifferent, so we tweak it
to allow vectorization of row-vectors.
* fix compilation in benchmark, and a warning in Transpose.
to optimize matrix-diag and diag-matrix products without
making Product over complicated.
* compilation fixes in Tridiagonalization and HessenbergDecomposition
in the case of 2x2 matrices.
* added an Orientation2D small class with similar interface than Quaternion
(used by Transform to handle 2D and 3D orientations seamlessly)
* added a couple of features in Transform.
them in the ei_traits, so that they're guaranteed even if the user
specified his own non-default flags (like before).
Measured to not make compilation any slower.
flags. This ensures that unless explicitly messed up otherwise,
a Matrix type is equal to its own Eval type. This seriously reduces
the number of types instantiated. Measured +13% compile speed, -7%
binary size.
* Improve doc of Matrix template parameters.
This is the first step towards a non-selfadjoint eigen solver.
Notes:
- We might consider merging Tridiagonalization and Hessenberg toghether ?
- Or we could factorize some code into a Householder class (could also be shared with QR)
which is even better optimized by the compiler.
* Quaternion no longer inherits MatrixBase. Instead it stores the coefficients
using a Matrix<> and provides only relevant methods.
tends to show L2 norm works very well here.
(the legacy implementation is still available via a preprocessor token
to allow further experiments if needed...)
as an argument of a function. Other possibilities for the name could be "end" or "matrix" ??
* various update in Quaternion, in particular I added a lot of FIXME about the API options,
these have to be discussed and fixed.
- get the doc of the flags in Constants right
- finally give up with SEPARATE_MEMBER_PAGES: it triggers too big
Doxygen bugs, and produces too many small pages. So we have one
huge page for MatrixBase at currently 300kb and going up, so the
solution especially for users with low bandwidth will be to provide
an archive of the html documentation.