eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2025-01-12 14:25:16 +08:00

Author	SHA1	Message	Date
Gael Guennebaud	99a625243f	Optimization: added super efficient rowmajor * vector product (and vector * colmajor). It basically performs 4 dot products at once reducing loads of the vector and improving instructions scheduling. With 3 cache friendly algorithms, we now handle all product configurations with outstanding perf for large matrices.	2008-07-13 01:22:54 +00:00
Gael Guennebaud	861d18d553	* Optimization: added a specialization of Block for xpr with DirectAccessBit * some simplifications and fixes in cache friendly products	2008-07-12 22:59:34 +00:00
Gael Guennebaud	b7bd1b3446	Add a very efficient evaluation path for both col-major matrix * vector and vector * row-major products. Currently, it is enabled only is the matrix has DirectAccessBit flag and the product is "large enough". Added the respective unit tests in test/product/cpp.	2008-07-12 12:12:02 +00:00
Benoit Jacob	2b53fd4d53	some performance fixes in Assign.h reported by Gael. Some doc update in Cwise.	2008-07-10 16:15:55 +00:00
Gael Guennebaud	c9b046d5d5	* added optimized paths for matrix-vector and vector-matrix products (using either a cache friendly strategy or re-using dot-product vectorized implementation) * add LinearAccessBit to Transpose	2008-07-09 22:30:18 +00:00
Gael Guennebaud	5f55ab524c	* added a lazyAssign overload skipping .lazy() such that c = (<xpr>).lazy() such that lazyAssign overloads of <xpr> are automatically called (this also reduces assign instansiations)	2008-07-09 13:54:21 +00:00
Gael Guennebaud	783eb6da9b	I forgot that the previous commit needed minor changes outside the bench folder	2008-07-08 17:25:58 +00:00
Benoit Jacob	6f09d3a67d	- many updates after Cwise change - fix compilation in product.cpp with std::complex - fix bug in MatrixBase::operator!=	2008-07-08 07:56:01 +00:00
Benoit Jacob	f5791eeb70	the big Array/Cwise rework as discussed on the mailing list. The new API can be seen in Eigen/src/Core/Cwise.h.	2008-07-08 00:49:10 +00:00
Benoit Jacob	a9d319d44f	* do the ActualPacketAccesBit change as discussed on list * add comment in Product.h about CanVectorizeInner * fix typo in test/product.cpp	2008-07-04 12:43:55 +00:00
Gael Guennebaud	8463b7d3f4	* fix compilation issue in Product * added some tests for product and swap * overload .swap() for dynamic-sized matrix of same size	2008-07-02 16:05:33 +00:00
Gael Guennebaud	9433df83a7	* resurected Flagged::_expression used to optimize m+=(ab).lazy() (equivalent to the GEMM blas routine) added a GEMM benchmark	2008-07-01 16:20:06 +00:00
Gael Guennebaud	37a50fa526	* added an in-place version of inverseProduct which might be twice faster fot small fixed size matrix * added a sparse triangular solver (sparse version of inverseProduct) * various other improvements in the Sparse module	2008-06-29 21:29:12 +00:00
Gael Guennebaud	027818d739	* added innerSize / outerSize functions to MatrixBase * added complete implementation of sparse matrix product (with a little glue in Eigen/Core) * added an exhaustive bench of sparse products including GMM++ and MTL4 => Eigen outperforms in all transposed/density configurations !	2008-06-28 23:07:14 +00:00
Benoit Jacob	55e08f7102	fix breakage from my last commit	2008-06-28 17:15:16 +00:00
Benoit Jacob	844f69e4a9	* update CMakeLists, only build instantiations if TEST_LIB is defined * allow default Matrix constructor in dynamic size, defaulting to (1, 1), this is convenient in mandelbrot example.	2008-06-27 10:53:30 +00:00
Benoit Jacob	6de4871c8c	fix a couple of issues in the new Map.h	2008-06-27 01:42:44 +00:00
Benoit Jacob	e27b2b95cf	* rework Map, allow vectorization * rework PacketMath and DummyPacketMath, make these actual template specializations instead of just overriding by non-template inline functions * introduce ei_ploadt and ei_pstoret, make use of them in Map and Matrix * remove Matrix::map() methods, use Map constructors instead.	2008-06-27 01:22:35 +00:00
Gael Guennebaud	e5d301dc96	various work on the Sparse module: * added some glue to Eigen/Core (SparseBit, ei_eval, Matrix) * add two new sparse matrix types: HashMatrix: based on std::map (for random writes) LinkedVectorMatrix: array of linked vectors (for outer coherent writes, e.g. to transpose a matrix) * add a SparseSetter class to easily set/update any kind of matrices, e.g.: { SparseSetter<MatrixType,RandomAccessPattern> wrapper(mymatrix); for (...) wrapper->coeffRef(rand(),rand()) = rand(); } * automatic shallow copy for RValue * and a lot of mess ! plus: * remove the remaining ArrayBit related stuff * don't use alloca in product for very large memory allocation	2008-06-26 23:22:26 +00:00
Benoit Jacob	c5bd1703cb	change derived classes methods from "private:_method()" to "public:method()" i.e. reimplementing the generic method() from MatrixBase. improves compilation speed by 7%, reduces almost by half the call depth of trivial functions, making gcc errors and application backtraces nicer...	2008-06-26 20:08:16 +00:00
Benoit Jacob	25ba9f377c	* add bench/benchVecAdd.cpp by Gael, fix crash (ei_pload on non-aligned) * introduce packet(int), make use of it in linear vectorized paths --> completely fixes the slowdown noticed in benchVecAdd. * generalize coeff(int) to linear-access xprs * clarify the access flag bits * rework api dox in Coeffs.h and util/Constants.h * improve certain expressions's flags, allowing more vectorization * fix bug in Block: start(int) and end(int) returned dyndyn size fix bug in Block: just because the Eval type has packet access doesn't imply the block xpr should have it too.	2008-06-26 16:06:41 +00:00
Benoit Jacob	5b0da4b778	make use of ei_pmadd in dot-product: will further improve performance on architectures having a packed-mul-add assembly instruction.	2008-06-24 18:08:35 +00:00
Benoit Jacob	3b94436d2f	* vectorize dot product, copying code from sum. * make the conj functor vectorizable: it is just identity in real case, and complex doesn't use the vectorized path anyway. * fix bug in Block: a 3x1 block in a 4x4 matrix (all fixed-size) should not be vectorizable, since in fixed-size we are assuming the size to be a multiple of packet size. (Or would you prefer Vector3d to be flagged "packetaccess" even though no packet access is possible on vectors of that type?) * rename: isOrtho for vectors ---> isOrthogonal isOrtho for matrices ---> isUnitary * add normalize() * reimplement normalized with quotient1 functor	2008-06-24 15:13:00 +00:00
Benoit Jacob	c9560df4a0	* add ei_pdiv intrinsic, make quotient functor vectorizable * add vdw benchmark from Tim's real-world use case	2008-06-23 22:00:18 +00:00
Gael Guennebaud	ac9aa47bbc	optimize linear vectorization both in Assign and Sum (optimal amortized perf)	2008-06-23 15:50:28 +00:00
Gael Guennebaud	ea1990ef3d	add experimental code for sparse matrix: - uses the common "Compressed Column Storage" scheme - supports every unary and binary operators with xpr template assuming binaryOp(0,0) == 0 and unaryOp(0) = 0 (otherwise a sparse matrix doesnot make sense) - this is the first commit, so of course, there are still several shorcommings !	2008-06-23 13:25:22 +00:00
Benoit Jacob	03d19f3bae	quick temporary fix for a perf issue we just identified with vectorization.... now the sum benchmark runs 3x faster with vectorization than without.	2008-06-23 11:23:05 +00:00
Benoit Jacob	dc9206cec5	split sum away from redux and vectorize it. (could come back to redux after it has been vectorized, and could serve as a starting point for that) also make the abs2 functor vectorizable (for real types).	2008-06-23 10:32:48 +00:00
Benoit Jacob	8a967fb17c	* implement slice vectorization. Because it uses unaligned packet access, it is not certain that it will bring a performance improvement: benchmarking needed. * improve logic choosing slice vectorization. * fix typo in SSE packet math, causing crash in unaligned case. * fix bug in Product, causing crash in unaligned case. * add TEST_SSE3 CMake option.	2008-06-22 15:02:05 +00:00
Gael Guennebaud	32c5ea388e	work on rotations in the Geometry module: - convertions are done trough constructors and operator= - added a EulerAngles class	2008-06-21 15:01:49 +00:00
Benoit Jacob	574416b842	Override MatrixBase::eval() since matrices don't need to be evaluated, it is enough to just read them.	2008-06-20 15:26:39 +00:00
Gael Guennebaud	54238961d6	* added a pseudo expression Array giving access to: - matrix-scalar addition/subtraction operators, e.g.: m.array() += 0.5; - matrix/matrix comparison operators, e.g.: if (m1.array() < m2.array()) {} * fix compilation issues with Transform and gcc < 4.1	2008-06-20 12:38:03 +00:00
Gael Guennebaud	e735692e37	move "enum" back to "const int" int ei_assign_impl: in fact, casting enums to int is enough to get compile time constants with ICC.	2008-06-20 07:10:50 +00:00
Gael Guennebaud	fb4a151982	* more cleaning in Product * make Matrix2f (and similar) vectorized using linear path * fix a couple of warnings and compilation issues with ICC and gcc 3.3/3.4 (cannot get Transform compiles with gcc 3.3/3.4, see the FIXME)	2008-06-19 23:00:51 +00:00
Gael Guennebaud	82c3cea1d5	* refactoring of Product: * use ProductReturnType<>::Type to get the correct Product xpr type * Product is no longer instanciated for xpr types which are evaluated * vectorization of "a.transpose() * b" for the normal product (small and fixed-size matrix) * some cleanning * removed ArrayBase	2008-06-19 17:33:57 +00:00
Gael Guennebaud	5dbfed1902	fix two bugs dicovered by the previous commit.	2008-06-16 16:39:58 +00:00
Benoit Jacob	bb1f4e44f1	* Block: row and column expressions in the inner direction now have the Like1D flag. * Big renaming: packetCoeff ---> packet VectorizableBit ---> PacketAccessBit Like1DArrayBit ---> LinearAccessBit	2008-06-16 14:54:31 +00:00
Benoit Jacob	9857764ae7	aaargh.	2008-06-16 11:20:29 +00:00
Benoit Jacob	478bfaf228	fix bug in computation of unrolling limit: div instead of mul	2008-06-16 11:18:59 +00:00
Benoit Jacob	c905b31b42	* Big rework of Assign.h: Much better organization Fix a few bugs Add the ability to unroll only the inner loop Add an unrolled path to the Like1D vectorization. Not well tested. ** Add placeholder for sliced vectorization. Unimplemented. * Rework of corrected_flags: improve rules determining vectorizability for vectors, the storage-order is indifferent, so we tweak it to allow vectorization of row-vectors. * fix compilation in benchmark, and a warning in Transpose.	2008-06-16 10:49:44 +00:00
Gael Guennebaud	bc0c7c57ed	Added an extensible mechanism to support any kind of rotation representation in Transform via the template static class ToRotationMatrix. Added a lightweight AngleAxis class (similar to Rotation2D).	2008-06-15 17:22:41 +00:00
Gael Guennebaud	0ee6b08128	* split Product to a DiagonalProduct template specialization to optimize matrix-diag and diag-matrix products without making Product over complicated. * compilation fixes in Tridiagonalization and HessenbergDecomposition in the case of 2x2 matrices. * added an Orientation2D small class with similar interface than Quaternion (used by Transform to handle 2D and 3D orientations seamlessly) * added a couple of features in Transform.	2008-06-15 11:54:18 +00:00
Gael Guennebaud	fbbd8afe30	Started a Transform class in the Geometry module to represent homography. Fix indentation in Quaternion.h	2008-06-15 08:33:44 +00:00
Gael Guennebaud	4af7089ab8	* Added a generalized eigen solver for the selfadjoint case. (as new members to SelfAdjointEigenSolver) The QR module now depends on Cholesky. * Fix Transpose to correctly preserve the *TriangularBit.	2008-06-14 19:42:12 +00:00
Gael Guennebaud	f07f907810	Add QR and Cholesky module instantiations in the lib. To try it with the unit tests set the cmake variable TEST_LIB to ON.	2008-06-14 13:02:41 +00:00
Benoit Jacob	53289a8b64	* even though the _Flags default to the corrected value, still correct them in the ei_traits, so that they're guaranteed even if the user specified his own non-default flags (like before). Measured to not make compilation any slower.	2008-06-13 08:09:48 +00:00
Benoit Jacob	c90c77051f	* make the _Flags template parameter of Matrix default to the corrected flags. This ensures that unless explicitly messed up otherwise, a Matrix type is equal to its own Eval type. This seriously reduces the number of types instantiated. Measured +13% compile speed, -7% binary size. * Improve doc of Matrix template parameters.	2008-06-13 07:53:45 +00:00
Gael Guennebaud	e3fac69f19	Added a Hessenberg decomposition class for both real and complex matrices. This is the first step towards a non-selfadjoint eigen solver. Notes: - We might consider merging Tridiagonalization and Hessenberg toghether ? - Or we could factorize some code into a Householder class (could also be shared with QR)	2008-06-08 15:03:23 +00:00
Gael Guennebaud	4dd57b585d	* rewrite of the QR decomposition: - works for complex - allows direct access to the matrix R * removed the scale by the matrix dimensions in MatrixBase::isMuchSmallerThan(scalar)	2008-06-07 22:47:11 +00:00
Gael Guennebaud	eb7b7b2cfc	* remove Cross product expression: MatrixBase::cross() now returns a temporary which is even better optimized by the compiler. * Quaternion no longer inherits MatrixBase. Instead it stores the coefficients using a Matrix<> and provides only relevant methods.	2008-06-07 13:18:29 +00:00
Gael Guennebaud	6998037930	* move some compile time "if" to their respective unroller (assign and dot) * fix a couple of compilation issues when unrolling is disabled * reduce default unrolling limit to a more reasonable value	2008-06-07 01:07:48 +00:00
Gael Guennebaud	a172385720	Updated fuzzy comparisons to use L2 norm as all my experiments tends to show L2 norm works very well here. (the legacy implementation is still available via a preprocessor token to allow further experiments if needed...)	2008-06-06 18:37:53 +00:00
Gael Guennebaud	8769bfd9aa	fix a compilation issue in non debug mode	2008-06-06 14:11:26 +00:00
Benoit Jacob	869394ee8b	fix some compile errors with gcc 4.3, some warnings, some documentation	2008-06-06 13:10:00 +00:00
Gael Guennebaud	2126baf9dc	add an optimized path for the tridiagonalization of a 3x3 matrix. (useful for plane fitting, and covariance analysis of 3D data)	2008-06-04 13:41:32 +00:00
Gael Guennebaud	48262b9734	added a static assertion mechanism (see notes in Core/util/StaticAssert.h for details)	2008-06-04 11:16:11 +00:00
Gael Guennebaud	a0cff1a295	fix eigenvectors computations :)	2008-06-03 18:03:55 +00:00
Gael Guennebaud	915587d03d	* add CommaInitializer::finished to allow the use of (Matrix3() << v0, v1, v2).finished() as an argument of a function. Other possibilities for the name could be "end" or "matrix" ?? * various update in Quaternion, in particular I added a lot of FIXME about the API options, these have to be discussed and fixed.	2008-06-03 15:50:09 +00:00
Gael Guennebaud	196f38f5db	improved Quaternion class: - Euler angles and angle axis conversions, - stable spherical interpolation - documentation - update the respective unit test	2008-06-03 13:43:29 +00:00
Gael Guennebaud	a9cf229e15	add a geometry unit test and fix a couple of typo in Quaternion.h	2008-06-03 07:32:12 +00:00
Benoit Jacob	8de4d92b70	- get the doc of the enums in MatrixBase right - get the doc of the flags in Constants right - finally give up with SEPARATE_MEMBER_PAGES: it triggers too big Doxygen bugs, and produces too many small pages. So we have one huge page for MatrixBase at currently 300kb and going up, so the solution especially for users with low bandwidth will be to provide an archive of the html documentation.	2008-06-03 02:06:18 +00:00
Gael Guennebaud	366971bea4	* start of the Geometry module with a cross product and quaternion expressions (haven't tried them yet) * applied the meta selector rule to MatrixBase::swap()	2008-06-02 22:58:36 +00:00
Benoit Jacob	75de41a00b	big changes in Doxygen configuration; work around bug with doxygen parsing of initialized enum values showing the last word the initializer instead of the actual enum value's name; add some more docs.	2008-06-02 20:08:37 +00:00
Benoit Jacob	ac88feebb7	work around Doxygen bug triggered by r814874, which caused many classes to disappear from the docs.	2008-06-02 19:29:23 +00:00
Gael Guennebaud	54ae2ac7e8	Added the computation of eigen vectors in the symmetric eigen solver. However the eigen vectors are not correct yet, but I really cannot find the problem.	2008-06-02 12:52:08 +00:00
Benoit Jacob	3b0523041a	since m*m.adjoint() is positive, so are its eigenvalues, so no need for cwiseAbs()	2008-06-02 04:45:02 +00:00
Benoit Jacob	0444e3601a	- add MatrixBase::eigenvalues() convenience method - add MatrixBase::matrixNorm(); in the non-selfadjoint case, we reduce to the selfadjoint case by using the "C-identity" a.k.a. norm of x = sqrt(norm of x x.adjoint())	2008-06-02 04:42:45 +00:00
Benoit Jacob	92b7e2d6a1	fix a couple of issues making the eigensolver test compile and run without aborting on an assert. Had to fix a stupid bug in Block -- very strange we hadn't hit it before. However the test still fails.	2008-06-02 02:06:33 +00:00
Gael Guennebaud	001b01a290	Rewrite from scratch of the eigen solver for symmetric matrices which now supports selfadjoint matrix. The implementation follows Golub's famous book.	2008-06-02 00:30:26 +00:00
Gael Guennebaud	06752b2b77	* added a Tridiagonalization class for selfadjoint matrices * added MatrixBase::real() * added the ability to extract a selfadjoint matrix from the lower or upper part of a matrix, e.g.: m.extract<Upper\|SelfAdjoint>() will ignore the strict lower part and return a selfadjoint. This is compatible with ZeroDiag and UnitDiag.	2008-06-01 17:20:18 +00:00
Benoit Jacob	dc5fd8dfff	meagre outcome for so much time spent! * fix inverse() bug discovered by Gael's test * fix warnings introduced by the new Diagonal stuff * update Doxyfile to v1.5.6	2008-06-01 03:36:49 +00:00
Gael Guennebaud	64169389ed	added an optional Eigen2 dynamic library. it allows the possiblity to save some compilation time by linking to it and defining the token EIGEN_EXTERN_INSTANCIATIONS	2008-05-31 23:21:49 +00:00
Gael Guennebaud	fcf4457b78	added optimized matrix times diagonal matrix product via Diagonal flag shortcut.	2008-05-31 21:35:11 +00:00
Gael Guennebaud	310f7aa096	moved purely "array" related stuff to a new module Array. This include: - cwise Pow,Sin,Cos,Exp... - cwise Greater and other comparison operators - .any(), .all() and partial reduction - random	2008-05-31 18:11:48 +00:00
Gael Guennebaud	a2f71f9d7e	updated EigenSolver to use .coeff / .coeffRef	2008-05-31 16:31:10 +00:00
Gael Guennebaud	c9fb248c36	simply a bit the basic product moving dynamic loops to the corresponding special case of the unrollers. the latter ones are therefore re-named *product_impl.	2008-05-31 15:06:26 +00:00
Gael Guennebaud	f5e599e489	* replace compile-time-if by meta-selector in Assign.h as it speed up compilation. * fix minor typo introduced in the previous commit	2008-05-31 14:42:07 +00:00
Gael Guennebaud	e2ac5d244e	Added ArrayBit to get the ability to manipulate a Matrix like a simple scalar. In particular this flag changes the behavior of operator* to a coeff wise product.	2008-05-29 22:33:07 +00:00
Benoit Jacob	b501e08d81	now the unit-tests (hence all of Eigen) don't depend on Qt at all anymore.	2008-05-29 03:37:16 +00:00
Benoit Jacob	486fdb26a1	many small fixes and documentation improvements, this should be alpha5.	2008-05-29 03:12:30 +00:00
Gael Guennebaud	c1559d3079	* updated the assignement operator macro so that overloads in MatrixBase work * removed product_selector and cleaned Product.h a bit * cleaned Assign.h a bit	2008-05-28 22:56:19 +00:00
Gael Guennebaud	8711e26c8a	* change Flagged to take into account NestByValue only * bugfix in Assign and cache friendly product (weird that worked before) * improved argument evaluation in Product	2008-05-28 22:11:47 +00:00
Gael Guennebaud	73084dc754	* added _coeffRef members in NestedByValue added ConjugateReturnType and AdjointReturnType that are type-defined to Derived& and Transpose<Derived> if the scalar type is not complex: this avoids abusive copies in the cache friendly Product	2008-05-28 09:09:18 +00:00
Benoit Jacob	f54760c889	hehe, the complicated nesting scheme in Flagged in the previous commit was a sign that we were doing something wrong. In fact, having NestByValue as a special case of Flagged was wrong, and the previous commit, while not buggy, was inefficient because then when the resulting NestByValue xpr was nested -- hence copied -- the original xpr which was already nested by value was copied again; hence instead of 1 copy we got 3 copies. The solution was to ressuscitate the old Temporary.h (renamed NestByValue.h) as it was the right approach.	2008-05-28 05:14:16 +00:00
Benoit Jacob	aebecae510	* find the proper way of nesting the expression in Flagged: finally that's more subtle than just using ei_nested, because when flagging with NestByValueBit we want to store the expression by value already, regardless of whether it already had the NestByValueBit set. * rename temporary() ----> nestByValue() * move the old Product.h to disabled/, replace by what was ProductWIP.h * tweak -O and -g flags for tests and examples * reorder the tests -- basic things go first * simplifications, e.g. in many methoeds return derived() and count on implicit casting to the actual return type. * strip some not-really-useful stuff from the heaviest tests	2008-05-28 04:38:16 +00:00
Gael Guennebaud	559233c73e	* fix the QR module to use extract/part instead of the previous triangular stuff * added qr and eigensolver tests * fix a compilation warning in Matrix copy constructor	2008-05-27 09:16:27 +00:00
Benoit Jacob	5aa00f6870	part 2 of big change: rename Triangular.h -> Extract.h (svn required to commit that separately)	2008-05-27 05:50:36 +00:00
Benoit Jacob	953efdbfe7	- introduce Part and Extract classes, splitting and extending the former Triangular class - full meta-unrolling in Part - move inverseProduct() to MatrixBase - compilation fix in ProductWIP: introduce a meta-selector to only do direct access on types that support it. - phase out the old Product, remove the WIP_DIRTY stuff. - misc renaming and fixes	2008-05-27 05:47:30 +00:00
Gael Guennebaud	8f1fc80a77	some documentation fixes (Cwise* and Cholesky)	2008-05-22 16:31:00 +00:00
Gael Guennebaud	94e1629a1b	* improved product performance: - fallback to normal product for small dynamic matrices - overloaded "c += (a * b).lazy()" to avoid the expensive and useless temporary and setZero() in such very common cases. * fix a couple of issues with the flags	2008-05-22 14:51:25 +00:00
Gael Guennebaud	9ab6e186eb	remove Like1DArrayBit in Transpose	2008-05-22 12:25:11 +00:00
Gael Guennebaud	c6789a279c	Fix compilation issues with MSVC and NVCC. Added a few typedef of complex return types in MatrixBase (Needed by MSVC)	2008-05-15 09:40:11 +00:00
Benoit Jacob	5da60897ab	Introduce generic Flagged xpr, remove already Lazy.h and Temporary.h Rename DefaultLostFlagMask --> HerediraryBits	2008-05-14 08:20:15 +00:00
Gael Guennebaud	fd2e9e5c3c	* Clean a bit the eigenvalue solver: if the matrix is known to be selfadjoint at compile time, then it returns real eigenvalues. * Fix a couple of bugs with the new product.	2008-05-13 07:40:25 +00:00
Benoit Jacob	3eccfd1a78	-fix certain #includes -fix CMakeLists, public headers weren't getting installed	2008-05-12 21:15:17 +00:00
Gael Guennebaud	4317fad869	* Added several cast to int of the enums (needed for some compilers) * Fix a mistake in CwiseNullary. * Added a CoreDeclarions header that declares only the forward declarations and related basic stuffs.	2008-05-12 18:09:30 +00:00
Benoit Jacob	678f18fce4	put inline keywords everywhere appropriate. So we don't need anymore to pass -finline-limit=1000 to gcc to get good performance. By the way some cleanup.	2008-05-12 17:34:46 +00:00
Gael Guennebaud	45cda6704a	* Draft of a eigenvalues solver (does not support complex and does not re-use the QR decomposition) * Rewrite the cache friendly product to have only one instance per scalar type ! This significantly speeds up compilation time and reduces executable size. The current drawback is that some trivial expressions might be evaluated like conjugate or negate. * Renamed "cache optimal" to "cache friendly" * Added the ability to directly access matrix data of some expressions via: - the stride()/_stride() methods - DirectAccessBit flag (replace ReferencableBit)	2008-05-12 10:23:09 +00:00
Benoit Jacob	dca416cace	move arch-specific code to arch/SSE and arch/AltiVec subdirs. rename the noarch PacketMath.h to DummyPacketMath.h	2008-05-12 08:30:42 +00:00
Benoit Jacob	3562b01105	* Give Konstantinos a copyright line * Fix compilation of Inverse.h with vectorisation * Introduce EIGEN_GNUC_AT_LEAST(x,y) macro doing future-proof (e.g. gcc v5.0) check * Only use ProductWIP if vectorisation is enabled * rename EIGEN_ALWAYS_INLINE -> EIGEN_INLINE with fall-back to inline keyword * some cleanup/indentation	2008-05-12 08:12:40 +00:00
Benoit Jacob	4f6d7abc87	only include SSE3 headers if compiling with SSE3 support	2008-05-08 09:15:16 +00:00
Gael Guennebaud	bf5326c3ca	* Added ReferencableBit flag to known if coeffRef is available. (needed by the new product implementation) * Make the packet* members template to support aligned and unaligned access. This makes Block vectorizable. Combined with ReferencableBit, we should be able to determine at runtime (in some specific cases) if an aligned vectorization is possible or not. * Improved the new product implementation to robustly handle all cases, it now passes all the tests. * Renamed the packet version ei_predux to ei_preduxp to avoid name collision.	2008-05-08 08:12:52 +00:00
Gael Guennebaud	64c49de7ba	* split PacketMath.h to SSE and Altivec specific files * improved the flexibility of the new product implementation, now all sizes seems to be properly handled.	2008-05-05 17:19:47 +00:00
Gael Guennebaud	46fa4c713f	* Started support for unaligned vectorization. * Introduce a new highly optimized matrix-matrix product for large matrices. The code is still highly experimental and it is activated only if you define EIGEN_WIP_PRODUCT at compile time. Currently the third dimension of the product must be a factor of the packet size (x4 for floats) and the right handed side matrix must be column major. Moreover, currently c = ab; actually computes c += ab !! Therefore, the code is provided for experimentation purpose only ! These limitations will be fixed soon or later to become the default product implementation.	2008-05-05 10:23:29 +00:00
Benoit Jacob	8c6007f80e	* Patch by Konstantinos Margaritis: AltiVec vectorization. * Fix several warnings, temporarily disable determinant test.	2008-05-03 12:21:23 +00:00
Gael Guennebaud	0545df2149	slighly improved the cache friendly product to use mul-add only	2008-05-03 10:01:30 +00:00
Gael Guennebaud	a6655dd91a	added packet mul-add function (ei_pmad) and updated Product to use it. this change nothing for current SSE architecture but might be helpful for altivec/cell and up comming AMD processors.	2008-05-03 00:45:08 +00:00
Gael Guennebaud	102e029dad	Removed ei_pload1, use posix_memalign to allocate aligned memory, and make Product ok when only one side is vectorizable (and the product is still vectorized)	2008-05-02 13:30:12 +00:00
Benoit Jacob	890a8de962	Make products always eval into expressions. Improves performance in benchmark. Still not as fasts as explicit eval(), strangely.	2008-05-02 08:53:23 +00:00
Gael Guennebaud	ef5b20bc50	fix flag and cost computations for nested expressions	2008-05-01 18:58:30 +00:00
Gael Guennebaud	5588def0cf	nullary xpr are now vectorized	2008-05-01 14:28:53 +00:00
Gael Guennebaud	02f1615d2a	Enable vectorization of product with dynamic matrices, extended cache optimal product to work in any row/column major situations, and a few bugfixes (forgot to add the Cholesky header, vectorization of CwiseBinary)	2008-05-01 13:53:05 +00:00
Gael Guennebaud	6486991ac3	some cleaning in Cholesky and removed evil ei_sqrt of complex	2008-04-27 18:57:28 +00:00
Gael Guennebaud	64bacf1c3f	* added ei_sqrt for complex * updated Cholesky to support complex * correct result_type for abs and abs2 functors	2008-04-27 14:05:40 +00:00
Gael Guennebaud	4ffffa670e	added Cholesky module	2008-04-27 10:57:32 +00:00
Gael Guennebaud	1ec2d21ca5	Fixed a couple of issues introduced in previous commits. Added a test for Triangular.	2008-04-26 20:28:27 +00:00
Gael Guennebaud	b4c974d059	Added triangular assignement, e.g.: m.upper() = a+b; only updates the upper triangular part of m. Note that: m = (a+b).upper(); updates all coefficients of m (but half of the additions will be skiped) Updated back/forward substitution to better use Eigen's capability.	2008-04-26 19:20:26 +00:00
Gael Guennebaud	4c92150676	Added Triangular expression to extract upper or lower (strictly or not) part of a matrix. Triangular also provide an optimised method for forward and backward substitution. Further optimizations regarding assignments and products might come later. Updated determinant() to take into account triangular matrices. Started the QR module with a QR decompostion algorithm. Help needed to build a QR algorithm (eigen solver) based on it.	2008-04-26 18:26:05 +00:00
Gael Guennebaud	62bf0bbd59	fix a bug in determinant of 4x4 matrices and a small type issue in Inverse	2008-04-26 08:56:52 +00:00
Gael Guennebaud	6f2c72fb53	Various fixes in: - vector to vector assign - PartialRedux - Vectorization criteria of Product - returned type of normalized - SSE integer mul	2008-04-25 23:10:37 +00:00
Gael Guennebaud	a451835bce	Make the explicit vectorization much more flexible: - support dynamic sizes - support arbitrary matrix size when the matrix can be seen as a 1D array (except for fixed size matrices where the size in Bytes must be a factor of 16, this is to allow compact storage of a vector of matrices) Note that the explict vectorization is still experimental and far to be completely tested.	2008-04-25 15:46:18 +00:00
Gael Guennebaud	30d47b5250	forgot to add a file in the previous commit	2008-04-24 20:25:55 +00:00
Gael Guennebaud	9385793f71	Fix a couple of issue with the vectorization. In particular, default ei_p* functions are provided to handle not suported types seemlessly. Added a generic null-ary expression with null-ary functors. They replace Zero, Ones, Identity and Random.	2008-04-24 18:35:39 +00:00
Benoit Jacob	6ae037dfb5	give up on OpenMP... for now	2008-04-18 07:57:46 +00:00
Benoit Jacob	acfd6f3bda	- add _packetCoeff() to Inverse, allowing vectorization. - let Inverse take template parameter MatrixType instead of ExpressionType, in order to reduce executable code size when taking inverses of xpr's. - introduce ei_corrected_matrix_flags : the flags template parameter to the Matrix class is only a suggestion. This is also useful in ei_eval.	2008-04-16 07:18:27 +00:00
Benoit Jacob	43e2bc14fe	+5% optimization in 4x4 inverse: -only evaluate block expressions for which that is beneficial -don't check for invertibility unless requested	2008-04-15 20:39:27 +00:00
Benoit Jacob	6747b45ae7	for 4x4 matrices implement the special algorithm that Markos proposed, falling back to the general algorithm in the bad case.	2008-04-15 20:15:36 +00:00
Benoit Jacob	2a86f052a5	- optimized determinant calculations for small matrices (size <= 4) (only 30 muls for size 4) - rework the matrix inversion: now using cofactor technique for size<=3, so the ugly unrolling is only used for size 4 anymore, and even there I'm looking to get rid of it.	2008-04-14 17:07:12 +00:00
Benoit Jacob	9789c04467	when evaluating an xpr, the result can now be vectorizable even if the xpr itself wasn't vectorizable.	2008-04-14 08:55:12 +00:00
Benoit Jacob	ea3ccb1e8c	* Start of the LU module, with matrix inversion already there and fully optimized. * Even if LargeBit is set, only parallelize for large enough objects (controlled by EIGEN_PARALLELIZATION_TRESHOLD).	2008-04-14 08:20:24 +00:00
Benoit Jacob	ab4046970b	* Add fixed-size template versions of corner(), start(), end(). * Use them to write an unrolled path in echelon.cpp, as an experiment before I do this LU module. * For floating-point types, make ei_random() use an amplitude of 1.	2008-04-12 17:37:27 +00:00
Benoit Jacob	dcebc46cdc	- cleaner use of OpenMP (no code duplication anymore) using a macro and _Pragma. - use OpenMP also in cacheOptimalProduct and in the vectorized paths as well - kill the vector assignment unroller. implement in operator= the logic for assigning a row-vector in a col-vector. - CMakeLists support for building tests/examples with -fopenmp and/or -msse2 - updates in bench/, especially replace identity() by ones() which prevents underflows from perturbing bench results.	2008-04-11 14:28:42 +00:00
Benoit Jacob	7bee90a62a	Merge Gael's experimental OpenMP parallelization support into Assign.h.	2008-04-11 08:18:47 +00:00
Gael Guennebaud	187b1543ce	added a vectorized version of Product::_cacheOptimalProduct, added the possibility to disable the vectorization using EIGEN_DONT_VECTORIZE (some architectures has SSE support by default)	2008-04-10 12:34:22 +00:00
Benoit Jacob	613c49b475	* add typedefs for matrices/vectors with LargeBit * add -pedantic to CXXFLAGS * cleanup intricated expressions with && and \|\| which gave warnings because of "missing" parentheses * fix compile error in NumTraits, apparently discovered by -pedantic	2008-04-10 10:33:50 +00:00
Benoit Jacob	ca448d2537	split those files in util/ some more renaming	2008-04-10 09:41:13 +00:00
Benoit Jacob	9d8876ce82	* rename XprCopy -> Nested * rename OperatorEquals -> Assign * move Util.h and FwDecl.h to a util/ subdir	2008-04-10 09:01:28 +00:00
Gael Guennebaud	212da8ffe0	fix priority operator bugs in the computation of the VectorizableBit flag, now benchmark.cpp is properly vectorized	2008-04-09 18:24:13 +00:00
Gael Guennebaud	8f957564ec	a better bugfix in ei_matrix_operator_equals_packet_unroller	2008-04-09 18:04:26 +00:00
Gael Guennebaud	d95d952e92	bugfix in ei_matrix_operator_equals_packet_unroller	2008-04-09 17:44:59 +00:00
Gael Guennebaud	1985fb0551	Added initial experimental support for explicit vectorization. Currently only the following platform/operations are supported: - SSE2 compatible architecture - compiler compatible with intel's SSE2 intrinsics - float, double and int data types - fixed size matrices with a storage major dimension multiple of 4 (or 2 for double) - scalar-matrix product, component wise: +,-,,min,max - matrix-matrix product only if the left matrix is vectorizable and column major or the right matrix is vectorizable and row major, e.g.: a.transpose() b is not vectorized with the default column major storage. To use it you must define EIGEN_VECTORIZE and EIGEN_INTEL_PLATFORM.	2008-04-09 12:31:55 +00:00
Benoit Jacob	4920f2011e	finish making use of CoeffReadCost and the new XprCopy everywhere seems appropriate to me.	2008-04-08 14:15:01 +00:00
Benoit Jacob	371d302efb	- merge ei_xpr_copy and ei_eval_if_needed_before_nesting - make use of CoeffReadCost to determine when to unroll the loops, for now only in Product.h and in OperatorEquals.h performance remains the same: generally still not as good as before the big changes.	2008-04-06 18:01:03 +00:00
Benoit Jacob	30ec34de36	fix compilation (finish removal of EIGEN_UNROLLED_LOOPS)	2008-04-05 14:20:30 +00:00
Benoit Jacob	61e58cf602	fixes as discussed with Gael on IRC. Mainly, in Fuzzy.h, and Dot.h, use ei_xpr_copy to evaluate args when needed. Had to introduce an ugly trick with ei_unref as when the XprCopy type is a reference one can't directly access member typedefs such as Scalar.	2008-04-05 14:15:02 +00:00
Gael Guennebaud	b4a156671f	* make use of the EvalBeforeNestingBit and EvalBeforeAssigningBit in ei_xpr_copy and operator=, respectively. * added Matrix::lazyAssign() when EvalBeforeAssigningBit must be skipped (mainly internal use only) * all expressions are now stored by const reference * added Temporary xpr: .temporary() must be called on any temporary expression not directly returned by a function (mainly internal use only) * moved all functors in the Functors.h header * added some preliminaries stuff for the explicit vectorization	2008-04-05 11:10:54 +00:00
Gael Guennebaud	048910caae	* added cwise comparisons * added "all" and "any" special redux operators * added support bool matrices * added support for cost model of STL functors via ei_functor_traits (By default ei_functor_traits query the functor member Cost)	2008-04-03 18:13:27 +00:00
Benoit Jacob	249dc4f482	current state of the mess. One line fails in the tests, and useless copies are made when evaluating nested expressions. Changes: - kill LazyBit, introduce EvalBeforeNestingBit and EvalBeforeAssigningBit - product and random don't evaluate immediately anymore - eval() always evaluates - change the value of Dynamic to some large positive value, in preparation of future simplifications	2008-04-03 16:54:19 +00:00
Benoit Jacob	b8900d0b80	More clever evaluation of arguments: now it occurs in earlier, in operator, before the Product<> type is constructed. This resets template depth on each intermediate evaluation, and gives simpler code. Introducing ei_eval_if_expensive<Derived, n> which evaluates Derived if it's worth it given that each of its coeffs will be accessed n times. Operator uses this with adequate values of n to evaluate args exactly when needed.	2008-04-03 14:17:56 +00:00
Gael Guennebaud	4448f2620d	fix a compilation issue with gcc-3.3 and ei_result_of	2008-04-03 12:39:39 +00:00

1 2 3 4 5 ...

318 Commits