eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2024-12-21 07:19:46 +08:00

Author	SHA1	Message	Date
Gael Guennebaud	8233de8b69	various minor updates in the benchmark suite like non inlining of some functions as well as the experimental C code used to design efficient eigen's matrix vector products.	2008-07-12 12:14:08 +00:00
Gael Guennebaud	b7bd1b3446	Add a very efficient evaluation path for both col-major matrix * vector and vector * row-major products. Currently, it is enabled only is the matrix has DirectAccessBit flag and the product is "large enough". Added the respective unit tests in test/product/cpp.	2008-07-12 12:12:02 +00:00
Gael Guennebaud	6f71ef8277	resurrected tvmet, added mt4, intel's MKL and handcoded vectorized backends in the benchmark suite	2008-07-10 18:28:50 +00:00
Benoit Jacob	2b53fd4d53	some performance fixes in Assign.h reported by Gael. Some doc update in Cwise.	2008-07-10 16:15:55 +00:00
Gael Guennebaud	7b4c6b8862	in BTL: a specific bench/action can be selected at runtime, e.g.: BTL_CONFIG="-a ata" ctest -V -R eigen run the all benchmarks having "ata" in their name for all libraries matching the regexp "eigen"	2008-07-09 22:35:11 +00:00
Gael Guennebaud	c9b046d5d5	* added optimized paths for matrix-vector and vector-matrix products (using either a cache friendly strategy or re-using dot-product vectorized implementation) * add LinearAccessBit to Transpose	2008-07-09 22:30:18 +00:00
Benoit Jacob	25904802bc	raah, results were corrupted by overflow. Now slice vectorization is about a +25% speedup which is still nice as i expected zero or even negative benefit.	2008-07-09 16:46:26 +00:00
Benoit Jacob	8f21a5e862	add benchmark for slice vectorization... expected it to be little or zero benefit... turns out to be 20x speedup. Something is wrong.	2008-07-09 16:43:11 +00:00
Gael Guennebaud	28539e7597	imported a reworked version of BTL (Benchmark for Templated Libraries). the modifications to initial code follow: * changed build system from plain makefiles to cmake * added eigen2 (4 versions: vec/novec and fixed/dynamic), GMM++, MTL4 interfaces * added "transposed matrix * vector" product action * updated blitz interface to use condensed products instead of hand coded loops * removed some deprecated interfaces * changed default storage order to column major for all libraries * new generic bench timer strategy which is supposed to be more accurate * various code clean-up	2008-07-09 14:04:48 +00:00
Gael Guennebaud	5f55ab524c	* added a lazyAssign overload skipping .lazy() such that c = (<xpr>).lazy() such that lazyAssign overloads of <xpr> are automatically called (this also reduces assign instansiations)	2008-07-09 13:54:21 +00:00
Gael Guennebaud	783eb6da9b	I forgot that the previous commit needed minor changes outside the bench folder	2008-07-08 17:25:58 +00:00
Gael Guennebaud	77a622f2bb	add Cholesky and eigensolver benchmark	2008-07-08 17:20:17 +00:00
Benoit Jacob	6f09d3a67d	- many updates after Cwise change - fix compilation in product.cpp with std::complex - fix bug in MatrixBase::operator!=	2008-07-08 07:56:01 +00:00
Benoit Jacob	f5791eeb70	the big Array/Cwise rework as discussed on the mailing list. The new API can be seen in Eigen/src/Core/Cwise.h.	2008-07-08 00:49:10 +00:00
Gael Guennebaud	c910c517b3	fix issues in previously added additionnal product tests	2008-07-06 19:02:03 +00:00
Benoit Jacob	a9d319d44f	* do the ActualPacketAccesBit change as discussed on list * add comment in Product.h about CanVectorizeInner * fix typo in test/product.cpp	2008-07-04 12:43:55 +00:00
Gael Guennebaud	8463b7d3f4	* fix compilation issue in Product * added some tests for product and swap * overload .swap() for dynamic-sized matrix of same size	2008-07-02 16:05:33 +00:00
Gael Guennebaud	9433df83a7	* resurected Flagged::_expression used to optimize m+=(ab).lazy() (equivalent to the GEMM blas routine) added a GEMM benchmark	2008-07-01 16:20:06 +00:00
Benoit Jacob	95549007b3	* fix error in divergence test, now it is even faster * add comments in render() in case anyone ever reads that :P	2008-07-01 14:23:01 +00:00
Benoit Jacob	a356ebd47d	interleaved rendering balances the load better	2008-07-01 14:12:32 +00:00
Benoit Jacob	56d03f181e	* multi-threaded rendering * increased number of iterations, with more iterations done before testing divergence. results in x2 speedup from vectorization.	2008-07-01 12:01:58 +00:00
Benoit Jacob	cacf986a7f	- use double precision to store the position / zoom / other stuff - some temporary fix to get a +50% improvement from vectorization until we have vectorisation for comparisons and redux	2008-06-30 07:33:08 +00:00
Gael Guennebaud	37a50fa526	* added an in-place version of inverseProduct which might be twice faster fot small fixed size matrix * added a sparse triangular solver (sparse version of inverseProduct) * various other improvements in the Sparse module	2008-06-29 21:29:12 +00:00
Benoit Jacob	fbdecf09e1	fix little bug in computation of max_iter	2008-06-29 12:20:07 +00:00
Benoit Jacob	97a1038653	improve greatly mandelbrot demo: - much better coloring - determine max number of iterations and choice between float and double at runtime based on zoom level - do draft renderings with increasing resolution before final rendering	2008-06-29 12:04:00 +00:00
Gael Guennebaud	027818d739	* added innerSize / outerSize functions to MatrixBase * added complete implementation of sparse matrix product (with a little glue in Eigen/Core) * added an exhaustive bench of sparse products including GMM++ and MTL4 => Eigen outperforms in all transposed/density configurations !	2008-06-28 23:07:14 +00:00
Benoit Jacob	6917be9113	add mandelbrot demo	2008-06-28 20:33:47 +00:00
Benoit Jacob	55e08f7102	fix breakage from my last commit	2008-06-28 17:15:16 +00:00
Benoit Jacob	844f69e4a9	* update CMakeLists, only build instantiations if TEST_LIB is defined * allow default Matrix constructor in dynamic size, defaulting to (1, 1), this is convenient in mandelbrot example.	2008-06-27 10:53:30 +00:00
Benoit Jacob	6de4871c8c	fix a couple of issues in the new Map.h	2008-06-27 01:42:44 +00:00
Benoit Jacob	e27b2b95cf	* rework Map, allow vectorization * rework PacketMath and DummyPacketMath, make these actual template specializations instead of just overriding by non-template inline functions * introduce ei_ploadt and ei_pstoret, make use of them in Map and Matrix * remove Matrix::map() methods, use Map constructors instead.	2008-06-27 01:22:35 +00:00
Gael Guennebaud	e5d301dc96	various work on the Sparse module: * added some glue to Eigen/Core (SparseBit, ei_eval, Matrix) * add two new sparse matrix types: HashMatrix: based on std::map (for random writes) LinkedVectorMatrix: array of linked vectors (for outer coherent writes, e.g. to transpose a matrix) * add a SparseSetter class to easily set/update any kind of matrices, e.g.: { SparseSetter<MatrixType,RandomAccessPattern> wrapper(mymatrix); for (...) wrapper->coeffRef(rand(),rand()) = rand(); } * automatic shallow copy for RValue * and a lot of mess ! plus: * remove the remaining ArrayBit related stuff * don't use alloca in product for very large memory allocation	2008-06-26 23:22:26 +00:00
Benoit Jacob	c5bd1703cb	change derived classes methods from "private:_method()" to "public:method()" i.e. reimplementing the generic method() from MatrixBase. improves compilation speed by 7%, reduces almost by half the call depth of trivial functions, making gcc errors and application backtraces nicer...	2008-06-26 20:08:16 +00:00
Benoit Jacob	25ba9f377c	* add bench/benchVecAdd.cpp by Gael, fix crash (ei_pload on non-aligned) * introduce packet(int), make use of it in linear vectorized paths --> completely fixes the slowdown noticed in benchVecAdd. * generalize coeff(int) to linear-access xprs * clarify the access flag bits * rework api dox in Coeffs.h and util/Constants.h * improve certain expressions's flags, allowing more vectorization * fix bug in Block: start(int) and end(int) returned dyndyn size fix bug in Block: just because the Eval type has packet access doesn't imply the block xpr should have it too.	2008-06-26 16:06:41 +00:00
Benoit Jacob	5b0da4b778	make use of ei_pmadd in dot-product: will further improve performance on architectures having a packed-mul-add assembly instruction.	2008-06-24 18:08:35 +00:00
Benoit Jacob	3b94436d2f	* vectorize dot product, copying code from sum. * make the conj functor vectorizable: it is just identity in real case, and complex doesn't use the vectorized path anyway. * fix bug in Block: a 3x1 block in a 4x4 matrix (all fixed-size) should not be vectorizable, since in fixed-size we are assuming the size to be a multiple of packet size. (Or would you prefer Vector3d to be flagged "packetaccess" even though no packet access is possible on vectors of that type?) * rename: isOrtho for vectors ---> isOrthogonal isOrtho for matrices ---> isUnitary * add normalize() * reimplement normalized with quotient1 functor	2008-06-24 15:13:00 +00:00
Benoit Jacob	c9560df4a0	* add ei_pdiv intrinsic, make quotient functor vectorizable * add vdw benchmark from Tim's real-world use case	2008-06-23 22:00:18 +00:00
Gael Guennebaud	ac9aa47bbc	optimize linear vectorization both in Assign and Sum (optimal amortized perf)	2008-06-23 15:50:28 +00:00
Gael Guennebaud	ea1990ef3d	add experimental code for sparse matrix: - uses the common "Compressed Column Storage" scheme - supports every unary and binary operators with xpr template assuming binaryOp(0,0) == 0 and unaryOp(0) = 0 (otherwise a sparse matrix doesnot make sense) - this is the first commit, so of course, there are still several shorcommings !	2008-06-23 13:25:22 +00:00
Benoit Jacob	03d19f3bae	quick temporary fix for a perf issue we just identified with vectorization.... now the sum benchmark runs 3x faster with vectorization than without.	2008-06-23 11:23:05 +00:00
Benoit Jacob	32596c5e9e	add benchmark for sum	2008-06-23 11:03:27 +00:00
Benoit Jacob	dc9206cec5	split sum away from redux and vectorize it. (could come back to redux after it has been vectorized, and could serve as a starting point for that) also make the abs2 functor vectorizable (for real types).	2008-06-23 10:32:48 +00:00
Benoit Jacob	8a967fb17c	* implement slice vectorization. Because it uses unaligned packet access, it is not certain that it will bring a performance improvement: benchmarking needed. * improve logic choosing slice vectorization. * fix typo in SSE packet math, causing crash in unaligned case. * fix bug in Product, causing crash in unaligned case. * add TEST_SSE3 CMake option.	2008-06-22 15:02:05 +00:00
Gael Guennebaud	8cef541b5a	forgot to add the unit test array.cpp	2008-06-21 17:28:07 +00:00
Gael Guennebaud	32c5ea388e	work on rotations in the Geometry module: - convertions are done trough constructors and operator= - added a EulerAngles class	2008-06-21 15:01:49 +00:00
Benoit Jacob	574416b842	Override MatrixBase::eval() since matrices don't need to be evaluated, it is enough to just read them.	2008-06-20 15:26:39 +00:00
Gael Guennebaud	54238961d6	* added a pseudo expression Array giving access to: - matrix-scalar addition/subtraction operators, e.g.: m.array() += 0.5; - matrix/matrix comparison operators, e.g.: if (m1.array() < m2.array()) {} * fix compilation issues with Transform and gcc < 4.1	2008-06-20 12:38:03 +00:00
Gael Guennebaud	e735692e37	move "enum" back to "const int" int ei_assign_impl: in fact, casting enums to int is enough to get compile time constants with ICC.	2008-06-20 07:10:50 +00:00
Gael Guennebaud	fb4a151982	* more cleaning in Product * make Matrix2f (and similar) vectorized using linear path * fix a couple of warnings and compilation issues with ICC and gcc 3.3/3.4 (cannot get Transform compiles with gcc 3.3/3.4, see the FIXME)	2008-06-19 23:00:51 +00:00
Gael Guennebaud	82c3cea1d5	* refactoring of Product: * use ProductReturnType<>::Type to get the correct Product xpr type * Product is no longer instanciated for xpr types which are evaluated * vectorization of "a.transpose() * b" for the normal product (small and fixed-size matrix) * some cleanning * removed ArrayBase	2008-06-19 17:33:57 +00:00

1 2 3 4 5 ...

394 Commits