Commit Graph

325 Commits

Author SHA1 Message Date
Gael Guennebaud
4161b8be67 sync 2010-07-10 22:58:51 +02:00
Thomas Capricelli
551cb9b7b4 bench: use of Eigen/Array is deprecated + fix includes for iostream 2010-07-09 03:59:36 +02:00
Gael Guennebaud
02fd3acd81 update to support mixin types 2010-07-07 19:49:48 +02:00
Gael Guennebaud
861962c55f sync 2010-07-07 16:44:05 +02:00
Gael Guennebaud
0f2d480af0 add support for complex 2010-07-07 16:41:29 +02:00
Gael Guennebaud
e1eccfad3f add intitial support for the vectorization of complex<float> 2010-07-05 16:18:09 +02:00
Konstantinos Margaritis
1daf9b11ba check for !x86 platforms, otherwise the BTL benchmark doesn't compile on arm/powerpc 2010-07-05 16:42:11 +03:00
Gael Guennebaud
f096452dfd Fix cache computation on old Intel CPUs which do not
support the cpuid function 0x4
2010-06-27 00:17:38 +02:00
Gael Guennebaud
5e7bd967cc add the manual Intel's way to query cache info 2010-06-26 23:37:42 +02:00
Gael Guennebaud
78d3c54631 add a small bench demoing the possibilities of a direct 3x3 eigen decomposition 2010-07-18 17:26:06 +02:00
Gael Guennebaud
2a820d41df finish/fix level1 blas, all test pass 2010-07-17 13:49:43 +02:00
Gael Guennebaud
f59226e901 fix compilation of blas lib 2010-07-16 22:27:24 +02:00
Gael Guennebaud
6a370f50c7 MPRealSupport was missing 2010-07-15 20:45:45 +02:00
Gael Guennebaud
e4f3759c4d add a bench for quaternion multiplication 2010-07-13 13:29:35 +02:00
Gael Guennebaud
931027f31b add a utilility to debug cpuid, and makes sure we get 0 if we query an unsupported cpuid function 2010-06-26 23:15:06 +02:00
Gael Guennebaud
28e64b0da3 email change 2010-06-24 23:21:58 +02:00
Gael Guennebaud
002f7114d1 add support for oski 2010-06-24 23:21:45 +02:00
Gael Guennebaud
98fec45d3c btl: add a trmm action and update eigen interface 2010-06-23 22:10:49 +02:00
Gael Guennebaud
b284bb8bba add a spmv mini becnhmark for Eigen, GMM++, ublas, mtl4, and oski 2010-06-22 21:39:55 +02:00
Gael Guennebaud
fd9a9fa0ae slightly optimize computeProductBlockingSizes by explicitely precomputing what is known at compile time 2010-06-22 11:10:38 +02:00
Gael Guennebaud
98686ab86c fix in case we don't know how to query the L1/L2 cache sizes 2010-06-21 23:44:20 +02:00
Gael Guennebaud
0212eec23f simplify and optimize block sizes computation for matrix products. They
are now automatically computed from the L1 and L2 cache sizes which are
themselves automatically determined at runtime.
2010-06-21 23:28:50 +02:00
Gael Guennebaud
4cd38b333c make bench_gemm print out the queried cache sizes 2010-06-21 12:07:05 +02:00
Gael Guennebaud
6db6e358f5 add the possibility to set the cache size at runtime 2010-06-18 23:25:57 +02:00
Gael Guennebaud
5b192930b6 add runtime API to control multithreading 2010-06-10 23:30:15 +02:00
Gael Guennebaud
0116261407 make BenchTimer compatible with 2.0 branch 2010-06-01 13:57:38 +02:00
Benoit Jacob
abbe260905 remove USING_PART_OF_NAMESPACE_EIGEN, leaving it in Eigen2Support.
improve porting-Eigen2-to-3 docs
2010-04-22 18:27:13 -04:00
Hauke Heibel
51b0159c96 Fixed line endings. 2010-03-05 18:11:54 +01:00
Gael Guennebaud
f2a246c225 add a small program to bench all combinations of small products 2010-03-05 17:16:19 +01:00
Gael Guennebaud
c442208358 clean a bit the bench_gemm files 2010-03-05 11:35:43 +01:00
Gael Guennebaud
24ef5fedcd minor cleaning 2010-03-05 09:57:04 +01:00
Gael Guennebaud
cefd9b8888 merge with default branch 2010-03-04 18:47:52 +01:00
Gael Guennebaud
b0ffd9bf04 clean #defined tokens, and use clock_gettime for the real time 2010-03-03 09:41:29 +01:00
Eamon Nerbonne
ff6b94d6d0 BenchTimer: avoid warning about symbol redefinition on win32, and include <Eigen/Core> (required to compile) 2010-03-02 08:46:11 +01:00
Gael Guennebaud
1710c07f63 remove Qt's atomic dependency, I don't know what I was doing wrong... 2010-03-01 13:09:47 +01:00
Gael Guennebaud
aeff3ff391 make Aron's idea work using Qt's atomic implementation for the synchronisation 2010-03-01 10:57:32 +01:00
Gael Guennebaud
ac425090f3 BTL: allow to bench real time 2010-02-26 14:57:49 +01:00
Gael Guennebaud
c05047d28e fix some BTL issues 2010-02-26 12:51:20 +01:00
Gael Guennebaud
3ac2b96a2f implement a smarter parallelization strategy for gemm avoiding multiple
paking of the same data
2010-02-26 12:32:00 +01:00
Gael Guennebaud
68eaefa5d4 update BTL (better timer, eigen2 => eigen3, etc) 2010-02-23 18:23:12 +01:00
Gael Guennebaud
3beedba244 merge 2010-02-22 21:32:29 +01:00
Thomas Capricelli
d3b314569b provide default values for CXX, remove duplicate define 2010-02-22 15:39:17 +01:00
Hauke Heibel
3e6ab8f93b ups 2010-02-22 11:34:25 +01:00
Hauke Heibel
d5af5ab92b Added getRealTime() for windows. 2010-02-22 11:23:27 +01:00
Gael Guennebaud
f797ba0abe extend the bench timer to allow benchmarking of parallel code,
improvements are welcome
2010-02-22 11:04:35 +01:00
Gael Guennebaud
801440c519 fix BTL's eigen interface
(transplanted from 437f40acc1
)
2010-02-22 09:32:16 +01:00
Gael Guennebaud
eb905500b6 significant speedup in the matrix-matrix products 2010-02-23 13:06:49 +01:00
Gael Guennebaud
d579d4cc37 oops 2010-02-22 17:57:15 +01:00
Hauke Heibel
6730fd9f3f Port BenchTimer fix. 2010-02-22 11:42:58 +01:00
Gael Guennebaud
4ba25a8d5c merge 2010-02-22 11:30:36 +01:00
Gael Guennebaud
aaaf855a88 add a small benchmark to quickly bench/compare SMP support 2010-02-22 11:09:57 +01:00
Gael Guennebaud
437f40acc1 fix BTL's eigen interface 2010-02-22 09:32:16 +01:00
Mark Borgerding
f200c84d9f merge 2010-02-16 21:41:04 -05:00
Mark Borgerding
7a6cb2a39c added benchmark for unscaled and half-spectrum FFTs 2010-01-21 21:09:26 -05:00
Gael Guennebaud
905050b239 extend sparse product benchmark with ublas 2010-02-09 15:55:36 +01:00
Gael Guennebaud
c3823dce72 extend benchmark for sparse products 2010-01-05 16:03:35 +01:00
Benoit Jacob
39ac57fa6d Big renaming:
start ---> head
  end   ---> tail
Much frustration with sed syntax. Need to learn perl some day.
2010-01-04 21:24:43 -05:00
Benoit Jacob
25f8adfa6c * Fix bug #79: ei_alignmentOffset was assuming that ptr is multiple of
sizeof(Scalar), and that assumption breaks with double on linux x86-32.
* Rename ei_alignmentOffset to ei_first_aligned
* Rewrite its documentation and part of its body
* The variant taking a MatrixBase doesn't need a separate size argument.
2010-01-02 12:38:16 -05:00
Gael Guennebaud
36969cc2a5 add a slerp benchmark (for accuracy and speed)) 2009-12-04 15:02:38 +01:00
Hauke Heibel
1fc5fdea25 Added missing typedef (will I ever learn it!?)
Removed unsupported directories that do not provide CMakeList.txt (CMake 2.8 warning).
The BenchTimer is now also working on Cygwin.
2009-12-01 09:20:05 +01:00
Benoit Jacob
92749eed11 * merge
* remove a ctor in QuaternionBase as it gives a strange error with GCC 4.4.2.
2009-11-09 09:08:03 -05:00
Gael Guennebaud
6647a58847 update product bench 2009-11-06 11:33:18 +01:00
Mark Borgerding
0fa68b9e50 switched to BenchUtil.h 2009-10-30 19:46:45 -04:00
Benoit Jacob
a2268ca6b3 properly implement BenchTimer on POSIX
(may require a platform check for the clock name on non-linux platforms)
2009-10-29 15:47:56 -04:00
Benoit Jacob
e8dd552257 sync with mainline 2009-10-28 19:06:45 -04:00
Benoit Jacob
2840ac7e94 big huge changes, so i dont remember everything.
* renaming, e.g. LU ---> FullPivLU
* split tests framework: more robust, e.g. dont generate empty tests if a number is skipped
* make all remaining tests use that splitting, as needed.
* Fix 4x4 inversion (see stable branch)
* Transform::inverse() and geo_transform test : adapt to new inverse() API, it was also trying to instantiate inverse() for 3x4 matrices.
* CMakeLists: more robust regexp to parse the version number
* misc fixes in unit tests
2009-10-28 18:19:29 -04:00
Mark Borgerding
0167f5ef43 added inline to many functions 2009-10-22 23:06:19 -04:00
Mark Borgerding
902b6dcd6c added Eigen::FFT and
Eigen::Complex
2009-10-20 21:33:48 -04:00
Hauke Heibel
5e3e6ff71a Added Windows support to the BenchTimer. 2009-10-20 22:08:13 +02:00
Mark Borgerding
d9b418bf12 merged eigen2_for_fft into eigen2 mainline 2009-10-20 15:18:01 -04:00
Gael Guennebaud
8f3e33581e extend the sparse matrix assembly benchmark 2009-10-07 14:25:53 +02:00
Gael Guennebaud
0b60027f3c implement __gnuc_forget_about_setZero_its_over_now 2009-09-18 15:36:05 +02:00
Gael Guennebaud
d2becb9612 add a "rot" benchmark in BTL 2009-08-15 10:19:16 +02:00
Gael Guennebaud
c2a92e92a6 add ger and lu with partial pivoting in BTL 2009-08-04 11:30:33 +02:00
Gael Guennebaud
3cf5bb31f6 * Bye bye MultiplierBase, extend a bit AnyMatrixBase to allow =, +=, and -=
* This probably makes ReturnByValue needless
2009-08-03 16:05:15 +02:00
Gael Guennebaud
54804eb626 synch with main branch 2009-07-28 17:35:07 +02:00
Gael Guennebaud
de8b795895 compilation fixes in BTL 2009-07-28 17:10:34 +02:00
Gael Guennebaud
32b08ac971 re-implement stableNorm using a homemade blocky and
vectorization friendly algorithm (slow if no vectorization)
2009-07-17 16:22:39 +02:00
Gael Guennebaud
15ed32dd6e add other stable norm impl. in the benchmark 2009-07-16 16:21:26 +02:00
Gael Guennebaud
525da6a464 bugfix in blueNorm 2009-07-16 14:20:36 +02:00
Gael Guennebaud
65fc70b750 add a benchmark for the different norms 2009-07-16 11:33:56 +02:00
Gael Guennebaud
c49d1fd2b5 add a partial LU bench in BTL 2009-06-04 18:16:54 +02:00
Mark Borgerding
1c54340174 more work on ei_fftw_impl 2009-05-31 15:44:57 -04:00
Mark Borgerding
09b4733255 added real-optimized inverse FFT (NFFT must be multiple of 4) 2009-05-25 23:52:21 -04:00
Mark Borgerding
210092d16c changed name from simple_fft_traits to ei_kissfft_impl 2009-05-25 20:35:24 -04:00
Mark Borgerding
326ea77390 added FFT inverse complex-to-scalar interface (not yet optimized) 2009-05-23 22:50:07 -04:00
Mark Borgerding
9c0fcd0f62 started real optimization, added benchmark for FFT 2009-05-23 10:09:48 -04:00
Benoit Jacob
6347b1db5b remove sentence "Eigen itself is part of the KDE project."
it never made very precise sense. but now does it still make any?
2009-05-22 20:25:33 +02:00
Gael Guennebaud
caa1ef7515 various BTL updates (disable Cholesky for MTL, add new plot settings,
etc)
2009-03-09 23:28:46 +00:00
Gael Guennebaud
bd8107c90c forgot to include a file in previous commit 2009-03-09 14:18:29 +00:00
Gael Guennebaud
8a424efb11 add an option to bench eigen without GCC's auto vec (might conflict with
Eigen's auto vec)
2009-03-09 14:16:05 +00:00
Gael Guennebaud
d710ccd41e BTL: add syr2 action 2009-03-05 08:15:32 +00:00
Gael Guennebaud
a72ff5abc1 BTL: - patch from Victor (add ACML support)
- fix overflow issues
2009-03-05 08:11:47 +00:00
Gael Guennebaud
45136ac3b6 various update of of BTL 2009-03-04 07:21:17 +00:00
Gael Guennebaud
7485aa6d57 add symv bench 2009-02-20 21:05:19 +00:00
Gael Guennebaud
19b035ee11 s/cholesky/llt in precompiled lib and BTL 2009-02-06 14:01:01 +00:00
Gael Guennebaud
cc90495e30 add bench_reverse, draft of a reverse vectorization for AltiVec, make
global Scaling function static
2009-02-06 13:28:55 +00:00
Gael Guennebaud
b0dd22cc72 update cholesky benchmark 2009-02-03 19:05:10 +00:00
Gael Guennebaud
dde729379a various updates in the (still messy) sparse benchmarks 2009-01-28 20:32:28 +00:00
Gael Guennebaud
f645d1f911 * complete the support of QVector via a QtAlignedMalloc header
* add a unit test for QVector which shows the issue with QVector::fill
2009-01-20 16:50:47 +00:00
Gael Guennebaud
0c7974dd4d bugfix in Map by Keir Mierle 2009-01-18 09:53:06 +00:00
Gael Guennebaud
22792c696f add ublas vector of vector in sparse setter bench 2009-01-17 16:24:49 +00:00
Gael Guennebaud
cc6c4d807b add a sparse setter bench 2009-01-17 14:05:01 +00:00
Gael Guennebaud
f5741d4277 add a sparse * dense_vector bench 2009-01-14 18:27:17 +00:00
James Richard Tyrer
28e15574df updating FindEigen2.cmake for proper search order 2009-01-11 16:18:59 +00:00
Benoit Jacob
1d52bd4cad the big memory changes. the most important changes are:
ei_aligned_malloc now really behaves like a malloc
 (untyped, doesn't call ctor)
ei_aligned_new is the typed variant calling ctor
EIGEN_MAKE_ALIGNED_OPERATOR_NEW now takes the class name as parameter
2009-01-08 15:20:21 +00:00
Benoit Jacob
be64619ab6 * require CMake 2.6.2 everywhere, Alexander Neundorf says it'd make it
easier to have a uniform requirement in kdesupport for when he makes
fixes.
* add eigen versioning macros
2009-01-04 16:19:12 +00:00
Benoit Jacob
15ca6659ac * the 4th template param of Matrix is now Options. One bit for storage
order, one bit for enabling/disabling auto-alignment. If you want to
disable, do:
Matrix<float,4,1,Matrix_DontAlign>
The Matrix_ prefix is the only way I can see to avoid
ambiguity/pollution. The old RowMajor, ColMajor constants are
deprecated, remain for now.
* this prompted several improvements in matrix_storage. ei_aligned_array
renamed to ei_matrix_array and moved there. The %16==0 tests are now
much more centralized in 1 place there.
* unalignedassert test: updated
* update FindEigen2.cmake from KDElibs
* determinant test: use VERIFY_IS_APPROX to fix false positives; add
testing of 1 big matrix
2009-01-04 15:26:32 +00:00
Benoit Jacob
164f410bb5 * make WithAlignedOperatorNew always align even when vectorization is disabled
* make ei_aligned_malloc and ei_aligned_free honor custom operator new and delete
2008-12-30 14:11:35 +00:00
Benoit Jacob
9e00d94543 * the Upper->UpperTriangular change
* finally get ei_add_test right
2008-12-20 13:36:12 +00:00
Gael Guennebaud
8679d895d3 various MSVC fixes in BTL 2008-12-19 15:31:47 +00:00
Gael Guennebaud
93f8d56789 improved MSVC support in cmake files (SSE) 2008-12-18 09:07:36 +00:00
Benoit Jacob
00f89a8f37 Update e-mail address 2008-11-24 13:40:43 +00:00
Gael Guennebaud
86ccd99d8d Several improvements in sparse module:
* add a LDL^T factorization with solver using code from T. Davis's LDL
  library (LPGL2.1+)
* various bug fixes in trianfular solver, matrix product, etc.
* improve cmake files for the supported libraries
* split the sparse unit test
* etc.
2008-11-05 13:47:55 +00:00
Gael Guennebaud
0c5a09d93f some cleaning and doc in ParametrizedLine and HyperPlane
Just a thought: what about ParamLine instead of the verbose ParametrizedLine ?
2008-10-25 00:08:52 +00:00
Gael Guennebaud
cf0f82ecbe sparse module:
- remove some useless stuff => let's focus on a single sparse matrix format
 - finalize the new RandomSetter
2008-10-21 13:35:04 +00:00
Gael Guennebaud
9e02e42ff6 add the bench file for the RandomSetter 2008-10-21 00:05:45 +00:00
Gael Guennebaud
3a231c2349 sparse module: add support for umfpack, the sparse direct LU
solver from suitesparse (as cholmod). It seems to be even faster
than SuperLU and it was much simpler to interface ! Well,
the factorization is faster, but for the solve part, SuperLU is
quite faster. On the other hand the solve part represents only a
fraction of the whole procedure. Moreover, I bench random matrices
that does not represents real cases, and I'm not sure at all
I use both libraries with their best settings !
2008-10-19 22:44:21 +00:00
Gael Guennebaud
76fe2e1b34 add/update some benchmark files used to test/compare sparse module features 2008-10-19 17:06:11 +00:00
Gael Guennebaud
765219aa51 Big API change in Cholesky module:
* rename Cholesky to LLT
 * rename CholeskyWithoutSquareRoot to LDLT
 * rename MatrixBase::cholesky() to llt()
 * rename MatrixBase::choleskyNoSqrt() to ldlt()
 * make {LLT,LDLT}::solve() API consistent with other modules

Note that we are going to keep a source compatibility untill the next beta release.
E.g., the "old" Cholesky* classes, etc are still available for some time.
To be clear, Eigen beta2 should be (hopefully) source compatible with beta1,
and so beta2 will contain all the deprecated API of beta1. Those features marked
as deprecated will be removed in beta3 (or in the final 2.0 if there is no beta 3 !).

Also includes various updated in sparse Cholesky.
2008-10-13 15:53:27 +00:00
Gael Guennebaud
68fbd6f531 typos in bench/ 2008-08-29 16:10:08 +00:00
Gael Guennebaud
3e526dcdbd BTL:added trisolve action file 2008-08-26 10:46:58 +00:00
Gael Guennebaud
7ce70e1437 various updates in BTL 2008-08-25 14:23:08 +00:00
Gael Guennebaud
b13148c358 renamed inverseProduct => solveTriangular 2008-08-09 20:06:25 +00:00
Gael Guennebaud
a7a05382d1 Add a LU decomposition action in BTL and various cleaning in BTL. For instance
all per plot settings have been moved to a single file, go_mean now takes an
optional second argument "tiny" to generate plots for tiny matrices, and
output of comparison information wrt to previous benchs (if any).
2008-08-04 23:12:48 +00:00
Gael Guennebaud
e0215ee510 BTL: - added tridiagonalization and hessenberg decomposition bench
- added GOTO library
2008-07-28 20:48:21 +00:00
Gael Guennebaud
93115619c2 * updated benchmark files according to recent renamings
* various improvements in BTL including trisolver and cholesky bench
2008-07-27 11:39:47 +00:00
Gael Guennebaud
b466c266a0 * Fix some complex alignment issues in the cache friendly matrix-vector products.
* Minor update of the cores of the Cholesky algorithms to make them more friendly
  wrt to matrix-vector products => speedup x5 !
2008-07-23 17:30:00 +00:00
Benoit Jacob
62ec1dd616 * big rework of Inverse.h:
- remove all invertibility checking, will be redundant with LU
  - general case: adapt to matrix storage order for better perf
  - size 4 case: handle corner cases without falling back to gen case.
  - rationalize with selectors instead of compile time if
  - add C-style computeInverse()
* update inverse test.
* in snippets, default cout precision to 3 decimal places
* add some cmake module from kdelibs to support btl with cmake 2.4
2008-07-15 23:56:17 +00:00
Gael Guennebaud
c8cbc1665e enhancements of the plot generator:
- removed the ugly X11 and PNG gnuplots terminals
- use enhanced postscript terminal
- use imagemagick to generate the png files (with compression)
- disable the fortran impl by default since it is as meaningless as a "C impl"
- update line settings
2008-07-13 11:46:36 +00:00
Gael Guennebaud
99a625243f Optimization: added super efficient rowmajor * vector product (and vector * colmajor).
It basically performs 4 dot products at once reducing loads of the vector and improving
instructions scheduling. With 3 cache friendly algorithms, we now handle all product
configurations with outstanding perf for large matrices.
2008-07-13 01:22:54 +00:00
Benoit Jacob
51e6ee39f0 SVN_SILENT trivial fix 2008-07-12 23:42:19 +00:00
Gael Guennebaud
bd0183f850 fix a cmake issue in FindTvmet and FindMKL 2008-07-12 23:34:42 +00:00
Benoit Jacob
e979e6485f another occurence of that little cmake fix 2008-07-12 23:27:41 +00:00
Gael Guennebaud
861d18d553 * Optimization: added a specialization of Block for xpr with DirectAccessBit
* some simplifications and fixes in cache friendly products
2008-07-12 22:59:34 +00:00
Benoit Jacob
1bbaea9885 little cmake fix 2008-07-12 22:13:03 +00:00
Gael Guennebaud
10c4e36b39 disable MKL check and fortran for cmake <2.6 2008-07-12 21:54:02 +00:00
Gael Guennebaud
ed6e07b2f6 various improvements of the plot generator in BTL 2008-07-12 21:41:32 +00:00
Gael Guennebaud
8233de8b69 various minor updates in the benchmark suite like non inlining
of some functions as well as the experimental C code used to design
efficient eigen's matrix vector products.
2008-07-12 12:14:08 +00:00
Gael Guennebaud
6f71ef8277 resurrected tvmet, added mt4, intel's MKL and handcoded vectorized backends
in the benchmark suite
2008-07-10 18:28:50 +00:00
Gael Guennebaud
7b4c6b8862 in BTL: a specific bench/action can be selected at runtime, e.g.:
BTL_CONFIG="-a ata" ctest -V -R eigen
  run the all benchmarks having "ata" in their name for all
  libraries matching the regexp "eigen"
2008-07-09 22:35:11 +00:00
Benoit Jacob
25904802bc raah, results were corrupted by overflow. Now slice vectorization is
about a +25% speedup which is still nice as i expected zero or even
negative benefit.
2008-07-09 16:46:26 +00:00
Benoit Jacob
8f21a5e862 add benchmark for slice vectorization... expected it to be little or
zero benefit... turns out to be 20x speedup. Something is wrong.
2008-07-09 16:43:11 +00:00
Gael Guennebaud
28539e7597 imported a reworked version of BTL (Benchmark for Templated Libraries).
the modifications to initial code follow:
* changed build system from plain makefiles to cmake
* added eigen2 (4 versions: vec/novec and fixed/dynamic), GMM++, MTL4 interfaces
* added "transposed matrix * vector" product action
* updated blitz interface to use condensed products instead of hand coded loops
* removed some deprecated interfaces
* changed default storage order to column major for all libraries
* new generic bench timer strategy which is supposed to be more accurate
* various code clean-up
2008-07-09 14:04:48 +00:00
Gael Guennebaud
77a622f2bb add Cholesky and eigensolver benchmark 2008-07-08 17:20:17 +00:00
Benoit Jacob
6f09d3a67d - many updates after Cwise change
- fix compilation in product.cpp with std::complex
- fix bug in MatrixBase::operator!=
2008-07-08 07:56:01 +00:00
Benoit Jacob
f5791eeb70 the big Array/Cwise rework as discussed on the mailing list. The new API
can be seen in Eigen/src/Core/Cwise.h.
2008-07-08 00:49:10 +00:00
Gael Guennebaud
8463b7d3f4 * fix compilation issue in Product
* added some tests for product and swap
* overload .swap() for dynamic-sized matrix of same size
2008-07-02 16:05:33 +00:00
Gael Guennebaud
9433df83a7 * resurected Flagged::_expression used to optimize m+=(a*b).lazy()
(equivalent to the GEMM blas routine)
* added a GEMM benchmark
2008-07-01 16:20:06 +00:00
Gael Guennebaud
37a50fa526 * added an in-place version of inverseProduct which
might be twice faster fot small fixed size matrix
* added a sparse triangular solver (sparse version
  of inverseProduct)
* various other improvements in the Sparse module
2008-06-29 21:29:12 +00:00
Gael Guennebaud
027818d739 * added innerSize / outerSize functions to MatrixBase
* added complete implementation of sparse matrix product
  (with a little glue in Eigen/Core)
* added an exhaustive bench of sparse products including GMM++ and MTL4
  => Eigen outperforms in all transposed/density configurations !
2008-06-28 23:07:14 +00:00
Gael Guennebaud
e5d301dc96 various work on the Sparse module:
* added some glue to Eigen/Core (SparseBit, ei_eval, Matrix)
* add two new sparse matrix types:
   HashMatrix: based on std::map (for random writes)
   LinkedVectorMatrix: array of linked vectors
   (for outer coherent writes, e.g. to transpose a matrix)
* add a SparseSetter class to easily set/update any kind of matrices, e.g.:
   { SparseSetter<MatrixType,RandomAccessPattern> wrapper(mymatrix);
     for (...) wrapper->coeffRef(rand(),rand()) = rand(); }
* automatic shallow copy for RValue
* and a lot of mess !
plus:
* remove the remaining ArrayBit related stuff
* don't use alloca in product for very large memory allocation
2008-06-26 23:22:26 +00:00
Benoit Jacob
25ba9f377c * add bench/benchVecAdd.cpp by Gael, fix crash (ei_pload on non-aligned)
* introduce packet(int), make use of it in linear vectorized paths
  --> completely fixes the slowdown noticed in benchVecAdd.
* generalize coeff(int) to linear-access xprs
* clarify the access flag bits
* rework api dox in Coeffs.h and util/Constants.h
* improve certain expressions's flags, allowing more vectorization
* fix bug in Block: start(int) and end(int) returned dyn*dyn size
* fix bug in Block: just because the Eval type has packet access
  doesn't imply the block xpr should have it too.
2008-06-26 16:06:41 +00:00
Benoit Jacob
c9560df4a0 * add ei_pdiv intrinsic, make quotient functor vectorizable
* add vdw benchmark from Tim's real-world use case
2008-06-23 22:00:18 +00:00
Gael Guennebaud
ea1990ef3d add experimental code for sparse matrix:
- uses the common "Compressed Column Storage" scheme
 - supports every unary and binary operators with xpr template
   assuming binaryOp(0,0) == 0 and unaryOp(0) = 0 (otherwise a sparse
   matrix doesnot make sense)
 - this is the first commit, so of course, there are still several shorcommings !
2008-06-23 13:25:22 +00:00
Benoit Jacob
32596c5e9e add benchmark for sum 2008-06-23 11:03:27 +00:00
Gael Guennebaud
82c3cea1d5 * refactoring of Product:
* use ProductReturnType<>::Type to get the correct Product xpr type
  * Product is no longer instanciated for xpr types which are evaluated
  * vectorization of "a.transpose() * b" for the normal product (small and fixed-size matrix)
  * some cleanning
* removed ArrayBase
2008-06-19 17:33:57 +00:00
Benoit Jacob
c905b31b42 * Big rework of Assign.h:
** Much better organization
** Fix a few bugs
** Add the ability to unroll only the inner loop
** Add an unrolled path to the Like1D vectorization. Not well tested.
** Add placeholder for sliced vectorization. Unimplemented.

* Rework of corrected_flags:
** improve rules determining vectorizability
** for vectors, the storage-order is indifferent, so we tweak it
   to allow vectorization of row-vectors.

* fix compilation in benchmark, and a warning in Transpose.
2008-06-16 10:49:44 +00:00
Gael Guennebaud
8f1fc80a77 some documentation fixes (Cwise* and Cholesky) 2008-05-22 16:31:00 +00:00
Benoit Jacob
8c6007f80e * Patch by Konstantinos Margaritis: AltiVec vectorization.
* Fix several warnings, temporarily disable determinant test.
2008-05-03 12:21:23 +00:00
Benoit Jacob
890a8de962 Make products always eval into expressions. Improves performance
in benchmark. Still not as fasts as explicit eval(), strangely.
2008-05-02 08:53:23 +00:00
Gael Guennebaud
1ec2d21ca5 Fixed a couple of issues introduced in previous commits.
Added a test for Triangular.
2008-04-26 20:28:27 +00:00
Benoit Jacob
6ae037dfb5 give up on OpenMP... for now 2008-04-18 07:57:46 +00:00
Benoit Jacob
2a86f052a5 - optimized determinant calculations for small matrices (size <= 4)
(only 30 muls for size 4)
- rework the matrix inversion: now using cofactor technique for size<=3,
  so the ugly unrolling is only used for size 4 anymore, and even there
  I'm looking to get rid of it.
2008-04-14 17:07:12 +00:00
Benoit Jacob
ab4046970b * Add fixed-size template versions of corner(), start(), end().
* Use them to write an unrolled path in echelon.cpp, as an
  experiment before I do this LU module.
* For floating-point types, make ei_random() use an amplitude
  of 1.
2008-04-12 17:37:27 +00:00
Benoit Jacob
dcebc46cdc - cleaner use of OpenMP (no code duplication anymore)
using a macro and _Pragma.
- use OpenMP also in cacheOptimalProduct and in the
  vectorized paths as well
- kill the vector assignment unroller. implement in
  operator= the logic for assigning a row-vector in
  a col-vector.
- CMakeLists support for building tests/examples
  with -fopenmp and/or -msse2
- updates in bench/, especially replace identity()
  by ones() which prevents underflows from perturbing
  bench results.
2008-04-11 14:28:42 +00:00
Benoit Jacob
7bee90a62a Merge Gael's experimental OpenMP parallelization support into Assign.h. 2008-04-11 08:18:47 +00:00
Benoit Jacob
9d8876ce82 * rename XprCopy -> Nested
* rename OperatorEquals -> Assign
* move Util.h and FwDecl.h to a util/ subdir
2008-04-10 09:01:28 +00:00
Benoit Jacob
371d302efb - merge ei_xpr_copy and ei_eval_if_needed_before_nesting
- make use of CoeffReadCost to determine when to unroll the loops,
  for now only in Product.h and in OperatorEquals.h
performance remains the same: generally still not as good as before the
big changes.
2008-04-06 18:01:03 +00:00
Benoit Jacob
cff5e3ce9c Make use of the LazyBit, introduce .lazy(), remove lazyProduct. 2008-03-31 16:20:06 +00:00
Benoit Jacob
fe569b060c get rid of MatrixRef, simplifications. 2008-03-13 20:36:01 +00:00
Benoit Jacob
afc64f3332 a lot of renaming
internal classes: AaBb -> ei_aa_bb
IntAtRunTimeIfDynamic -> ei_int_if_dynamic
unify UNROLLING_LIMIT (there was no reason to have operator= use
a higher limit)
etc...
2008-03-13 09:33:26 +00:00
Gael Guennebaud
35bce20954 Removed Column and Row in favor of Block 2008-03-12 18:10:52 +00:00
Benoit Jacob
2ee68a074e generalized ei_traits<>.
Finally the importing macro is named EIGEN_BASIC_PUBLIC_INTERFACE
because it does not only import the ei_traits, it also makes the base class
a friend, etc.
2008-03-12 17:17:36 +00:00
Gael Guennebaud
9d9d81ad71 * basic support for multicore CPU via a .evalOMP() which
internaly uses OpenMP if enabled at compile time.
 * added a bench/ folder with a couple benchmarks and benchmark tools.
2008-03-09 16:13:47 +00:00