eigen/bench
Benoit Jacob 8f21a5e862 add benchmark for slice vectorization... expected it to be little or
zero benefit... turns out to be 20x speedup. Something is wrong.
2008-07-09 16:43:11 +00:00
..
btl imported a reworked version of BTL (Benchmark for Templated Libraries). 2008-07-09 14:04:48 +00:00
basicbench.cxxlist Removed Column and Row in favor of Block 2008-03-12 18:10:52 +00:00
basicbenchmark.cpp add Cholesky and eigensolver benchmark 2008-07-08 17:20:17 +00:00
basicbenchmark.h Make use of the LazyBit, introduce .lazy(), remove lazyProduct. 2008-03-31 16:20:06 +00:00
bench_multi_compilers.sh Removed Column and Row in favor of Block 2008-03-12 18:10:52 +00:00
bench_sum.cpp add benchmark for sum 2008-06-23 11:03:27 +00:00
bench_unrolling add Cholesky and eigensolver benchmark 2008-07-08 17:20:17 +00:00
benchBlasGemm.cpp * fix compilation issue in Product 2008-07-02 16:05:33 +00:00
benchCholesky.cpp add Cholesky and eigensolver benchmark 2008-07-08 17:20:17 +00:00
benchEigenSolver.cpp add Cholesky and eigensolver benchmark 2008-07-08 17:20:17 +00:00
benchmark_suite Make use of the LazyBit, introduce .lazy(), remove lazyProduct. 2008-03-31 16:20:06 +00:00
benchmark.cpp * fix compilation issue in Product 2008-07-02 16:05:33 +00:00
benchmarkSlice.cpp add benchmark for slice vectorization... expected it to be little or 2008-07-09 16:43:11 +00:00
benchmarkX.cpp - cleaner use of OpenMP (no code duplication anymore) 2008-04-11 14:28:42 +00:00
benchmarkXcwise.cpp - many updates after Cwise change 2008-07-08 07:56:01 +00:00
BenchSparseUtil.h add Cholesky and eigensolver benchmark 2008-07-08 17:20:17 +00:00
BenchTimer.h some documentation fixes (Cwise* and Cholesky) 2008-05-22 16:31:00 +00:00
BenchUtil.h add Cholesky and eigensolver benchmark 2008-07-08 17:20:17 +00:00
benchVecAdd.cpp * add bench/benchVecAdd.cpp by Gael, fix crash (ei_pload on non-aligned) 2008-06-26 16:06:41 +00:00
README.txt * basic support for multicore CPU via a .evalOMP() which 2008-03-09 16:13:47 +00:00
sparse_product.cpp * added innerSize / outerSize functions to MatrixBase 2008-06-28 23:07:14 +00:00
sparse_trisolver.cpp * added an in-place version of inverseProduct which 2008-06-29 21:29:12 +00:00
vdw_new.cpp the big Array/Cwise rework as discussed on the mailing list. The new API 2008-07-08 00:49:10 +00:00

This folder contains a couple of benchmark utities and Eigen benchmarks.

****************************
* bench_multi_compilers.sh *
****************************

This script allows to run a benchmark on a set of different compilers/compiler options.
It takes two arguments:
 - a file defining the list of the compilers with their options
 - the .cpp file of the benchmark

Examples:

$ ./bench_multi_compilers.sh basicbench.cxxlist basicbenchmark.cpp

    g++-4.1 -O3 -DNDEBUG -finline-limit=10000
    3d-3x3   /   4d-4x4   /   Xd-4x4   /   Xd-20x20   /
    0.271102   0.131416   0.422322   0.198633
    0.201658   0.102436   0.397566   0.207282

    g++-4.2 -O3 -DNDEBUG -finline-limit=10000
    3d-3x3   /   4d-4x4   /   Xd-4x4   /   Xd-20x20   /
    0.107805   0.0890579   0.30265   0.161843
    0.127157   0.0712581   0.278341   0.191029

    g++-4.3 -O3 -DNDEBUG -finline-limit=10000
    3d-3x3   /   4d-4x4   /   Xd-4x4   /   Xd-20x20   /
    0.134318   0.105291   0.3704   0.180966
    0.137703   0.0732472   0.31225   0.202204

    icpc -fast -DNDEBUG -fno-exceptions -no-inline-max-size
    3d-3x3   /   4d-4x4   /   Xd-4x4   /   Xd-20x20   /
    0.226145   0.0941319   0.371873   0.159433
    0.109302   0.0837538   0.328102   0.173891


$ ./bench_multi_compilers.sh ompbench.cxxlist ompbenchmark.cpp

    g++-4.2 -O3 -DNDEBUG -finline-limit=10000 -fopenmp
    double, fixed-size 4x4: 0.00165105s  0.0778739s
    double, 32x32: 0.0654769s 0.075289s  => x0.869674 (2)
    double, 128x128: 0.054148s 0.0419669s  => x1.29025 (2)
    double, 512x512: 0.913799s 0.428533s  => x2.13239 (2)
    double, 1024x1024: 14.5972s 9.3542s  => x1.5605 (2)

    icpc -fast -DNDEBUG -fno-exceptions -no-inline-max-size -openmp
    double, fixed-size 4x4: 0.000589848s  0.019949s
    double, 32x32: 0.0682781s 0.0449722s  => x1.51823 (2)
    double, 128x128: 0.0547509s 0.0435519s  => x1.25714 (2)
    double, 512x512: 0.829436s 0.424438s  => x1.9542 (2)
    double, 1024x1024: 14.5243s 10.7735s  => x1.34815 (2)