eigen

mirror of https://gitlab.com/libeigen/eigen.git synced 2024-12-15 07:10:37 +08:00

History

Gael Guennebaud c8cbc1665e enhancements of the plot generator: - removed the ugly X11 and PNG gnuplots terminals - use enhanced postscript terminal - use imagemagick to generate the png files (with compression) - disable the fortran impl by default since it is as meaningless as a "C impl" - update line settings		2008-07-13 11:46:36 +00:00
..
actions	various improvements of the plot generator in BTL	2008-07-12 21:41:32 +00:00
cmake	enhancements of the plot generator:	2008-07-13 11:46:36 +00:00
data	enhancements of the plot generator:	2008-07-13 11:46:36 +00:00
generic_bench	enhancements of the plot generator:	2008-07-13 11:46:36 +00:00
libs	enhancements of the plot generator:	2008-07-13 11:46:36 +00:00
CMakeLists.txt	various minor updates in the benchmark suite like non inlining	2008-07-12 12:14:08 +00:00
COPYING	imported a reworked version of BTL (Benchmark for Templated Libraries).	2008-07-09 14:04:48 +00:00
README	imported a reworked version of BTL (Benchmark for Templated Libraries).	2008-07-09 14:04:48 +00:00

README

Bench Template Library

****************************************
Introduction :

The aim of this project is to compare the performance
of available numerical libraries. The code is designed
as generic and modular as possible. Thus, adding new
numerical libraries or new numerical tests should
require minimal effort.


*****************************************

Installation :

BTL uses cmake / ctest:

1 - create a build directory:

  $ mkdir build
  $ cd build

2 - configure:

  $ ccmake ..

3 - run the bench using ctest:

  $ ctest -V

You can also run a single bench, e.g.: ctest -V -R eigen

4 : Analyze the result. different data files (.dat) are produced in each libs directories.
 If gnuplot is available, choose a directory name in the data directory to store the results and type
	cd data
	mkdir my_directory
        cp ../libs/*/*.dat my_directory
 Build the data utilities in this (data) directory
        make
 Then you can look the raw data,
        go_mean my_directory
 or smooth the data first :
	smooth_all.sh my_directory
	go_mean my_directory_smooth


*************************************************

Files and directories :

 generic_bench : all the bench sources common to all libraries

 actions : sources for different action wrappers (axpy, matrix-matrix product) to be tested.

 libs/* : bench sources specific to each tested libraries.

 machine_dep : directory used to store machine specific Makefile.in

 data : directory used to store gnuplot scripts and data analysis utilities

**************************************************

Principles : the code modularity is achieved by defining two concepts :

 ****** Action concept : This is a class defining which kind
  of test must be performed (e.g. a matrix_vector_product).
	An Action should define the following methods :

        *** Ctor using the size of the problem (matrix or vector size) as an argument
	    Action action(size);
        *** initialize : this method initialize the calculation (e.g. initialize the matrices and vectors arguments)
	    action.initialize();
	*** calculate : this method actually launch the calculation to be benchmarked
	    action.calculate;
	*** nb_op_base() : this method returns the complexity of the calculate method (allowing the mflops evaluation)
        *** name() : this method returns the name of the action (std::string)

 ****** Interface concept : This is a class or namespace defining how to use a given library and
  its specific containers (matrix and vector). Up to now an interface should following types

	*** real_type : kind of float to be used (float or double)
	*** stl_vector : must correspond to std::vector<real_type>
	*** stl_matrix : must correspond to std::vector<stl_vector>
	*** gene_vector : the vector type for this interface        --> e.g. (real_type *) for the C_interface
	*** gene_matrix : the matrix type for this interface        --> e.g. (gene_vector *) for the C_interface

	+ the following common methods

        *** free_matrix(gene_matrix & A, int N)  dealocation of a N sized gene_matrix A
        *** free_vector(gene_vector & B)  dealocation of a N sized gene_vector B
        *** matrix_from_stl(gene_matrix & A, stl_matrix & A_stl) copy the content of an stl_matrix A_stl into a gene_matrix A.
	     The allocation of A is done in this function.
	*** vector_to_stl(gene_vector & B, stl_vector & B_stl)  copy the content of an stl_vector B_stl into a gene_vector B.
	     The allocation of B is done in this function.
        *** matrix_to_stl(gene_matrix & A, stl_matrix & A_stl) copy the content of an gene_matrix A into an stl_matrix A_stl.
             The size of A_STL must corresponds to the size of A.
        *** vector_to_stl(gene_vector & A, stl_vector & A_stl) copy the content of an gene_vector A into an stl_vector A_stl.
             The size of B_STL must corresponds to the size of B.
	*** copy_matrix(gene_matrix & source, gene_matrix & cible, int N) : copy the content of source in cible. Both source
		and cible must be sized NxN.
	*** copy_vector(gene_vector & source, gene_vector & cible, int N) : copy the content of source in cible. Both source
 		and cible must be sized N.

	and the following method corresponding to the action one wants to be benchmarked :

	***  matrix_vector_product(const gene_matrix & A, const gene_vector & B, gene_vector & X, int N)
	***  matrix_matrix_product(const gene_matrix & A, const gene_matrix & B, gene_matrix & X, int N)
        ***  ata_product(const gene_matrix & A, gene_matrix & X, int N)
	***  aat_product(const gene_matrix & A, gene_matrix & X, int N)
        ***  axpy(real coef, const gene_vector & X, gene_vector & Y, int N)

 The bench algorithm (generic_bench/bench.hh) is templated with an action itself templated with
 an interface. A typical main.cpp source stored in a given library directory libs/A_LIB
 looks like :

 bench< AN_ACTION < AN_INTERFACE > >( 10 , 1000 , 50 ) ;

 this function will produce XY data file containing measured  mflops as a function of the size for 50
 sizes between 10 and 10000.

 This algorithm can be adapted by providing a given Perf_Analyzer object which determines how the time
 measurements must be done. For example, the X86_Perf_Analyzer use the asm rdtsc function and provides
 a very fast and accurate (but less portable) timing method. The default is the Portable_Perf_Analyzer
 so

 bench< AN_ACTION < AN_INTERFACE > >( 10 , 1000 , 50 ) ;

 is equivalent to

 bench< Portable_Perf_Analyzer,AN_ACTION < AN_INTERFACE > >( 10 , 1000 , 50 ) ;

 If your system supports it we suggest to use a mixed implementation (X86_perf_Analyzer+Portable_Perf_Analyzer).
 replace
     bench<Portable_Perf_Analyzer,Action>(size_min,size_max,nb_point);
 with
     bench<Mixed_Perf_Analyzer,Action>(size_min,size_max,nb_point);
 in generic/bench.hh

.