Home | History | Annotate | Download | only in btl
      1 Bench Template Library
      2 
      3 ****************************************
      4 Introduction :
      5 
      6 The aim of this project is to compare the performance
      7 of available numerical libraries. The code is designed
      8 as generic and modular as possible. Thus, adding new
      9 numerical libraries or new numerical tests should
     10 require minimal effort.
     11 
     12 
     13 *****************************************
     14 
     15 Installation :
     16 
     17 BTL uses cmake / ctest:
     18 
     19 1 - create a build directory:
     20 
     21   $ mkdir build
     22   $ cd build
     23 
     24 2 - configure:
     25 
     26   $ ccmake ..
     27 
     28 3 - run the bench using ctest:
     29 
     30   $ ctest -V
     31 
     32 You can run the benchmarks only on libraries matching a given regular expression:
     33   ctest -V -R <regexp>
     34 For instance:
     35   ctest -V -R eigen2
     36 
     37 You can also select a given set of actions defining the environment variable BTL_CONFIG this way:
     38   BTL_CONFIG="-a action1{:action2}*" ctest -V
     39 An exemple:
     40   BTL_CONFIG="-a axpy:vector_matrix:trisolve:ata" ctest -V -R eigen2
     41 
     42 Finally, if bench results already exist (the bench*.dat files) then they merges by keeping the best for each matrix size. If you want to overwrite the previous ones you can simply add the "--overwrite" option:
     43   BTL_CONFIG="-a axpy:vector_matrix:trisolve:ata --overwrite" ctest -V -R eigen2
     44 
     45 4 : Analyze the result. different data files (.dat) are produced in each libs directories.
     46  If gnuplot is available, choose a directory name in the data directory to store the results and type:
     47         $ cd data
     48         $ mkdir my_directory
     49         $ cp ../libs/*/*.dat my_directory
     50  Build the data utilities in this (data) directory
     51         make
     52  Then you can look the raw data,
     53         go_mean my_directory
     54  or smooth the data first :
     55 	smooth_all.sh my_directory
     56 	go_mean my_directory_smooth
     57 
     58 
     59 *************************************************
     60 
     61 Files and directories :
     62 
     63  generic_bench : all the bench sources common to all libraries
     64 
     65  actions : sources for different action wrappers (axpy, matrix-matrix product) to be tested.
     66 
     67  libs/* : bench sources specific to each tested libraries.
     68 
     69  machine_dep : directory used to store machine specific Makefile.in
     70 
     71  data : directory used to store gnuplot scripts and data analysis utilities
     72 
     73 **************************************************
     74 
     75 Principles : the code modularity is achieved by defining two concepts :
     76 
     77  ****** Action concept : This is a class defining which kind
     78   of test must be performed (e.g. a matrix_vector_product).
     79 	An Action should define the following methods :
     80 
     81         *** Ctor using the size of the problem (matrix or vector size) as an argument
     82 	    Action action(size);
     83         *** initialize : this method initialize the calculation (e.g. initialize the matrices and vectors arguments)
     84 	    action.initialize();
     85 	*** calculate : this method actually launch the calculation to be benchmarked
     86 	    action.calculate;
     87 	*** nb_op_base() : this method returns the complexity of the calculate method (allowing the mflops evaluation)
     88         *** name() : this method returns the name of the action (std::string)
     89 
     90  ****** Interface concept : This is a class or namespace defining how to use a given library and
     91   its specific containers (matrix and vector). Up to now an interface should following types
     92 
     93 	*** real_type : kind of float to be used (float or double)
     94 	*** stl_vector : must correspond to std::vector<real_type>
     95 	*** stl_matrix : must correspond to std::vector<stl_vector>
     96 	*** gene_vector : the vector type for this interface        --> e.g. (real_type *) for the C_interface
     97 	*** gene_matrix : the matrix type for this interface        --> e.g. (gene_vector *) for the C_interface
     98 
     99 	+ the following common methods
    100 
    101         *** free_matrix(gene_matrix & A, int N)  dealocation of a N sized gene_matrix A
    102         *** free_vector(gene_vector & B)  dealocation of a N sized gene_vector B
    103         *** matrix_from_stl(gene_matrix & A, stl_matrix & A_stl) copy the content of an stl_matrix A_stl into a gene_matrix A.
    104 	     The allocation of A is done in this function.
    105 	*** vector_to_stl(gene_vector & B, stl_vector & B_stl)  copy the content of an stl_vector B_stl into a gene_vector B.
    106 	     The allocation of B is done in this function.
    107         *** matrix_to_stl(gene_matrix & A, stl_matrix & A_stl) copy the content of an gene_matrix A into an stl_matrix A_stl.
    108              The size of A_STL must corresponds to the size of A.
    109         *** vector_to_stl(gene_vector & A, stl_vector & A_stl) copy the content of an gene_vector A into an stl_vector A_stl.
    110              The size of B_STL must corresponds to the size of B.
    111 	*** copy_matrix(gene_matrix & source, gene_matrix & cible, int N) : copy the content of source in cible. Both source
    112 		and cible must be sized NxN.
    113 	*** copy_vector(gene_vector & source, gene_vector & cible, int N) : copy the content of source in cible. Both source
    114  		and cible must be sized N.
    115 
    116 	and the following method corresponding to the action one wants to be benchmarked :
    117 
    118 	***  matrix_vector_product(const gene_matrix & A, const gene_vector & B, gene_vector & X, int N)
    119 	***  matrix_matrix_product(const gene_matrix & A, const gene_matrix & B, gene_matrix & X, int N)
    120         ***  ata_product(const gene_matrix & A, gene_matrix & X, int N)
    121 	***  aat_product(const gene_matrix & A, gene_matrix & X, int N)
    122         ***  axpy(real coef, const gene_vector & X, gene_vector & Y, int N)
    123 
    124  The bench algorithm (generic_bench/bench.hh) is templated with an action itself templated with
    125  an interface. A typical main.cpp source stored in a given library directory libs/A_LIB
    126  looks like :
    127 
    128  bench< AN_ACTION < AN_INTERFACE > >( 10 , 1000 , 50 ) ;
    129 
    130  this function will produce XY data file containing measured  mflops as a function of the size for 50
    131  sizes between 10 and 10000.
    132 
    133  This algorithm can be adapted by providing a given Perf_Analyzer object which determines how the time
    134  measurements must be done. For example, the X86_Perf_Analyzer use the asm rdtsc function and provides
    135  a very fast and accurate (but less portable) timing method. The default is the Portable_Perf_Analyzer
    136  so
    137 
    138  bench< AN_ACTION < AN_INTERFACE > >( 10 , 1000 , 50 ) ;
    139 
    140  is equivalent to
    141 
    142  bench< Portable_Perf_Analyzer,AN_ACTION < AN_INTERFACE > >( 10 , 1000 , 50 ) ;
    143 
    144  If your system supports it we suggest to use a mixed implementation (X86_perf_Analyzer+Portable_Perf_Analyzer).
    145  replace
    146      bench<Portable_Perf_Analyzer,Action>(size_min,size_max,nb_point);
    147  with
    148      bench<Mixed_Perf_Analyzer,Action>(size_min,size_max,nb_point);
    149  in generic/bench.hh
    150 
    151 .
    152 
    153 
    154 
    155