Home | History | Annotate | Download | only in objdetect
      1 Cascade Classifier Training {#tutorial_traincascade}
      2 ===========================
      3 
      4 Introduction
      5 ------------
      6 
      7 The work with a cascade classifier inlcudes two major stages: training and detection. Detection
      8 stage is described in a documentation of objdetect module of general OpenCV documentation.
      9 Documentation gives some basic information about cascade classifier. Current guide is describing how
     10 to train a cascade classifier: preparation of the training data and running the training application.
     11 
     12 ### Important notes
     13 
     14 There are two applications in OpenCV to train cascade classifier: opencv_haartraining and
     15 opencv_traincascade. opencv_traincascade is a newer version, written in C++ in accordance to
     16 OpenCV 2.x API. But the main difference between this two applications is that opencv_traincascade
     17 supports both Haar @cite Viola01 and @cite Liao2007 (Local Binary Patterns) features. LBP features
     18 are integer in contrast to Haar features, so both training and detection with LBP are several times
     19 faster then with Haar features. Regarding the LBP and Haar detection quality, it depends on
     20 training: the quality of training dataset first of all and training parameters too. It's possible to
     21 train a LBP-based classifier that will provide almost the same quality as Haar-based one.
     22 
     23 opencv_traincascade and opencv_haartraining store the trained classifier in different file
     24 formats. Note, the newer cascade detection interface (see CascadeClassifier class in objdetect
     25 module) support both formats. opencv_traincascade can save (export) a trained cascade in the older
     26 format. But opencv_traincascade and opencv_haartraining can not load (import) a classifier in
     27 another format for the futher training after interruption.
     28 
     29 Note that opencv_traincascade application can use TBB for multi-threading. To use it in multicore
     30 mode OpenCV must be built with TBB.
     31 
     32 Also there are some auxilary utilities related to the training.
     33 
     34 -   opencv_createsamples is used to prepare a training dataset of positive and test samples.
     35     opencv_createsamples produces dataset of positive samples in a format that is supported by
     36     both opencv_haartraining and opencv_traincascade applications. The output is a file
     37     with \*.vec extension, it is a binary format which contains images.
     38 -   opencv_performance may be used to evaluate the quality of classifiers, but for trained by
     39     opencv_haartraining only. It takes a collection of marked up images, runs the classifier and
     40     reports the performance, i.e. number of found objects, number of missed objects, number of
     41     false alarms and other information.
     42 
     43 Since opencv_haartraining is an obsolete application, only opencv_traincascade will be described
     44 futher. opencv_createsamples utility is needed to prepare a training data for opencv_traincascade,
     45 so it will be described too.
     46 
     47 Training data preparation
     48 -------------------------
     49 
     50 For training we need a set of samples. There are two types of samples: negative and positive.
     51 Negative samples correspond to non-object images. Positive samples correspond to images with
     52 detected objects. Set of negative samples must be prepared manually, whereas set of positive samples
     53 is created using opencv_createsamples utility.
     54 
     55 ### Negative Samples
     56 
     57 Negative samples are taken from arbitrary images. These images must not contain detected objects.
     58 Negative samples are enumerated in a special file. It is a text file in which each line contains an
     59 image filename (relative to the directory of the description file) of negative sample image. This
     60 file must be created manually. Note that negative samples and sample images are also called
     61 background samples or background images, and are used interchangeably in this document.
     62 Described images may be of different sizes. But each image should be (but not nessesarily) larger
     63 than a training window size, because these images are used to subsample negative image to the
     64 training size.
     65 
     66 An example of description file:
     67 
     68 Directory structure:
     69 @code{.text}
     70 /img
     71   img1.jpg
     72   img2.jpg
     73 bg.txt
     74 @endcode
     75 File bg.txt:
     76 @code{.text}
     77 img/img1.jpg
     78 img/img2.jpg
     79 @endcode
     80 ### Positive Samples
     81 
     82 Positive samples are created by opencv_createsamples utility. They may be created from a single
     83 image with object or from a collection of previously marked up images.
     84 
     85 Please note that you need a large dataset of positive samples before you give it to the mentioned
     86 utility, because it only applies perspective transformation. For example you may need only one
     87 positive sample for absolutely rigid object like an OpenCV logo, but you definetely need hundreds
     88 and even thousands of positive samples for faces. In the case of faces you should consider all the
     89 race and age groups, emotions and perhaps beard styles.
     90 
     91 So, a single object image may contain a company logo. Then a large set of positive samples is
     92 created from the given object image by random rotating, changing the logo intensity as well as
     93 placing the logo on arbitrary background. The amount and range of randomness can be controlled by
     94 command line arguments of opencv_createsamples utility.
     95 
     96 Command line arguments:
     97 
     98 -   -vec \<vec_file_name\>
     99 
    100     Name of the output file containing the positive samples for training.
    101 
    102 -   -img \<image_file_name\>
    103 
    104     Source object image (e.g., a company logo).
    105 
    106 -   -bg \<background_file_name\>
    107 
    108     Background description file; contains a list of images which are used as a background for
    109     randomly distorted versions of the object.
    110 
    111 -   -num \<number_of_samples\>
    112 
    113     Number of positive samples to generate.
    114 
    115 -   -bgcolor \<background_color\>
    116 
    117     Background color (currently grayscale images are assumed); the background color denotes the
    118     transparent color. Since there might be compression artifacts, the amount of color tolerance
    119     can be specified by -bgthresh. All pixels withing bgcolor-bgthresh and bgcolor+bgthresh range
    120     are interpreted as transparent.
    121 
    122 -   -bgthresh \<background_color_threshold\>
    123 -   -inv
    124 
    125     If specified, colors will be inverted.
    126 
    127 -   -randinv
    128 
    129     If specified, colors will be inverted randomly.
    130 
    131 -   -maxidev \<max_intensity_deviation\>
    132 
    133     Maximal intensity deviation of pixels in foreground samples.
    134 
    135 -   -maxxangle \<max_x_rotation_angle\>
    136 -   -maxyangle \<max_y_rotation_angle\>
    137 -   -maxzangle \<max_z_rotation_angle\>
    138 
    139     Maximum rotation angles must be given in radians.
    140 
    141 -   -show
    142 
    143     Useful debugging option. If specified, each sample will be shown. Pressing Esc will continue
    144     the samples creation process without.
    145 
    146 -   -w \<sample_width\>
    147 
    148     Width (in pixels) of the output samples.
    149 
    150 -   -h \<sample_height\>
    151 
    152     Height (in pixels) of the output samples.
    153 
    154 For following procedure is used to create a sample object instance: The source image is rotated
    155 randomly around all three axes. The chosen angle is limited my -max?angle. Then pixels having the
    156 intensity from [bg_color-bg_color_threshold; bg_color+bg_color_threshold] range are
    157 interpreted as transparent. White noise is added to the intensities of the foreground. If the -inv
    158 key is specified then foreground pixel intensities are inverted. If -randinv key is specified then
    159 algorithm randomly selects whether inversion should be applied to this sample. Finally, the obtained
    160 image is placed onto an arbitrary background from the background description file, resized to the
    161 desired size specified by -w and -h and stored to the vec-file, specified by the -vec command line
    162 option.
    163 
    164 Positive samples also may be obtained from a collection of previously marked up images. This
    165 collection is described by a text file similar to background description file. Each line of this
    166 file corresponds to an image. The first element of the line is the filename. It is followed by the
    167 number of object instances. The following numbers are the coordinates of objects bounding rectangles
    168 (x, y, width, height).
    169 
    170 An example of description file:
    171 
    172 Directory structure:
    173 @code{.text}
    174 /img
    175   img1.jpg
    176   img2.jpg
    177 info.dat
    178 @endcode
    179 File info.dat:
    180 @code{.text}
    181 img/img1.jpg  1  140 100 45 45
    182 img/img2.jpg  2  100 200 50 50   50 30 25 25
    183 @endcode
    184 Image img1.jpg contains single object instance with the following coordinates of bounding rectangle:
    185 (140, 100, 45, 45). Image img2.jpg contains two object instances.
    186 
    187 In order to create positive samples from such collection, -info argument should be specified instead
    188 of \`-img\`:
    189 
    190 -   -info \<collection_file_name\>
    191 
    192     Description file of marked up images collection.
    193 
    194 The scheme of samples creation in this case is as follows. The object instances are taken from
    195 images. Then they are resized to target samples size and stored in output vec-file. No distortion is
    196 applied, so the only affecting arguments are -w, -h, -show and -num.
    197 
    198 opencv_createsamples utility may be used for examining samples stored in positive samples file. In
    199 order to do this only -vec, -w and -h parameters should be specified.
    200 
    201 Note that for training, it does not matter how vec-files with positive samples are generated. But
    202 opencv_createsamples utility is the only one way to collect/create a vector file of positive
    203 samples, provided by OpenCV.
    204 
    205 Example of vec-file is available here opencv/data/vec_files/trainingfaces_24-24.vec. It can be
    206 used to train a face detector with the following window size: -w 24 -h 24.
    207 
    208 Cascade Training
    209 ----------------
    210 
    211 The next step is the training of classifier. As mentioned above opencv_traincascade or
    212 opencv_haartraining may be used to train a cascade classifier, but only the newer
    213 opencv_traincascade will be described futher.
    214 
    215 Command line arguments of opencv_traincascade application grouped by purposes:
    216 
    217 -#  Common arguments:
    218 
    219     -   -data \<cascade_dir_name\>
    220 
    221         Where the trained classifier should be stored.
    222 
    223     -   -vec \<vec_file_name\>
    224 
    225         vec-file with positive samples (created by opencv_createsamples utility).
    226 
    227     -   -bg \<background_file_name\>
    228 
    229         Background description file.
    230 
    231     -   -numPos \<number_of_positive_samples\>
    232     -   -numNeg \<number_of_negative_samples\>
    233 
    234         Number of positive/negative samples used in training for every classifier stage.
    235 
    236     -   -numStages \<number_of_stages\>
    237 
    238         Number of cascade stages to be trained.
    239 
    240     -   -precalcValBufSize \<precalculated_vals_buffer_size_in_Mb\>
    241 
    242         Size of buffer for precalculated feature values (in Mb).
    243 
    244     -   -precalcIdxBufSize \<precalculated_idxs_buffer_size_in_Mb\>
    245 
    246         Size of buffer for precalculated feature indices (in Mb). The more memory you have the
    247         faster the training process.
    248 
    249     -   -baseFormatSave
    250 
    251         This argument is actual in case of Haar-like features. If it is specified, the cascade will
    252         be saved in the old format.
    253 
    254     -   -numThreads \<max_number_of_threads\>
    255 
    256         Maximum number of threads to use during training. Notice that the actual number of used
    257         threads may be lower, depending on your machine and compilation options.
    258 
    259     -   -acceptanceRatioBreakValue \<break_value\>
    260 
    261         This argument is used to determine how precise your model should keep learning and when to stop.
    262         A good guideline is to train not further than 10e-5, to ensure the model does not overtrain on your training data.
    263         By default this value is set to -1 to disable this feature.
    264 
    265 -#  Cascade parameters:
    266 
    267     -   -stageType \<BOOST(default)\>
    268 
    269         Type of stages. Only boosted classifier are supported as a stage type at the moment.
    270 
    271     -   -featureType\<{HAAR(default), LBP}\>
    272 
    273         Type of features: HAAR - Haar-like features, LBP - local binary patterns.
    274 
    275     -   -w \<sampleWidth\>
    276     -   -h \<sampleHeight\>
    277 
    278         Size of training samples (in pixels). Must have exactly the same values as used during
    279         training samples creation (opencv_createsamples utility).
    280 
    281 -#  Boosted classifer parameters:
    282 
    283     -   -bt \<{DAB, RAB, LB, GAB(default)}\>
    284 
    285         Type of boosted classifiers: DAB - Discrete AdaBoost, RAB - Real AdaBoost, LB - LogitBoost,
    286         GAB - Gentle AdaBoost.
    287 
    288     -   -minHitRate \<min_hit_rate\>
    289 
    290         Minimal desired hit rate for each stage of the classifier. Overall hit rate may be estimated
    291         as (min_hit_rate\^number_of_stages).
    292 
    293     -   -maxFalseAlarmRate \<max_false_alarm_rate\>
    294 
    295         Maximal desired false alarm rate for each stage of the classifier. Overall false alarm rate
    296         may be estimated as (max_false_alarm_rate\^number_of_stages).
    297 
    298     -   -weightTrimRate \<weight_trim_rate\>
    299 
    300         Specifies whether trimming should be used and its weight. A decent choice is 0.95.
    301 
    302     -   -maxDepth \<max_depth_of_weak_tree\>
    303 
    304         Maximal depth of a weak tree. A decent choice is 1, that is case of stumps.
    305 
    306     -   -maxWeakCount \<max_weak_tree_count\>
    307 
    308         Maximal count of weak trees for every cascade stage. The boosted classifier (stage) will
    309         have so many weak trees (\<=maxWeakCount), as needed to achieve the
    310         given -maxFalseAlarmRate.
    311 
    312 -#  Haar-like feature parameters:
    313 
    314     -   -mode \<BASIC (default) | CORE | ALL\>
    315 
    316         Selects the type of Haar features set used in training. BASIC use only upright features,
    317         while ALL uses the full set of upright and 45 degree rotated feature set. See @cite Lienhart02
    318         for more details.
    319 
    320 -#  Local Binary Patterns parameters:
    321 
    322     Local Binary Patterns don't have parameters.
    323 
    324 After the opencv_traincascade application has finished its work, the trained cascade will be saved
    325 in cascade.xml file in the folder, which was passed as -data parameter. Other files in this folder
    326 are created for the case of interrupted training, so you may delete them after completion of
    327 training.
    328 
    329 Training is finished and you can test you cascade classifier!
    330