Home | History | Annotate | Download | only in opencv2
      1 /*M///////////////////////////////////////////////////////////////////////////////////////
      2 //
      3 //  IMPORTANT: READ BEFORE DOWNLOADING, COPYING, INSTALLING OR USING.
      4 //
      5 //  By downloading, copying, installing or using the software you agree to this license.
      6 //  If you do not agree to this license, do not download, install,
      7 //  copy or use the software.
      8 //
      9 //
     10 //                          License Agreement
     11 //                For Open Source Computer Vision Library
     12 //
     13 // Copyright (C) 2000-2008, Intel Corporation, all rights reserved.
     14 // Copyright (C) 2009, Willow Garage Inc., all rights reserved.
     15 // Copyright (C) 2013, OpenCV Foundation, all rights reserved.
     16 // Third party copyrights are property of their respective owners.
     17 //
     18 // Redistribution and use in source and binary forms, with or without modification,
     19 // are permitted provided that the following conditions are met:
     20 //
     21 //   * Redistribution's of source code must retain the above copyright notice,
     22 //     this list of conditions and the following disclaimer.
     23 //
     24 //   * Redistribution's in binary form must reproduce the above copyright notice,
     25 //     this list of conditions and the following disclaimer in the documentation
     26 //     and/or other materials provided with the distribution.
     27 //
     28 //   * The name of the copyright holders may not be used to endorse or promote products
     29 //     derived from this software without specific prior written permission.
     30 //
     31 // This software is provided by the copyright holders and contributors "as is" and
     32 // any express or implied warranties, including, but not limited to, the implied
     33 // warranties of merchantability and fitness for a particular purpose are disclaimed.
     34 // In no event shall the Intel Corporation or contributors be liable for any direct,
     35 // indirect, incidental, special, exemplary, or consequential damages
     36 // (including, but not limited to, procurement of substitute goods or services;
     37 // loss of use, data, or profits; or business interruption) however caused
     38 // and on any theory of liability, whether in contract, strict liability,
     39 // or tort (including negligence or otherwise) arising in any way out of
     40 // the use of this software, even if advised of the possibility of such damage.
     41 //
     42 //M*/
     43 
     44 #ifndef __OPENCV_OBJDETECT_HPP__
     45 #define __OPENCV_OBJDETECT_HPP__
     46 
     47 #include "opencv2/core.hpp"
     48 
     49 /**
     50 @defgroup objdetect Object Detection
     51 
     52 Haar Feature-based Cascade Classifier for Object Detection
     53 ----------------------------------------------------------
     54 
     55 The object detector described below has been initially proposed by Paul Viola @cite Viola01 and
     56 improved by Rainer Lienhart @cite Lienhart02 .
     57 
     58 First, a classifier (namely a *cascade of boosted classifiers working with haar-like features*) is
     59 trained with a few hundred sample views of a particular object (i.e., a face or a car), called
     60 positive examples, that are scaled to the same size (say, 20x20), and negative examples - arbitrary
     61 images of the same size.
     62 
     63 After a classifier is trained, it can be applied to a region of interest (of the same size as used
     64 during the training) in an input image. The classifier outputs a "1" if the region is likely to show
     65 the object (i.e., face/car), and "0" otherwise. To search for the object in the whole image one can
     66 move the search window across the image and check every location using the classifier. The
     67 classifier is designed so that it can be easily "resized" in order to be able to find the objects of
     68 interest at different sizes, which is more efficient than resizing the image itself. So, to find an
     69 object of an unknown size in the image the scan procedure should be done several times at different
     70 scales.
     71 
     72 The word "cascade" in the classifier name means that the resultant classifier consists of several
     73 simpler classifiers (*stages*) that are applied subsequently to a region of interest until at some
     74 stage the candidate is rejected or all the stages are passed. The word "boosted" means that the
     75 classifiers at every stage of the cascade are complex themselves and they are built out of basic
     76 classifiers using one of four different boosting techniques (weighted voting). Currently Discrete
     77 Adaboost, Real Adaboost, Gentle Adaboost and Logitboost are supported. The basic classifiers are
     78 decision-tree classifiers with at least 2 leaves. Haar-like features are the input to the basic
     79 classifiers, and are calculated as described below. The current algorithm uses the following
     80 Haar-like features:
     81 
     82 ![image](pics/haarfeatures.png)
     83 
     84 The feature used in a particular classifier is specified by its shape (1a, 2b etc.), position within
     85 the region of interest and the scale (this scale is not the same as the scale used at the detection
     86 stage, though these two scales are multiplied). For example, in the case of the third line feature
     87 (2c) the response is calculated as the difference between the sum of image pixels under the
     88 rectangle covering the whole feature (including the two white stripes and the black stripe in the
     89 middle) and the sum of the image pixels under the black stripe multiplied by 3 in order to
     90 compensate for the differences in the size of areas. The sums of pixel values over a rectangular
     91 regions are calculated rapidly using integral images (see below and the integral description).
     92 
     93 To see the object detector at work, have a look at the facedetect demo:
     94 <https://github.com/Itseez/opencv/tree/master/samples/cpp/dbt_face_detection.cpp>
     95 
     96 The following reference is for the detection part only. There is a separate application called
     97 opencv_traincascade that can train a cascade of boosted classifiers from a set of samples.
     98 
     99 @note In the new C++ interface it is also possible to use LBP (local binary pattern) features in
    100 addition to Haar-like features. .. [Viola01] Paul Viola and Michael J. Jones. Rapid Object Detection
    101 using a Boosted Cascade of Simple Features. IEEE CVPR, 2001. The paper is available online at
    102 <http://research.microsoft.com/en-us/um/people/viola/Pubs/Detect/violaJones_CVPR2001.pdf>
    103 
    104 @{
    105     @defgroup objdetect_c C API
    106 @}
    107  */
    108 
    109 typedef struct CvHaarClassifierCascade CvHaarClassifierCascade;
    110 
    111 namespace cv
    112 {
    113 
    114 //! @addtogroup objdetect
    115 //! @{
    116 
    117 ///////////////////////////// Object Detection ////////////////////////////
    118 
    119 //! class for grouping object candidates, detected by Cascade Classifier, HOG etc.
    120 //! instance of the class is to be passed to cv::partition (see cxoperations.hpp)
    121 class CV_EXPORTS SimilarRects
    122 {
    123 public:
    124     SimilarRects(double _eps) : eps(_eps) {}
    125     inline bool operator()(const Rect& r1, const Rect& r2) const
    126     {
    127         double delta = eps*(std::min(r1.width, r2.width) + std::min(r1.height, r2.height))*0.5;
    128         return std::abs(r1.x - r2.x) <= delta &&
    129             std::abs(r1.y - r2.y) <= delta &&
    130             std::abs(r1.x + r1.width - r2.x - r2.width) <= delta &&
    131             std::abs(r1.y + r1.height - r2.y - r2.height) <= delta;
    132     }
    133     double eps;
    134 };
    135 
    136 /** @brief Groups the object candidate rectangles.
    137 
    138 @param rectList Input/output vector of rectangles. Output vector includes retained and grouped
    139 rectangles. (The Python list is not modified in place.)
    140 @param groupThreshold Minimum possible number of rectangles minus 1. The threshold is used in a
    141 group of rectangles to retain it.
    142 @param eps Relative difference between sides of the rectangles to merge them into a group.
    143 
    144 The function is a wrapper for the generic function partition . It clusters all the input rectangles
    145 using the rectangle equivalence criteria that combines rectangles with similar sizes and similar
    146 locations. The similarity is defined by eps. When eps=0 , no clustering is done at all. If
    147 \f$\texttt{eps}\rightarrow +\inf\f$ , all the rectangles are put in one cluster. Then, the small
    148 clusters containing less than or equal to groupThreshold rectangles are rejected. In each other
    149 cluster, the average rectangle is computed and put into the output rectangle list.
    150  */
    151 CV_EXPORTS   void groupRectangles(std::vector<Rect>& rectList, int groupThreshold, double eps = 0.2);
    152 /** @overload */
    153 CV_EXPORTS_W void groupRectangles(CV_IN_OUT std::vector<Rect>& rectList, CV_OUT std::vector<int>& weights,
    154                                   int groupThreshold, double eps = 0.2);
    155 /** @overload */
    156 CV_EXPORTS   void groupRectangles(std::vector<Rect>& rectList, int groupThreshold,
    157                                   double eps, std::vector<int>* weights, std::vector<double>* levelWeights );
    158 /** @overload */
    159 CV_EXPORTS   void groupRectangles(std::vector<Rect>& rectList, std::vector<int>& rejectLevels,
    160                                   std::vector<double>& levelWeights, int groupThreshold, double eps = 0.2);
    161 /** @overload */
    162 CV_EXPORTS   void groupRectangles_meanshift(std::vector<Rect>& rectList, std::vector<double>& foundWeights,
    163                                             std::vector<double>& foundScales,
    164                                             double detectThreshold = 0.0, Size winDetSize = Size(64, 128));
    165 
    166 template<> CV_EXPORTS void DefaultDeleter<CvHaarClassifierCascade>::operator ()(CvHaarClassifierCascade* obj) const;
    167 
    168 enum { CASCADE_DO_CANNY_PRUNING    = 1,
    169        CASCADE_SCALE_IMAGE         = 2,
    170        CASCADE_FIND_BIGGEST_OBJECT = 4,
    171        CASCADE_DO_ROUGH_SEARCH     = 8
    172      };
    173 
    174 class CV_EXPORTS_W BaseCascadeClassifier : public Algorithm
    175 {
    176 public:
    177     virtual ~BaseCascadeClassifier();
    178     virtual bool empty() const = 0;
    179     virtual bool load( const String& filename ) = 0;
    180     virtual void detectMultiScale( InputArray image,
    181                            CV_OUT std::vector<Rect>& objects,
    182                            double scaleFactor,
    183                            int minNeighbors, int flags,
    184                            Size minSize, Size maxSize ) = 0;
    185 
    186     virtual void detectMultiScale( InputArray image,
    187                            CV_OUT std::vector<Rect>& objects,
    188                            CV_OUT std::vector<int>& numDetections,
    189                            double scaleFactor,
    190                            int minNeighbors, int flags,
    191                            Size minSize, Size maxSize ) = 0;
    192 
    193     virtual void detectMultiScale( InputArray image,
    194                                    CV_OUT std::vector<Rect>& objects,
    195                                    CV_OUT std::vector<int>& rejectLevels,
    196                                    CV_OUT std::vector<double>& levelWeights,
    197                                    double scaleFactor,
    198                                    int minNeighbors, int flags,
    199                                    Size minSize, Size maxSize,
    200                                    bool outputRejectLevels ) = 0;
    201 
    202     virtual bool isOldFormatCascade() const = 0;
    203     virtual Size getOriginalWindowSize() const = 0;
    204     virtual int getFeatureType() const = 0;
    205     virtual void* getOldCascade() = 0;
    206 
    207     class CV_EXPORTS MaskGenerator
    208     {
    209     public:
    210         virtual ~MaskGenerator() {}
    211         virtual Mat generateMask(const Mat& src)=0;
    212         virtual void initializeMask(const Mat& /*src*/) { }
    213     };
    214     virtual void setMaskGenerator(const Ptr<MaskGenerator>& maskGenerator) = 0;
    215     virtual Ptr<MaskGenerator> getMaskGenerator() = 0;
    216 };
    217 
    218 /** @brief Cascade classifier class for object detection.
    219  */
    220 class CV_EXPORTS_W CascadeClassifier
    221 {
    222 public:
    223     CV_WRAP CascadeClassifier();
    224     /** @brief Loads a classifier from a file.
    225 
    226     @param filename Name of the file from which the classifier is loaded.
    227      */
    228     CV_WRAP CascadeClassifier(const String& filename);
    229     ~CascadeClassifier();
    230     /** @brief Checks whether the classifier has been loaded.
    231     */
    232     CV_WRAP bool empty() const;
    233     /** @brief Loads a classifier from a file.
    234 
    235     @param filename Name of the file from which the classifier is loaded. The file may contain an old
    236     HAAR classifier trained by the haartraining application or a new cascade classifier trained by the
    237     traincascade application.
    238      */
    239     CV_WRAP bool load( const String& filename );
    240     /** @brief Reads a classifier from a FileStorage node.
    241 
    242     @note The file may contain a new cascade classifier (trained traincascade application) only.
    243      */
    244     CV_WRAP bool read( const FileNode& node );
    245 
    246     /** @brief Detects objects of different sizes in the input image. The detected objects are returned as a list
    247     of rectangles.
    248 
    249     @param image Matrix of the type CV_8U containing an image where objects are detected.
    250     @param objects Vector of rectangles where each rectangle contains the detected object, the
    251     rectangles may be partially outside the original image.
    252     @param scaleFactor Parameter specifying how much the image size is reduced at each image scale.
    253     @param minNeighbors Parameter specifying how many neighbors each candidate rectangle should have
    254     to retain it.
    255     @param flags Parameter with the same meaning for an old cascade as in the function
    256     cvHaarDetectObjects. It is not used for a new cascade.
    257     @param minSize Minimum possible object size. Objects smaller than that are ignored.
    258     @param maxSize Maximum possible object size. Objects larger than that are ignored.
    259 
    260     The function is parallelized with the TBB library.
    261 
    262     @note
    263        -   (Python) A face detection example using cascade classifiers can be found at
    264             opencv_source_code/samples/python2/facedetect.py
    265     */
    266     CV_WRAP void detectMultiScale( InputArray image,
    267                           CV_OUT std::vector<Rect>& objects,
    268                           double scaleFactor = 1.1,
    269                           int minNeighbors = 3, int flags = 0,
    270                           Size minSize = Size(),
    271                           Size maxSize = Size() );
    272 
    273     /** @overload
    274     @param image Matrix of the type CV_8U containing an image where objects are detected.
    275     @param objects Vector of rectangles where each rectangle contains the detected object, the
    276     rectangles may be partially outside the original image.
    277     @param numDetections Vector of detection numbers for the corresponding objects. An object's number
    278     of detections is the number of neighboring positively classified rectangles that were joined
    279     together to form the object.
    280     @param scaleFactor Parameter specifying how much the image size is reduced at each image scale.
    281     @param minNeighbors Parameter specifying how many neighbors each candidate rectangle should have
    282     to retain it.
    283     @param flags Parameter with the same meaning for an old cascade as in the function
    284     cvHaarDetectObjects. It is not used for a new cascade.
    285     @param minSize Minimum possible object size. Objects smaller than that are ignored.
    286     @param maxSize Maximum possible object size. Objects larger than that are ignored.
    287     */
    288     CV_WRAP_AS(detectMultiScale2) void detectMultiScale( InputArray image,
    289                           CV_OUT std::vector<Rect>& objects,
    290                           CV_OUT std::vector<int>& numDetections,
    291                           double scaleFactor=1.1,
    292                           int minNeighbors=3, int flags=0,
    293                           Size minSize=Size(),
    294                           Size maxSize=Size() );
    295 
    296     /** @overload
    297     if `outputRejectLevels` is `true` returns `rejectLevels` and `levelWeights`
    298     */
    299     CV_WRAP_AS(detectMultiScale3) void detectMultiScale( InputArray image,
    300                                   CV_OUT std::vector<Rect>& objects,
    301                                   CV_OUT std::vector<int>& rejectLevels,
    302                                   CV_OUT std::vector<double>& levelWeights,
    303                                   double scaleFactor = 1.1,
    304                                   int minNeighbors = 3, int flags = 0,
    305                                   Size minSize = Size(),
    306                                   Size maxSize = Size(),
    307                                   bool outputRejectLevels = false );
    308 
    309     CV_WRAP bool isOldFormatCascade() const;
    310     CV_WRAP Size getOriginalWindowSize() const;
    311     CV_WRAP int getFeatureType() const;
    312     void* getOldCascade();
    313 
    314     CV_WRAP static bool convert(const String& oldcascade, const String& newcascade);
    315 
    316     void setMaskGenerator(const Ptr<BaseCascadeClassifier::MaskGenerator>& maskGenerator);
    317     Ptr<BaseCascadeClassifier::MaskGenerator> getMaskGenerator();
    318 
    319     Ptr<BaseCascadeClassifier> cc;
    320 };
    321 
    322 CV_EXPORTS Ptr<BaseCascadeClassifier::MaskGenerator> createFaceDetectionMaskGenerator();
    323 
    324 //////////////// HOG (Histogram-of-Oriented-Gradients) Descriptor and Object Detector //////////////
    325 
    326 //! struct for detection region of interest (ROI)
    327 struct DetectionROI
    328 {
    329    //! scale(size) of the bounding box
    330    double scale;
    331    //! set of requrested locations to be evaluated
    332    std::vector<cv::Point> locations;
    333    //! vector that will contain confidence values for each location
    334    std::vector<double> confidences;
    335 };
    336 
    337 struct CV_EXPORTS_W HOGDescriptor
    338 {
    339 public:
    340     enum { L2Hys = 0
    341          };
    342     enum { DEFAULT_NLEVELS = 64
    343          };
    344 
    345     CV_WRAP HOGDescriptor() : winSize(64,128), blockSize(16,16), blockStride(8,8),
    346         cellSize(8,8), nbins(9), derivAperture(1), winSigma(-1),
    347         histogramNormType(HOGDescriptor::L2Hys), L2HysThreshold(0.2), gammaCorrection(true),
    348         free_coef(-1.f), nlevels(HOGDescriptor::DEFAULT_NLEVELS), signedGradient(false)
    349     {}
    350 
    351     CV_WRAP HOGDescriptor(Size _winSize, Size _blockSize, Size _blockStride,
    352                   Size _cellSize, int _nbins, int _derivAperture=1, double _winSigma=-1,
    353                   int _histogramNormType=HOGDescriptor::L2Hys,
    354                   double _L2HysThreshold=0.2, bool _gammaCorrection=false,
    355                   int _nlevels=HOGDescriptor::DEFAULT_NLEVELS, bool _signedGradient=false)
    356     : winSize(_winSize), blockSize(_blockSize), blockStride(_blockStride), cellSize(_cellSize),
    357     nbins(_nbins), derivAperture(_derivAperture), winSigma(_winSigma),
    358     histogramNormType(_histogramNormType), L2HysThreshold(_L2HysThreshold),
    359     gammaCorrection(_gammaCorrection), free_coef(-1.f), nlevels(_nlevels), signedGradient(_signedGradient)
    360     {}
    361 
    362     CV_WRAP HOGDescriptor(const String& filename)
    363     {
    364         load(filename);
    365     }
    366 
    367     HOGDescriptor(const HOGDescriptor& d)
    368     {
    369         d.copyTo(*this);
    370     }
    371 
    372     virtual ~HOGDescriptor() {}
    373 
    374     CV_WRAP size_t getDescriptorSize() const;
    375     CV_WRAP bool checkDetectorSize() const;
    376     CV_WRAP double getWinSigma() const;
    377 
    378     CV_WRAP virtual void setSVMDetector(InputArray _svmdetector);
    379 
    380     virtual bool read(FileNode& fn);
    381     virtual void write(FileStorage& fs, const String& objname) const;
    382 
    383     CV_WRAP virtual bool load(const String& filename, const String& objname = String());
    384     CV_WRAP virtual void save(const String& filename, const String& objname = String()) const;
    385     virtual void copyTo(HOGDescriptor& c) const;
    386 
    387     CV_WRAP virtual void compute(InputArray img,
    388                          CV_OUT std::vector<float>& descriptors,
    389                          Size winStride = Size(), Size padding = Size(),
    390                          const std::vector<Point>& locations = std::vector<Point>()) const;
    391 
    392     //! with found weights output
    393     CV_WRAP virtual void detect(const Mat& img, CV_OUT std::vector<Point>& foundLocations,
    394                         CV_OUT std::vector<double>& weights,
    395                         double hitThreshold = 0, Size winStride = Size(),
    396                         Size padding = Size(),
    397                         const std::vector<Point>& searchLocations = std::vector<Point>()) const;
    398     //! without found weights output
    399     virtual void detect(const Mat& img, CV_OUT std::vector<Point>& foundLocations,
    400                         double hitThreshold = 0, Size winStride = Size(),
    401                         Size padding = Size(),
    402                         const std::vector<Point>& searchLocations=std::vector<Point>()) const;
    403 
    404     //! with result weights output
    405     CV_WRAP virtual void detectMultiScale(InputArray img, CV_OUT std::vector<Rect>& foundLocations,
    406                                   CV_OUT std::vector<double>& foundWeights, double hitThreshold = 0,
    407                                   Size winStride = Size(), Size padding = Size(), double scale = 1.05,
    408                                   double finalThreshold = 2.0,bool useMeanshiftGrouping = false) const;
    409     //! without found weights output
    410     virtual void detectMultiScale(InputArray img, CV_OUT std::vector<Rect>& foundLocations,
    411                                   double hitThreshold = 0, Size winStride = Size(),
    412                                   Size padding = Size(), double scale = 1.05,
    413                                   double finalThreshold = 2.0, bool useMeanshiftGrouping = false) const;
    414 
    415     CV_WRAP virtual void computeGradient(const Mat& img, CV_OUT Mat& grad, CV_OUT Mat& angleOfs,
    416                                  Size paddingTL = Size(), Size paddingBR = Size()) const;
    417 
    418     CV_WRAP static std::vector<float> getDefaultPeopleDetector();
    419     CV_WRAP static std::vector<float> getDaimlerPeopleDetector();
    420 
    421     CV_PROP Size winSize;
    422     CV_PROP Size blockSize;
    423     CV_PROP Size blockStride;
    424     CV_PROP Size cellSize;
    425     CV_PROP int nbins;
    426     CV_PROP int derivAperture;
    427     CV_PROP double winSigma;
    428     CV_PROP int histogramNormType;
    429     CV_PROP double L2HysThreshold;
    430     CV_PROP bool gammaCorrection;
    431     CV_PROP std::vector<float> svmDetector;
    432     UMat oclSvmDetector;
    433     float free_coef;
    434     CV_PROP int nlevels;
    435     CV_PROP bool signedGradient;
    436 
    437 
    438     //! evaluate specified ROI and return confidence value for each location
    439     virtual void detectROI(const cv::Mat& img, const std::vector<cv::Point> &locations,
    440                                    CV_OUT std::vector<cv::Point>& foundLocations, CV_OUT std::vector<double>& confidences,
    441                                    double hitThreshold = 0, cv::Size winStride = Size(),
    442                                    cv::Size padding = Size()) const;
    443 
    444     //! evaluate specified ROI and return confidence value for each location in multiple scales
    445     virtual void detectMultiScaleROI(const cv::Mat& img,
    446                                                        CV_OUT std::vector<cv::Rect>& foundLocations,
    447                                                        std::vector<DetectionROI>& locations,
    448                                                        double hitThreshold = 0,
    449                                                        int groupThreshold = 0) const;
    450 
    451     //! read/parse Dalal's alt model file
    452     void readALTModel(String modelfile);
    453     void groupRectangles(std::vector<cv::Rect>& rectList, std::vector<double>& weights, int groupThreshold, double eps) const;
    454 };
    455 
    456 //! @} objdetect
    457 
    458 }
    459 
    460 #include "opencv2/objdetect/detection_based_tracker.hpp"
    461 
    462 #ifndef DISABLE_OPENCV_24_COMPATIBILITY
    463 #include "opencv2/objdetect/objdetect_c.h"
    464 #endif
    465 
    466 #endif
    467