Home | History | Annotate | Download | only in doc
      1 Introduction {#intro}
      2 ============
      3 
      4 OpenCV (Open Source Computer Vision Library: <http://opencv.org>) is an open-source BSD-licensed
      5 library that includes several hundreds of computer vision algorithms. The document describes the
      6 so-called OpenCV 2.x API, which is essentially a C++ API, as opposite to the C-based OpenCV 1.x API.
      7 The latter is described in opencv1x.pdf.
      8 
      9 OpenCV has a modular structure, which means that the package includes several shared or static
     10 libraries. The following modules are available:
     11 
     12 -   @ref core - a compact module defining basic data structures, including the dense
     13     multi-dimensional array Mat and basic functions used by all other modules.
     14 -   @ref imgproc - an image processing module that includes linear and non-linear image filtering,
     15     geometrical image transformations (resize, affine and perspective warping, generic table-based
     16     remapping), color space conversion, histograms, and so on.
     17 -   **video** - a video analysis module that includes motion estimation, background subtraction,
     18     and object tracking algorithms.
     19 -   **calib3d** - basic multiple-view geometry algorithms, single and stereo camera calibration,
     20     object pose estimation, stereo correspondence algorithms, and elements of 3D reconstruction.
     21 -   **features2d** - salient feature detectors, descriptors, and descriptor matchers.
     22 -   **objdetect** - detection of objects and instances of the predefined classes (for example,
     23     faces, eyes, mugs, people, cars, and so on).
     24 -   **highgui** - an easy-to-use interface to simple UI capabilities.
     25 -   **videoio** - an easy-to-use interface to video capturing and video codecs.
     26 -   **gpu** - GPU-accelerated algorithms from different OpenCV modules.
     27 -   ... some other helper modules, such as FLANN and Google test wrappers, Python bindings, and
     28     others.
     29 
     30 The further chapters of the document describe functionality of each module. But first, make sure to
     31 get familiar with the common API concepts used thoroughly in the library.
     32 
     33 API Concepts
     34 ------------
     35 
     36 ### cv Namespace
     37 
     38 All the OpenCV classes and functions are placed into the cv namespace. Therefore, to access this
     39 functionality from your code, use the cv:: specifier or using namespace cv; directive:
     40 @code
     41 #include "opencv2/core.hpp"
     42 ...
     43 cv::Mat H = cv::findHomography(points1, points2, CV_RANSAC, 5);
     44 ...
     45 @endcode
     46 or :
     47 ~~~
     48     #include "opencv2/core.hpp"
     49     using namespace cv;
     50     ...
     51     Mat H = findHomography(points1, points2, CV_RANSAC, 5 );
     52     ...
     53 ~~~
     54 Some of the current or future OpenCV external names may conflict with STL or other libraries. In
     55 this case, use explicit namespace specifiers to resolve the name conflicts:
     56 @code
     57     Mat a(100, 100, CV_32F);
     58     randu(a, Scalar::all(1), Scalar::all(std::rand()));
     59     cv::log(a, a);
     60     a /= std::log(2.);
     61 @endcode
     62 
     63 ### Automatic Memory Management
     64 
     65 OpenCV handles all the memory automatically.
     66 
     67 First of all, std::vector, Mat, and other data structures used by the functions and methods have
     68 destructors that deallocate the underlying memory buffers when needed. This means that the
     69 destructors do not always deallocate the buffers as in case of Mat. They take into account possible
     70 data sharing. A destructor decrements the reference counter associated with the matrix data buffer.
     71 The buffer is deallocated if and only if the reference counter reaches zero, that is, when no other
     72 structures refer to the same buffer. Similarly, when a Mat instance is copied, no actual data is
     73 really copied. Instead, the reference counter is incremented to memorize that there is another owner
     74 of the same data. There is also the Mat::clone method that creates a full copy of the matrix data.
     75 See the example below:
     76 @code
     77     // create a big 8Mb matrix
     78     Mat A(1000, 1000, CV_64F);
     79 
     80     // create another header for the same matrix;
     81     // this is an instant operation, regardless of the matrix size.
     82     Mat B = A;
     83     // create another header for the 3-rd row of A; no data is copied either
     84     Mat C = B.row(3);
     85     // now create a separate copy of the matrix
     86     Mat D = B.clone();
     87     // copy the 5-th row of B to C, that is, copy the 5-th row of A
     88     // to the 3-rd row of A.
     89     B.row(5).copyTo(C);
     90     // now let A and D share the data; after that the modified version
     91     // of A is still referenced by B and C.
     92     A = D;
     93     // now make B an empty matrix (which references no memory buffers),
     94     // but the modified version of A will still be referenced by C,
     95     // despite that C is just a single row of the original A
     96     B.release();
     97 
     98     // finally, make a full copy of C. As a result, the big modified
     99     // matrix will be deallocated, since it is not referenced by anyone
    100     C = C.clone();
    101 @endcode
    102 You see that the use of Mat and other basic structures is simple. But what about high-level classes
    103 or even user data types created without taking automatic memory management into account? For them,
    104 OpenCV offers the Ptr template class that is similar to std::shared\_ptr from C++11. So, instead of
    105 using plain pointers:
    106 @code
    107     T* ptr = new T(...);
    108 @endcode
    109 you can use:
    110 @code
    111     Ptr<T> ptr(new T(...));
    112 @endcode
    113 or:
    114 @code
    115     Ptr<T> ptr = makePtr<T>(...);
    116 @endcode
    117 Ptr\<T\> encapsulates a pointer to a T instance and a reference counter associated with the pointer.
    118 See the Ptr description for details.
    119 
    120 ### Automatic Allocation of the Output Data
    121 
    122 OpenCV deallocates the memory automatically, as well as automatically allocates the memory for
    123 output function parameters most of the time. So, if a function has one or more input arrays (cv::Mat
    124 instances) and some output arrays, the output arrays are automatically allocated or reallocated. The
    125 size and type of the output arrays are determined from the size and type of input arrays. If needed,
    126 the functions take extra parameters that help to figure out the output array properties.
    127 
    128 Example:
    129 @code
    130     #include "opencv2/imgproc.hpp"
    131     #include "opencv2/highgui.hpp"
    132 
    133     using namespace cv;
    134 
    135     int main(int, char**)
    136     {
    137         VideoCapture cap(0);
    138         if(!cap.isOpened()) return -1;
    139 
    140         Mat frame, edges;
    141         namedWindow("edges",1);
    142         for(;;)
    143         {
    144             cap >> frame;
    145             cvtColor(frame, edges, COLOR_BGR2GRAY);
    146             GaussianBlur(edges, edges, Size(7,7), 1.5, 1.5);
    147             Canny(edges, edges, 0, 30, 3);
    148             imshow("edges", edges);
    149             if(waitKey(30) >= 0) break;
    150         }
    151         return 0;
    152     }
    153 @endcode
    154 The array frame is automatically allocated by the \>\> operator since the video frame resolution and
    155 the bit-depth is known to the video capturing module. The array edges is automatically allocated by
    156 the cvtColor function. It has the same size and the bit-depth as the input array. The number of
    157 channels is 1 because the color conversion code COLOR\_BGR2GRAY is passed, which means a color to
    158 grayscale conversion. Note that frame and edges are allocated only once during the first execution
    159 of the loop body since all the next video frames have the same resolution. If you somehow change the
    160 video resolution, the arrays are automatically reallocated.
    161 
    162 The key component of this technology is the Mat::create method. It takes the desired array size and
    163 type. If the array already has the specified size and type, the method does nothing. Otherwise, it
    164 releases the previously allocated data, if any (this part involves decrementing the reference
    165 counter and comparing it with zero), and then allocates a new buffer of the required size. Most
    166 functions call the Mat::create method for each output array, and so the automatic output data
    167 allocation is implemented.
    168 
    169 Some notable exceptions from this scheme are cv::mixChannels, cv::RNG::fill, and a few other
    170 functions and methods. They are not able to allocate the output array, so you have to do this in
    171 advance.
    172 
    173 ### Saturation Arithmetics
    174 
    175 As a computer vision library, OpenCV deals a lot with image pixels that are often encoded in a
    176 compact, 8- or 16-bit per channel, form and thus have a limited value range. Furthermore, certain
    177 operations on images, like color space conversions, brightness/contrast adjustments, sharpening,
    178 complex interpolation (bi-cubic, Lanczos) can produce values out of the available range. If you just
    179 store the lowest 8 (16) bits of the result, this results in visual artifacts and may affect a
    180 further image analysis. To solve this problem, the so-called *saturation* arithmetics is used. For
    181 example, to store r, the result of an operation, to an 8-bit image, you find the nearest value
    182 within the 0..255 range:
    183 
    184 \f[I(x,y)= \min ( \max (\textrm{round}(r), 0), 255)\f]
    185 
    186 Similar rules are applied to 8-bit signed, 16-bit signed and unsigned types. This semantics is used
    187 everywhere in the library. In C++ code, it is done using the saturate\_cast\<\> functions that
    188 resemble standard C++ cast operations. See below the implementation of the formula provided above:
    189 @code
    190     I.at<uchar>(y, x) = saturate_cast<uchar>(r);
    191 @endcode
    192 where cv::uchar is an OpenCV 8-bit unsigned integer type. In the optimized SIMD code, such SSE2
    193 instructions as paddusb, packuswb, and so on are used. They help achieve exactly the same behavior
    194 as in C++ code.
    195 
    196 @note Saturation is not applied when the result is 32-bit integer.
    197 
    198 ### Fixed Pixel Types. Limited Use of Templates
    199 
    200 Templates is a great feature of C++ that enables implementation of very powerful, efficient and yet
    201 safe data structures and algorithms. However, the extensive use of templates may dramatically
    202 increase compilation time and code size. Besides, it is difficult to separate an interface and
    203 implementation when templates are used exclusively. This could be fine for basic algorithms but not
    204 good for computer vision libraries where a single algorithm may span thousands lines of code.
    205 Because of this and also to simplify development of bindings for other languages, like Python, Java,
    206 Matlab that do not have templates at all or have limited template capabilities, the current OpenCV
    207 implementation is based on polymorphism and runtime dispatching over templates. In those places
    208 where runtime dispatching would be too slow (like pixel access operators), impossible (generic
    209 Ptr\<\> implementation), or just very inconvenient (saturate\_cast\<\>()) the current implementation
    210 introduces small template classes, methods, and functions. Anywhere else in the current OpenCV
    211 version the use of templates is limited.
    212 
    213 Consequently, there is a limited fixed set of primitive data types the library can operate on. That
    214 is, array elements should have one of the following types:
    215 
    216 -   8-bit unsigned integer (uchar)
    217 -   8-bit signed integer (schar)
    218 -   16-bit unsigned integer (ushort)
    219 -   16-bit signed integer (short)
    220 -   32-bit signed integer (int)
    221 -   32-bit floating-point number (float)
    222 -   64-bit floating-point number (double)
    223 -   a tuple of several elements where all elements have the same type (one of the above). An array
    224     whose elements are such tuples, are called multi-channel arrays, as opposite to the
    225     single-channel arrays, whose elements are scalar values. The maximum possible number of
    226     channels is defined by the CV\_CN\_MAX constant, which is currently set to 512.
    227 
    228 For these basic types, the following enumeration is applied:
    229 @code
    230     enum { CV_8U=0, CV_8S=1, CV_16U=2, CV_16S=3, CV_32S=4, CV_32F=5, CV_64F=6 };
    231 @endcode
    232 Multi-channel (n-channel) types can be specified using the following options:
    233 
    234 -   CV_8UC1 ... CV_64FC4 constants (for a number of channels from 1 to 4)
    235 -   CV_8UC(n) ... CV_64FC(n) or CV_MAKETYPE(CV_8U, n) ... CV_MAKETYPE(CV_64F, n) macros when
    236     the number of channels is more than 4 or unknown at the compilation time.
    237 
    238 @note `CV_32FC1 == CV_32F, CV_32FC2 == CV_32FC(2) == CV_MAKETYPE(CV_32F, 2)`, and
    239 `CV_MAKETYPE(depth, n) == ((depth&7) + ((n-1)<<3)``. This means that the constant type is formed from the
    240 depth, taking the lowest 3 bits, and the number of channels minus 1, taking the next
    241 `log2(CV_CN_MAX)`` bits.
    242 
    243 Examples:
    244 @code
    245     Mat mtx(3, 3, CV_32F); // make a 3x3 floating-point matrix
    246     Mat cmtx(10, 1, CV_64FC2); // make a 10x1 2-channel floating-point
    247                                // matrix (10-element complex vector)
    248     Mat img(Size(1920, 1080), CV_8UC3); // make a 3-channel (color) image
    249                                         // of 1920 columns and 1080 rows.
    250     Mat grayscale(image.size(), CV_MAKETYPE(image.depth(), 1)); // make a 1-channel image of
    251                                                                 // the same size and same
    252                                                                 // channel type as img
    253 @endcode
    254 Arrays with more complex elements cannot be constructed or processed using OpenCV. Furthermore, each
    255 function or method can handle only a subset of all possible array types. Usually, the more complex
    256 the algorithm is, the smaller the supported subset of formats is. See below typical examples of such
    257 limitations:
    258 
    259 -   The face detection algorithm only works with 8-bit grayscale or color images.
    260 -   Linear algebra functions and most of the machine learning algorithms work with floating-point
    261     arrays only.
    262 -   Basic functions, such as cv::add, support all types.
    263 -   Color space conversion functions support 8-bit unsigned, 16-bit unsigned, and 32-bit
    264     floating-point types.
    265 
    266 The subset of supported types for each function has been defined from practical needs and could be
    267 extended in future based on user requests.
    268 
    269 ### InputArray and OutputArray
    270 
    271 Many OpenCV functions process dense 2-dimensional or multi-dimensional numerical arrays. Usually,
    272 such functions take cppMat as parameters, but in some cases it's more convenient to use
    273 std::vector\<\> (for a point set, for example) or Matx\<\> (for 3x3 homography matrix and such). To
    274 avoid many duplicates in the API, special "proxy" classes have been introduced. The base "proxy"
    275 class is InputArray. It is used for passing read-only arrays on a function input. The derived from
    276 InputArray class OutputArray is used to specify an output array for a function. Normally, you should
    277 not care of those intermediate types (and you should not declare variables of those types
    278 explicitly) - it will all just work automatically. You can assume that instead of
    279 InputArray/OutputArray you can always use Mat, std::vector\<\>, Matx\<\>, Vec\<\> or Scalar. When a
    280 function has an optional input or output array, and you do not have or do not want one, pass
    281 cv::noArray().
    282 
    283 ### Error Handling
    284 
    285 OpenCV uses exceptions to signal critical errors. When the input data has a correct format and
    286 belongs to the specified value range, but the algorithm cannot succeed for some reason (for example,
    287 the optimization algorithm did not converge), it returns a special error code (typically, just a
    288 boolean variable).
    289 
    290 The exceptions can be instances of the cv::Exception class or its derivatives. In its turn,
    291 cv::Exception is a derivative of std::exception. So it can be gracefully handled in the code using
    292 other standard C++ library components.
    293 
    294 The exception is typically thrown either using the CV\_Error(errcode, description) macro, or its
    295 printf-like CV\_Error\_(errcode, printf-spec, (printf-args)) variant, or using the
    296 CV\_Assert(condition) macro that checks the condition and throws an exception when it is not
    297 satisfied. For performance-critical code, there is CV\_DbgAssert(condition) that is only retained in
    298 the Debug configuration. Due to the automatic memory management, all the intermediate buffers are
    299 automatically deallocated in case of a sudden error. You only need to add a try statement to catch
    300 exceptions, if needed: :
    301 @code
    302     try
    303     {
    304         ... // call OpenCV
    305     }
    306     catch( cv::Exception& e )
    307     {
    308         const char* err_msg = e.what();
    309         std::cout << "exception caught: " << err_msg << std::endl;
    310     }
    311 @endcode
    312 
    313 ### Multi-threading and Re-enterability
    314 
    315 The current OpenCV implementation is fully re-enterable. That is, the same function, the same
    316 *constant* method of a class instance, or the same *non-constant* method of different class
    317 instances can be called from different threads. Also, the same cv::Mat can be used in different
    318 threads because the reference-counting operations use the architecture-specific atomic instructions.
    319