Home | History | Annotate | Download | only in source
      1 .. default-domain:: cpp
      2 
      3 .. cpp:namespace:: ceres
      4 
      5 .. _`chapter-modeling`:
      6 
      7 ========
      8 Modeling
      9 ========
     10 
     11 Ceres solver consists of two distinct parts. A modeling API which
     12 provides a rich set of tools to construct an optimization problem one
     13 term at a time and a solver API that controls the minimization
     14 algorithm. This chapter is devoted to the task of modeling
     15 optimization problems using Ceres. :ref:`chapter-solving` discusses
     16 the various ways in which an optimization problem can be solved using
     17 Ceres.
     18 
     19 Ceres solves robustified bounds constrained non-linear least squares
     20 problems of the form:
     21 
     22 .. math:: :label: ceresproblem
     23 
     24    \min_{\mathbf{x}} &\quad \frac{1}{2}\sum_{i}
     25    \rho_i\left(\left\|f_i\left(x_{i_1},
     26    ... ,x_{i_k}\right)\right\|^2\right)  \\
     27    \text{s.t.} &\quad l_j \le x_j \le u_j
     28 
     29 In Ceres parlance, the expression
     30 :math:`\rho_i\left(\left\|f_i\left(x_{i_1},...,x_{i_k}\right)\right\|^2\right)`
     31 is known as a **residual block**, where :math:`f_i(\cdot)` is a
     32 :class:`CostFunction` that depends on the **parameter blocks**
     33 :math:`\left\{x_{i_1},... , x_{i_k}\right\}`.
     34 
     35 In most optimization problems small groups of scalars occur
     36 together. For example the three components of a translation vector and
     37 the four components of the quaternion that define the pose of a
     38 camera. We refer to such a group of scalars as a **parameter block**. Of
     39 course a parameter block can be just a single scalar too.
     40 
     41 :math:`\rho_i` is a :class:`LossFunction`. A :class:`LossFunction` is
     42 a scalar valued function that is used to reduce the influence of
     43 outliers on the solution of non-linear least squares problems.
     44 
     45 :math:`l_j` and :math:`u_j` are lower and upper bounds on the
     46 parameter block :math:`x_j`.
     47 
     48 As a special case, when :math:`\rho_i(x) = x`, i.e., the identity
     49 function, and :math:`l_j = -\infty` and :math:`u_j = \infty` we get
     50 the more familiar unconstrained `non-linear least squares problem
     51 <http://en.wikipedia.org/wiki/Non-linear_least_squares>`_.
     52 
     53 .. math:: :label: ceresproblemunconstrained
     54 
     55    \frac{1}{2}\sum_{i} \left\|f_i\left(x_{i_1}, ... ,x_{i_k}\right)\right\|^2.
     56 
     57 :class:`CostFunction`
     58 ---------------------
     59 
     60 For each term in the objective function, a :class:`CostFunction` is
     61 responsible for computing a vector of residuals and if asked a vector
     62 of Jacobian matrices, i.e., given :math:`\left[x_{i_1}, ... ,
     63 x_{i_k}\right]`, compute the vector
     64 :math:`f_i\left(x_{i_1},...,x_{i_k}\right)` and the matrices
     65 
     66  .. math:: J_{ij} = \frac{\partial}{\partial
     67 	   x_{i_j}}f_i\left(x_{i_1},...,x_{i_k}\right),\quad \forall j
     68 	   \in \{1, \ldots, k\}
     69 
     70 .. class:: CostFunction
     71 
     72    .. code-block:: c++
     73 
     74     class CostFunction {
     75      public:
     76       virtual bool Evaluate(double const* const* parameters,
     77                             double* residuals,
     78                             double** jacobians) = 0;
     79       const vector<int32>& parameter_block_sizes();
     80       int num_residuals() const;
     81 
     82      protected:
     83       vector<int32>* mutable_parameter_block_sizes();
     84       void set_num_residuals(int num_residuals);
     85     };
     86 
     87 
     88 The signature of the :class:`CostFunction` (number and sizes of input
     89 parameter blocks and number of outputs) is stored in
     90 :member:`CostFunction::parameter_block_sizes_` and
     91 :member:`CostFunction::num_residuals_` respectively. User code
     92 inheriting from this class is expected to set these two members with
     93 the corresponding accessors. This information will be verified by the
     94 :class:`Problem` when added with :func:`Problem::AddResidualBlock`.
     95 
     96 .. function:: bool CostFunction::Evaluate(double const* const* parameters, double* residuals, double** jacobians)
     97 
     98    Compute the residual vector and the Jacobian matrices.
     99 
    100    ``parameters`` is an array of pointers to arrays containing the
    101    various parameter blocks. ``parameters`` has the same number of
    102    elements as :member:`CostFunction::parameter_block_sizes_` and the
    103    parameter blocks are in the same order as
    104    :member:`CostFunction::parameter_block_sizes_`.
    105 
    106    ``residuals`` is an array of size ``num_residuals_``.
    107 
    108    ``jacobians`` is an array of size
    109    :member:`CostFunction::parameter_block_sizes_` containing pointers
    110    to storage for Jacobian matrices corresponding to each parameter
    111    block. The Jacobian matrices are in the same order as
    112    :member:`CostFunction::parameter_block_sizes_`. ``jacobians[i]`` is
    113    an array that contains :member:`CostFunction::num_residuals_` x
    114    :member:`CostFunction::parameter_block_sizes_` ``[i]``
    115    elements. Each Jacobian matrix is stored in row-major order, i.e.,
    116    ``jacobians[i][r * parameter_block_size_[i] + c]`` =
    117    :math:`\frac{\partial residual[r]}{\partial parameters[i][c]}`
    118 
    119 
    120    If ``jacobians`` is ``NULL``, then no derivatives are returned;
    121    this is the case when computing cost only. If ``jacobians[i]`` is
    122    ``NULL``, then the Jacobian matrix corresponding to the
    123    :math:`i^{\textrm{th}}` parameter block must not be returned, this
    124    is the case when a parameter block is marked constant.
    125 
    126    **NOTE** The return value indicates whether the computation of the
    127    residuals and/or jacobians was successful or not.
    128 
    129    This can be used to communicate numerical failures in Jacobian
    130    computations for instance.
    131 
    132 :class:`SizedCostFunction`
    133 --------------------------
    134 
    135 .. class:: SizedCostFunction
    136 
    137    If the size of the parameter blocks and the size of the residual
    138    vector is known at compile time (this is the common case),
    139    :class:`SizeCostFunction` can be used where these values can be
    140    specified as template parameters and the user only needs to
    141    implement :func:`CostFunction::Evaluate`.
    142 
    143    .. code-block:: c++
    144 
    145     template<int kNumResiduals,
    146              int N0 = 0, int N1 = 0, int N2 = 0, int N3 = 0, int N4 = 0,
    147              int N5 = 0, int N6 = 0, int N7 = 0, int N8 = 0, int N9 = 0>
    148     class SizedCostFunction : public CostFunction {
    149      public:
    150       virtual bool Evaluate(double const* const* parameters,
    151                             double* residuals,
    152                             double** jacobians) const = 0;
    153     };
    154 
    155 
    156 :class:`AutoDiffCostFunction`
    157 -----------------------------
    158 
    159 .. class:: AutoDiffCostFunction
    160 
    161    Defining a :class:`CostFunction` or a :class:`SizedCostFunction`
    162    can be a tedious and error prone especially when computing
    163    derivatives.  To this end Ceres provides `automatic differentiation
    164    <http://en.wikipedia.org/wiki/Automatic_differentiation>`_.
    165 
    166    .. code-block:: c++
    167 
    168      template <typename CostFunctor,
    169             int kNumResiduals,  // Number of residuals, or ceres::DYNAMIC.
    170             int N0,       // Number of parameters in block 0.
    171             int N1 = 0,   // Number of parameters in block 1.
    172             int N2 = 0,   // Number of parameters in block 2.
    173             int N3 = 0,   // Number of parameters in block 3.
    174             int N4 = 0,   // Number of parameters in block 4.
    175             int N5 = 0,   // Number of parameters in block 5.
    176             int N6 = 0,   // Number of parameters in block 6.
    177             int N7 = 0,   // Number of parameters in block 7.
    178             int N8 = 0,   // Number of parameters in block 8.
    179             int N9 = 0>   // Number of parameters in block 9.
    180      class AutoDiffCostFunction : public
    181      SizedCostFunction<kNumResiduals, N0, N1, N2, N3, N4, N5, N6, N7, N8, N9> {
    182       public:
    183        explicit AutoDiffCostFunction(CostFunctor* functor);
    184        // Ignore the template parameter kNumResiduals and use
    185        // num_residuals instead.
    186        AutoDiffCostFunction(CostFunctor* functor, int num_residuals);
    187      }
    188 
    189    To get an auto differentiated cost function, you must define a
    190    class with a templated ``operator()`` (a functor) that computes the
    191    cost function in terms of the template parameter ``T``. The
    192    autodiff framework substitutes appropriate ``Jet`` objects for
    193    ``T`` in order to compute the derivative when necessary, but this
    194    is hidden, and you should write the function as if ``T`` were a
    195    scalar type (e.g. a double-precision floating point number).
    196 
    197    The function must write the computed value in the last argument
    198    (the only non-``const`` one) and return true to indicate success.
    199 
    200    For example, consider a scalar error :math:`e = k - x^\top y`,
    201    where both :math:`x` and :math:`y` are two-dimensional vector
    202    parameters and :math:`k` is a constant. The form of this error,
    203    which is the difference between a constant and an expression, is a
    204    common pattern in least squares problems. For example, the value
    205    :math:`x^\top y` might be the model expectation for a series of
    206    measurements, where there is an instance of the cost function for
    207    each measurement :math:`k`.
    208 
    209    The actual cost added to the total problem is :math:`e^2`, or
    210    :math:`(k - x^\top y)^2`; however, the squaring is implicitly done
    211    by the optimization framework.
    212 
    213    To write an auto-differentiable cost function for the above model,
    214    first define the object
    215 
    216    .. code-block:: c++
    217 
    218     class MyScalarCostFunctor {
    219       MyScalarCostFunctor(double k): k_(k) {}
    220 
    221       template <typename T>
    222       bool operator()(const T* const x , const T* const y, T* e) const {
    223         e[0] = T(k_) - x[0] * y[0] - x[1] * y[1];
    224         return true;
    225       }
    226 
    227      private:
    228       double k_;
    229     };
    230 
    231 
    232    Note that in the declaration of ``operator()`` the input parameters
    233    ``x`` and ``y`` come first, and are passed as const pointers to arrays
    234    of ``T``. If there were three input parameters, then the third input
    235    parameter would come after ``y``. The output is always the last
    236    parameter, and is also a pointer to an array. In the example above,
    237    ``e`` is a scalar, so only ``e[0]`` is set.
    238 
    239    Then given this class definition, the auto differentiated cost
    240    function for it can be constructed as follows.
    241 
    242    .. code-block:: c++
    243 
    244     CostFunction* cost_function
    245         = new AutoDiffCostFunction<MyScalarCostFunctor, 1, 2, 2>(
    246             new MyScalarCostFunctor(1.0));              ^  ^  ^
    247                                                         |  |  |
    248                             Dimension of residual ------+  |  |
    249                             Dimension of x ----------------+  |
    250                             Dimension of y -------------------+
    251 
    252 
    253    In this example, there is usually an instance for each measurement
    254    of ``k``.
    255 
    256    In the instantiation above, the template parameters following
    257    ``MyScalarCostFunction``, ``<1, 2, 2>`` describe the functor as
    258    computing a 1-dimensional output from two arguments, both
    259    2-dimensional.
    260 
    261    :class:`AutoDiffCostFunction` also supports cost functions with a
    262    runtime-determined number of residuals. For example:
    263 
    264    .. code-block:: c++
    265 
    266      CostFunction* cost_function
    267          = new AutoDiffCostFunction<MyScalarCostFunctor, DYNAMIC, 2, 2>(
    268              new CostFunctorWithDynamicNumResiduals(1.0),   ^     ^  ^
    269              runtime_number_of_residuals); <----+           |     |  |
    270                                                 |           |     |  |
    271                                                 |           |     |  |
    272                Actual number of residuals ------+           |     |  |
    273                Indicate dynamic number of residuals --------+     |  |
    274                Dimension of x ------------------------------------+  |
    275                Dimension of y ---------------------------------------+
    276 
    277    The framework can currently accommodate cost functions of up to 10
    278    independent variables, and there is no limit on the dimensionality
    279    of each of them.
    280 
    281    **WARNING 1** Since the functor will get instantiated with
    282    different types for ``T``, you must convert from other numeric
    283    types to ``T`` before mixing computations with other variables
    284    of type ``T``. In the example above, this is seen where instead of
    285    using ``k_`` directly, ``k_`` is wrapped with ``T(k_)``.
    286 
    287    **WARNING 2** A common beginner's error when first using
    288    :class:`AutoDiffCostFunction` is to get the sizing wrong. In particular,
    289    there is a tendency to set the template parameters to (dimension of
    290    residual, number of parameters) instead of passing a dimension
    291    parameter for *every parameter block*. In the example above, that
    292    would be ``<MyScalarCostFunction, 1, 2>``, which is missing the 2
    293    as the last template argument.
    294 
    295 
    296 :class:`DynamicAutoDiffCostFunction`
    297 ------------------------------------
    298 
    299 .. class:: DynamicAutoDiffCostFunction
    300 
    301    :class:`AutoDiffCostFunction` requires that the number of parameter
    302    blocks and their sizes be known at compile time. It also has an
    303    upper limit of 10 parameter blocks. In a number of applications,
    304    this is not enough e.g., Bezier curve fitting, Neural Network
    305    training etc.
    306 
    307      .. code-block:: c++
    308 
    309       template <typename CostFunctor, int Stride = 4>
    310       class DynamicAutoDiffCostFunction : public CostFunction {
    311       };
    312 
    313    In such cases :class:`DynamicAutoDiffCostFunction` can be
    314    used. Like :class:`AutoDiffCostFunction` the user must define a
    315    templated functor, but the signature of the functor differs
    316    slightly. The expected interface for the cost functors is:
    317 
    318      .. code-block:: c++
    319 
    320        struct MyCostFunctor {
    321          template<typename T>
    322          bool operator()(T const* const* parameters, T* residuals) const {
    323          }
    324        }
    325 
    326    Since the sizing of the parameters is done at runtime, you must
    327    also specify the sizes after creating the dynamic autodiff cost
    328    function. For example:
    329 
    330      .. code-block:: c++
    331 
    332        DynamicAutoDiffCostFunction<MyCostFunctor, 4> cost_function(
    333            new MyCostFunctor());
    334        cost_function.AddParameterBlock(5);
    335        cost_function.AddParameterBlock(10);
    336        cost_function.SetNumResiduals(21);
    337 
    338    Under the hood, the implementation evaluates the cost function
    339    multiple times, computing a small set of the derivatives (four by
    340    default, controlled by the ``Stride`` template parameter) with each
    341    pass. There is a performance tradeoff with the size of the passes;
    342    Smaller sizes are more cache efficient but result in larger number
    343    of passes, and larger stride lengths can destroy cache-locality
    344    while reducing the number of passes over the cost function. The
    345    optimal value depends on the number and sizes of the various
    346    parameter blocks.
    347 
    348    As a rule of thumb, try using :class:`AutoDiffCostFunction` before
    349    you use :class:`DynamicAutoDiffCostFunction`.
    350 
    351 :class:`NumericDiffCostFunction`
    352 --------------------------------
    353 
    354 .. class:: NumericDiffCostFunction
    355 
    356   In some cases, its not possible to define a templated cost functor,
    357   for example when the evaluation of the residual involves a call to a
    358   library function that you do not have control over.  In such a
    359   situation, `numerical differentiation
    360   <http://en.wikipedia.org/wiki/Numerical_differentiation>`_ can be
    361   used.
    362 
    363     .. code-block:: c++
    364 
    365       template <typename CostFunctor,
    366                 NumericDiffMethod method = CENTRAL,
    367                 int kNumResiduals,  // Number of residuals, or ceres::DYNAMIC.
    368                 int N0,       // Number of parameters in block 0.
    369                 int N1 = 0,   // Number of parameters in block 1.
    370                 int N2 = 0,   // Number of parameters in block 2.
    371                 int N3 = 0,   // Number of parameters in block 3.
    372                 int N4 = 0,   // Number of parameters in block 4.
    373                 int N5 = 0,   // Number of parameters in block 5.
    374                 int N6 = 0,   // Number of parameters in block 6.
    375                 int N7 = 0,   // Number of parameters in block 7.
    376                 int N8 = 0,   // Number of parameters in block 8.
    377                 int N9 = 0>   // Number of parameters in block 9.
    378       class NumericDiffCostFunction : public
    379       SizedCostFunction<kNumResiduals, N0, N1, N2, N3, N4, N5, N6, N7, N8, N9> {
    380       };
    381 
    382    To get a numerically differentiated :class:`CostFunction`, you must
    383    define a class with a ``operator()`` (a functor) that computes the
    384    residuals. The functor must write the computed value in the last
    385    argument (the only non-``const`` one) and return ``true`` to
    386    indicate success.  Please see :class:`CostFunction` for details on
    387    how the return value may be used to impose simple constraints on
    388    the parameter block. e.g., an object of the form
    389 
    390    .. code-block:: c++
    391 
    392      struct ScalarFunctor {
    393       public:
    394        bool operator()(const double* const x1,
    395                        const double* const x2,
    396                        double* residuals) const;
    397      }
    398 
    399    For example, consider a scalar error :math:`e = k - x'y`, where
    400    both :math:`x` and :math:`y` are two-dimensional column vector
    401    parameters, the prime sign indicates transposition, and :math:`k`
    402    is a constant. The form of this error, which is the difference
    403    between a constant and an expression, is a common pattern in least
    404    squares problems. For example, the value :math:`x'y` might be the
    405    model expectation for a series of measurements, where there is an
    406    instance of the cost function for each measurement :math:`k`.
    407 
    408    To write an numerically-differentiable class:`CostFunction` for the
    409    above model, first define the object
    410 
    411    .. code-block::  c++
    412 
    413      class MyScalarCostFunctor {
    414        MyScalarCostFunctor(double k): k_(k) {}
    415 
    416        bool operator()(const double* const x,
    417                        const double* const y,
    418                        double* residuals) const {
    419          residuals[0] = k_ - x[0] * y[0] + x[1] * y[1];
    420          return true;
    421        }
    422 
    423       private:
    424        double k_;
    425      };
    426 
    427    Note that in the declaration of ``operator()`` the input parameters
    428    ``x`` and ``y`` come first, and are passed as const pointers to
    429    arrays of ``double`` s. If there were three input parameters, then
    430    the third input parameter would come after ``y``. The output is
    431    always the last parameter, and is also a pointer to an array. In
    432    the example above, the residual is a scalar, so only
    433    ``residuals[0]`` is set.
    434 
    435    Then given this class definition, the numerically differentiated
    436    :class:`CostFunction` with central differences used for computing
    437    the derivative can be constructed as follows.
    438 
    439    .. code-block:: c++
    440 
    441      CostFunction* cost_function
    442          = new NumericDiffCostFunction<MyScalarCostFunctor, CENTRAL, 1, 2, 2>(
    443              new MyScalarCostFunctor(1.0));                    ^     ^  ^  ^
    444                                                                |     |  |  |
    445                                    Finite Differencing Scheme -+     |  |  |
    446                                    Dimension of residual ------------+  |  |
    447                                    Dimension of x ----------------------+  |
    448                                    Dimension of y -------------------------+
    449 
    450    In this example, there is usually an instance for each measurement
    451    of `k`.
    452 
    453    In the instantiation above, the template parameters following
    454    ``MyScalarCostFunctor``, ``1, 2, 2``, describe the functor as
    455    computing a 1-dimensional output from two arguments, both
    456    2-dimensional.
    457 
    458    NumericDiffCostFunction also supports cost functions with a
    459    runtime-determined number of residuals. For example:
    460 
    461    .. code-block:: c++
    462 
    463      CostFunction* cost_function
    464          = new NumericDiffCostFunction<MyScalarCostFunctor, CENTRAL, DYNAMIC, 2, 2>(
    465              new CostFunctorWithDynamicNumResiduals(1.0),               ^     ^  ^
    466              TAKE_OWNERSHIP,                                            |     |  |
    467              runtime_number_of_residuals); <----+                       |     |  |
    468                                                 |                       |     |  |
    469                                                 |                       |     |  |
    470                Actual number of residuals ------+                       |     |  |
    471                Indicate dynamic number of residuals --------------------+     |  |
    472                Dimension of x ------------------------------------------------+  |
    473                Dimension of y ---------------------------------------------------+
    474 
    475 
    476    The framework can currently accommodate cost functions of up to 10
    477    independent variables, and there is no limit on the dimensionality
    478    of each of them.
    479 
    480    The ``CENTRAL`` difference method is considerably more accurate at
    481    the cost of twice as many function evaluations than forward
    482    difference. Consider using central differences begin with, and only
    483    after that works, trying forward difference to improve performance.
    484 
    485    **WARNING** A common beginner's error when first using
    486    NumericDiffCostFunction is to get the sizing wrong. In particular,
    487    there is a tendency to set the template parameters to (dimension of
    488    residual, number of parameters) instead of passing a dimension
    489    parameter for *every parameter*. In the example above, that would
    490    be ``<MyScalarCostFunctor, 1, 2>``, which is missing the last ``2``
    491    argument. Please be careful when setting the size parameters.
    492 
    493 
    494    **Alternate Interface**
    495 
    496    For a variety of reason, including compatibility with legacy code,
    497    :class:`NumericDiffCostFunction` can also take
    498    :class:`CostFunction` objects as input. The following describes
    499    how.
    500 
    501    To get a numerically differentiated cost function, define a
    502    subclass of :class:`CostFunction` such that the
    503    :func:`CostFunction::Evaluate` function ignores the ``jacobians``
    504    parameter. The numeric differentiation wrapper will fill in the
    505    jacobian parameter if necessary by repeatedly calling the
    506    :func:`CostFunction::Evaluate` with small changes to the
    507    appropriate parameters, and computing the slope. For performance,
    508    the numeric differentiation wrapper class is templated on the
    509    concrete cost function, even though it could be implemented only in
    510    terms of the :class:`CostFunction` interface.
    511 
    512    The numerically differentiated version of a cost function for a
    513    cost function can be constructed as follows:
    514 
    515    .. code-block:: c++
    516 
    517      CostFunction* cost_function
    518          = new NumericDiffCostFunction<MyCostFunction, CENTRAL, 1, 4, 8>(
    519              new MyCostFunction(...), TAKE_OWNERSHIP);
    520 
    521    where ``MyCostFunction`` has 1 residual and 2 parameter blocks with
    522    sizes 4 and 8 respectively. Look at the tests for a more detailed
    523    example.
    524 
    525 :class:`DynamicNumericDiffCostFunction`
    526 ---------------------------------------
    527 
    528 .. class:: DynamicNumericDiffCostFunction
    529 
    530    Like :class:`AutoDiffCostFunction` :class:`NumericDiffCostFunction`
    531    requires that the number of parameter blocks and their sizes be
    532    known at compile time. It also has an upper limit of 10 parameter
    533    blocks. In a number of applications, this is not enough.
    534 
    535      .. code-block:: c++
    536 
    537       template <typename CostFunctor, NumericDiffMethod method = CENTRAL>
    538       class DynamicNumericDiffCostFunction : public CostFunction {
    539       };
    540 
    541    In such cases when numeric differentiation is desired,
    542    :class:`DynamicNumericDiffCostFunction` can be used.
    543 
    544    Like :class:`NumericDiffCostFunction` the user must define a
    545    functor, but the signature of the functor differs slightly. The
    546    expected interface for the cost functors is:
    547 
    548      .. code-block:: c++
    549 
    550        struct MyCostFunctor {
    551          bool operator()(double const* const* parameters, double* residuals) const {
    552          }
    553        }
    554 
    555    Since the sizing of the parameters is done at runtime, you must
    556    also specify the sizes after creating the dynamic numeric diff cost
    557    function. For example:
    558 
    559      .. code-block:: c++
    560 
    561        DynamicNumericDiffCostFunction<MyCostFunctor> cost_function(
    562            new MyCostFunctor());
    563        cost_function.AddParameterBlock(5);
    564        cost_function.AddParameterBlock(10);
    565        cost_function.SetNumResiduals(21);
    566 
    567    As a rule of thumb, try using :class:`NumericDiffCostFunction` before
    568    you use :class:`DynamicNumericDiffCostFunction`.
    569 
    570 
    571 :class:`NumericDiffFunctor`
    572 ---------------------------
    573 
    574 .. class:: NumericDiffFunctor
    575 
    576    Sometimes parts of a cost function can be differentiated
    577    automatically or analytically but others require numeric
    578    differentiation. :class:`NumericDiffFunctor` is a wrapper class
    579    that takes a variadic functor evaluating a function, numerically
    580    differentiates it and makes it available as a templated functor so
    581    that it can be easily used as part of Ceres' automatic
    582    differentiation framework.
    583 
    584    For example, let us assume that
    585 
    586    .. code-block:: c++
    587 
    588      struct IntrinsicProjection
    589        IntrinsicProjection(const double* observations);
    590        bool operator()(const double* calibration,
    591                        const double* point,
    592                        double* residuals);
    593      };
    594 
    595    is a functor that implements the projection of a point in its local
    596    coordinate system onto its image plane and subtracts it from the
    597    observed point projection.
    598 
    599    Now we would like to compose the action of this functor with the
    600    action of camera extrinsics, i.e., rotation and translation, which
    601    is given by the following templated function
    602 
    603    .. code-block:: c++
    604 
    605      template<typename T>
    606      void RotateAndTranslatePoint(const T* rotation,
    607                                   const T* translation,
    608                                   const T* point,
    609                                   T* result);
    610 
    611    To compose the extrinsics and intrinsics, we can construct a
    612    ``CameraProjection`` functor as follows.
    613 
    614    .. code-block:: c++
    615 
    616     struct CameraProjection {
    617        typedef NumericDiffFunctor<IntrinsicProjection, CENTRAL, 2, 5, 3>
    618           IntrinsicProjectionFunctor;
    619 
    620       CameraProjection(double* observation) {
    621         intrinsic_projection_.reset(
    622             new IntrinsicProjectionFunctor(observation)) {
    623       }
    624 
    625       template <typename T>
    626       bool operator()(const T* rotation,
    627                       const T* translation,
    628                       const T* intrinsics,
    629                       const T* point,
    630                       T* residuals) const {
    631         T transformed_point[3];
    632         RotateAndTranslatePoint(rotation, translation, point, transformed_point);
    633         return (*intrinsic_projection_)(intrinsics, transformed_point, residual);
    634       }
    635 
    636      private:
    637       scoped_ptr<IntrinsicProjectionFunctor> intrinsic_projection_;
    638     };
    639 
    640    Here, we made the choice of using ``CENTRAL`` differences to compute
    641    the jacobian of ``IntrinsicProjection``.
    642 
    643    Now, we are ready to construct an automatically differentiated cost
    644    function as
    645 
    646    .. code-block:: c++
    647 
    648     CostFunction* cost_function =
    649         new AutoDiffCostFunction<CameraProjection, 2, 3, 3, 5>(
    650            new CameraProjection(observations));
    651 
    652    ``cost_function`` now seamlessly integrates automatic
    653    differentiation of ``RotateAndTranslatePoint`` with a numerically
    654    differentiated version of ``IntrinsicProjection``.
    655 
    656 
    657 :class:`CostFunctionToFunctor`
    658 ------------------------------
    659 
    660 .. class:: CostFunctionToFunctor
    661 
    662    Just like :class:`NumericDiffFunctor` allows numeric
    663    differentiation to be mixed with automatic differentiation,
    664    :class:`CostFunctionToFunctor` provides an even more general
    665    mechanism.  :class:`CostFunctionToFunctor` is an adapter class that
    666    allows users to use :class:`CostFunction` objects in templated
    667    functors which are to be used for automatic differentiation.  This
    668    allows the user to seamlessly mix analytic, numeric and automatic
    669    differentiation.
    670 
    671    For example, let us assume that
    672 
    673    .. code-block:: c++
    674 
    675      class IntrinsicProjection : public SizedCostFunction<2, 5, 3> {
    676        public:
    677          IntrinsicProjection(const double* observations);
    678          virtual bool Evaluate(double const* const* parameters,
    679                                double* residuals,
    680                                double** jacobians) const;
    681      };
    682 
    683    is a :class:`CostFunction` that implements the projection of a
    684    point in its local coordinate system onto its image plane and
    685    subtracts it from the observed point projection. It can compute its
    686    residual and either via analytic or numerical differentiation can
    687    compute its jacobians.
    688 
    689    Now we would like to compose the action of this
    690    :class:`CostFunction` with the action of camera extrinsics, i.e.,
    691    rotation and translation. Say we have a templated function
    692 
    693    .. code-block:: c++
    694 
    695       template<typename T>
    696       void RotateAndTranslatePoint(const T* rotation,
    697                                    const T* translation,
    698                                    const T* point,
    699                                    T* result);
    700 
    701 
    702    Then we can now do the following,
    703 
    704    .. code-block:: c++
    705 
    706     struct CameraProjection {
    707       CameraProjection(double* observation) {
    708         intrinsic_projection_.reset(
    709             new CostFunctionToFunctor<2, 5, 3>(new IntrinsicProjection(observation_)));
    710       }
    711       template <typename T>
    712       bool operator()(const T* rotation,
    713                       const T* translation,
    714                       const T* intrinsics,
    715                       const T* point,
    716                       T* residual) const {
    717         T transformed_point[3];
    718         RotateAndTranslatePoint(rotation, translation, point, transformed_point);
    719 
    720         // Note that we call intrinsic_projection_, just like it was
    721         // any other templated functor.
    722         return (*intrinsic_projection_)(intrinsics, transformed_point, residual);
    723       }
    724 
    725      private:
    726       scoped_ptr<CostFunctionToFunctor<2,5,3> > intrinsic_projection_;
    727     };
    728 
    729 
    730 
    731 :class:`ConditionedCostFunction`
    732 --------------------------------
    733 
    734 .. class:: ConditionedCostFunction
    735 
    736    This class allows you to apply different conditioning to the residual
    737    values of a wrapped cost function. An example where this is useful is
    738    where you have an existing cost function that produces N values, but you
    739    want the total cost to be something other than just the sum of these
    740    squared values - maybe you want to apply a different scaling to some
    741    values, to change their contribution to the cost.
    742 
    743    Usage:
    744 
    745    .. code-block:: c++
    746 
    747        //  my_cost_function produces N residuals
    748        CostFunction* my_cost_function = ...
    749        CHECK_EQ(N, my_cost_function->num_residuals());
    750        vector<CostFunction*> conditioners;
    751 
    752        //  Make N 1x1 cost functions (1 parameter, 1 residual)
    753        CostFunction* f_1 = ...
    754        conditioners.push_back(f_1);
    755 
    756        CostFunction* f_N = ...
    757        conditioners.push_back(f_N);
    758        ConditionedCostFunction* ccf =
    759          new ConditionedCostFunction(my_cost_function, conditioners);
    760 
    761 
    762    Now ``ccf`` 's ``residual[i]`` (i=0..N-1) will be passed though the
    763    :math:`i^{\text{th}}` conditioner.
    764 
    765    .. code-block:: c++
    766 
    767       ccf_residual[i] = f_i(my_cost_function_residual[i])
    768 
    769    and the Jacobian will be affected appropriately.
    770 
    771 
    772 :class:`NormalPrior`
    773 --------------------
    774 
    775 .. class:: NormalPrior
    776 
    777    .. code-block:: c++
    778 
    779      class NormalPrior: public CostFunction {
    780       public:
    781        // Check that the number of rows in the vector b are the same as the
    782        // number of columns in the matrix A, crash otherwise.
    783        NormalPrior(const Matrix& A, const Vector& b);
    784 
    785        virtual bool Evaluate(double const* const* parameters,
    786                              double* residuals,
    787                              double** jacobians) const;
    788       };
    789 
    790    Implements a cost function of the form
    791 
    792    .. math::  cost(x) = ||A(x - b)||^2
    793 
    794    where, the matrix A and the vector b are fixed and x is the
    795    variable. In case the user is interested in implementing a cost
    796    function of the form
    797 
    798   .. math::  cost(x) = (x - \mu)^T S^{-1} (x - \mu)
    799 
    800   where, :math:`\mu` is a vector and :math:`S` is a covariance matrix,
    801   then, :math:`A = S^{-1/2}`, i.e the matrix :math:`A` is the square
    802   root of the inverse of the covariance, also known as the stiffness
    803   matrix. There are however no restrictions on the shape of
    804   :math:`A`. It is free to be rectangular, which would be the case if
    805   the covariance matrix :math:`S` is rank deficient.
    806 
    807 
    808 
    809 .. _`section-loss_function`:
    810 
    811 :class:`LossFunction`
    812 ---------------------
    813 
    814 .. class:: LossFunction
    815 
    816    For least squares problems where the minimization may encounter
    817    input terms that contain outliers, that is, completely bogus
    818    measurements, it is important to use a loss function that reduces
    819    their influence.
    820 
    821    Consider a structure from motion problem. The unknowns are 3D
    822    points and camera parameters, and the measurements are image
    823    coordinates describing the expected reprojected position for a
    824    point in a camera. For example, we want to model the geometry of a
    825    street scene with fire hydrants and cars, observed by a moving
    826    camera with unknown parameters, and the only 3D points we care
    827    about are the pointy tippy-tops of the fire hydrants. Our magic
    828    image processing algorithm, which is responsible for producing the
    829    measurements that are input to Ceres, has found and matched all
    830    such tippy-tops in all image frames, except that in one of the
    831    frame it mistook a car's headlight for a hydrant. If we didn't do
    832    anything special the residual for the erroneous measurement will
    833    result in the entire solution getting pulled away from the optimum
    834    to reduce the large error that would otherwise be attributed to the
    835    wrong measurement.
    836 
    837    Using a robust loss function, the cost for large residuals is
    838    reduced. In the example above, this leads to outlier terms getting
    839    down-weighted so they do not overly influence the final solution.
    840 
    841    .. code-block:: c++
    842 
    843     class LossFunction {
    844      public:
    845       virtual void Evaluate(double s, double out[3]) const = 0;
    846     };
    847 
    848 
    849    The key method is :func:`LossFunction::Evaluate`, which given a
    850    non-negative scalar ``s``, computes
    851 
    852    .. math:: out = \begin{bmatrix}\rho(s), & \rho'(s), & \rho''(s)\end{bmatrix}
    853 
    854    Here the convention is that the contribution of a term to the cost
    855    function is given by :math:`\frac{1}{2}\rho(s)`, where :math:`s
    856    =\|f_i\|^2`. Calling the method with a negative value of :math:`s`
    857    is an error and the implementations are not required to handle that
    858    case.
    859 
    860    Most sane choices of :math:`\rho` satisfy:
    861 
    862    .. math::
    863 
    864       \rho(0) &= 0\\
    865       \rho'(0) &= 1\\
    866       \rho'(s) &< 1 \text{ in the outlier region}\\
    867       \rho''(s) &< 0 \text{ in the outlier region}
    868 
    869    so that they mimic the squared cost for small residuals.
    870 
    871    **Scaling**
    872 
    873    Given one robustifier :math:`\rho(s)` one can change the length
    874    scale at which robustification takes place, by adding a scale
    875    factor :math:`a > 0` which gives us :math:`\rho(s,a) = a^2 \rho(s /
    876    a^2)` and the first and second derivatives as :math:`\rho'(s /
    877    a^2)` and :math:`(1 / a^2) \rho''(s / a^2)` respectively.
    878 
    879 
    880    The reason for the appearance of squaring is that :math:`a` is in
    881    the units of the residual vector norm whereas :math:`s` is a squared
    882    norm. For applications it is more convenient to specify :math:`a` than
    883    its square.
    884 
    885 Instances
    886 ^^^^^^^^^
    887 
    888 Ceres includes a number of predefined loss functions. For simplicity
    889 we described their unscaled versions. The figure below illustrates
    890 their shape graphically. More details can be found in
    891 ``include/ceres/loss_function.h``.
    892 
    893 .. figure:: loss.png
    894    :figwidth: 500px
    895    :height: 400px
    896    :align: center
    897 
    898    Shape of the various common loss functions.
    899 
    900 .. class:: TrivialLoss
    901 
    902       .. math:: \rho(s) = s
    903 
    904 .. class:: HuberLoss
    905 
    906    .. math:: \rho(s) = \begin{cases} s & s \le 1\\ 2 \sqrt{s} - 1 & s > 1 \end{cases}
    907 
    908 .. class:: SoftLOneLoss
    909 
    910    .. math:: \rho(s) = 2 (\sqrt{1+s} - 1)
    911 
    912 .. class:: CauchyLoss
    913 
    914    .. math:: \rho(s) = \log(1 + s)
    915 
    916 .. class:: ArctanLoss
    917 
    918    .. math:: \rho(s) = \arctan(s)
    919 
    920 .. class:: TolerantLoss
    921 
    922    .. math:: \rho(s,a,b) = b \log(1 + e^{(s - a) / b}) - b \log(1 + e^{-a / b})
    923 
    924 .. class:: ComposedLoss
    925 
    926    Given two loss functions ``f`` and ``g``, implements the loss
    927    function ``h(s) = f(g(s))``.
    928 
    929    .. code-block:: c++
    930 
    931       class ComposedLoss : public LossFunction {
    932        public:
    933         explicit ComposedLoss(const LossFunction* f,
    934                               Ownership ownership_f,
    935                               const LossFunction* g,
    936                               Ownership ownership_g);
    937       };
    938 
    939 .. class:: ScaledLoss
    940 
    941    Sometimes you want to simply scale the output value of the
    942    robustifier. For example, you might want to weight different error
    943    terms differently (e.g., weight pixel reprojection errors
    944    differently from terrain errors).
    945 
    946    Given a loss function :math:`\rho(s)` and a scalar :math:`a`, :class:`ScaledLoss`
    947    implements the function :math:`a \rho(s)`.
    948 
    949    Since we treat the a ``NULL`` Loss function as the Identity loss
    950    function, :math:`rho` = ``NULL``: is a valid input and will result
    951    in the input being scaled by :math:`a`. This provides a simple way
    952    of implementing a scaled ResidualBlock.
    953 
    954 .. class:: LossFunctionWrapper
    955 
    956    Sometimes after the optimization problem has been constructed, we
    957    wish to mutate the scale of the loss function. For example, when
    958    performing estimation from data which has substantial outliers,
    959    convergence can be improved by starting out with a large scale,
    960    optimizing the problem and then reducing the scale. This can have
    961    better convergence behavior than just using a loss function with a
    962    small scale.
    963 
    964    This templated class allows the user to implement a loss function
    965    whose scale can be mutated after an optimization problem has been
    966    constructed. e.g,
    967 
    968    .. code-block:: c++
    969 
    970      Problem problem;
    971 
    972      // Add parameter blocks
    973 
    974      CostFunction* cost_function =
    975          new AutoDiffCostFunction < UW_Camera_Mapper, 2, 9, 3>(
    976              new UW_Camera_Mapper(feature_x, feature_y));
    977 
    978      LossFunctionWrapper* loss_function(new HuberLoss(1.0), TAKE_OWNERSHIP);
    979      problem.AddResidualBlock(cost_function, loss_function, parameters);
    980 
    981      Solver::Options options;
    982      Solver::Summary summary;
    983      Solve(options, &problem, &summary);
    984 
    985      loss_function->Reset(new HuberLoss(1.0), TAKE_OWNERSHIP);
    986      Solve(options, &problem, &summary);
    987 
    988 
    989 Theory
    990 ^^^^^^
    991 
    992 Let us consider a problem with a single problem and a single parameter
    993 block.
    994 
    995 .. math::
    996 
    997  \min_x \frac{1}{2}\rho(f^2(x))
    998 
    999 
   1000 Then, the robustified gradient and the Gauss-Newton Hessian are
   1001 
   1002 .. math::
   1003 
   1004         g(x) &= \rho'J^\top(x)f(x)\\
   1005         H(x) &= J^\top(x)\left(\rho' + 2 \rho''f(x)f^\top(x)\right)J(x)
   1006 
   1007 where the terms involving the second derivatives of :math:`f(x)` have
   1008 been ignored. Note that :math:`H(x)` is indefinite if
   1009 :math:`\rho''f(x)^\top f(x) + \frac{1}{2}\rho' < 0`. If this is not
   1010 the case, then its possible to re-weight the residual and the Jacobian
   1011 matrix such that the corresponding linear least squares problem for
   1012 the robustified Gauss-Newton step.
   1013 
   1014 
   1015 Let :math:`\alpha` be a root of
   1016 
   1017 .. math:: \frac{1}{2}\alpha^2 - \alpha - \frac{\rho''}{\rho'}\|f(x)\|^2 = 0.
   1018 
   1019 
   1020 Then, define the rescaled residual and Jacobian as
   1021 
   1022 .. math::
   1023 
   1024         \tilde{f}(x) &= \frac{\sqrt{\rho'}}{1 - \alpha} f(x)\\
   1025         \tilde{J}(x) &= \sqrt{\rho'}\left(1 - \alpha
   1026                         \frac{f(x)f^\top(x)}{\left\|f(x)\right\|^2} \right)J(x)
   1027 
   1028 
   1029 In the case :math:`2 \rho''\left\|f(x)\right\|^2 + \rho' \lesssim 0`,
   1030 we limit :math:`\alpha \le 1- \epsilon` for some small
   1031 :math:`\epsilon`. For more details see [Triggs]_.
   1032 
   1033 With this simple rescaling, one can use any Jacobian based non-linear
   1034 least squares algorithm to robustified non-linear least squares
   1035 problems.
   1036 
   1037 
   1038 :class:`LocalParameterization`
   1039 ------------------------------
   1040 
   1041 .. class:: LocalParameterization
   1042 
   1043    .. code-block:: c++
   1044 
   1045      class LocalParameterization {
   1046       public:
   1047        virtual ~LocalParameterization() {}
   1048        virtual bool Plus(const double* x,
   1049                          const double* delta,
   1050                          double* x_plus_delta) const = 0;
   1051        virtual bool ComputeJacobian(const double* x, double* jacobian) const = 0;
   1052        virtual int GlobalSize() const = 0;
   1053        virtual int LocalSize() const = 0;
   1054      };
   1055 
   1056    Sometimes the parameters :math:`x` can overparameterize a
   1057    problem. In that case it is desirable to choose a parameterization
   1058    to remove the null directions of the cost. More generally, if
   1059    :math:`x` lies on a manifold of a smaller dimension than the
   1060    ambient space that it is embedded in, then it is numerically and
   1061    computationally more effective to optimize it using a
   1062    parameterization that lives in the tangent space of that manifold
   1063    at each point.
   1064 
   1065    For example, a sphere in three dimensions is a two dimensional
   1066    manifold, embedded in a three dimensional space. At each point on
   1067    the sphere, the plane tangent to it defines a two dimensional
   1068    tangent space. For a cost function defined on this sphere, given a
   1069    point :math:`x`, moving in the direction normal to the sphere at
   1070    that point is not useful. Thus a better way to parameterize a point
   1071    on a sphere is to optimize over two dimensional vector
   1072    :math:`\Delta x` in the tangent space at the point on the sphere
   1073    point and then "move" to the point :math:`x + \Delta x`, where the
   1074    move operation involves projecting back onto the sphere. Doing so
   1075    removes a redundant dimension from the optimization, making it
   1076    numerically more robust and efficient.
   1077 
   1078    More generally we can define a function
   1079 
   1080    .. math:: x' = \boxplus(x, \Delta x),
   1081 
   1082    where :math:`x'` has the same size as :math:`x`, and :math:`\Delta
   1083    x` is of size less than or equal to :math:`x`. The function
   1084    :math:`\boxplus`, generalizes the definition of vector
   1085    addition. Thus it satisfies the identity
   1086 
   1087    .. math:: \boxplus(x, 0) = x,\quad \forall x.
   1088 
   1089    Instances of :class:`LocalParameterization` implement the
   1090    :math:`\boxplus` operation and its derivative with respect to
   1091    :math:`\Delta x` at :math:`\Delta x = 0`.
   1092 
   1093 
   1094 .. function:: int LocalParameterization::GlobalSize()
   1095 
   1096    The dimension of the ambient space in which the parameter block
   1097    :math:`x` lives.
   1098 
   1099 .. function:: int LocalParamterization::LocaLocalSize()
   1100 
   1101    The size of the tangent space
   1102    that :math:`\Delta x` lives in.
   1103 
   1104 .. function:: bool LocalParameterization::Plus(const double* x, const double* delta, double* x_plus_delta) const
   1105 
   1106     :func:`LocalParameterization::Plus` implements :math:`\boxplus(x,\Delta x)`.
   1107 
   1108 .. function:: bool LocalParameterization::ComputeJacobian(const double* x, double* jacobian) const
   1109 
   1110    Computes the Jacobian matrix
   1111 
   1112    .. math:: J = \left . \frac{\partial }{\partial \Delta x} \boxplus(x,\Delta x)\right|_{\Delta x = 0}
   1113 
   1114    in row major form.
   1115 
   1116 Instances
   1117 ^^^^^^^^^
   1118 
   1119 .. class:: IdentityParameterization
   1120 
   1121    A trivial version of :math:`\boxplus` is when :math:`\Delta x` is
   1122    of the same size as :math:`x` and
   1123 
   1124    .. math::  \boxplus(x, \Delta x) = x + \Delta x
   1125 
   1126 .. class:: SubsetParameterization
   1127 
   1128    A more interesting case if :math:`x` is a two dimensional vector,
   1129    and the user wishes to hold the first coordinate constant. Then,
   1130    :math:`\Delta x` is a scalar and :math:`\boxplus` is defined as
   1131 
   1132    .. math::
   1133 
   1134       \boxplus(x, \Delta x) = x + \left[ \begin{array}{c} 0 \\ 1
   1135                                   \end{array} \right] \Delta x
   1136 
   1137    :class:`SubsetParameterization` generalizes this construction to
   1138    hold any part of a parameter block constant.
   1139 
   1140 .. class:: QuaternionParameterization
   1141 
   1142    Another example that occurs commonly in Structure from Motion
   1143    problems is when camera rotations are parameterized using a
   1144    quaternion. There, it is useful only to make updates orthogonal to
   1145    that 4-vector defining the quaternion. One way to do this is to let
   1146    :math:`\Delta x` be a 3 dimensional vector and define
   1147    :math:`\boxplus` to be
   1148 
   1149     .. math:: \boxplus(x, \Delta x) = \left[ \cos(|\Delta x|), \frac{\sin\left(|\Delta x|\right)}{|\Delta x|} \Delta x \right] * x
   1150       :label: quaternion
   1151 
   1152    The multiplication between the two 4-vectors on the right hand side
   1153    is the standard quaternion
   1154    product. :class:`QuaternionParameterization` is an implementation
   1155    of :eq:`quaternion`.
   1156 
   1157 
   1158 
   1159 :class:`AutoDiffLocalParameterization`
   1160 --------------------------------------
   1161 
   1162 .. class:: AutoDiffLocalParameterization
   1163 
   1164   :class:`AutoDiffLocalParameterization` does for
   1165   :class:`LocalParameterization` what :class:`AutoDiffCostFunction`
   1166   does for :class:`CostFunction`. It allows the user to define a
   1167   templated functor that implements the
   1168   :func:`LocalParameterization::Plus` operation and it uses automatic
   1169   differentiation to implement the computation of the Jacobian.
   1170 
   1171   To get an auto differentiated local parameterization, you must
   1172   define a class with a templated operator() (a functor) that computes
   1173 
   1174      .. math:: x' = \boxplus(x, \Delta x),
   1175 
   1176   For example, Quaternions have a three dimensional local
   1177   parameterization. It's plus operation can be implemented as (taken
   1178   from `internal/ceres/autodiff_local_parameterization_test.cc
   1179   <https://ceres-solver.googlesource.com/ceres-solver/+/master/internal/ceres/autodiff_local_parameterization_test.cc>`_
   1180   )
   1181 
   1182     .. code-block:: c++
   1183 
   1184       struct QuaternionPlus {
   1185         template<typename T>
   1186         bool operator()(const T* x, const T* delta, T* x_plus_delta) const {
   1187           const T squared_norm_delta =
   1188               delta[0] * delta[0] + delta[1] * delta[1] + delta[2] * delta[2];
   1189 
   1190           T q_delta[4];
   1191           if (squared_norm_delta > T(0.0)) {
   1192             T norm_delta = sqrt(squared_norm_delta);
   1193             const T sin_delta_by_delta = sin(norm_delta) / norm_delta;
   1194             q_delta[0] = cos(norm_delta);
   1195             q_delta[1] = sin_delta_by_delta * delta[0];
   1196             q_delta[2] = sin_delta_by_delta * delta[1];
   1197             q_delta[3] = sin_delta_by_delta * delta[2];
   1198           } else {
   1199             // We do not just use q_delta = [1,0,0,0] here because that is a
   1200             // constant and when used for automatic differentiation will
   1201             // lead to a zero derivative. Instead we take a first order
   1202             // approximation and evaluate it at zero.
   1203             q_delta[0] = T(1.0);
   1204             q_delta[1] = delta[0];
   1205             q_delta[2] = delta[1];
   1206             q_delta[3] = delta[2];
   1207           }
   1208 
   1209           Quaternionproduct(q_delta, x, x_plus_delta);
   1210           return true;
   1211         }
   1212       };
   1213 
   1214   Then given this struct, the auto differentiated local
   1215   parameterization can now be constructed as
   1216 
   1217   .. code-block:: c++
   1218 
   1219      LocalParameterization* local_parameterization =
   1220          new AutoDiffLocalParameterization<QuaternionPlus, 4, 3>;
   1221                                                            |  |
   1222                                 Global Size ---------------+  |
   1223                                 Local Size -------------------+
   1224 
   1225   **WARNING:** Since the functor will get instantiated with different
   1226   types for ``T``, you must to convert from other numeric types to
   1227   ``T`` before mixing computations with other variables of type
   1228   ``T``. In the example above, this is seen where instead of using
   1229   ``k_`` directly, ``k_`` is wrapped with ``T(k_)``.
   1230 
   1231 
   1232 :class:`Problem`
   1233 ----------------
   1234 
   1235 .. class:: Problem
   1236 
   1237    :class:`Problem` holds the robustified bounds constrained
   1238    non-linear least squares problem :eq:`ceresproblem`. To create a
   1239    least squares problem, use the :func:`Problem::AddResidualBlock`
   1240    and :func:`Problem::AddParameterBlock` methods.
   1241 
   1242    For example a problem containing 3 parameter blocks of sizes 3, 4
   1243    and 5 respectively and two residual blocks of size 2 and 6:
   1244 
   1245    .. code-block:: c++
   1246 
   1247      double x1[] = { 1.0, 2.0, 3.0 };
   1248      double x2[] = { 1.0, 2.0, 3.0, 5.0 };
   1249      double x3[] = { 1.0, 2.0, 3.0, 6.0, 7.0 };
   1250 
   1251      Problem problem;
   1252      problem.AddResidualBlock(new MyUnaryCostFunction(...), x1);
   1253      problem.AddResidualBlock(new MyBinaryCostFunction(...), x2, x3);
   1254 
   1255    :func:`Problem::AddResidualBlock` as the name implies, adds a
   1256    residual block to the problem. It adds a :class:`CostFunction`, an
   1257    optional :class:`LossFunction` and connects the
   1258    :class:`CostFunction` to a set of parameter block.
   1259 
   1260    The cost function carries with it information about the sizes of
   1261    the parameter blocks it expects. The function checks that these
   1262    match the sizes of the parameter blocks listed in
   1263    ``parameter_blocks``. The program aborts if a mismatch is
   1264    detected. ``loss_function`` can be ``NULL``, in which case the cost
   1265    of the term is just the squared norm of the residuals.
   1266 
   1267    The user has the option of explicitly adding the parameter blocks
   1268    using :func:`Problem::AddParameterBlock`. This causes additional
   1269    correctness checking; however, :func:`Problem::AddResidualBlock`
   1270    implicitly adds the parameter blocks if they are not present, so
   1271    calling :func:`Problem::AddParameterBlock` explicitly is not
   1272    required.
   1273 
   1274    :func:`Problem::AddParameterBlock` explicitly adds a parameter
   1275    block to the :class:`Problem`. Optionally it allows the user to
   1276    associate a :class:`LocalParameterization` object with the
   1277    parameter block too. Repeated calls with the same arguments are
   1278    ignored. Repeated calls with the same double pointer but a
   1279    different size results in undefined behavior.
   1280 
   1281    You can set any parameter block to be constant using
   1282    :func:`Problem::SetParameterBlockConstant` and undo this using
   1283    :func:`SetParameterBlockVariable`.
   1284 
   1285    In fact you can set any number of parameter blocks to be constant,
   1286    and Ceres is smart enough to figure out what part of the problem
   1287    you have constructed depends on the parameter blocks that are free
   1288    to change and only spends time solving it. So for example if you
   1289    constructed a problem with a million parameter blocks and 2 million
   1290    residual blocks, but then set all but one parameter blocks to be
   1291    constant and say only 10 residual blocks depend on this one
   1292    non-constant parameter block. Then the computational effort Ceres
   1293    spends in solving this problem will be the same if you had defined
   1294    a problem with one parameter block and 10 residual blocks.
   1295 
   1296    **Ownership**
   1297 
   1298    :class:`Problem` by default takes ownership of the
   1299    ``cost_function``, ``loss_function`` and ``local_parameterization``
   1300    pointers. These objects remain live for the life of the
   1301    :class:`Problem`. If the user wishes to keep control over the
   1302    destruction of these objects, then they can do this by setting the
   1303    corresponding enums in the :class:`Problem::Options` struct.
   1304 
   1305    Note that even though the Problem takes ownership of ``cost_function``
   1306    and ``loss_function``, it does not preclude the user from re-using
   1307    them in another residual block. The destructor takes care to call
   1308    delete on each ``cost_function`` or ``loss_function`` pointer only
   1309    once, regardless of how many residual blocks refer to them.
   1310 
   1311 .. function:: ResidualBlockId Problem::AddResidualBlock(CostFunction* cost_function, LossFunction* loss_function, const vector<double*> parameter_blocks)
   1312 
   1313    Add a residual block to the overall cost function. The cost
   1314    function carries with it information about the sizes of the
   1315    parameter blocks it expects. The function checks that these match
   1316    the sizes of the parameter blocks listed in parameter_blocks. The
   1317    program aborts if a mismatch is detected. loss_function can be
   1318    NULL, in which case the cost of the term is just the squared norm
   1319    of the residuals.
   1320 
   1321    The user has the option of explicitly adding the parameter blocks
   1322    using AddParameterBlock. This causes additional correctness
   1323    checking; however, AddResidualBlock implicitly adds the parameter
   1324    blocks if they are not present, so calling AddParameterBlock
   1325    explicitly is not required.
   1326 
   1327    The Problem object by default takes ownership of the
   1328    cost_function and loss_function pointers. These objects remain
   1329    live for the life of the Problem object. If the user wishes to
   1330    keep control over the destruction of these objects, then they can
   1331    do this by setting the corresponding enums in the Options struct.
   1332 
   1333    Note: Even though the Problem takes ownership of cost_function
   1334    and loss_function, it does not preclude the user from re-using
   1335    them in another residual block. The destructor takes care to call
   1336    delete on each cost_function or loss_function pointer only once,
   1337    regardless of how many residual blocks refer to them.
   1338 
   1339    Example usage:
   1340 
   1341    .. code-block:: c++
   1342 
   1343       double x1[] = {1.0, 2.0, 3.0};
   1344       double x2[] = {1.0, 2.0, 5.0, 6.0};
   1345       double x3[] = {3.0, 6.0, 2.0, 5.0, 1.0};
   1346 
   1347       Problem problem;
   1348 
   1349       problem.AddResidualBlock(new MyUnaryCostFunction(...), NULL, x1);
   1350       problem.AddResidualBlock(new MyBinaryCostFunction(...), NULL, x2, x1);
   1351 
   1352 
   1353 .. function:: void Problem::AddParameterBlock(double* values, int size, LocalParameterization* local_parameterization)
   1354 
   1355    Add a parameter block with appropriate size to the problem.
   1356    Repeated calls with the same arguments are ignored. Repeated calls
   1357    with the same double pointer but a different size results in
   1358    undefined behavior.
   1359 
   1360 .. function:: void Problem::AddParameterBlock(double* values, int size)
   1361 
   1362    Add a parameter block with appropriate size and parameterization to
   1363    the problem. Repeated calls with the same arguments are
   1364    ignored. Repeated calls with the same double pointer but a
   1365    different size results in undefined behavior.
   1366 
   1367 .. function:: void Problem::RemoveResidualBlock(ResidualBlockId residual_block)
   1368 
   1369    Remove a residual block from the problem. Any parameters that the residual
   1370    block depends on are not removed. The cost and loss functions for the
   1371    residual block will not get deleted immediately; won't happen until the
   1372    problem itself is deleted.  If Problem::Options::enable_fast_removal is
   1373    true, then the removal is fast (almost constant time). Otherwise, removing a
   1374    residual block will incur a scan of the entire Problem object to verify that
   1375    the residual_block represents a valid residual in the problem.
   1376 
   1377    **WARNING:** Removing a residual or parameter block will destroy
   1378    the implicit ordering, rendering the jacobian or residuals returned
   1379    from the solver uninterpretable. If you depend on the evaluated
   1380    jacobian, do not use remove! This may change in a future release.
   1381    Hold the indicated parameter block constant during optimization.
   1382 
   1383 .. function:: void Problem::RemoveParameterBlock(double* values)
   1384 
   1385    Remove a parameter block from the problem. The parameterization of
   1386    the parameter block, if it exists, will persist until the deletion
   1387    of the problem (similar to cost/loss functions in residual block
   1388    removal). Any residual blocks that depend on the parameter are also
   1389    removed, as described above in RemoveResidualBlock().  If
   1390    Problem::Options::enable_fast_removal is true, then
   1391    the removal is fast (almost constant time). Otherwise, removing a
   1392    parameter block will incur a scan of the entire Problem object.
   1393 
   1394    **WARNING:** Removing a residual or parameter block will destroy
   1395    the implicit ordering, rendering the jacobian or residuals returned
   1396    from the solver uninterpretable. If you depend on the evaluated
   1397    jacobian, do not use remove! This may change in a future release.
   1398 
   1399 .. function:: void Problem::SetParameterBlockConstant(double* values)
   1400 
   1401    Hold the indicated parameter block constant during optimization.
   1402 
   1403 .. function:: void Problem::SetParameterBlockVariable(double* values)
   1404 
   1405    Allow the indicated parameter to vary during optimization.
   1406 
   1407 .. function:: void Problem::SetParameterization(double* values, LocalParameterization* local_parameterization)
   1408 
   1409    Set the local parameterization for one of the parameter blocks.
   1410    The local_parameterization is owned by the Problem by default. It
   1411    is acceptable to set the same parameterization for multiple
   1412    parameters; the destructor is careful to delete local
   1413    parameterizations only once. The local parameterization can only be
   1414    set once per parameter, and cannot be changed once set.
   1415 
   1416 .. function:: LocalParameterization* Problem::GetParameterization(double* values) const
   1417 
   1418    Get the local parameterization object associated with this
   1419    parameter block. If there is no parameterization object associated
   1420    then `NULL` is returned
   1421 
   1422 .. function:: void Problem::SetParameterLowerBound(double* values, int index, double lower_bound)
   1423 
   1424    Set the lower bound for the parameter at position `index` in the
   1425    parameter block corresponding to `values`. By default the lower
   1426    bound is :math:`-\infty`.
   1427 
   1428 .. function:: void Problem::SetParameterUpperBound(double* values, int index, double upper_bound)
   1429 
   1430    Set the upper bound for the parameter at position `index` in the
   1431    parameter block corresponding to `values`. By default the value is
   1432    :math:`\infty`.
   1433 
   1434 .. function:: int Problem::NumParameterBlocks() const
   1435 
   1436    Number of parameter blocks in the problem. Always equals
   1437    parameter_blocks().size() and parameter_block_sizes().size().
   1438 
   1439 .. function:: int Problem::NumParameters() const
   1440 
   1441    The size of the parameter vector obtained by summing over the sizes
   1442    of all the parameter blocks.
   1443 
   1444 .. function:: int Problem::NumResidualBlocks() const
   1445 
   1446    Number of residual blocks in the problem. Always equals
   1447    residual_blocks().size().
   1448 
   1449 .. function:: int Problem::NumResiduals() const
   1450 
   1451    The size of the residual vector obtained by summing over the sizes
   1452    of all of the residual blocks.
   1453 
   1454 .. function:: int Problem::ParameterBlockSize(const double* values) const
   1455 
   1456    The size of the parameter block.
   1457 
   1458 .. function:: int Problem::ParameterBlockLocalSize(const double* values) const
   1459 
   1460    The size of local parameterization for the parameter block. If
   1461    there is no local parameterization associated with this parameter
   1462    block, then ``ParameterBlockLocalSize`` = ``ParameterBlockSize``.
   1463 
   1464 .. function:: bool Problem::HasParameterBlock(const double* values) const
   1465 
   1466    Is the given parameter block present in the problem or not?
   1467 
   1468 .. function:: void Problem::GetParameterBlocks(vector<double*>* parameter_blocks) const
   1469 
   1470    Fills the passed ``parameter_blocks`` vector with pointers to the
   1471    parameter blocks currently in the problem. After this call,
   1472    ``parameter_block.size() == NumParameterBlocks``.
   1473 
   1474 .. function:: void Problem::GetResidualBlocks(vector<ResidualBlockId>* residual_blocks) const
   1475 
   1476    Fills the passed `residual_blocks` vector with pointers to the
   1477    residual blocks currently in the problem. After this call,
   1478    `residual_blocks.size() == NumResidualBlocks`.
   1479 
   1480 .. function:: void Problem::GetParameterBlocksForResidualBlock(const ResidualBlockId residual_block, vector<double*>* parameter_blocks) const
   1481 
   1482    Get all the parameter blocks that depend on the given residual
   1483    block.
   1484 
   1485 .. function:: void Problem::GetResidualBlocksForParameterBlock(const double* values, vector<ResidualBlockId>* residual_blocks) const
   1486 
   1487    Get all the residual blocks that depend on the given parameter
   1488    block.
   1489 
   1490    If `Problem::Options::enable_fast_removal` is
   1491    `true`, then getting the residual blocks is fast and depends only
   1492    on the number of residual blocks. Otherwise, getting the residual
   1493    blocks for a parameter block will incur a scan of the entire
   1494    :class:`Problem` object.
   1495 
   1496 .. function:: bool Problem::Evaluate(const Problem::EvaluateOptions& options, double* cost, vector<double>* residuals, vector<double>* gradient, CRSMatrix* jacobian)
   1497 
   1498    Evaluate a :class:`Problem`. Any of the output pointers can be
   1499    `NULL`. Which residual blocks and parameter blocks are used is
   1500    controlled by the :class:`Problem::EvaluateOptions` struct below.
   1501 
   1502    .. code-block:: c++
   1503 
   1504      Problem problem;
   1505      double x = 1;
   1506      problem.Add(new MyCostFunction, NULL, &x);
   1507 
   1508      double cost = 0.0;
   1509      problem.Evaluate(Problem::EvaluateOptions(), &cost, NULL, NULL, NULL);
   1510 
   1511    The cost is evaluated at `x = 1`. If you wish to evaluate the
   1512    problem at `x = 2`, then
   1513 
   1514    .. code-block:: c++
   1515 
   1516       x = 2;
   1517       problem.Evaluate(Problem::EvaluateOptions(), &cost, NULL, NULL, NULL);
   1518 
   1519    is the way to do so.
   1520 
   1521    **NOTE** If no local parameterizations are used, then the size of
   1522    the gradient vector is the sum of the sizes of all the parameter
   1523    blocks. If a parameter block has a local parameterization, then
   1524    it contributes "LocalSize" entries to the gradient vector.
   1525 
   1526 .. class:: Problem::EvaluateOptions
   1527 
   1528    Options struct that is used to control :func:`Problem::Evaluate`.
   1529 
   1530 .. member:: vector<double*> Problem::EvaluateOptions::parameter_blocks
   1531 
   1532    The set of parameter blocks for which evaluation should be
   1533    performed. This vector determines the order in which parameter
   1534    blocks occur in the gradient vector and in the columns of the
   1535    jacobian matrix. If parameter_blocks is empty, then it is assumed
   1536    to be equal to a vector containing ALL the parameter
   1537    blocks. Generally speaking the ordering of the parameter blocks in
   1538    this case depends on the order in which they were added to the
   1539    problem and whether or not the user removed any parameter blocks.
   1540 
   1541    **NOTE** This vector should contain the same pointers as the ones
   1542    used to add parameter blocks to the Problem. These parameter block
   1543    should NOT point to new memory locations. Bad things will happen if
   1544    you do.
   1545 
   1546 .. member:: vector<ResidualBlockId> Problem::EvaluateOptions::residual_blocks
   1547 
   1548    The set of residual blocks for which evaluation should be
   1549    performed. This vector determines the order in which the residuals
   1550    occur, and how the rows of the jacobian are ordered. If
   1551    residual_blocks is empty, then it is assumed to be equal to the
   1552    vector containing all the parameter blocks.
   1553 
   1554 ``rotation.h``
   1555 --------------
   1556 
   1557 Many applications of Ceres Solver involve optimization problems where
   1558 some of the variables correspond to rotations. To ease the pain of
   1559 work with the various representations of rotations (angle-axis,
   1560 quaternion and matrix) we provide a handy set of templated
   1561 functions. These functions are templated so that the user can use them
   1562 within Ceres Solver's automatic differentiation framework.
   1563 
   1564 .. function:: void AngleAxisToQuaternion<T>(T const* angle_axis, T* quaternion)
   1565 
   1566    Convert a value in combined axis-angle representation to a
   1567    quaternion.
   1568 
   1569    The value ``angle_axis`` is a triple whose norm is an angle in radians,
   1570    and whose direction is aligned with the axis of rotation, and
   1571    ``quaternion`` is a 4-tuple that will contain the resulting quaternion.
   1572 
   1573 .. function:: void QuaternionToAngleAxis<T>(T const* quaternion, T* angle_axis)
   1574 
   1575    Convert a quaternion to the equivalent combined axis-angle
   1576    representation.
   1577 
   1578    The value ``quaternion`` must be a unit quaternion - it is not
   1579    normalized first, and ``angle_axis`` will be filled with a value
   1580    whose norm is the angle of rotation in radians, and whose direction
   1581    is the axis of rotation.
   1582 
   1583 .. function:: void RotationMatrixToAngleAxis<T, row_stride, col_stride>(const MatrixAdapter<const T, row_stride, col_stride>& R, T * angle_axis)
   1584 .. function:: void AngleAxisToRotationMatrix<T, row_stride, col_stride>(T const * angle_axis, const MatrixAdapter<T, row_stride, col_stride>& R)
   1585 .. function:: void RotationMatrixToAngleAxis<T>(T const * R, T * angle_axis)
   1586 .. function:: void AngleAxisToRotationMatrix<T>(T const * angle_axis, T * R)
   1587 
   1588    Conversions between 3x3 rotation matrix with given column and row strides and
   1589    axis-angle rotation representations. The functions that take a pointer to T instead
   1590    of a MatrixAdapter assume a column major representation with unit row stride and a column stride of 3.
   1591 
   1592 .. function:: void EulerAnglesToRotationMatrix<T, row_stride, col_stride>(const T* euler, const MatrixAdapter<T, row_stride, col_stride>& R)
   1593 .. function:: void EulerAnglesToRotationMatrix<T>(const T* euler, int row_stride, T* R)
   1594 
   1595    Conversions between 3x3 rotation matrix with given column and row strides and
   1596    Euler angle (in degrees) rotation representations.
   1597 
   1598    The {pitch,roll,yaw} Euler angles are rotations around the {x,y,z}
   1599    axes, respectively.  They are applied in that same order, so the
   1600    total rotation R is Rz * Ry * Rx.
   1601 
   1602    The function that takes a pointer to T as the rotation matrix assumes a row
   1603    major representation with unit column stride and a row stride of 3.
   1604    The additional parameter row_stride is required to be 3.
   1605 
   1606 .. function:: void QuaternionToScaledRotation<T, row_stride, col_stride>(const T q[4], const MatrixAdapter<T, row_stride, col_stride>& R)
   1607 .. function:: void QuaternionToScaledRotation<T>(const T q[4], T R[3 * 3])
   1608 
   1609    Convert a 4-vector to a 3x3 scaled rotation matrix.
   1610 
   1611    The choice of rotation is such that the quaternion
   1612    :math:`\begin{bmatrix} 1 &0 &0 &0\end{bmatrix}` goes to an identity
   1613    matrix and for small :math:`a, b, c` the quaternion
   1614    :math:`\begin{bmatrix}1 &a &b &c\end{bmatrix}` goes to the matrix
   1615 
   1616    .. math::
   1617 
   1618      I + 2 \begin{bmatrix} 0 & -c & b \\ c & 0 & -a\\ -b & a & 0
   1619            \end{bmatrix} + O(q^2)
   1620 
   1621    which corresponds to a Rodrigues approximation, the last matrix
   1622    being the cross-product matrix of :math:`\begin{bmatrix} a& b&
   1623    c\end{bmatrix}`. Together with the property that :math:`R(q1 * q2)
   1624    = R(q1) * R(q2)` this uniquely defines the mapping from :math:`q` to
   1625    :math:`R`.
   1626 
   1627    In the function that accepts a pointer to T instead of a MatrixAdapter,
   1628    the rotation matrix ``R`` is a row-major matrix with unit column stride
   1629    and a row stride of 3.
   1630 
   1631    No normalization of the quaternion is performed, i.e.
   1632    :math:`R = \|q\|^2  Q`, where :math:`Q` is an orthonormal matrix
   1633    such that :math:`\det(Q) = 1` and :math:`Q*Q' = I`.
   1634 
   1635 
   1636 .. function:: void QuaternionToRotation<T>(const T q[4], const MatrixAdapter<T, row_stride, col_stride>& R)
   1637 .. function:: void QuaternionToRotation<T>(const T q[4], T R[3 * 3])
   1638 
   1639    Same as above except that the rotation matrix is normalized by the
   1640    Frobenius norm, so that :math:`R R' = I` (and :math:`\det(R) = 1`).
   1641 
   1642 .. function:: void UnitQuaternionRotatePoint<T>(const T q[4], const T pt[3], T result[3])
   1643 
   1644    Rotates a point pt by a quaternion q:
   1645 
   1646    .. math:: \text{result} = R(q)  \text{pt}
   1647 
   1648    Assumes the quaternion is unit norm. If you pass in a quaternion
   1649    with :math:`|q|^2 = 2` then you WILL NOT get back 2 times the
   1650    result you get for a unit quaternion.
   1651 
   1652 
   1653 .. function:: void QuaternionRotatePoint<T>(const T q[4], const T pt[3], T result[3])
   1654 
   1655    With this function you do not need to assume that q has unit norm.
   1656    It does assume that the norm is non-zero.
   1657 
   1658 .. function:: void QuaternionProduct<T>(const T z[4], const T w[4], T zw[4])
   1659 
   1660    .. math:: zw = z * w
   1661 
   1662    where :math:`*` is the Quaternion product between 4-vectors.
   1663 
   1664 
   1665 .. function:: void CrossProduct<T>(const T x[3], const T y[3], T x_cross_y[3])
   1666 
   1667    .. math:: \text{x_cross_y} = x \times y
   1668 
   1669 .. function:: void AngleAxisRotatePoint<T>(const T angle_axis[3], const T pt[3], T result[3])
   1670 
   1671    .. math:: y = R(\text{angle_axis}) x
   1672