Home | History | Annotate | Download | only in source
      1 .. default-domain:: cpp
      2 
      3 .. cpp:namespace:: ceres
      4 
      5 .. _`chapter-modeling`:
      6 
      7 ========
      8 Modeling
      9 ========
     10 
     11 Recall that Ceres solves robustified non-linear least squares problems
     12 of the form
     13 
     14 .. math:: \frac{1}{2}\sum_{i=1} \rho_i\left(\left\|f_i\left(x_{i_1}, ... ,x_{i_k}\right)\right\|^2\right).
     15    :label: ceresproblem
     16 
     17 The expression
     18 :math:`\rho_i\left(\left\|f_i\left(x_{i_1},...,x_{i_k}\right)\right\|^2\right)`
     19 is known as a ``ResidualBlock``, where :math:`f_i(\cdot)` is a
     20 :class:`CostFunction` that depends on the parameter blocks
     21 :math:`\left[x_{i_1},... , x_{i_k}\right]`. In most optimization
     22 problems small groups of scalars occur together. For example the three
     23 components of a translation vector and the four components of the
     24 quaternion that define the pose of a camera. We refer to such a group
     25 of small scalars as a ``ParameterBlock``. Of course a
     26 ``ParameterBlock`` can just be a single parameter. :math:`\rho_i` is a
     27 :class:`LossFunction`. A :class:`LossFunction` is a scalar function
     28 that is used to reduce the influence of outliers on the solution of
     29 non-linear least squares problems.
     30 
     31 In this chapter we will describe the various classes that are part of
     32 Ceres Solver's modeling API, and how they can be used to construct an
     33 optimization problem. Once a problem has been constructed, various
     34 methods for solving them will be discussed in
     35 :ref:`chapter-solving`. It is by design that the modeling and the
     36 solving APIs are orthogonal to each other. This enables
     37 switching/tweaking of various solver parameters without having to
     38 touch the problem once it has been successfully modeled.
     39 
     40 :class:`CostFunction`
     41 ---------------------
     42 
     43 The single biggest task when modeling a problem is specifying the
     44 residuals and their derivatives. This is done using
     45 :class:`CostFunction` objects.
     46 
     47 .. class:: CostFunction
     48 
     49    .. code-block:: c++
     50 
     51     class CostFunction {
     52      public:
     53       virtual bool Evaluate(double const* const* parameters,
     54                             double* residuals,
     55                             double** jacobians) = 0;
     56       const vector<int16>& parameter_block_sizes();
     57       int num_residuals() const;
     58 
     59      protected:
     60       vector<int16>* mutable_parameter_block_sizes();
     61       void set_num_residuals(int num_residuals);
     62     };
     63 
     64    Given parameter blocks :math:`\left[x_{i_1}, ... , x_{i_k}\right]`,
     65    a :class:`CostFunction` is responsible for computing a vector of
     66    residuals and if asked a vector of Jacobian matrices, i.e., given
     67    :math:`\left[x_{i_1}, ... , x_{i_k}\right]`, compute the vector
     68    :math:`f_i\left(x_{i_1},...,x_{i_k}\right)` and the matrices
     69 
     70    .. math:: J_{ij} = \frac{\partial}{\partial x_{i_j}}f_i\left(x_{i_1},...,x_{i_k}\right),\quad \forall j \in \{i_1,..., i_k\}
     71 
     72    The signature of the :class:`CostFunction` (number and sizes of
     73    input parameter blocks and number of outputs) is stored in
     74    :member:`CostFunction::parameter_block_sizes_` and
     75    :member:`CostFunction::num_residuals_` respectively. User code
     76    inheriting from this class is expected to set these two members
     77    with the corresponding accessors. This information will be verified
     78    by the :class:`Problem` when added with
     79    :func:`Problem::AddResidualBlock`.
     80 
     81 .. function:: bool CostFunction::Evaluate(double const* const* parameters, double* residuals, double** jacobians)
     82 
     83    Compute the residual vector and the Jacobian matrices.
     84 
     85    ``parameters`` is an array of pointers to arrays containing the
     86    various parameter blocks. ``parameters`` has the same number of
     87    elements as :member:`CostFunction::parameter_block_sizes_` and the
     88    parameter blocks are in the same order as
     89    :member:`CostFunction::parameter_block_sizes_`.
     90 
     91    ``residuals`` is an array of size ``num_residuals_``.
     92 
     93    ``jacobians`` is an array of size
     94    :member:`CostFunction::parameter_block_sizes_` containing pointers
     95    to storage for Jacobian matrices corresponding to each parameter
     96    block. The Jacobian matrices are in the same order as
     97    :member:`CostFunction::parameter_block_sizes_`. ``jacobians[i]`` is
     98    an array that contains :member:`CostFunction::num_residuals_` x
     99    :member:`CostFunction::parameter_block_sizes_` ``[i]``
    100    elements. Each Jacobian matrix is stored in row-major order, i.e.,
    101    ``jacobians[i][r * parameter_block_size_[i] + c]`` =
    102    :math:`\frac{\partial residual[r]}{\partial parameters[i][c]}`
    103 
    104 
    105    If ``jacobians`` is ``NULL``, then no derivatives are returned;
    106    this is the case when computing cost only. If ``jacobians[i]`` is
    107    ``NULL``, then the Jacobian matrix corresponding to the
    108    :math:`i^{\textrm{th}}` parameter block must not be returned, this
    109    is the case when a parameter block is marked constant.
    110 
    111    **NOTE** The return value indicates whether the computation of the
    112    residuals and/or jacobians was successful or not.
    113 
    114    This can be used to communicate numerical failures in Jacobian
    115    computations for instance.
    116 
    117    A more interesting and common use is to impose constraints on the
    118    parameters. If the initial values of the parameter blocks satisfy
    119    the constraints, then returning false whenever the constraints are
    120    not satisfied will prevent the solver from moving into the
    121    infeasible region. This is not a very sophisticated mechanism for
    122    enforcing constraints, but is often good enough for things like
    123    non-negativity constraints.
    124 
    125    Note that it is important that the initial values of the parameter
    126    block must be feasible, otherwise the solver will declare a
    127    numerical problem at iteration 0.
    128 
    129 
    130 :class:`SizedCostFunction`
    131 --------------------------
    132 
    133 .. class:: SizedCostFunction
    134 
    135    If the size of the parameter blocks and the size of the residual
    136    vector is known at compile time (this is the common case),
    137    :class:`SizeCostFunction` can be used where these values can be
    138    specified as template parameters and the user only needs to
    139    implement :func:`CostFunction::Evaluate`.
    140 
    141    .. code-block:: c++
    142 
    143     template<int kNumResiduals,
    144              int N0 = 0, int N1 = 0, int N2 = 0, int N3 = 0, int N4 = 0,
    145              int N5 = 0, int N6 = 0, int N7 = 0, int N8 = 0, int N9 = 0>
    146     class SizedCostFunction : public CostFunction {
    147      public:
    148       virtual bool Evaluate(double const* const* parameters,
    149                             double* residuals,
    150                             double** jacobians) const = 0;
    151     };
    152 
    153 
    154 :class:`AutoDiffCostFunction`
    155 -----------------------------
    156 
    157 .. class:: AutoDiffCostFunction
    158 
    159    Defining a :class:`CostFunction` or a :class:`SizedCostFunction`
    160    can be a tedious and error prone especially when computing
    161    derivatives.  To this end Ceres provides `automatic differentiation
    162    <http://en.wikipedia.org/wiki/Automatic_differentiation>`_.
    163 
    164    .. code-block:: c++
    165 
    166      template <typename CostFunctor,
    167             int M,        // Number of residuals, or ceres::DYNAMIC.
    168             int N0,       // Number of parameters in block 0.
    169             int N1 = 0,   // Number of parameters in block 1.
    170             int N2 = 0,   // Number of parameters in block 2.
    171             int N3 = 0,   // Number of parameters in block 3.
    172             int N4 = 0,   // Number of parameters in block 4.
    173             int N5 = 0,   // Number of parameters in block 5.
    174             int N6 = 0,   // Number of parameters in block 6.
    175             int N7 = 0,   // Number of parameters in block 7.
    176             int N8 = 0,   // Number of parameters in block 8.
    177             int N9 = 0>   // Number of parameters in block 9.
    178      class AutoDiffCostFunction : public
    179      SizedCostFunction<M, N0, N1, N2, N3, N4, N5, N6, N7, N8, N9> {
    180      };
    181 
    182    To get an auto differentiated cost function, you must define a
    183    class with a templated ``operator()`` (a functor) that computes the
    184    cost function in terms of the template parameter ``T``. The
    185    autodiff framework substitutes appropriate ``Jet`` objects for
    186    ``T`` in order to compute the derivative when necessary, but this
    187    is hidden, and you should write the function as if ``T`` were a
    188    scalar type (e.g. a double-precision floating point number).
    189 
    190    The function must write the computed value in the last argument
    191    (the only non-``const`` one) and return true to indicate success.
    192    Please see :class:`CostFunction` for details on how the return
    193    value may be used to impose simple constraints on the parameter
    194    block.
    195 
    196    For example, consider a scalar error :math:`e = k - x^\top y`,
    197    where both :math:`x` and :math:`y` are two-dimensional vector
    198    parameters and :math:`k` is a constant. The form of this error,
    199    which is the difference between a constant and an expression, is a
    200    common pattern in least squares problems. For example, the value
    201    :math:`x^\top y` might be the model expectation for a series of
    202    measurements, where there is an instance of the cost function for
    203    each measurement :math:`k`.
    204 
    205    The actual cost added to the total problem is :math:`e^2`, or
    206    :math:`(k - x^\top y)^2`; however, the squaring is implicitly done
    207    by the optimization framework.
    208 
    209    To write an auto-differentiable cost function for the above model,
    210    first define the object
    211 
    212    .. code-block:: c++
    213 
    214     class MyScalarCostFunctor {
    215       MyScalarCostFunctor(double k): k_(k) {}
    216 
    217       template <typename T>
    218       bool operator()(const T* const x , const T* const y, T* e) const {
    219         e[0] = T(k_) - x[0] * y[0] - x[1] * y[1];
    220         return true;
    221       }
    222 
    223      private:
    224       double k_;
    225     };
    226 
    227 
    228    Note that in the declaration of ``operator()`` the input parameters
    229    ``x`` and ``y`` come first, and are passed as const pointers to arrays
    230    of ``T``. If there were three input parameters, then the third input
    231    parameter would come after ``y``. The output is always the last
    232    parameter, and is also a pointer to an array. In the example above,
    233    ``e`` is a scalar, so only ``e[0]`` is set.
    234 
    235    Then given this class definition, the auto differentiated cost
    236    function for it can be constructed as follows.
    237 
    238    .. code-block:: c++
    239 
    240     CostFunction* cost_function
    241         = new AutoDiffCostFunction<MyScalarCostFunctor, 1, 2, 2>(
    242             new MyScalarCostFunctor(1.0));              ^  ^  ^
    243                                                         |  |  |
    244                             Dimension of residual ------+  |  |
    245                             Dimension of x ----------------+  |
    246                             Dimension of y -------------------+
    247 
    248 
    249    In this example, there is usually an instance for each measurement
    250    of ``k``.
    251 
    252    In the instantiation above, the template parameters following
    253    ``MyScalarCostFunction``, ``<1, 2, 2>`` describe the functor as
    254    computing a 1-dimensional output from two arguments, both
    255    2-dimensional.
    256 
    257    The framework can currently accommodate cost functions of up to 10
    258    independent variables, and there is no limit on the dimensionality
    259    of each of them.
    260 
    261    **WARNING 1** Since the functor will get instantiated with
    262    different types for ``T``, you must convert from other numeric
    263    types to ``T`` before mixing computations with other variables
    264    of type ``T``. In the example above, this is seen where instead of
    265    using ``k_`` directly, ``k_`` is wrapped with ``T(k_)``.
    266 
    267    **WARNING 2** A common beginner's error when first using
    268    :class:`AutoDiffCostFunction` is to get the sizing wrong. In particular,
    269    there is a tendency to set the template parameters to (dimension of
    270    residual, number of parameters) instead of passing a dimension
    271    parameter for *every parameter block*. In the example above, that
    272    would be ``<MyScalarCostFunction, 1, 2>``, which is missing the 2
    273    as the last template argument.
    274 
    275 
    276 :class:`DynamicAutoDiffCostFunction`
    277 ------------------------------------
    278 
    279 .. class:: DynamicAutoDiffCostFunction
    280 
    281    :class:`AutoDiffCostFunction` requires that the number of parameter
    282    blocks and their sizes be known at compile time, e.g., Bezier curve
    283    fitting, Neural Network training etc. It also has an upper limit of
    284    10 parameter blocks. In a number of applications, this is not
    285    enough.
    286 
    287      .. code-block:: c++
    288 
    289       template <typename CostFunctor, int Stride = 4>
    290       class DynamicAutoDiffCostFunction : public CostFunction {
    291       };
    292 
    293    In such cases :class:`DynamicAutoDiffCostFunction` can be
    294    used. Like :class:`AutoDiffCostFunction` the user must define a
    295    templated functor, but the signature of the functor differs
    296    slightly. The expected interface for the cost functors is:
    297 
    298      .. code-block:: c++
    299 
    300        struct MyCostFunctor {
    301          template<typename T>
    302          bool operator()(T const* const* parameters, T* residuals) const {
    303          }
    304        }
    305 
    306    Since the sizing of the parameters is done at runtime, you must
    307    also specify the sizes after creating the dynamic autodiff cost
    308    function. For example:
    309 
    310      .. code-block:: c++
    311 
    312        DynamicAutoDiffCostFunction<MyCostFunctor, 4> cost_function(
    313            new MyCostFunctor());
    314        cost_function.AddParameterBlock(5);
    315        cost_function.AddParameterBlock(10);
    316        cost_function.SetNumResiduals(21);
    317 
    318    Under the hood, the implementation evaluates the cost function
    319    multiple times, computing a small set of the derivatives (four by
    320    default, controlled by the ``Stride`` template parameter) with each
    321    pass. There is a performance tradeoff with the size of the passes;
    322    Smaller sizes are more cache efficient but result in larger number
    323    of passes, and larger stride lengths can destroy cache-locality
    324    while reducing the number of passes over the cost function. The
    325    optimal value depends on the number and sizes of the various
    326    parameter blocks.
    327 
    328    As a rule of thumb, try using :class:`AutoDiffCostFunction` before
    329    you use :class:`DynamicAutoDiffCostFunction`.
    330 
    331 :class:`NumericDiffCostFunction`
    332 --------------------------------
    333 
    334 .. class:: NumericDiffCostFunction
    335 
    336   In some cases, its not possible to define a templated cost functor,
    337   for example when the evaluation of the residual involves a call to a
    338   library function that you do not have control over.  In such a
    339   situation, `numerical differentiation
    340   <http://en.wikipedia.org/wiki/Numerical_differentiation>`_ can be
    341   used.
    342 
    343     .. code-block:: c++
    344 
    345       template <typename CostFunctionNoJacobian,
    346                 NumericDiffMethod method = CENTRAL, int M = 0,
    347                 int N0 = 0, int N1 = 0, int N2 = 0, int N3 = 0, int N4 = 0,
    348                 int N5 = 0, int N6 = 0, int N7 = 0, int N8 = 0, int N9 = 0>
    349       class NumericDiffCostFunction
    350         : public SizedCostFunction<M, N0, N1, N2, N3, N4, N5, N6, N7, N8, N9> {
    351       };
    352 
    353    To get a numerically differentiated :class:`CostFunction`, you must
    354    define a class with a ``operator()`` (a functor) that computes the
    355    residuals. The functor must write the computed value in the last
    356    argument (the only non-``const`` one) and return ``true`` to
    357    indicate success.  Please see :class:`CostFunction` for details on
    358    how the return value may be used to impose simple constraints on
    359    the parameter block. e.g., an object of the form
    360 
    361    .. code-block:: c++
    362 
    363      struct ScalarFunctor {
    364       public:
    365        bool operator()(const double* const x1,
    366                        const double* const x2,
    367                        double* residuals) const;
    368      }
    369 
    370    For example, consider a scalar error :math:`e = k - x'y`, where
    371    both :math:`x` and :math:`y` are two-dimensional column vector
    372    parameters, the prime sign indicates transposition, and :math:`k`
    373    is a constant. The form of this error, which is the difference
    374    between a constant and an expression, is a common pattern in least
    375    squares problems. For example, the value :math:`x'y` might be the
    376    model expectation for a series of measurements, where there is an
    377    instance of the cost function for each measurement :math:`k`.
    378 
    379    To write an numerically-differentiable class:`CostFunction` for the
    380    above model, first define the object
    381 
    382    .. code-block::  c++
    383 
    384      class MyScalarCostFunctor {
    385        MyScalarCostFunctor(double k): k_(k) {}
    386 
    387        bool operator()(const double* const x,
    388                        const double* const y,
    389                        double* residuals) const {
    390          residuals[0] = k_ - x[0] * y[0] + x[1] * y[1];
    391          return true;
    392        }
    393 
    394       private:
    395        double k_;
    396      };
    397 
    398    Note that in the declaration of ``operator()`` the input parameters
    399    ``x`` and ``y`` come first, and are passed as const pointers to
    400    arrays of ``double`` s. If there were three input parameters, then
    401    the third input parameter would come after ``y``. The output is
    402    always the last parameter, and is also a pointer to an array. In
    403    the example above, the residual is a scalar, so only
    404    ``residuals[0]`` is set.
    405 
    406    Then given this class definition, the numerically differentiated
    407    :class:`CostFunction` with central differences used for computing
    408    the derivative can be constructed as follows.
    409 
    410    .. code-block:: c++
    411 
    412      CostFunction* cost_function
    413          = new NumericDiffCostFunction<MyScalarCostFunctor, CENTRAL, 1, 2, 2>(
    414              new MyScalarCostFunctor(1.0));                    ^     ^  ^  ^
    415                                                                |     |  |  |
    416                                    Finite Differencing Scheme -+     |  |  |
    417                                    Dimension of residual ------------+  |  |
    418                                    Dimension of x ----------------------+  |
    419                                    Dimension of y -------------------------+
    420 
    421    In this example, there is usually an instance for each measurement
    422    of `k`.
    423 
    424    In the instantiation above, the template parameters following
    425    ``MyScalarCostFunctor``, ``1, 2, 2``, describe the functor as
    426    computing a 1-dimensional output from two arguments, both
    427    2-dimensional.
    428 
    429    The framework can currently accommodate cost functions of up to 10
    430    independent variables, and there is no limit on the dimensionality
    431    of each of them.
    432 
    433    The ``CENTRAL`` difference method is considerably more accurate at
    434    the cost of twice as many function evaluations than forward
    435    difference. Consider using central differences begin with, and only
    436    after that works, trying forward difference to improve performance.
    437 
    438    **WARNING** A common beginner's error when first using
    439    NumericDiffCostFunction is to get the sizing wrong. In particular,
    440    there is a tendency to set the template parameters to (dimension of
    441    residual, number of parameters) instead of passing a dimension
    442    parameter for *every parameter*. In the example above, that would
    443    be ``<MyScalarCostFunctor, 1, 2>``, which is missing the last ``2``
    444    argument. Please be careful when setting the size parameters.
    445 
    446 
    447    **Alternate Interface**
    448 
    449    For a variety of reason, including compatibility with legacy code,
    450    :class:`NumericDiffCostFunction` can also take
    451    :class:`CostFunction` objects as input. The following describes
    452    how.
    453 
    454    To get a numerically differentiated cost function, define a
    455    subclass of :class:`CostFunction` such that the
    456    :func:`CostFunction::Evaluate` function ignores the ``jacobians``
    457    parameter. The numeric differentiation wrapper will fill in the
    458    jacobian parameter if necessary by repeatedly calling the
    459    :func:`CostFunction::Evaluate` with small changes to the
    460    appropriate parameters, and computing the slope. For performance,
    461    the numeric differentiation wrapper class is templated on the
    462    concrete cost function, even though it could be implemented only in
    463    terms of the :class:`CostFunction` interface.
    464 
    465    The numerically differentiated version of a cost function for a
    466    cost function can be constructed as follows:
    467 
    468    .. code-block:: c++
    469 
    470      CostFunction* cost_function
    471          = new NumericDiffCostFunction<MyCostFunction, CENTRAL, 1, 4, 8>(
    472              new MyCostFunction(...), TAKE_OWNERSHIP);
    473 
    474    where ``MyCostFunction`` has 1 residual and 2 parameter blocks with
    475    sizes 4 and 8 respectively. Look at the tests for a more detailed
    476    example.
    477 
    478 :class:`NumericDiffFunctor`
    479 ---------------------------
    480 
    481 .. class:: NumericDiffFunctor
    482 
    483    Sometimes parts of a cost function can be differentiated
    484    automatically or analytically but others require numeric
    485    differentiation. :class:`NumericDiffFunctor` is a wrapper class
    486    that takes a variadic functor evaluating a function, numerically
    487    differentiates it and makes it available as a templated functor so
    488    that it can be easily used as part of Ceres' automatic
    489    differentiation framework.
    490 
    491    For example, let us assume that
    492 
    493    .. code-block:: c++
    494 
    495      struct IntrinsicProjection
    496        IntrinsicProjection(const double* observations);
    497        bool operator()(const double* calibration,
    498                        const double* point,
    499                        double* residuals);
    500      };
    501 
    502    is a functor that implements the projection of a point in its local
    503    coordinate system onto its image plane and subtracts it from the
    504    observed point projection.
    505 
    506    Now we would like to compose the action of this functor with the
    507    action of camera extrinsics, i.e., rotation and translation, which
    508    is given by the following templated function
    509 
    510    .. code-block:: c++
    511 
    512      template<typename T>
    513      void RotateAndTranslatePoint(const T* rotation,
    514                                   const T* translation,
    515                                   const T* point,
    516                                   T* result);
    517 
    518    To compose the extrinsics and intrinsics, we can construct a
    519    ``CameraProjection`` functor as follows.
    520 
    521    .. code-block:: c++
    522 
    523     struct CameraProjection {
    524        typedef NumericDiffFunctor<IntrinsicProjection, CENTRAL, 2, 5, 3>
    525           IntrinsicProjectionFunctor;
    526 
    527       CameraProjection(double* observation) {
    528         intrinsic_projection_.reset(
    529             new IntrinsicProjectionFunctor(observation)) {
    530       }
    531 
    532       template <typename T>
    533       bool operator()(const T* rotation,
    534                       const T* translation,
    535                       const T* intrinsics,
    536                       const T* point,
    537                       T* residuals) const {
    538         T transformed_point[3];
    539         RotateAndTranslatePoint(rotation, translation, point, transformed_point);
    540         return (*intrinsic_projection_)(intrinsics, transformed_point, residual);
    541       }
    542 
    543      private:
    544       scoped_ptr<IntrinsicProjectionFunctor> intrinsic_projection_;
    545     };
    546 
    547    Here, we made the choice of using ``CENTRAL`` differences to compute
    548    the jacobian of ``IntrinsicProjection``.
    549 
    550    Now, we are ready to construct an automatically differentiated cost
    551    function as
    552 
    553    .. code-block:: c++
    554 
    555     CostFunction* cost_function =
    556         new AutoDiffCostFunction<CameraProjection, 2, 3, 3, 5>(
    557            new CameraProjection(observations));
    558 
    559    ``cost_function`` now seamlessly integrates automatic
    560    differentiation of ``RotateAndTranslatePoint`` with a numerically
    561    differentiated version of ``IntrinsicProjection``.
    562 
    563 
    564 :class:`CostFunctionToFunctor`
    565 ------------------------------
    566 
    567 .. class:: CostFunctionToFunctor
    568 
    569    Just like :class:`NumericDiffFunctor` allows numeric
    570    differentiation to be mixed with automatic differentiation,
    571    :class:`CostFunctionToFunctor` provides an even more general
    572    mechanism.  :class:`CostFunctionToFunctor` is an adapter class that
    573    allows users to use :class:`CostFunction` objects in templated
    574    functors which are to be used for automatic differentiation.  This
    575    allows the user to seamlessly mix analytic, numeric and automatic
    576    differentiation.
    577 
    578    For example, let us assume that
    579 
    580    .. code-block:: c++
    581 
    582      class IntrinsicProjection : public SizedCostFunction<2, 5, 3> {
    583        public:
    584          IntrinsicProjection(const double* observations);
    585          virtual bool Evaluate(double const* const* parameters,
    586                                double* residuals,
    587                                double** jacobians) const;
    588      };
    589 
    590    is a :class:`CostFunction` that implements the projection of a
    591    point in its local coordinate system onto its image plane and
    592    subtracts it from the observed point projection. It can compute its
    593    residual and either via analytic or numerical differentiation can
    594    compute its jacobians.
    595 
    596    Now we would like to compose the action of this
    597    :class:`CostFunction` with the action of camera extrinsics, i.e.,
    598    rotation and translation. Say we have a templated function
    599 
    600    .. code-block:: c++
    601 
    602       template<typename T>
    603       void RotateAndTranslatePoint(const T* rotation,
    604                                    const T* translation,
    605                                    const T* point,
    606                                    T* result);
    607 
    608 
    609    Then we can now do the following,
    610 
    611    .. code-block:: c++
    612 
    613     struct CameraProjection {
    614       CameraProjection(double* observation) {
    615         intrinsic_projection_.reset(
    616             new CostFunctionToFunctor<2, 5, 3>(new IntrinsicProjection(observation_)));
    617       }
    618       template <typename T>
    619       bool operator()(const T* rotation,
    620                       const T* translation,
    621                       const T* intrinsics,
    622                       const T* point,
    623                       T* residual) const {
    624         T transformed_point[3];
    625         RotateAndTranslatePoint(rotation, translation, point, transformed_point);
    626 
    627         // Note that we call intrinsic_projection_, just like it was
    628         // any other templated functor.
    629         return (*intrinsic_projection_)(intrinsics, transformed_point, residual);
    630       }
    631 
    632      private:
    633       scoped_ptr<CostFunctionToFunctor<2,5,3> > intrinsic_projection_;
    634     };
    635 
    636 
    637 
    638 :class:`ConditionedCostFunction`
    639 --------------------------------
    640 
    641 .. class:: ConditionedCostFunction
    642 
    643    This class allows you to apply different conditioning to the residual
    644    values of a wrapped cost function. An example where this is useful is
    645    where you have an existing cost function that produces N values, but you
    646    want the total cost to be something other than just the sum of these
    647    squared values - maybe you want to apply a different scaling to some
    648    values, to change their contribution to the cost.
    649 
    650    Usage:
    651 
    652    .. code-block:: c++
    653 
    654        //  my_cost_function produces N residuals
    655        CostFunction* my_cost_function = ...
    656        CHECK_EQ(N, my_cost_function->num_residuals());
    657        vector<CostFunction*> conditioners;
    658 
    659        //  Make N 1x1 cost functions (1 parameter, 1 residual)
    660        CostFunction* f_1 = ...
    661        conditioners.push_back(f_1);
    662 
    663        CostFunction* f_N = ...
    664        conditioners.push_back(f_N);
    665        ConditionedCostFunction* ccf =
    666          new ConditionedCostFunction(my_cost_function, conditioners);
    667 
    668 
    669    Now ``ccf`` 's ``residual[i]`` (i=0..N-1) will be passed though the
    670    :math:`i^{\text{th}}` conditioner.
    671 
    672    .. code-block:: c++
    673 
    674       ccf_residual[i] = f_i(my_cost_function_residual[i])
    675 
    676    and the Jacobian will be affected appropriately.
    677 
    678 
    679 :class:`NormalPrior`
    680 --------------------
    681 
    682 .. class:: NormalPrior
    683 
    684    .. code-block:: c++
    685 
    686      class NormalPrior: public CostFunction {
    687       public:
    688        // Check that the number of rows in the vector b are the same as the
    689        // number of columns in the matrix A, crash otherwise.
    690        NormalPrior(const Matrix& A, const Vector& b);
    691 
    692        virtual bool Evaluate(double const* const* parameters,
    693                              double* residuals,
    694                              double** jacobians) const;
    695       };
    696 
    697    Implements a cost function of the form
    698 
    699    .. math::  cost(x) = ||A(x - b)||^2
    700 
    701    where, the matrix A and the vector b are fixed and x is the
    702    variable. In case the user is interested in implementing a cost
    703    function of the form
    704 
    705   .. math::  cost(x) = (x - \mu)^T S^{-1} (x - \mu)
    706 
    707   where, :math:`\mu` is a vector and :math:`S` is a covariance matrix,
    708   then, :math:`A = S^{-1/2}`, i.e the matrix :math:`A` is the square
    709   root of the inverse of the covariance, also known as the stiffness
    710   matrix. There are however no restrictions on the shape of
    711   :math:`A`. It is free to be rectangular, which would be the case if
    712   the covariance matrix :math:`S` is rank deficient.
    713 
    714 
    715 
    716 :class:`LossFunction`
    717 ---------------------
    718 
    719 .. class:: LossFunction
    720 
    721    For least squares problems where the minimization may encounter
    722    input terms that contain outliers, that is, completely bogus
    723    measurements, it is important to use a loss function that reduces
    724    their influence.
    725 
    726    Consider a structure from motion problem. The unknowns are 3D
    727    points and camera parameters, and the measurements are image
    728    coordinates describing the expected reprojected position for a
    729    point in a camera. For example, we want to model the geometry of a
    730    street scene with fire hydrants and cars, observed by a moving
    731    camera with unknown parameters, and the only 3D points we care
    732    about are the pointy tippy-tops of the fire hydrants. Our magic
    733    image processing algorithm, which is responsible for producing the
    734    measurements that are input to Ceres, has found and matched all
    735    such tippy-tops in all image frames, except that in one of the
    736    frame it mistook a car's headlight for a hydrant. If we didn't do
    737    anything special the residual for the erroneous measurement will
    738    result in the entire solution getting pulled away from the optimum
    739    to reduce the large error that would otherwise be attributed to the
    740    wrong measurement.
    741 
    742    Using a robust loss function, the cost for large residuals is
    743    reduced. In the example above, this leads to outlier terms getting
    744    down-weighted so they do not overly influence the final solution.
    745 
    746    .. code-block:: c++
    747 
    748     class LossFunction {
    749      public:
    750       virtual void Evaluate(double s, double out[3]) const = 0;
    751     };
    752 
    753 
    754    The key method is :func:`LossFunction::Evaluate`, which given a
    755    non-negative scalar ``s``, computes
    756 
    757    .. math:: out = \begin{bmatrix}\rho(s), & \rho'(s), & \rho''(s)\end{bmatrix}
    758 
    759    Here the convention is that the contribution of a term to the cost
    760    function is given by :math:`\frac{1}{2}\rho(s)`, where :math:`s
    761    =\|f_i\|^2`. Calling the method with a negative value of :math:`s`
    762    is an error and the implementations are not required to handle that
    763    case.
    764 
    765    Most sane choices of :math:`\rho` satisfy:
    766 
    767    .. math::
    768 
    769       \rho(0) &= 0\\
    770       \rho'(0) &= 1\\
    771       \rho'(s) &< 1 \text{ in the outlier region}\\
    772       \rho''(s) &< 0 \text{ in the outlier region}
    773 
    774    so that they mimic the squared cost for small residuals.
    775 
    776    **Scaling**
    777 
    778    Given one robustifier :math:`\rho(s)` one can change the length
    779    scale at which robustification takes place, by adding a scale
    780    factor :math:`a > 0` which gives us :math:`\rho(s,a) = a^2 \rho(s /
    781    a^2)` and the first and second derivatives as :math:`\rho'(s /
    782    a^2)` and :math:`(1 / a^2) \rho''(s / a^2)` respectively.
    783 
    784 
    785    The reason for the appearance of squaring is that :math:`a` is in
    786    the units of the residual vector norm whereas :math:`s` is a squared
    787    norm. For applications it is more convenient to specify :math:`a` than
    788    its square.
    789 
    790 Instances
    791 ^^^^^^^^^
    792 
    793 Ceres includes a number of predefined loss functions. For simplicity
    794 we described their unscaled versions. The figure below illustrates
    795 their shape graphically. More details can be found in
    796 ``include/ceres/loss_function.h``.
    797 
    798 .. figure:: loss.png
    799    :figwidth: 500px
    800    :height: 400px
    801    :align: center
    802 
    803    Shape of the various common loss functions.
    804 
    805 .. class:: TrivialLoss
    806 
    807       .. math:: \rho(s) = s
    808 
    809 .. class:: HuberLoss
    810 
    811    .. math:: \rho(s) = \begin{cases} s & s \le 1\\ 2 \sqrt{s} - 1 & s > 1 \end{cases}
    812 
    813 .. class:: SoftLOneLoss
    814 
    815    .. math:: \rho(s) = 2 (\sqrt{1+s} - 1)
    816 
    817 .. class:: CauchyLoss
    818 
    819    .. math:: \rho(s) = \log(1 + s)
    820 
    821 .. class:: ArctanLoss
    822 
    823    .. math:: \rho(s) = \arctan(s)
    824 
    825 .. class:: TolerantLoss
    826 
    827    .. math:: \rho(s,a,b) = b \log(1 + e^{(s - a) / b}) - b \log(1 + e^{-a / b})
    828 
    829 .. class:: ComposedLoss
    830 
    831    Given two loss functions ``f`` and ``g``, implements the loss
    832    function ``h(s) = f(g(s))``.
    833 
    834    .. code-block:: c++
    835 
    836       class ComposedLoss : public LossFunction {
    837        public:
    838         explicit ComposedLoss(const LossFunction* f,
    839                               Ownership ownership_f,
    840                               const LossFunction* g,
    841                               Ownership ownership_g);
    842       };
    843 
    844 .. class:: ScaledLoss
    845 
    846    Sometimes you want to simply scale the output value of the
    847    robustifier. For example, you might want to weight different error
    848    terms differently (e.g., weight pixel reprojection errors
    849    differently from terrain errors).
    850 
    851    Given a loss function :math:`\rho(s)` and a scalar :math:`a`, :class:`ScaledLoss`
    852    implements the function :math:`a \rho(s)`.
    853 
    854    Since we treat the a ``NULL`` Loss function as the Identity loss
    855    function, :math:`rho` = ``NULL``: is a valid input and will result
    856    in the input being scaled by :math:`a`. This provides a simple way
    857    of implementing a scaled ResidualBlock.
    858 
    859 .. class:: LossFunctionWrapper
    860 
    861    Sometimes after the optimization problem has been constructed, we
    862    wish to mutate the scale of the loss function. For example, when
    863    performing estimation from data which has substantial outliers,
    864    convergence can be improved by starting out with a large scale,
    865    optimizing the problem and then reducing the scale. This can have
    866    better convergence behavior than just using a loss function with a
    867    small scale.
    868 
    869    This templated class allows the user to implement a loss function
    870    whose scale can be mutated after an optimization problem has been
    871    constructed. e.g,
    872 
    873    .. code-block:: c++
    874 
    875      Problem problem;
    876 
    877      // Add parameter blocks
    878 
    879      CostFunction* cost_function =
    880          new AutoDiffCostFunction < UW_Camera_Mapper, 2, 9, 3>(
    881              new UW_Camera_Mapper(feature_x, feature_y));
    882 
    883      LossFunctionWrapper* loss_function(new HuberLoss(1.0), TAKE_OWNERSHIP);
    884      problem.AddResidualBlock(cost_function, loss_function, parameters);
    885 
    886      Solver::Options options;
    887      Solver::Summary summary;
    888      Solve(options, &problem, &summary);
    889 
    890      loss_function->Reset(new HuberLoss(1.0), TAKE_OWNERSHIP);
    891      Solve(options, &problem, &summary);
    892 
    893 
    894 Theory
    895 ^^^^^^
    896 
    897 Let us consider a problem with a single problem and a single parameter
    898 block.
    899 
    900 .. math::
    901 
    902  \min_x \frac{1}{2}\rho(f^2(x))
    903 
    904 
    905 Then, the robustified gradient and the Gauss-Newton Hessian are
    906 
    907 .. math::
    908 
    909         g(x) &= \rho'J^\top(x)f(x)\\
    910         H(x) &= J^\top(x)\left(\rho' + 2 \rho''f(x)f^\top(x)\right)J(x)
    911 
    912 where the terms involving the second derivatives of :math:`f(x)` have
    913 been ignored. Note that :math:`H(x)` is indefinite if
    914 :math:`\rho''f(x)^\top f(x) + \frac{1}{2}\rho' < 0`. If this is not
    915 the case, then its possible to re-weight the residual and the Jacobian
    916 matrix such that the corresponding linear least squares problem for
    917 the robustified Gauss-Newton step.
    918 
    919 
    920 Let :math:`\alpha` be a root of
    921 
    922 .. math:: \frac{1}{2}\alpha^2 - \alpha - \frac{\rho''}{\rho'}\|f(x)\|^2 = 0.
    923 
    924 
    925 Then, define the rescaled residual and Jacobian as
    926 
    927 .. math::
    928 
    929         \tilde{f}(x) &= \frac{\sqrt{\rho'}}{1 - \alpha} f(x)\\
    930         \tilde{J}(x) &= \sqrt{\rho'}\left(1 - \alpha
    931                         \frac{f(x)f^\top(x)}{\left\|f(x)\right\|^2} \right)J(x)
    932 
    933 
    934 In the case :math:`2 \rho''\left\|f(x)\right\|^2 + \rho' \lesssim 0`,
    935 we limit :math:`\alpha \le 1- \epsilon` for some small
    936 :math:`\epsilon`. For more details see [Triggs]_.
    937 
    938 With this simple rescaling, one can use any Jacobian based non-linear
    939 least squares algorithm to robustified non-linear least squares
    940 problems.
    941 
    942 
    943 :class:`LocalParameterization`
    944 ------------------------------
    945 
    946 .. class:: LocalParameterization
    947 
    948    .. code-block:: c++
    949 
    950      class LocalParameterization {
    951       public:
    952        virtual ~LocalParameterization() {}
    953        virtual bool Plus(const double* x,
    954                          const double* delta,
    955                          double* x_plus_delta) const = 0;
    956        virtual bool ComputeJacobian(const double* x, double* jacobian) const = 0;
    957        virtual int GlobalSize() const = 0;
    958        virtual int LocalSize() const = 0;
    959      };
    960 
    961    Sometimes the parameters :math:`x` can overparameterize a
    962    problem. In that case it is desirable to choose a parameterization
    963    to remove the null directions of the cost. More generally, if
    964    :math:`x` lies on a manifold of a smaller dimension than the
    965    ambient space that it is embedded in, then it is numerically and
    966    computationally more effective to optimize it using a
    967    parameterization that lives in the tangent space of that manifold
    968    at each point.
    969 
    970    For example, a sphere in three dimensions is a two dimensional
    971    manifold, embedded in a three dimensional space. At each point on
    972    the sphere, the plane tangent to it defines a two dimensional
    973    tangent space. For a cost function defined on this sphere, given a
    974    point :math:`x`, moving in the direction normal to the sphere at
    975    that point is not useful. Thus a better way to parameterize a point
    976    on a sphere is to optimize over two dimensional vector
    977    :math:`\Delta x` in the tangent space at the point on the sphere
    978    point and then "move" to the point :math:`x + \Delta x`, where the
    979    move operation involves projecting back onto the sphere. Doing so
    980    removes a redundant dimension from the optimization, making it
    981    numerically more robust and efficient.
    982 
    983    More generally we can define a function
    984 
    985    .. math:: x' = \boxplus(x, \Delta x),
    986 
    987    where :math:`x'` has the same size as :math:`x`, and :math:`\Delta
    988    x` is of size less than or equal to :math:`x`. The function
    989    :math:`\boxplus`, generalizes the definition of vector
    990    addition. Thus it satisfies the identity
    991 
    992    .. math:: \boxplus(x, 0) = x,\quad \forall x.
    993 
    994    Instances of :class:`LocalParameterization` implement the
    995    :math:`\boxplus` operation and its derivative with respect to
    996    :math:`\Delta x` at :math:`\Delta x = 0`.
    997 
    998 
    999 .. function:: int LocalParameterization::GlobalSize()
   1000 
   1001    The dimension of the ambient space in which the parameter block
   1002    :math:`x` lives.
   1003 
   1004 .. function:: int LocalParamterization::LocaLocalSize()
   1005 
   1006    The size of the tangent space
   1007    that :math:`\Delta x` lives in.
   1008 
   1009 .. function:: bool LocalParameterization::Plus(const double* x, const double* delta, double* x_plus_delta) const
   1010 
   1011     :func:`LocalParameterization::Plus` implements :math:`\boxplus(x,\Delta x)`.
   1012 
   1013 .. function:: bool LocalParameterization::ComputeJacobian(const double* x, double* jacobian) const
   1014 
   1015    Computes the Jacobian matrix
   1016 
   1017    .. math:: J = \left . \frac{\partial }{\partial \Delta x} \boxplus(x,\Delta x)\right|_{\Delta x = 0}
   1018 
   1019    in row major form.
   1020 
   1021 Instances
   1022 ^^^^^^^^^
   1023 
   1024 .. class:: IdentityParameterization
   1025 
   1026    A trivial version of :math:`\boxplus` is when :math:`\Delta x` is
   1027    of the same size as :math:`x` and
   1028 
   1029    .. math::  \boxplus(x, \Delta x) = x + \Delta x
   1030 
   1031 .. class:: SubsetParameterization
   1032 
   1033    A more interesting case if :math:`x` is a two dimensional vector,
   1034    and the user wishes to hold the first coordinate constant. Then,
   1035    :math:`\Delta x` is a scalar and :math:`\boxplus` is defined as
   1036 
   1037    .. math::
   1038 
   1039       \boxplus(x, \Delta x) = x + \left[ \begin{array}{c} 0 \\ 1
   1040                                   \end{array} \right] \Delta x
   1041 
   1042    :class:`SubsetParameterization` generalizes this construction to
   1043    hold any part of a parameter block constant.
   1044 
   1045 .. class:: QuaternionParameterization
   1046 
   1047    Another example that occurs commonly in Structure from Motion
   1048    problems is when camera rotations are parameterized using a
   1049    quaternion. There, it is useful only to make updates orthogonal to
   1050    that 4-vector defining the quaternion. One way to do this is to let
   1051    :math:`\Delta x` be a 3 dimensional vector and define
   1052    :math:`\boxplus` to be
   1053 
   1054     .. math:: \boxplus(x, \Delta x) = \left[ \cos(|\Delta x|), \frac{\sin\left(|\Delta x|\right)}{|\Delta x|} \Delta x \right] * x
   1055       :label: quaternion
   1056 
   1057    The multiplication between the two 4-vectors on the right hand side
   1058    is the standard quaternion
   1059    product. :class:`QuaternionParameterization` is an implementation
   1060    of :eq:`quaternion`.
   1061 
   1062 
   1063 
   1064 :class:`AutoDiffLocalParameterization`
   1065 --------------------------------------
   1066 
   1067 .. class:: AutoDiffLocalParameterization
   1068 
   1069   :class:`AutoDiffLocalParameterization` does for
   1070   :class:`LocalParameterization` what :class:`AutoDiffCostFunction`
   1071   does for :class:`CostFunction`. It allows the user to define a
   1072   templated functor that implements the
   1073   :func:`LocalParameterization::Plus` operation and it uses automatic
   1074   differentiation to implement the computation of the Jacobian.
   1075 
   1076   To get an auto differentiated local parameterization, you must
   1077   define a class with a templated operator() (a functor) that computes
   1078 
   1079      .. math:: x' = \boxplus(x, \Delta x),
   1080 
   1081   For example, Quaternions have a three dimensional local
   1082   parameterization. It's plus operation can be implemented as (taken
   1083   from `internal/ceres/auto_diff_local_parameterization_test.cc
   1084   <https://ceres-solver.googlesource.com/ceres-solver/+/master/include/ceres/local_parameterization.h>`_
   1085   )
   1086 
   1087     .. code-block:: c++
   1088 
   1089       struct QuaternionPlus {
   1090         template<typename T>
   1091         bool operator()(const T* x, const T* delta, T* x_plus_delta) const {
   1092           const T squared_norm_delta =
   1093               delta[0] * delta[0] + delta[1] * delta[1] + delta[2] * delta[2];
   1094 
   1095           T q_delta[4];
   1096           if (squared_norm_delta > T(0.0)) {
   1097             T norm_delta = sqrt(squared_norm_delta);
   1098             const T sin_delta_by_delta = sin(norm_delta) / norm_delta;
   1099             q_delta[0] = cos(norm_delta);
   1100             q_delta[1] = sin_delta_by_delta * delta[0];
   1101             q_delta[2] = sin_delta_by_delta * delta[1];
   1102             q_delta[3] = sin_delta_by_delta * delta[2];
   1103           } else {
   1104             // We do not just use q_delta = [1,0,0,0] here because that is a
   1105             // constant and when used for automatic differentiation will
   1106             // lead to a zero derivative. Instead we take a first order
   1107             // approximation and evaluate it at zero.
   1108             q_delta[0] = T(1.0);
   1109             q_delta[1] = delta[0];
   1110             q_delta[2] = delta[1];
   1111             q_delta[3] = delta[2];
   1112           }
   1113 
   1114           Quaternionproduct(q_delta, x, x_plus_delta);
   1115           return true;
   1116         }
   1117       };
   1118 
   1119   Then given this struct, the auto differentiated local
   1120   parameterization can now be constructed as
   1121 
   1122   .. code-block:: c++
   1123 
   1124      LocalParameterization* local_parameterization =
   1125          new AutoDiffLocalParameterization<QuaternionPlus, 4, 3>;
   1126                                                            |  |
   1127                                 Global Size ---------------+  |
   1128                                 Local Size -------------------+
   1129 
   1130   **WARNING:** Since the functor will get instantiated with different
   1131   types for ``T``, you must to convert from other numeric types to
   1132   ``T`` before mixing computations with other variables of type
   1133   ``T``. In the example above, this is seen where instead of using
   1134   ``k_`` directly, ``k_`` is wrapped with ``T(k_)``.
   1135 
   1136 
   1137 :class:`Problem`
   1138 ----------------
   1139 
   1140 .. class:: Problem
   1141 
   1142    :class:`Problem` holds the robustified non-linear least squares
   1143    problem :eq:`ceresproblem`. To create a least squares problem, use
   1144    the :func:`Problem::AddResidualBlock` and
   1145    :func:`Problem::AddParameterBlock` methods.
   1146 
   1147    For example a problem containing 3 parameter blocks of sizes 3, 4
   1148    and 5 respectively and two residual blocks of size 2 and 6:
   1149 
   1150    .. code-block:: c++
   1151 
   1152      double x1[] = { 1.0, 2.0, 3.0 };
   1153      double x2[] = { 1.0, 2.0, 3.0, 5.0 };
   1154      double x3[] = { 1.0, 2.0, 3.0, 6.0, 7.0 };
   1155 
   1156      Problem problem;
   1157      problem.AddResidualBlock(new MyUnaryCostFunction(...), x1);
   1158      problem.AddResidualBlock(new MyBinaryCostFunction(...), x2, x3);
   1159 
   1160    :func:`Problem::AddResidualBlock` as the name implies, adds a
   1161    residual block to the problem. It adds a :class:`CostFunction`, an
   1162    optional :class:`LossFunction` and connects the
   1163    :class:`CostFunction` to a set of parameter block.
   1164 
   1165    The cost function carries with it information about the sizes of
   1166    the parameter blocks it expects. The function checks that these
   1167    match the sizes of the parameter blocks listed in
   1168    ``parameter_blocks``. The program aborts if a mismatch is
   1169    detected. ``loss_function`` can be ``NULL``, in which case the cost
   1170    of the term is just the squared norm of the residuals.
   1171 
   1172    The user has the option of explicitly adding the parameter blocks
   1173    using :func:`Problem::AddParameterBlock`. This causes additional
   1174    correctness checking; however, :func:`Problem::AddResidualBlock`
   1175    implicitly adds the parameter blocks if they are not present, so
   1176    calling :func:`Problem::AddParameterBlock` explicitly is not
   1177    required.
   1178 
   1179    :func:`Problem::AddParameterBlock` explicitly adds a parameter
   1180    block to the :class:`Problem`. Optionally it allows the user to
   1181    associate a :class:`LocalParameterization` object with the
   1182    parameter block too. Repeated calls with the same arguments are
   1183    ignored. Repeated calls with the same double pointer but a
   1184    different size results in undefined behavior.
   1185 
   1186    You can set any parameter block to be constant using
   1187    :func:`Problem::SetParameterBlockConstant` and undo this using
   1188    :func:`SetParameterBlockVariable`.
   1189 
   1190    In fact you can set any number of parameter blocks to be constant,
   1191    and Ceres is smart enough to figure out what part of the problem
   1192    you have constructed depends on the parameter blocks that are free
   1193    to change and only spends time solving it. So for example if you
   1194    constructed a problem with a million parameter blocks and 2 million
   1195    residual blocks, but then set all but one parameter blocks to be
   1196    constant and say only 10 residual blocks depend on this one
   1197    non-constant parameter block. Then the computational effort Ceres
   1198    spends in solving this problem will be the same if you had defined
   1199    a problem with one parameter block and 10 residual blocks.
   1200 
   1201    **Ownership**
   1202 
   1203    :class:`Problem` by default takes ownership of the
   1204    ``cost_function``, ``loss_function`` and ``local_parameterization``
   1205    pointers. These objects remain live for the life of the
   1206    :class:`Problem`. If the user wishes to keep control over the
   1207    destruction of these objects, then they can do this by setting the
   1208    corresponding enums in the :class:`Problem::Options` struct.
   1209 
   1210    Note that even though the Problem takes ownership of ``cost_function``
   1211    and ``loss_function``, it does not preclude the user from re-using
   1212    them in another residual block. The destructor takes care to call
   1213    delete on each ``cost_function`` or ``loss_function`` pointer only
   1214    once, regardless of how many residual blocks refer to them.
   1215 
   1216 .. function:: ResidualBlockId Problem::AddResidualBlock(CostFunction* cost_function, LossFunction* loss_function, const vector<double*> parameter_blocks)
   1217 
   1218    Add a residual block to the overall cost function. The cost
   1219    function carries with it information about the sizes of the
   1220    parameter blocks it expects. The function checks that these match
   1221    the sizes of the parameter blocks listed in parameter_blocks. The
   1222    program aborts if a mismatch is detected. loss_function can be
   1223    NULL, in which case the cost of the term is just the squared norm
   1224    of the residuals.
   1225 
   1226    The user has the option of explicitly adding the parameter blocks
   1227    using AddParameterBlock. This causes additional correctness
   1228    checking; however, AddResidualBlock implicitly adds the parameter
   1229    blocks if they are not present, so calling AddParameterBlock
   1230    explicitly is not required.
   1231 
   1232    The Problem object by default takes ownership of the
   1233    cost_function and loss_function pointers. These objects remain
   1234    live for the life of the Problem object. If the user wishes to
   1235    keep control over the destruction of these objects, then they can
   1236    do this by setting the corresponding enums in the Options struct.
   1237 
   1238    Note: Even though the Problem takes ownership of cost_function
   1239    and loss_function, it does not preclude the user from re-using
   1240    them in another residual block. The destructor takes care to call
   1241    delete on each cost_function or loss_function pointer only once,
   1242    regardless of how many residual blocks refer to them.
   1243 
   1244    Example usage:
   1245 
   1246    .. code-block:: c++
   1247 
   1248       double x1[] = {1.0, 2.0, 3.0};
   1249       double x2[] = {1.0, 2.0, 5.0, 6.0};
   1250       double x3[] = {3.0, 6.0, 2.0, 5.0, 1.0};
   1251 
   1252       Problem problem;
   1253 
   1254       problem.AddResidualBlock(new MyUnaryCostFunction(...), NULL, x1);
   1255       problem.AddResidualBlock(new MyBinaryCostFunction(...), NULL, x2, x1);
   1256 
   1257 
   1258 .. function:: void Problem::AddParameterBlock(double* values, int size, LocalParameterization* local_parameterization)
   1259 
   1260    Add a parameter block with appropriate size to the problem.
   1261    Repeated calls with the same arguments are ignored. Repeated calls
   1262    with the same double pointer but a different size results in
   1263    undefined behavior.
   1264 
   1265 .. function:: void Problem::AddParameterBlock(double* values, int size)
   1266 
   1267    Add a parameter block with appropriate size and parameterization to
   1268    the problem. Repeated calls with the same arguments are
   1269    ignored. Repeated calls with the same double pointer but a
   1270    different size results in undefined behavior.
   1271 
   1272 .. function:: void Problem::RemoveResidualBlock(ResidualBlockId residual_block)
   1273 
   1274    Remove a residual block from the problem. Any parameters that the residual
   1275    block depends on are not removed. The cost and loss functions for the
   1276    residual block will not get deleted immediately; won't happen until the
   1277    problem itself is deleted.
   1278 
   1279    **WARNING:** Removing a residual or parameter block will destroy
   1280    the implicit ordering, rendering the jacobian or residuals returned
   1281    from the solver uninterpretable. If you depend on the evaluated
   1282    jacobian, do not use remove! This may change in a future release.
   1283    Hold the indicated parameter block constant during optimization.
   1284 
   1285 .. function:: void Problem::RemoveParameterBlock(double* values)
   1286 
   1287    Remove a parameter block from the problem. The parameterization of
   1288    the parameter block, if it exists, will persist until the deletion
   1289    of the problem (similar to cost/loss functions in residual block
   1290    removal). Any residual blocks that depend on the parameter are also
   1291    removed, as described above in RemoveResidualBlock().  If
   1292    Problem::Options::enable_fast_parameter_block_removal is true, then
   1293    the removal is fast (almost constant time). Otherwise, removing a
   1294    parameter block will incur a scan of the entire Problem object.
   1295 
   1296    **WARNING:** Removing a residual or parameter block will destroy
   1297    the implicit ordering, rendering the jacobian or residuals returned
   1298    from the solver uninterpretable. If you depend on the evaluated
   1299    jacobian, do not use remove! This may change in a future release.
   1300 
   1301 .. function:: void Problem::SetParameterBlockConstant(double* values)
   1302 
   1303    Hold the indicated parameter block constant during optimization.
   1304 
   1305 .. function:: void Problem::SetParameterBlockVariable(double* values)
   1306 
   1307    Allow the indicated parameter to vary during optimization.
   1308 
   1309 .. function:: void Problem::SetParameterization(double* values, LocalParameterization* local_parameterization)
   1310 
   1311    Set the local parameterization for one of the parameter blocks.
   1312    The local_parameterization is owned by the Problem by default. It
   1313    is acceptable to set the same parameterization for multiple
   1314    parameters; the destructor is careful to delete local
   1315    parameterizations only once. The local parameterization can only be
   1316    set once per parameter, and cannot be changed once set.
   1317 
   1318 .. function:: int Problem::NumParameterBlocks() const
   1319 
   1320    Number of parameter blocks in the problem. Always equals
   1321    parameter_blocks().size() and parameter_block_sizes().size().
   1322 
   1323 .. function:: int Problem::NumParameters() const
   1324 
   1325    The size of the parameter vector obtained by summing over the sizes
   1326    of all the parameter blocks.
   1327 
   1328 .. function:: int Problem::NumResidualBlocks() const
   1329 
   1330    Number of residual blocks in the problem. Always equals
   1331    residual_blocks().size().
   1332 
   1333 .. function:: int Problem::NumResiduals() const
   1334 
   1335    The size of the residual vector obtained by summing over the sizes
   1336    of all of the residual blocks.
   1337 
   1338 .. function int Problem::ParameterBlockSize(const double* values) const;
   1339 
   1340    The size of the parameter block.
   1341 
   1342 .. function int Problem::ParameterBlockLocalSize(const double* values) const;
   1343 
   1344   The size of local parameterization for the parameter block. If
   1345   there is no local parameterization associated with this parameter
   1346   block, then ``ParameterBlockLocalSize`` = ``ParameterBlockSize``.
   1347 
   1348 
   1349 .. function void Problem::GetParameterBlocks(vector<double*>* parameter_blocks) const;
   1350 
   1351   Fills the passed ``parameter_blocks`` vector with pointers to the
   1352   parameter blocks currently in the problem. After this call,
   1353   ``parameter_block.size() == NumParameterBlocks``.
   1354 
   1355 .. function:: bool Problem::Evaluate(const Problem::EvaluateOptions& options, double* cost, vector<double>* residuals, vector<double>* gradient, CRSMatrix* jacobian)
   1356 
   1357    Evaluate a :class:`Problem`. Any of the output pointers can be
   1358    `NULL`. Which residual blocks and parameter blocks are used is
   1359    controlled by the :class:`Problem::EvaluateOptions` struct below.
   1360 
   1361    .. code-block:: c++
   1362 
   1363      Problem problem;
   1364      double x = 1;
   1365      problem.Add(new MyCostFunction, NULL, &x);
   1366 
   1367      double cost = 0.0;
   1368      problem.Evaluate(Problem::EvaluateOptions(), &cost, NULL, NULL, NULL);
   1369 
   1370    The cost is evaluated at `x = 1`. If you wish to evaluate the
   1371    problem at `x = 2`, then
   1372 
   1373    .. code-block:: c++
   1374 
   1375       x = 2;
   1376       problem.Evaluate(Problem::EvaluateOptions(), &cost, NULL, NULL, NULL);
   1377 
   1378    is the way to do so.
   1379 
   1380    **NOTE** If no local parameterizations are used, then the size of
   1381    the gradient vector is the sum of the sizes of all the parameter
   1382    blocks. If a parameter block has a local parameterization, then
   1383    it contributes "LocalSize" entries to the gradient vector.
   1384 
   1385 .. class:: Problem::EvaluateOptions
   1386 
   1387    Options struct that is used to control :func:`Problem::Evaluate`.
   1388 
   1389 .. member:: vector<double*> Problem::EvaluateOptions::parameter_blocks
   1390 
   1391    The set of parameter blocks for which evaluation should be
   1392    performed. This vector determines the order in which parameter
   1393    blocks occur in the gradient vector and in the columns of the
   1394    jacobian matrix. If parameter_blocks is empty, then it is assumed
   1395    to be equal to a vector containing ALL the parameter
   1396    blocks. Generally speaking the ordering of the parameter blocks in
   1397    this case depends on the order in which they were added to the
   1398    problem and whether or not the user removed any parameter blocks.
   1399 
   1400    **NOTE** This vector should contain the same pointers as the ones
   1401    used to add parameter blocks to the Problem. These parameter block
   1402    should NOT point to new memory locations. Bad things will happen if
   1403    you do.
   1404 
   1405 .. member:: vector<ResidualBlockId> Problem::EvaluateOptions::residual_blocks
   1406 
   1407    The set of residual blocks for which evaluation should be
   1408    performed. This vector determines the order in which the residuals
   1409    occur, and how the rows of the jacobian are ordered. If
   1410    residual_blocks is empty, then it is assumed to be equal to the
   1411    vector containing all the parameter blocks.
   1412 
   1413 ``rotation.h``
   1414 --------------
   1415 
   1416 Many applications of Ceres Solver involve optimization problems where
   1417 some of the variables correspond to rotations. To ease the pain of
   1418 work with the various representations of rotations (angle-axis,
   1419 quaternion and matrix) we provide a handy set of templated
   1420 functions. These functions are templated so that the user can use them
   1421 within Ceres Solver's automatic differentiation framework.
   1422 
   1423 .. function:: void AngleAxisToQuaternion<T>(T const* angle_axis, T* quaternion)
   1424 
   1425    Convert a value in combined axis-angle representation to a
   1426    quaternion.
   1427 
   1428    The value ``angle_axis`` is a triple whose norm is an angle in radians,
   1429    and whose direction is aligned with the axis of rotation, and
   1430    ``quaternion`` is a 4-tuple that will contain the resulting quaternion.
   1431 
   1432 .. function:: void QuaternionToAngleAxis<T>(T const* quaternion, T* angle_axis)
   1433 
   1434    Convert a quaternion to the equivalent combined axis-angle
   1435    representation.
   1436 
   1437    The value ``quaternion`` must be a unit quaternion - it is not
   1438    normalized first, and ``angle_axis`` will be filled with a value
   1439    whose norm is the angle of rotation in radians, and whose direction
   1440    is the axis of rotation.
   1441 
   1442 .. function:: void RotationMatrixToAngleAxis<T, row_stride, col_stride>(const MatrixAdapter<const T, row_stride, col_stride>& R, T * angle_axis)
   1443 .. function:: void AngleAxisToRotationMatrix<T, row_stride, col_stride>(T const * angle_axis, const MatrixAdapter<T, row_stride, col_stride>& R)
   1444 .. function:: void RotationMatrixToAngleAxis<T>(T const * R, T * angle_axis)
   1445 .. function:: void AngleAxisToRotationMatrix<T>(T const * angle_axis, T * R)
   1446 
   1447    Conversions between 3x3 rotation matrix with given column and row strides and
   1448    axis-angle rotation representations. The functions that take a pointer to T instead
   1449    of a MatrixAdapter assume a column major representation with unit row stride and a column stride of 3.
   1450 
   1451 .. function:: void EulerAnglesToRotationMatrix<T, row_stride, col_stride>(const T* euler, const MatrixAdapter<T, row_stride, col_stride>& R)
   1452 .. function:: void EulerAnglesToRotationMatrix<T>(const T* euler, int row_stride, T* R)
   1453 
   1454    Conversions between 3x3 rotation matrix with given column and row strides and
   1455    Euler angle (in degrees) rotation representations.
   1456 
   1457    The {pitch,roll,yaw} Euler angles are rotations around the {x,y,z}
   1458    axes, respectively.  They are applied in that same order, so the
   1459    total rotation R is Rz * Ry * Rx.
   1460 
   1461    The function that takes a pointer to T as the rotation matrix assumes a row
   1462    major representation with unit column stride and a row stride of 3.
   1463    The additional parameter row_stride is required to be 3.
   1464 
   1465 .. function:: void QuaternionToScaledRotation<T, row_stride, col_stride>(const T q[4], const MatrixAdapter<T, row_stride, col_stride>& R)
   1466 .. function:: void QuaternionToScaledRotation<T>(const T q[4], T R[3 * 3])
   1467 
   1468    Convert a 4-vector to a 3x3 scaled rotation matrix.
   1469 
   1470    The choice of rotation is such that the quaternion
   1471    :math:`\begin{bmatrix} 1 &0 &0 &0\end{bmatrix}` goes to an identity
   1472    matrix and for small :math:`a, b, c` the quaternion
   1473    :math:`\begin{bmatrix}1 &a &b &c\end{bmatrix}` goes to the matrix
   1474 
   1475    .. math::
   1476 
   1477      I + 2 \begin{bmatrix} 0 & -c & b \\ c & 0 & -a\\ -b & a & 0
   1478            \end{bmatrix} + O(q^2)
   1479 
   1480    which corresponds to a Rodrigues approximation, the last matrix
   1481    being the cross-product matrix of :math:`\begin{bmatrix} a& b&
   1482    c\end{bmatrix}`. Together with the property that :math:`R(q1 * q2)
   1483    = R(q1) * R(q2)` this uniquely defines the mapping from :math:`q` to
   1484    :math:`R`.
   1485 
   1486    In the function that accepts a pointer to T instead of a MatrixAdapter,
   1487    the rotation matrix ``R`` is a row-major matrix with unit column stride
   1488    and a row stride of 3.
   1489 
   1490    No normalization of the quaternion is performed, i.e.
   1491    :math:`R = \|q\|^2  Q`, where :math:`Q` is an orthonormal matrix
   1492    such that :math:`\det(Q) = 1` and :math:`Q*Q' = I`.
   1493 
   1494 
   1495 .. function:: void QuaternionToRotation<T>(const T q[4], const MatrixAdapter<T, row_stride, col_stride>& R)
   1496 .. function:: void QuaternionToRotation<T>(const T q[4], T R[3 * 3])
   1497 
   1498    Same as above except that the rotation matrix is normalized by the
   1499    Frobenius norm, so that :math:`R R' = I` (and :math:`\det(R) = 1`).
   1500 
   1501 .. function:: void UnitQuaternionRotatePoint<T>(const T q[4], const T pt[3], T result[3])
   1502 
   1503    Rotates a point pt by a quaternion q:
   1504 
   1505    .. math:: \text{result} = R(q)  \text{pt}
   1506 
   1507    Assumes the quaternion is unit norm. If you pass in a quaternion
   1508    with :math:`|q|^2 = 2` then you WILL NOT get back 2 times the
   1509    result you get for a unit quaternion.
   1510 
   1511 
   1512 .. function:: void QuaternionRotatePoint<T>(const T q[4], const T pt[3], T result[3])
   1513 
   1514    With this function you do not need to assume that q has unit norm.
   1515    It does assume that the norm is non-zero.
   1516 
   1517 .. function:: void QuaternionProduct<T>(const T z[4], const T w[4], T zw[4])
   1518 
   1519    .. math:: zw = z * w
   1520 
   1521    where :math:`*` is the Quaternion product between 4-vectors.
   1522 
   1523 
   1524 .. function:: void CrossProduct<T>(const T x[3], const T y[3], T x_cross_y[3])
   1525 
   1526    .. math:: \text{x_cross_y} = x \times y
   1527 
   1528 .. function:: void AngleAxisRotatePoint<T>(const T angle_axis[3], const T pt[3], T result[3])
   1529 
   1530    .. math:: y = R(\text{angle_axis}) x
   1531