1 2 .. default-domain:: cpp 3 4 .. cpp:namespace:: ceres 5 6 .. _chapter-solving: 7 8 ======= 9 Solving 10 ======= 11 12 Introduction 13 ============ 14 15 Effective use of Ceres requires some familiarity with the basic 16 components of a nonlinear least squares solver, so before we describe 17 how to configure and use the solver, we will take a brief look at how 18 some of the core optimization algorithms in Ceres work. 19 20 Let :math:`x \in \mathbb{R}^n` be an :math:`n`-dimensional vector of 21 variables, and 22 :math:`F(x) = \left[f_1(x), ... , f_{m}(x) \right]^{\top}` be a 23 :math:`m`-dimensional function of :math:`x`. We are interested in 24 solving the following optimization problem [#f1]_ . 25 26 .. math:: \arg \min_x \frac{1}{2}\|F(x)\|^2\ . \\ 27 L \le x \le U 28 :label: nonlinsq 29 30 Where, :math:`L` and :math:`U` are lower and upper bounds on the 31 parameter vector :math:`x`. 32 33 Since the efficient global minimization of :eq:`nonlinsq` for 34 general :math:`F(x)` is an intractable problem, we will have to settle 35 for finding a local minimum. 36 37 In the following, the Jacobian :math:`J(x)` of :math:`F(x)` is an 38 :math:`m\times n` matrix, where :math:`J_{ij}(x) = \partial_j f_i(x)` 39 and the gradient vector is :math:`g(x) = \nabla \frac{1}{2}\|F(x)\|^2 40 = J(x)^\top F(x)`. 41 42 The general strategy when solving non-linear optimization problems is 43 to solve a sequence of approximations to the original problem 44 [NocedalWright]_. At each iteration, the approximation is solved to 45 determine a correction :math:`\Delta x` to the vector :math:`x`. For 46 non-linear least squares, an approximation can be constructed by using 47 the linearization :math:`F(x+\Delta x) \approx F(x) + J(x)\Delta x`, 48 which leads to the following linear least squares problem: 49 50 .. math:: \min_{\Delta x} \frac{1}{2}\|J(x)\Delta x + F(x)\|^2 51 :label: linearapprox 52 53 Unfortunately, naively solving a sequence of these problems and 54 updating :math:`x \leftarrow x+ \Delta x` leads to an algorithm that 55 may not converge. To get a convergent algorithm, we need to control 56 the size of the step :math:`\Delta x`. Depending on how the size of 57 the step :math:`\Delta x` is controlled, non-linear optimization 58 algorithms can be divided into two major categories [NocedalWright]_. 59 60 1. **Trust Region** The trust region approach approximates the 61 objective function using using a model function (often a quadratic) 62 over a subset of the search space known as the trust region. If the 63 model function succeeds in minimizing the true objective function 64 the trust region is expanded; conversely, otherwise it is 65 contracted and the model optimization problem is solved again. 66 67 2. **Line Search** The line search approach first finds a descent 68 direction along which the objective function will be reduced and 69 then computes a step size that decides how far should move along 70 that direction. The descent direction can be computed by various 71 methods, such as gradient descent, Newton's method and Quasi-Newton 72 method. The step size can be determined either exactly or 73 inexactly. 74 75 Trust region methods are in some sense dual to line search methods: 76 trust region methods first choose a step size (the size of the trust 77 region) and then a step direction while line search methods first 78 choose a step direction and then a step size. Ceres implements 79 multiple algorithms in both categories. 80 81 .. _section-trust-region-methods: 82 83 Trust Region Methods 84 ==================== 85 86 The basic trust region algorithm looks something like this. 87 88 1. Given an initial point :math:`x` and a trust region radius :math:`\mu`. 89 2. Solve 90 91 .. math:: 92 \arg \min_{\Delta x}& \frac{1}{2}\|J(x)\Delta x + F(x)\|^2 \\ 93 \text{such that} &\|D(x)\Delta x\|^2 \le \mu\\ 94 &L \le x + \Delta x \le U. 95 96 3. :math:`\rho = \frac{\displaystyle \|F(x + \Delta x)\|^2 - 97 \|F(x)\|^2}{\displaystyle \|J(x)\Delta x + F(x)\|^2 - 98 \|F(x)\|^2}` 99 4. if :math:`\rho > \epsilon` then :math:`x = x + \Delta x`. 100 5. if :math:`\rho > \eta_1` then :math:`\rho = 2 \rho` 101 6. else if :math:`\rho < \eta_2` then :math:`\rho = 0.5 * \rho` 102 7. Go to 2. 103 104 Here, :math:`\mu` is the trust region radius, :math:`D(x)` is some 105 matrix used to define a metric on the domain of :math:`F(x)` and 106 :math:`\rho` measures the quality of the step :math:`\Delta x`, i.e., 107 how well did the linear model predict the decrease in the value of the 108 non-linear objective. The idea is to increase or decrease the radius 109 of the trust region depending on how well the linearization predicts 110 the behavior of the non-linear objective, which in turn is reflected 111 in the value of :math:`\rho`. 112 113 The key computational step in a trust-region algorithm is the solution 114 of the constrained optimization problem 115 116 .. math:: 117 \arg \min_{\Delta x}& \frac{1}{2}\|J(x)\Delta x + F(x)\|^2 \\ 118 \text{such that} &\|D(x)\Delta x\|^2 \le \mu\\ 119 &L \le x + \Delta x \le U. 120 :label: trp 121 122 There are a number of different ways of solving this problem, each 123 giving rise to a different concrete trust-region algorithm. Currently 124 Ceres, implements two trust-region algorithms - Levenberg-Marquardt 125 and Dogleg, each of which is augmented with a line search if bounds 126 constraints are present [Kanzow]_. The user can choose between them by 127 setting :member:`Solver::Options::trust_region_strategy_type`. 128 129 .. rubric:: Footnotes 130 131 .. [#f1] At the level of the non-linear solver, the block structure is 132 not relevant, therefore our discussion here is in terms of an 133 optimization problem defined over a state vector of size 134 :math:`n`. Similarly the presence of loss functions is also 135 ignored as the problem is internally converted into a pure 136 non-linear least squares problem. 137 138 139 .. _section-levenberg-marquardt: 140 141 Levenberg-Marquardt 142 ------------------- 143 144 The Levenberg-Marquardt algorithm [Levenberg]_ [Marquardt]_ is the 145 most popular algorithm for solving non-linear least squares problems. 146 It was also the first trust region algorithm to be developed 147 [Levenberg]_ [Marquardt]_. Ceres implements an exact step [Madsen]_ 148 and an inexact step variant of the Levenberg-Marquardt algorithm 149 [WrightHolt]_ [NashSofer]_. 150 151 It can be shown, that the solution to :eq:`trp` can be obtained by 152 solving an unconstrained optimization of the form 153 154 .. math:: \arg\min_{\Delta x}& \frac{1}{2}\|J(x)\Delta x + F(x)\|^2 +\lambda \|D(x)\Delta x\|^2 155 156 Where, :math:`\lambda` is a Lagrange multiplier that is inverse 157 related to :math:`\mu`. In Ceres, we solve for 158 159 .. math:: \arg\min_{\Delta x}& \frac{1}{2}\|J(x)\Delta x + F(x)\|^2 + \frac{1}{\mu} \|D(x)\Delta x\|^2 160 :label: lsqr 161 162 The matrix :math:`D(x)` is a non-negative diagonal matrix, typically 163 the square root of the diagonal of the matrix :math:`J(x)^\top J(x)`. 164 165 Before going further, let us make some notational simplifications. We 166 will assume that the matrix :math:`\sqrt{\mu} D` has been concatenated 167 at the bottom of the matrix :math:`J` and similarly a vector of zeros 168 has been added to the bottom of the vector :math:`f` and the rest of 169 our discussion will be in terms of :math:`J` and :math:`f`, i.e, the 170 linear least squares problem. 171 172 .. math:: \min_{\Delta x} \frac{1}{2} \|J(x)\Delta x + f(x)\|^2 . 173 :label: simple 174 175 For all but the smallest problems the solution of :eq:`simple` in 176 each iteration of the Levenberg-Marquardt algorithm is the dominant 177 computational cost in Ceres. Ceres provides a number of different 178 options for solving :eq:`simple`. There are two major classes of 179 methods - factorization and iterative. 180 181 The factorization methods are based on computing an exact solution of 182 :eq:`lsqr` using a Cholesky or a QR factorization and lead to an exact 183 step Levenberg-Marquardt algorithm. But it is not clear if an exact 184 solution of :eq:`lsqr` is necessary at each step of the LM algorithm 185 to solve :eq:`nonlinsq`. In fact, we have already seen evidence 186 that this may not be the case, as :eq:`lsqr` is itself a regularized 187 version of :eq:`linearapprox`. Indeed, it is possible to 188 construct non-linear optimization algorithms in which the linearized 189 problem is solved approximately. These algorithms are known as inexact 190 Newton or truncated Newton methods [NocedalWright]_. 191 192 An inexact Newton method requires two ingredients. First, a cheap 193 method for approximately solving systems of linear 194 equations. Typically an iterative linear solver like the Conjugate 195 Gradients method is used for this 196 purpose [NocedalWright]_. Second, a termination rule for 197 the iterative solver. A typical termination rule is of the form 198 199 .. math:: \|H(x) \Delta x + g(x)\| \leq \eta_k \|g(x)\|. 200 :label: inexact 201 202 Here, :math:`k` indicates the Levenberg-Marquardt iteration number and 203 :math:`0 < \eta_k <1` is known as the forcing sequence. [WrightHolt]_ 204 prove that a truncated Levenberg-Marquardt algorithm that uses an 205 inexact Newton step based on :eq:`inexact` converges for any 206 sequence :math:`\eta_k \leq \eta_0 < 1` and the rate of convergence 207 depends on the choice of the forcing sequence :math:`\eta_k`. 208 209 Ceres supports both exact and inexact step solution strategies. When 210 the user chooses a factorization based linear solver, the exact step 211 Levenberg-Marquardt algorithm is used. When the user chooses an 212 iterative linear solver, the inexact step Levenberg-Marquardt 213 algorithm is used. 214 215 .. _section-dogleg: 216 217 Dogleg 218 ------ 219 220 Another strategy for solving the trust region problem :eq:`trp` was 221 introduced by M. J. D. Powell. The key idea there is to compute two 222 vectors 223 224 .. math:: 225 226 \Delta x^{\text{Gauss-Newton}} &= \arg \min_{\Delta x}\frac{1}{2} \|J(x)\Delta x + f(x)\|^2.\\ 227 \Delta x^{\text{Cauchy}} &= -\frac{\|g(x)\|^2}{\|J(x)g(x)\|^2}g(x). 228 229 Note that the vector :math:`\Delta x^{\text{Gauss-Newton}}` is the 230 solution to :eq:`linearapprox` and :math:`\Delta 231 x^{\text{Cauchy}}` is the vector that minimizes the linear 232 approximation if we restrict ourselves to moving along the direction 233 of the gradient. Dogleg methods finds a vector :math:`\Delta x` 234 defined by :math:`\Delta x^{\text{Gauss-Newton}}` and :math:`\Delta 235 x^{\text{Cauchy}}` that solves the trust region problem. Ceres 236 supports two variants that can be chose by setting 237 :member:`Solver::Options::dogleg_type`. 238 239 ``TRADITIONAL_DOGLEG`` as described by Powell, constructs two line 240 segments using the Gauss-Newton and Cauchy vectors and finds the point 241 farthest along this line shaped like a dogleg (hence the name) that is 242 contained in the trust-region. For more details on the exact reasoning 243 and computations, please see Madsen et al [Madsen]_. 244 245 ``SUBSPACE_DOGLEG`` is a more sophisticated method that considers the 246 entire two dimensional subspace spanned by these two vectors and finds 247 the point that minimizes the trust region problem in this subspace 248 [ByrdSchnabel]_. 249 250 The key advantage of the Dogleg over Levenberg Marquardt is that if 251 the step computation for a particular choice of :math:`\mu` does not 252 result in sufficient decrease in the value of the objective function, 253 Levenberg-Marquardt solves the linear approximation from scratch with 254 a smaller value of :math:`\mu`. Dogleg on the other hand, only needs 255 to compute the interpolation between the Gauss-Newton and the Cauchy 256 vectors, as neither of them depend on the value of :math:`\mu`. 257 258 The Dogleg method can only be used with the exact factorization based 259 linear solvers. 260 261 .. _section-inner-iterations: 262 263 Inner Iterations 264 ---------------- 265 266 Some non-linear least squares problems have additional structure in 267 the way the parameter blocks interact that it is beneficial to modify 268 the way the trust region step is computed. e.g., consider the 269 following regression problem 270 271 .. math:: y = a_1 e^{b_1 x} + a_2 e^{b_3 x^2 + c_1} 272 273 274 Given a set of pairs :math:`\{(x_i, y_i)\}`, the user wishes to estimate 275 :math:`a_1, a_2, b_1, b_2`, and :math:`c_1`. 276 277 Notice that the expression on the left is linear in :math:`a_1` and 278 :math:`a_2`, and given any value for :math:`b_1, b_2` and :math:`c_1`, 279 it is possible to use linear regression to estimate the optimal values 280 of :math:`a_1` and :math:`a_2`. It's possible to analytically 281 eliminate the variables :math:`a_1` and :math:`a_2` from the problem 282 entirely. Problems like these are known as separable least squares 283 problem and the most famous algorithm for solving them is the Variable 284 Projection algorithm invented by Golub & Pereyra [GolubPereyra]_. 285 286 Similar structure can be found in the matrix factorization with 287 missing data problem. There the corresponding algorithm is known as 288 Wiberg's algorithm [Wiberg]_. 289 290 Ruhe & Wedin present an analysis of various algorithms for solving 291 separable non-linear least squares problems and refer to *Variable 292 Projection* as Algorithm I in their paper [RuheWedin]_. 293 294 Implementing Variable Projection is tedious and expensive. Ruhe & 295 Wedin present a simpler algorithm with comparable convergence 296 properties, which they call Algorithm II. Algorithm II performs an 297 additional optimization step to estimate :math:`a_1` and :math:`a_2` 298 exactly after computing a successful Newton step. 299 300 301 This idea can be generalized to cases where the residual is not 302 linear in :math:`a_1` and :math:`a_2`, i.e., 303 304 .. math:: y = f_1(a_1, e^{b_1 x}) + f_2(a_2, e^{b_3 x^2 + c_1}) 305 306 In this case, we solve for the trust region step for the full problem, 307 and then use it as the starting point to further optimize just `a_1` 308 and `a_2`. For the linear case, this amounts to doing a single linear 309 least squares solve. For non-linear problems, any method for solving 310 the :math:`a_1` and :math:`a_2` optimization problems will do. The 311 only constraint on :math:`a_1` and :math:`a_2` (if they are two 312 different parameter block) is that they do not co-occur in a residual 313 block. 314 315 This idea can be further generalized, by not just optimizing 316 :math:`(a_1, a_2)`, but decomposing the graph corresponding to the 317 Hessian matrix's sparsity structure into a collection of 318 non-overlapping independent sets and optimizing each of them. 319 320 Setting :member:`Solver::Options::use_inner_iterations` to ``true`` 321 enables the use of this non-linear generalization of Ruhe & Wedin's 322 Algorithm II. This version of Ceres has a higher iteration 323 complexity, but also displays better convergence behavior per 324 iteration. 325 326 Setting :member:`Solver::Options::num_threads` to the maximum number 327 possible is highly recommended. 328 329 .. _section-non-monotonic-steps: 330 331 Non-monotonic Steps 332 ------------------- 333 334 Note that the basic trust-region algorithm described in 335 :ref:`section-trust-region-methods` is a descent algorithm in that it 336 only accepts a point if it strictly reduces the value of the objective 337 function. 338 339 Relaxing this requirement allows the algorithm to be more efficient in 340 the long term at the cost of some local increase in the value of the 341 objective function. 342 343 This is because allowing for non-decreasing objective function values 344 in a principled manner allows the algorithm to *jump over boulders* as 345 the method is not restricted to move into narrow valleys while 346 preserving its convergence properties. 347 348 Setting :member:`Solver::Options::use_nonmonotonic_steps` to ``true`` 349 enables the non-monotonic trust region algorithm as described by Conn, 350 Gould & Toint in [Conn]_. 351 352 Even though the value of the objective function may be larger 353 than the minimum value encountered over the course of the 354 optimization, the final parameters returned to the user are the 355 ones corresponding to the minimum cost over all iterations. 356 357 The option to take non-monotonic steps is available for all trust 358 region strategies. 359 360 361 .. _section-line-search-methods: 362 363 Line Search Methods 364 =================== 365 366 The line search method in Ceres Solver cannot handle bounds 367 constraints right now, so it can only be used for solving 368 unconstrained problems. 369 370 Line search algorithms 371 372 1. Given an initial point :math:`x` 373 2. :math:`\Delta x = -H^{-1}(x) g(x)` 374 3. :math:`\arg \min_\mu \frac{1}{2} \| F(x + \mu \Delta x) \|^2` 375 4. :math:`x = x + \mu \Delta x` 376 5. Goto 2. 377 378 Here :math:`H(x)` is some approximation to the Hessian of the 379 objective function, and :math:`g(x)` is the gradient at 380 :math:`x`. Depending on the choice of :math:`H(x)` we get a variety of 381 different search directions :math:`\Delta x`. 382 383 Step 4, which is a one dimensional optimization or `Line Search` along 384 :math:`\Delta x` is what gives this class of methods its name. 385 386 Different line search algorithms differ in their choice of the search 387 direction :math:`\Delta x` and the method used for one dimensional 388 optimization along :math:`\Delta x`. The choice of :math:`H(x)` is the 389 primary source of computational complexity in these 390 methods. Currently, Ceres Solver supports three choices of search 391 directions, all aimed at large scale problems. 392 393 1. ``STEEPEST_DESCENT`` This corresponds to choosing :math:`H(x)` to 394 be the identity matrix. This is not a good search direction for 395 anything but the simplest of the problems. It is only included here 396 for completeness. 397 398 2. ``NONLINEAR_CONJUGATE_GRADIENT`` A generalization of the Conjugate 399 Gradient method to non-linear functions. The generalization can be 400 performed in a number of different ways, resulting in a variety of 401 search directions. Ceres Solver currently supports 402 ``FLETCHER_REEVES``, ``POLAK_RIBIERE`` and ``HESTENES_STIEFEL`` 403 directions. 404 405 3. ``BFGS`` A generalization of the Secant method to multiple 406 dimensions in which a full, dense approximation to the inverse 407 Hessian is maintained and used to compute a quasi-Newton step 408 [NocedalWright]_. BFGS is currently the best known general 409 quasi-Newton algorithm. 410 411 4. ``LBFGS`` A limited memory approximation to the full ``BFGS`` 412 method in which the last `M` iterations are used to approximate the 413 inverse Hessian used to compute a quasi-Newton step [Nocedal]_, 414 [ByrdNocedal]_. 415 416 Currently Ceres Solver supports both a backtracking and interpolation 417 based Armijo line search algorithm, and a sectioning / zoom 418 interpolation (strong) Wolfe condition line search algorithm. 419 However, note that in order for the assumptions underlying the 420 ``BFGS`` and ``LBFGS`` methods to be guaranteed to be satisfied the 421 Wolfe line search algorithm should be used. 422 423 .. _section-linear-solver: 424 425 LinearSolver 426 ============ 427 428 Recall that in both of the trust-region methods described above, the 429 key computational cost is the solution of a linear least squares 430 problem of the form 431 432 .. math:: \min_{\Delta x} \frac{1}{2} \|J(x)\Delta x + f(x)\|^2 . 433 :label: simple2 434 435 Let :math:`H(x)= J(x)^\top J(x)` and :math:`g(x) = -J(x)^\top 436 f(x)`. For notational convenience let us also drop the dependence on 437 :math:`x`. Then it is easy to see that solving :eq:`simple2` is 438 equivalent to solving the *normal equations*. 439 440 .. math:: H \Delta x = g 441 :label: normal 442 443 Ceres provides a number of different options for solving :eq:`normal`. 444 445 .. _section-qr: 446 447 ``DENSE_QR`` 448 ------------ 449 450 For small problems (a couple of hundred parameters and a few thousand 451 residuals) with relatively dense Jacobians, ``DENSE_QR`` is the method 452 of choice [Bjorck]_. Let :math:`J = QR` be the QR-decomposition of 453 :math:`J`, where :math:`Q` is an orthonormal matrix and :math:`R` is 454 an upper triangular matrix [TrefethenBau]_. Then it can be shown that 455 the solution to :eq:`normal` is given by 456 457 .. math:: \Delta x^* = -R^{-1}Q^\top f 458 459 460 Ceres uses ``Eigen`` 's dense QR factorization routines. 461 462 .. _section-cholesky: 463 464 ``DENSE_NORMAL_CHOLESKY`` & ``SPARSE_NORMAL_CHOLESKY`` 465 ------------------------------------------------------ 466 467 Large non-linear least square problems are usually sparse. In such 468 cases, using a dense QR factorization is inefficient. Let :math:`H = 469 R^\top R` be the Cholesky factorization of the normal equations, where 470 :math:`R` is an upper triangular matrix, then the solution to 471 :eq:`normal` is given by 472 473 .. math:: 474 475 \Delta x^* = R^{-1} R^{-\top} g. 476 477 478 The observant reader will note that the :math:`R` in the Cholesky 479 factorization of :math:`H` is the same upper triangular matrix 480 :math:`R` in the QR factorization of :math:`J`. Since :math:`Q` is an 481 orthonormal matrix, :math:`J=QR` implies that :math:`J^\top J = R^\top 482 Q^\top Q R = R^\top R`. There are two variants of Cholesky 483 factorization -- sparse and dense. 484 485 ``DENSE_NORMAL_CHOLESKY`` as the name implies performs a dense 486 Cholesky factorization of the normal equations. Ceres uses 487 ``Eigen`` 's dense LDLT factorization routines. 488 489 ``SPARSE_NORMAL_CHOLESKY``, as the name implies performs a sparse 490 Cholesky factorization of the normal equations. This leads to 491 substantial savings in time and memory for large sparse 492 problems. Ceres uses the sparse Cholesky factorization routines in 493 Professor Tim Davis' ``SuiteSparse`` or ``CXSparse`` packages [Chen]_ 494 or the sparse Cholesky factorization algorithm in ``Eigen`` (which 495 incidently is a port of the algorithm implemented inside ``CXSparse``) 496 497 .. _section-schur: 498 499 ``DENSE_SCHUR`` & ``SPARSE_SCHUR`` 500 ---------------------------------- 501 502 While it is possible to use ``SPARSE_NORMAL_CHOLESKY`` to solve bundle 503 adjustment problems, bundle adjustment problem have a special 504 structure, and a more efficient scheme for solving :eq:`normal` 505 can be constructed. 506 507 Suppose that the SfM problem consists of :math:`p` cameras and 508 :math:`q` points and the variable vector :math:`x` has the block 509 structure :math:`x = [y_{1}, ... ,y_{p},z_{1}, ... ,z_{q}]`. Where, 510 :math:`y` and :math:`z` correspond to camera and point parameters, 511 respectively. Further, let the camera blocks be of size :math:`c` and 512 the point blocks be of size :math:`s` (for most problems :math:`c` = 513 :math:`6`--`9` and :math:`s = 3`). Ceres does not impose any constancy 514 requirement on these block sizes, but choosing them to be constant 515 simplifies the exposition. 516 517 A key characteristic of the bundle adjustment problem is that there is 518 no term :math:`f_{i}` that includes two or more point blocks. This in 519 turn implies that the matrix :math:`H` is of the form 520 521 .. math:: H = \left[ \begin{matrix} B & E\\ E^\top & C \end{matrix} \right]\ , 522 :label: hblock 523 524 where, :math:`B \in \mathbb{R}^{pc\times pc}` is a block sparse matrix 525 with :math:`p` blocks of size :math:`c\times c` and :math:`C \in 526 \mathbb{R}^{qs\times qs}` is a block diagonal matrix with :math:`q` blocks 527 of size :math:`s\times s`. :math:`E \in \mathbb{R}^{pc\times qs}` is a 528 general block sparse matrix, with a block of size :math:`c\times s` 529 for each observation. Let us now block partition :math:`\Delta x = 530 [\Delta y,\Delta z]` and :math:`g=[v,w]` to restate :eq:`normal` 531 as the block structured linear system 532 533 .. math:: \left[ \begin{matrix} B & E\\ E^\top & C \end{matrix} 534 \right]\left[ \begin{matrix} \Delta y \\ \Delta z 535 \end{matrix} \right] = \left[ \begin{matrix} v\\ w 536 \end{matrix} \right]\ , 537 :label: linear2 538 539 and apply Gaussian elimination to it. As we noted above, :math:`C` is 540 a block diagonal matrix, with small diagonal blocks of size 541 :math:`s\times s`. Thus, calculating the inverse of :math:`C` by 542 inverting each of these blocks is cheap. This allows us to eliminate 543 :math:`\Delta z` by observing that :math:`\Delta z = C^{-1}(w - E^\top 544 \Delta y)`, giving us 545 546 .. math:: \left[B - EC^{-1}E^\top\right] \Delta y = v - EC^{-1}w\ . 547 :label: schur 548 549 The matrix 550 551 .. math:: S = B - EC^{-1}E^\top 552 553 is the Schur complement of :math:`C` in :math:`H`. It is also known as 554 the *reduced camera matrix*, because the only variables 555 participating in :eq:`schur` are the ones corresponding to the 556 cameras. :math:`S \in \mathbb{R}^{pc\times pc}` is a block structured 557 symmetric positive definite matrix, with blocks of size :math:`c\times 558 c`. The block :math:`S_{ij}` corresponding to the pair of images 559 :math:`i` and :math:`j` is non-zero if and only if the two images 560 observe at least one common point. 561 562 563 Now, eq-linear2 can be solved by first forming :math:`S`, solving for 564 :math:`\Delta y`, and then back-substituting :math:`\Delta y` to 565 obtain the value of :math:`\Delta z`. Thus, the solution of what was 566 an :math:`n\times n`, :math:`n=pc+qs` linear system is reduced to the 567 inversion of the block diagonal matrix :math:`C`, a few matrix-matrix 568 and matrix-vector multiplies, and the solution of block sparse 569 :math:`pc\times pc` linear system :eq:`schur`. For almost all 570 problems, the number of cameras is much smaller than the number of 571 points, :math:`p \ll q`, thus solving :eq:`schur` is 572 significantly cheaper than solving :eq:`linear2`. This is the 573 *Schur complement trick* [Brown]_. 574 575 This still leaves open the question of solving :eq:`schur`. The 576 method of choice for solving symmetric positive definite systems 577 exactly is via the Cholesky factorization [TrefethenBau]_ and 578 depending upon the structure of the matrix, there are, in general, two 579 options. The first is direct factorization, where we store and factor 580 :math:`S` as a dense matrix [TrefethenBau]_. This method has 581 :math:`O(p^2)` space complexity and :math:`O(p^3)` time complexity and 582 is only practical for problems with up to a few hundred cameras. Ceres 583 implements this strategy as the ``DENSE_SCHUR`` solver. 584 585 586 But, :math:`S` is typically a fairly sparse matrix, as most images 587 only see a small fraction of the scene. This leads us to the second 588 option: Sparse Direct Methods. These methods store :math:`S` as a 589 sparse matrix, use row and column re-ordering algorithms to maximize 590 the sparsity of the Cholesky decomposition, and focus their compute 591 effort on the non-zero part of the factorization [Chen]_. Sparse 592 direct methods, depending on the exact sparsity structure of the Schur 593 complement, allow bundle adjustment algorithms to significantly scale 594 up over those based on dense factorization. Ceres implements this 595 strategy as the ``SPARSE_SCHUR`` solver. 596 597 .. _section-cgnr: 598 599 ``CGNR`` 600 -------- 601 602 For general sparse problems, if the problem is too large for 603 ``CHOLMOD`` or a sparse linear algebra library is not linked into 604 Ceres, another option is the ``CGNR`` solver. This solver uses the 605 Conjugate Gradients solver on the *normal equations*, but without 606 forming the normal equations explicitly. It exploits the relation 607 608 .. math:: 609 H x = J^\top J x = J^\top(J x) 610 611 612 When the user chooses ``ITERATIVE_SCHUR`` as the linear solver, Ceres 613 automatically switches from the exact step algorithm to an inexact 614 step algorithm. 615 616 .. _section-iterative_schur: 617 618 ``ITERATIVE_SCHUR`` 619 ------------------- 620 621 Another option for bundle adjustment problems is to apply PCG to the 622 reduced camera matrix :math:`S` instead of :math:`H`. One reason to do 623 this is that :math:`S` is a much smaller matrix than :math:`H`, but 624 more importantly, it can be shown that :math:`\kappa(S)\leq 625 \kappa(H)`. Cseres implements PCG on :math:`S` as the 626 ``ITERATIVE_SCHUR`` solver. When the user chooses ``ITERATIVE_SCHUR`` 627 as the linear solver, Ceres automatically switches from the exact step 628 algorithm to an inexact step algorithm. 629 630 The cost of forming and storing the Schur complement :math:`S` can be 631 prohibitive for large problems. Indeed, for an inexact Newton solver 632 that computes :math:`S` and runs PCG on it, almost all of its time is 633 spent in constructing :math:`S`; the time spent inside the PCG 634 algorithm is negligible in comparison. Because PCG only needs access 635 to :math:`S` via its product with a vector, one way to evaluate 636 :math:`Sx` is to observe that 637 638 .. math:: x_1 &= E^\top x 639 .. math:: x_2 &= C^{-1} x_1 640 .. math:: x_3 &= Ex_2\\ 641 .. math:: x_4 &= Bx\\ 642 .. math:: Sx &= x_4 - x_3 643 :label: schurtrick1 644 645 Thus, we can run PCG on :math:`S` with the same computational effort 646 per iteration as PCG on :math:`H`, while reaping the benefits of a 647 more powerful preconditioner. In fact, we do not even need to compute 648 :math:`H`, :eq:`schurtrick1` can be implemented using just the columns 649 of :math:`J`. 650 651 Equation :eq:`schurtrick1` is closely related to *Domain 652 Decomposition methods* for solving large linear systems that arise in 653 structural engineering and partial differential equations. In the 654 language of Domain Decomposition, each point in a bundle adjustment 655 problem is a domain, and the cameras form the interface between these 656 domains. The iterative solution of the Schur complement then falls 657 within the sub-category of techniques known as Iterative 658 Sub-structuring [Saad]_ [Mathew]_. 659 660 .. _section-preconditioner: 661 662 Preconditioner 663 -------------- 664 665 The convergence rate of Conjugate Gradients for 666 solving :eq:`normal` depends on the distribution of eigenvalues 667 of :math:`H` [Saad]_. A useful upper bound is 668 :math:`\sqrt{\kappa(H)}`, where, :math:`\kappa(H)` is the condition 669 number of the matrix :math:`H`. For most bundle adjustment problems, 670 :math:`\kappa(H)` is high and a direct application of Conjugate 671 Gradients to :eq:`normal` results in extremely poor performance. 672 673 The solution to this problem is to replace :eq:`normal` with a 674 *preconditioned* system. Given a linear system, :math:`Ax =b` and a 675 preconditioner :math:`M` the preconditioned system is given by 676 :math:`M^{-1}Ax = M^{-1}b`. The resulting algorithm is known as 677 Preconditioned Conjugate Gradients algorithm (PCG) and its worst case 678 complexity now depends on the condition number of the *preconditioned* 679 matrix :math:`\kappa(M^{-1}A)`. 680 681 The computational cost of using a preconditioner :math:`M` is the cost 682 of computing :math:`M` and evaluating the product :math:`M^{-1}y` for 683 arbitrary vectors :math:`y`. Thus, there are two competing factors to 684 consider: How much of :math:`H`'s structure is captured by :math:`M` 685 so that the condition number :math:`\kappa(HM^{-1})` is low, and the 686 computational cost of constructing and using :math:`M`. The ideal 687 preconditioner would be one for which :math:`\kappa(M^{-1}A) 688 =1`. :math:`M=A` achieves this, but it is not a practical choice, as 689 applying this preconditioner would require solving a linear system 690 equivalent to the unpreconditioned problem. It is usually the case 691 that the more information :math:`M` has about :math:`H`, the more 692 expensive it is use. For example, Incomplete Cholesky factorization 693 based preconditioners have much better convergence behavior than the 694 Jacobi preconditioner, but are also much more expensive. 695 696 697 The simplest of all preconditioners is the diagonal or Jacobi 698 preconditioner, i.e., :math:`M=\operatorname{diag}(A)`, which for 699 block structured matrices like :math:`H` can be generalized to the 700 block Jacobi preconditioner. 701 702 For ``ITERATIVE_SCHUR`` there are two obvious choices for block 703 diagonal preconditioners for :math:`S`. The block diagonal of the 704 matrix :math:`B` [Mandel]_ and the block diagonal :math:`S`, i.e, the 705 block Jacobi preconditioner for :math:`S`. Ceres's implements both of 706 these preconditioners and refers to them as ``JACOBI`` and 707 ``SCHUR_JACOBI`` respectively. 708 709 For bundle adjustment problems arising in reconstruction from 710 community photo collections, more effective preconditioners can be 711 constructed by analyzing and exploiting the camera-point visibility 712 structure of the scene [KushalAgarwal]. Ceres implements the two 713 visibility based preconditioners described by Kushal & Agarwal as 714 ``CLUSTER_JACOBI`` and ``CLUSTER_TRIDIAGONAL``. These are fairly new 715 preconditioners and Ceres' implementation of them is in its early 716 stages and is not as mature as the other preconditioners described 717 above. 718 719 .. _section-ordering: 720 721 Ordering 722 -------- 723 724 The order in which variables are eliminated in a linear solver can 725 have a significant of impact on the efficiency and accuracy of the 726 method. For example when doing sparse Cholesky factorization, there 727 are matrices for which a good ordering will give a Cholesky factor 728 with :math:`O(n)` storage, where as a bad ordering will result in an 729 completely dense factor. 730 731 Ceres allows the user to provide varying amounts of hints to the 732 solver about the variable elimination ordering to use. This can range 733 from no hints, where the solver is free to decide the best ordering 734 based on the user's choices like the linear solver being used, to an 735 exact order in which the variables should be eliminated, and a variety 736 of possibilities in between. 737 738 Instances of the :class:`ParameterBlockOrdering` class are used to 739 communicate this information to Ceres. 740 741 Formally an ordering is an ordered partitioning of the parameter 742 blocks. Each parameter block belongs to exactly one group, and each 743 group has a unique integer associated with it, that determines its 744 order in the set of groups. We call these groups *Elimination Groups* 745 746 Given such an ordering, Ceres ensures that the parameter blocks in the 747 lowest numbered elimination group are eliminated first, and then the 748 parameter blocks in the next lowest numbered elimination group and so 749 on. Within each elimination group, Ceres is free to order the 750 parameter blocks as it chooses. e.g. Consider the linear system 751 752 .. math:: 753 x + y &= 3\\ 754 2x + 3y &= 7 755 756 There are two ways in which it can be solved. First eliminating 757 :math:`x` from the two equations, solving for y and then back 758 substituting for :math:`x`, or first eliminating :math:`y`, solving 759 for :math:`x` and back substituting for :math:`y`. The user can 760 construct three orderings here. 761 762 1. :math:`\{0: x\}, \{1: y\}` : Eliminate :math:`x` first. 763 2. :math:`\{0: y\}, \{1: x\}` : Eliminate :math:`y` first. 764 3. :math:`\{0: x, y\}` : Solver gets to decide the elimination order. 765 766 Thus, to have Ceres determine the ordering automatically using 767 heuristics, put all the variables in the same elimination group. The 768 identity of the group does not matter. This is the same as not 769 specifying an ordering at all. To control the ordering for every 770 variable, create an elimination group per variable, ordering them in 771 the desired order. 772 773 If the user is using one of the Schur solvers (``DENSE_SCHUR``, 774 ``SPARSE_SCHUR``, ``ITERATIVE_SCHUR``) and chooses to specify an 775 ordering, it must have one important property. The lowest numbered 776 elimination group must form an independent set in the graph 777 corresponding to the Hessian, or in other words, no two parameter 778 blocks in in the first elimination group should co-occur in the same 779 residual block. For the best performance, this elimination group 780 should be as large as possible. For standard bundle adjustment 781 problems, this corresponds to the first elimination group containing 782 all the 3d points, and the second containing the all the cameras 783 parameter blocks. 784 785 If the user leaves the choice to Ceres, then the solver uses an 786 approximate maximum independent set algorithm to identify the first 787 elimination group [LiSaad]_. 788 789 .. _section-solver-options: 790 791 :class:`Solver::Options` 792 ------------------------ 793 794 .. class:: Solver::Options 795 796 :class:`Solver::Options` controls the overall behavior of the 797 solver. We list the various settings and their default values below. 798 799 .. function:: bool Solver::Options::IsValid(string* error) const 800 801 Validate the values in the options struct and returns true on 802 success. If there is a problem, the method returns false with 803 ``error`` containing a textual description of the cause. 804 805 .. member:: MinimizerType Solver::Options::minimizer_type 806 807 Default: ``TRUST_REGION`` 808 809 Choose between ``LINE_SEARCH`` and ``TRUST_REGION`` algorithms. See 810 :ref:`section-trust-region-methods` and 811 :ref:`section-line-search-methods` for more details. 812 813 .. member:: LineSearchDirectionType Solver::Options::line_search_direction_type 814 815 Default: ``LBFGS`` 816 817 Choices are ``STEEPEST_DESCENT``, ``NONLINEAR_CONJUGATE_GRADIENT``, 818 ``BFGS`` and ``LBFGS``. 819 820 .. member:: LineSearchType Solver::Options::line_search_type 821 822 Default: ``WOLFE`` 823 824 Choices are ``ARMIJO`` and ``WOLFE`` (strong Wolfe conditions). 825 Note that in order for the assumptions underlying the ``BFGS`` and 826 ``LBFGS`` line search direction algorithms to be guaranteed to be 827 satisifed, the ``WOLFE`` line search should be used. 828 829 .. member:: NonlinearConjugateGradientType Solver::Options::nonlinear_conjugate_gradient_type 830 831 Default: ``FLETCHER_REEVES`` 832 833 Choices are ``FLETCHER_REEVES``, ``POLAK_RIBIERE`` and 834 ``HESTENES_STIEFEL``. 835 836 .. member:: int Solver::Options::max_lbfs_rank 837 838 Default: 20 839 840 The L-BFGS hessian approximation is a low rank approximation to the 841 inverse of the Hessian matrix. The rank of the approximation 842 determines (linearly) the space and time complexity of using the 843 approximation. Higher the rank, the better is the quality of the 844 approximation. The increase in quality is however is bounded for a 845 number of reasons. 846 847 1. The method only uses secant information and not actual 848 derivatives. 849 850 2. The Hessian approximation is constrained to be positive 851 definite. 852 853 So increasing this rank to a large number will cost time and space 854 complexity without the corresponding increase in solution 855 quality. There are no hard and fast rules for choosing the maximum 856 rank. The best choice usually requires some problem specific 857 experimentation. 858 859 .. member:: bool Solver::Options::use_approximate_eigenvalue_bfgs_scaling 860 861 Default: ``false`` 862 863 As part of the ``BFGS`` update step / ``LBFGS`` right-multiply 864 step, the initial inverse Hessian approximation is taken to be the 865 Identity. However, [Oren]_ showed that using instead :math:`I * 866 \gamma`, where :math:`\gamma` is a scalar chosen to approximate an 867 eigenvalue of the true inverse Hessian can result in improved 868 convergence in a wide variety of cases. Setting 869 ``use_approximate_eigenvalue_bfgs_scaling`` to true enables this 870 scaling in ``BFGS`` (before first iteration) and ``LBFGS`` (at each 871 iteration). 872 873 Precisely, approximate eigenvalue scaling equates to 874 875 .. math:: \gamma = \frac{y_k' s_k}{y_k' y_k} 876 877 With: 878 879 .. math:: y_k = \nabla f_{k+1} - \nabla f_k 880 .. math:: s_k = x_{k+1} - x_k 881 882 Where :math:`f()` is the line search objective and :math:`x` the 883 vector of parameter values [NocedalWright]_. 884 885 It is important to note that approximate eigenvalue scaling does 886 **not** *always* improve convergence, and that it can in fact 887 *significantly* degrade performance for certain classes of problem, 888 which is why it is disabled by default. In particular it can 889 degrade performance when the sensitivity of the problem to different 890 parameters varies significantly, as in this case a single scalar 891 factor fails to capture this variation and detrimentally downscales 892 parts of the Jacobian approximation which correspond to 893 low-sensitivity parameters. It can also reduce the robustness of the 894 solution to errors in the Jacobians. 895 896 .. member:: LineSearchIterpolationType Solver::Options::line_search_interpolation_type 897 898 Default: ``CUBIC`` 899 900 Degree of the polynomial used to approximate the objective 901 function. Valid values are ``BISECTION``, ``QUADRATIC`` and 902 ``CUBIC``. 903 904 .. member:: double Solver::Options::min_line_search_step_size 905 906 The line search terminates if: 907 908 .. math:: \|\Delta x_k\|_\infty < \text{min_line_search_step_size} 909 910 where :math:`\|\cdot\|_\infty` refers to the max norm, and 911 :math:`\Delta x_k` is the step change in the parameter values at 912 the :math:`k`-th iteration. 913 914 .. member:: double Solver::Options::line_search_sufficient_function_decrease 915 916 Default: ``1e-4`` 917 918 Solving the line search problem exactly is computationally 919 prohibitive. Fortunately, line search based optimization algorithms 920 can still guarantee convergence if instead of an exact solution, 921 the line search algorithm returns a solution which decreases the 922 value of the objective function sufficiently. More precisely, we 923 are looking for a step size s.t. 924 925 .. math:: f(\text{step_size}) \le f(0) + \text{sufficient_decrease} * [f'(0) * \text{step_size}] 926 927 This condition is known as the Armijo condition. 928 929 .. member:: double Solver::Options::max_line_search_step_contraction 930 931 Default: ``1e-3`` 932 933 In each iteration of the line search, 934 935 .. math:: \text{new_step_size} >= \text{max_line_search_step_contraction} * \text{step_size} 936 937 Note that by definition, for contraction: 938 939 .. math:: 0 < \text{max_step_contraction} < \text{min_step_contraction} < 1 940 941 .. member:: double Solver::Options::min_line_search_step_contraction 942 943 Default: ``0.6`` 944 945 In each iteration of the line search, 946 947 .. math:: \text{new_step_size} <= \text{min_line_search_step_contraction} * \text{step_size} 948 949 Note that by definition, for contraction: 950 951 .. math:: 0 < \text{max_step_contraction} < \text{min_step_contraction} < 1 952 953 .. member:: int Solver::Options::max_num_line_search_step_size_iterations 954 955 Default: ``20`` 956 957 Maximum number of trial step size iterations during each line 958 search, if a step size satisfying the search conditions cannot be 959 found within this number of trials, the line search will stop. 960 961 As this is an 'artificial' constraint (one imposed by the user, not 962 the underlying math), if ``WOLFE`` line search is being used, *and* 963 points satisfying the Armijo sufficient (function) decrease 964 condition have been found during the current search (in :math:`<=` 965 ``max_num_line_search_step_size_iterations``). Then, the step size 966 with the lowest function value which satisfies the Armijo condition 967 will be returned as the new valid step, even though it does *not* 968 satisfy the strong Wolfe conditions. This behaviour protects 969 against early termination of the optimizer at a sub-optimal point. 970 971 .. member:: int Solver::Options::max_num_line_search_direction_restarts 972 973 Default: ``5`` 974 975 Maximum number of restarts of the line search direction algorithm 976 before terminating the optimization. Restarts of the line search 977 direction algorithm occur when the current algorithm fails to 978 produce a new descent direction. This typically indicates a 979 numerical failure, or a breakdown in the validity of the 980 approximations used. 981 982 .. member:: double Solver::Options::line_search_sufficient_curvature_decrease 983 984 Default: ``0.9`` 985 986 The strong Wolfe conditions consist of the Armijo sufficient 987 decrease condition, and an additional requirement that the 988 step size be chosen s.t. the *magnitude* ('strong' Wolfe 989 conditions) of the gradient along the search direction 990 decreases sufficiently. Precisely, this second condition 991 is that we seek a step size s.t. 992 993 .. math:: \|f'(\text{step_size})\| <= \text{sufficient_curvature_decrease} * \|f'(0)\| 994 995 Where :math:`f()` is the line search objective and :math:`f'()` is the derivative 996 of :math:`f` with respect to the step size: :math:`\frac{d f}{d~\text{step size}}`. 997 998 .. member:: double Solver::Options::max_line_search_step_expansion 999 1000 Default: ``10.0`` 1001 1002 During the bracketing phase of a Wolfe line search, the step size 1003 is increased until either a point satisfying the Wolfe conditions 1004 is found, or an upper bound for a bracket containinqg a point 1005 satisfying the conditions is found. Precisely, at each iteration 1006 of the expansion: 1007 1008 .. math:: \text{new_step_size} <= \text{max_step_expansion} * \text{step_size} 1009 1010 By definition for expansion 1011 1012 .. math:: \text{max_step_expansion} > 1.0 1013 1014 .. member:: TrustRegionStrategyType Solver::Options::trust_region_strategy_type 1015 1016 Default: ``LEVENBERG_MARQUARDT`` 1017 1018 The trust region step computation algorithm used by 1019 Ceres. Currently ``LEVENBERG_MARQUARDT`` and ``DOGLEG`` are the two 1020 valid choices. See :ref:`section-levenberg-marquardt` and 1021 :ref:`section-dogleg` for more details. 1022 1023 .. member:: DoglegType Solver::Options::dogleg_type 1024 1025 Default: ``TRADITIONAL_DOGLEG`` 1026 1027 Ceres supports two different dogleg strategies. 1028 ``TRADITIONAL_DOGLEG`` method by Powell and the ``SUBSPACE_DOGLEG`` 1029 method described by [ByrdSchnabel]_ . See :ref:`section-dogleg` 1030 for more details. 1031 1032 .. member:: bool Solver::Options::use_nonmonotonic_steps 1033 1034 Default: ``false`` 1035 1036 Relax the requirement that the trust-region algorithm take strictly 1037 decreasing steps. See :ref:`section-non-monotonic-steps` for more 1038 details. 1039 1040 .. member:: int Solver::Options::max_consecutive_nonmonotonic_steps 1041 1042 Default: ``5`` 1043 1044 The window size used by the step selection algorithm to accept 1045 non-monotonic steps. 1046 1047 .. member:: int Solver::Options::max_num_iterations 1048 1049 Default: ``50`` 1050 1051 Maximum number of iterations for which the solver should run. 1052 1053 .. member:: double Solver::Options::max_solver_time_in_seconds 1054 1055 Default: ``1e6`` 1056 Maximum amount of time for which the solver should run. 1057 1058 .. member:: int Solver::Options::num_threads 1059 1060 Default: ``1`` 1061 1062 Number of threads used by Ceres to evaluate the Jacobian. 1063 1064 .. member:: double Solver::Options::initial_trust_region_radius 1065 1066 Default: ``1e4`` 1067 1068 The size of the initial trust region. When the 1069 ``LEVENBERG_MARQUARDT`` strategy is used, the reciprocal of this 1070 number is the initial regularization parameter. 1071 1072 .. member:: double Solver::Options::max_trust_region_radius 1073 1074 Default: ``1e16`` 1075 1076 The trust region radius is not allowed to grow beyond this value. 1077 1078 .. member:: double Solver::Options::min_trust_region_radius 1079 1080 Default: ``1e-32`` 1081 1082 The solver terminates, when the trust region becomes smaller than 1083 this value. 1084 1085 .. member:: double Solver::Options::min_relative_decrease 1086 1087 Default: ``1e-3`` 1088 1089 Lower threshold for relative decrease before a trust-region step is 1090 accepted. 1091 1092 .. member:: double Solver::Options::min_lm_diagonal 1093 1094 Default: ``1e6`` 1095 1096 The ``LEVENBERG_MARQUARDT`` strategy, uses a diagonal matrix to 1097 regularize the the trust region step. This is the lower bound on 1098 the values of this diagonal matrix. 1099 1100 .. member:: double Solver::Options::max_lm_diagonal 1101 1102 Default: ``1e32`` 1103 1104 The ``LEVENBERG_MARQUARDT`` strategy, uses a diagonal matrix to 1105 regularize the the trust region step. This is the upper bound on 1106 the values of this diagonal matrix. 1107 1108 .. member:: int Solver::Options::max_num_consecutive_invalid_steps 1109 1110 Default: ``5`` 1111 1112 The step returned by a trust region strategy can sometimes be 1113 numerically invalid, usually because of conditioning 1114 issues. Instead of crashing or stopping the optimization, the 1115 optimizer can go ahead and try solving with a smaller trust 1116 region/better conditioned problem. This parameter sets the number 1117 of consecutive retries before the minimizer gives up. 1118 1119 .. member:: double Solver::Options::function_tolerance 1120 1121 Default: ``1e-6`` 1122 1123 Solver terminates if 1124 1125 .. math:: \frac{|\Delta \text{cost}|}{\text{cost}} < \text{function_tolerance} 1126 1127 where, :math:`\Delta \text{cost}` is the change in objective 1128 function value (up or down) in the current iteration of 1129 Levenberg-Marquardt. 1130 1131 .. member:: double Solver::Options::gradient_tolerance 1132 1133 Default: ``1e-10`` 1134 1135 Solver terminates if 1136 1137 .. math:: \|x - \Pi \boxplus(x, -g(x))\|_\infty < \text{gradient_tolerance} 1138 1139 where :math:`\|\cdot\|_\infty` refers to the max norm, :math:`\Pi` 1140 is projection onto the bounds constraints and :math:`\boxplus` is 1141 Plus operation for the overall local parameterization associated 1142 with the parameter vector. 1143 1144 .. member:: double Solver::Options::parameter_tolerance 1145 1146 Default: ``1e-8`` 1147 1148 Solver terminates if 1149 1150 .. math:: \|\Delta x\| < (\|x\| + \text{parameter_tolerance}) * \text{parameter_tolerance} 1151 1152 where :math:`\Delta x` is the step computed by the linear solver in 1153 the current iteration of Levenberg-Marquardt. 1154 1155 .. member:: LinearSolverType Solver::Options::linear_solver_type 1156 1157 Default: ``SPARSE_NORMAL_CHOLESKY`` / ``DENSE_QR`` 1158 1159 Type of linear solver used to compute the solution to the linear 1160 least squares problem in each iteration of the Levenberg-Marquardt 1161 algorithm. If Ceres is built with support for ``SuiteSparse`` or 1162 ``CXSparse`` or ``Eigen``'s sparse Cholesky factorization, the 1163 default is ``SPARSE_NORMAL_CHOLESKY``, it is ``DENSE_QR`` 1164 otherwise. 1165 1166 .. member:: PreconditionerType Solver::Options::preconditioner_type 1167 1168 Default: ``JACOBI`` 1169 1170 The preconditioner used by the iterative linear solver. The default 1171 is the block Jacobi preconditioner. Valid values are (in increasing 1172 order of complexity) ``IDENTITY``, ``JACOBI``, ``SCHUR_JACOBI``, 1173 ``CLUSTER_JACOBI`` and ``CLUSTER_TRIDIAGONAL``. See 1174 :ref:`section-preconditioner` for more details. 1175 1176 .. member:: VisibilityClusteringType Solver::Options::visibility_clustering_type 1177 1178 Default: ``CANONICAL_VIEWS`` 1179 1180 Type of clustering algorithm to use when constructing a visibility 1181 based preconditioner. The original visibility based preconditioning 1182 paper and implementation only used the canonical views algorithm. 1183 1184 This algorithm gives high quality results but for large dense 1185 graphs can be particularly expensive. As its worst case complexity 1186 is cubic in size of the graph. 1187 1188 Another option is to use ``SINGLE_LINKAGE`` which is a simple 1189 thresholded single linkage clustering algorithm that only pays 1190 attention to tightly coupled blocks in the Schur complement. This 1191 is a fast algorithm that works well. 1192 1193 The optimal choice of the clustering algorithm depends on the 1194 sparsity structure of the problem, but generally speaking we 1195 recommend that you try ``CANONICAL_VIEWS`` first and if it is too 1196 expensive try ``SINGLE_LINKAGE``. 1197 1198 .. member:: DenseLinearAlgebraLibrary Solver::Options::dense_linear_algebra_library_type 1199 1200 Default:``EIGEN`` 1201 1202 Ceres supports using multiple dense linear algebra libraries for 1203 dense matrix factorizations. Currently ``EIGEN`` and ``LAPACK`` are 1204 the valid choices. ``EIGEN`` is always available, ``LAPACK`` refers 1205 to the system ``BLAS + LAPACK`` library which may or may not be 1206 available. 1207 1208 This setting affects the ``DENSE_QR``, ``DENSE_NORMAL_CHOLESKY`` 1209 and ``DENSE_SCHUR`` solvers. For small to moderate sized probem 1210 ``EIGEN`` is a fine choice but for large problems, an optimized 1211 ``LAPACK + BLAS`` implementation can make a substantial difference 1212 in performance. 1213 1214 .. member:: SparseLinearAlgebraLibrary Solver::Options::sparse_linear_algebra_library_type 1215 1216 Default:``SUITE_SPARSE`` 1217 1218 Ceres supports the use of three sparse linear algebra libraries, 1219 ``SuiteSparse``, which is enabled by setting this parameter to 1220 ``SUITE_SPARSE``, ``CXSparse``, which can be selected by setting 1221 this parameter to ```CX_SPARSE`` and ``Eigen`` which is enabled by 1222 setting this parameter to ``EIGEN_SPARSE``. 1223 1224 ``SuiteSparse`` is a sophisticated and complex sparse linear 1225 algebra library and should be used in general. 1226 1227 If your needs/platforms prevent you from using ``SuiteSparse``, 1228 consider using ``CXSparse``, which is a much smaller, easier to 1229 build library. As can be expected, its performance on large 1230 problems is not comparable to that of ``SuiteSparse``. 1231 1232 Last but not the least you can use the sparse linear algebra 1233 routines in ``Eigen``. Currently the performance of this library is 1234 the poorest of the three. But this should change in the near 1235 future. 1236 1237 Another thing to consider here is that the sparse Cholesky 1238 factorization libraries in Eigen are licensed under ``LGPL`` and 1239 building Ceres with support for ``EIGEN_SPARSE`` will result in an 1240 LGPL licensed library (since the corresponding code from Eigen is 1241 compiled into the library). 1242 1243 The upside is that you do not need to build and link to an external 1244 library to use ``EIGEN_SPARSE``. 1245 1246 .. member:: int Solver::Options::num_linear_solver_threads 1247 1248 Default: ``1`` 1249 1250 Number of threads used by the linear solver. 1251 1252 .. member:: shared_ptr<ParameterBlockOrdering> Solver::Options::linear_solver_ordering 1253 1254 Default: ``NULL`` 1255 1256 An instance of the ordering object informs the solver about the 1257 desired order in which parameter blocks should be eliminated by the 1258 linear solvers. See section~\ref{sec:ordering`` for more details. 1259 1260 If ``NULL``, the solver is free to choose an ordering that it 1261 thinks is best. 1262 1263 See :ref:`section-ordering` for more details. 1264 1265 .. member:: bool Solver::Options::use_post_ordering 1266 1267 Default: ``false`` 1268 1269 Sparse Cholesky factorization algorithms use a fill-reducing 1270 ordering to permute the columns of the Jacobian matrix. There are 1271 two ways of doing this. 1272 1273 1. Compute the Jacobian matrix in some order and then have the 1274 factorization algorithm permute the columns of the Jacobian. 1275 1276 2. Compute the Jacobian with its columns already permuted. 1277 1278 The first option incurs a significant memory penalty. The 1279 factorization algorithm has to make a copy of the permuted Jacobian 1280 matrix, thus Ceres pre-permutes the columns of the Jacobian matrix 1281 and generally speaking, there is no performance penalty for doing 1282 so. 1283 1284 In some rare cases, it is worth using a more complicated reordering 1285 algorithm which has slightly better runtime performance at the 1286 expense of an extra copy of the Jacobian matrix. Setting 1287 ``use_postordering`` to ``true`` enables this tradeoff. 1288 1289 .. member:: bool Solver::Options::dynamic_sparsity 1290 1291 Some non-linear least squares problems are symbolically dense but 1292 numerically sparse. i.e. at any given state only a small number of 1293 Jacobian entries are non-zero, but the position and number of 1294 non-zeros is different depending on the state. For these problems 1295 it can be useful to factorize the sparse jacobian at each solver 1296 iteration instead of including all of the zero entries in a single 1297 general factorization. 1298 1299 If your problem does not have this property (or you do not know), 1300 then it is probably best to keep this false, otherwise it will 1301 likely lead to worse performance. 1302 1303 This settings affects the `SPARSE_NORMAL_CHOLESKY` solver. 1304 1305 .. member:: int Solver::Options::min_linear_solver_iterations 1306 1307 Default: ``1`` 1308 1309 Minimum number of iterations used by the linear solver. This only 1310 makes sense when the linear solver is an iterative solver, e.g., 1311 ``ITERATIVE_SCHUR`` or ``CGNR``. 1312 1313 .. member:: int Solver::Options::max_linear_solver_iterations 1314 1315 Default: ``500`` 1316 1317 Minimum number of iterations used by the linear solver. This only 1318 makes sense when the linear solver is an iterative solver, e.g., 1319 ``ITERATIVE_SCHUR`` or ``CGNR``. 1320 1321 .. member:: double Solver::Options::eta 1322 1323 Default: ``1e-1`` 1324 1325 Forcing sequence parameter. The truncated Newton solver uses this 1326 number to control the relative accuracy with which the Newton step 1327 is computed. This constant is passed to 1328 ``ConjugateGradientsSolver`` which uses it to terminate the 1329 iterations when 1330 1331 .. math:: \frac{Q_i - Q_{i-1}}{Q_i} < \frac{\eta}{i} 1332 1333 .. member:: bool Solver::Options::jacobi_scaling 1334 1335 Default: ``true`` 1336 1337 ``true`` means that the Jacobian is scaled by the norm of its 1338 columns before being passed to the linear solver. This improves the 1339 numerical conditioning of the normal equations. 1340 1341 .. member:: bool Solver::Options::use_inner_iterations 1342 1343 Default: ``false`` 1344 1345 Use a non-linear version of a simplified variable projection 1346 algorithm. Essentially this amounts to doing a further optimization 1347 on each Newton/Trust region step using a coordinate descent 1348 algorithm. For more details, see :ref:`section-inner-iterations`. 1349 1350 .. member:: double Solver::Options::inner_itearation_tolerance 1351 1352 Default: ``1e-3`` 1353 1354 Generally speaking, inner iterations make significant progress in 1355 the early stages of the solve and then their contribution drops 1356 down sharply, at which point the time spent doing inner iterations 1357 is not worth it. 1358 1359 Once the relative decrease in the objective function due to inner 1360 iterations drops below ``inner_iteration_tolerance``, the use of 1361 inner iterations in subsequent trust region minimizer iterations is 1362 disabled. 1363 1364 .. member:: shared_ptr<ParameterBlockOrdering> Solver::Options::inner_iteration_ordering 1365 1366 Default: ``NULL`` 1367 1368 If :member:`Solver::Options::use_inner_iterations` true, then the 1369 user has two choices. 1370 1371 1. Let the solver heuristically decide which parameter blocks to 1372 optimize in each inner iteration. To do this, set 1373 :member:`Solver::Options::inner_iteration_ordering` to ``NULL``. 1374 1375 2. Specify a collection of of ordered independent sets. The lower 1376 numbered groups are optimized before the higher number groups 1377 during the inner optimization phase. Each group must be an 1378 independent set. Not all parameter blocks need to be included in 1379 the ordering. 1380 1381 See :ref:`section-ordering` for more details. 1382 1383 .. member:: LoggingType Solver::Options::logging_type 1384 1385 Default: ``PER_MINIMIZER_ITERATION`` 1386 1387 .. member:: bool Solver::Options::minimizer_progress_to_stdout 1388 1389 Default: ``false`` 1390 1391 By default the :class:`Minimizer` progress is logged to ``STDERR`` 1392 depending on the ``vlog`` level. If this flag is set to true, and 1393 :member:`Solver::Options::logging_type` is not ``SILENT``, the logging 1394 output is sent to ``STDOUT``. 1395 1396 For ``TRUST_REGION_MINIMIZER`` the progress display looks like 1397 1398 .. code-block:: bash 1399 1400 iter cost cost_change |gradient| |step| tr_ratio tr_radius ls_iter iter_time total_time 1401 0 4.185660e+06 0.00e+00 1.09e+08 0.00e+00 0.00e+00 1.00e+04 0 7.59e-02 3.37e-01 1402 1 1.062590e+05 4.08e+06 8.99e+06 5.36e+02 9.82e-01 3.00e+04 1 1.65e-01 5.03e-01 1403 2 4.992817e+04 5.63e+04 8.32e+06 3.19e+02 6.52e-01 3.09e+04 1 1.45e-01 6.48e-01 1404 1405 Here 1406 1407 #. ``cost`` is the value of the objective function. 1408 #. ``cost_change`` is the change in the value of the objective 1409 function if the step computed in this iteration is accepted. 1410 #. ``|gradient|`` is the max norm of the gradient. 1411 #. ``|step|`` is the change in the parameter vector. 1412 #. ``tr_ratio`` is the ratio of the actual change in the objective 1413 function value to the change in the the value of the trust 1414 region model. 1415 #. ``tr_radius`` is the size of the trust region radius. 1416 #. ``ls_iter`` is the number of linear solver iterations used to 1417 compute the trust region step. For direct/factorization based 1418 solvers it is always 1, for iterative solvers like 1419 ``ITERATIVE_SCHUR`` it is the number of iterations of the 1420 Conjugate Gradients algorithm. 1421 #. ``iter_time`` is the time take by the current iteration. 1422 #. ``total_time`` is the the total time taken by the minimizer. 1423 1424 For ``LINE_SEARCH_MINIMIZER`` the progress display looks like 1425 1426 .. code-block:: bash 1427 1428 0: f: 2.317806e+05 d: 0.00e+00 g: 3.19e-01 h: 0.00e+00 s: 0.00e+00 e: 0 it: 2.98e-02 tt: 8.50e-02 1429 1: f: 2.312019e+05 d: 5.79e+02 g: 3.18e-01 h: 2.41e+01 s: 1.00e+00 e: 1 it: 4.54e-02 tt: 1.31e-01 1430 2: f: 2.300462e+05 d: 1.16e+03 g: 3.17e-01 h: 4.90e+01 s: 2.54e-03 e: 1 it: 4.96e-02 tt: 1.81e-01 1431 1432 Here 1433 1434 #. ``f`` is the value of the objective function. 1435 #. ``d`` is the change in the value of the objective function if 1436 the step computed in this iteration is accepted. 1437 #. ``g`` is the max norm of the gradient. 1438 #. ``h`` is the change in the parameter vector. 1439 #. ``s`` is the optimal step length computed by the line search. 1440 #. ``it`` is the time take by the current iteration. 1441 #. ``tt`` is the the total time taken by the minimizer. 1442 1443 .. member:: vector<int> Solver::Options::trust_region_minimizer_iterations_to_dump 1444 1445 Default: ``empty`` 1446 1447 List of iterations at which the trust region minimizer should dump 1448 the trust region problem. Useful for testing and benchmarking. If 1449 ``empty``, no problems are dumped. 1450 1451 .. member:: string Solver::Options::trust_region_problem_dump_directory 1452 1453 Default: ``/tmp`` 1454 1455 Directory to which the problems should be written to. Should be 1456 non-empty if 1457 :member:`Solver::Options::trust_region_minimizer_iterations_to_dump` is 1458 non-empty and 1459 :member:`Solver::Options::trust_region_problem_dump_format_type` is not 1460 ``CONSOLE``. 1461 1462 .. member:: DumpFormatType Solver::Options::trust_region_problem_dump_format 1463 1464 Default: ``TEXTFILE`` 1465 1466 The format in which trust region problems should be logged when 1467 :member:`Solver::Options::trust_region_minimizer_iterations_to_dump` 1468 is non-empty. There are three options: 1469 1470 * ``CONSOLE`` prints the linear least squares problem in a human 1471 readable format to ``stderr``. The Jacobian is printed as a 1472 dense matrix. The vectors :math:`D`, :math:`x` and :math:`f` are 1473 printed as dense vectors. This should only be used for small 1474 problems. 1475 1476 * ``TEXTFILE`` Write out the linear least squares problem to the 1477 directory pointed to by 1478 :member:`Solver::Options::trust_region_problem_dump_directory` as 1479 text files which can be read into ``MATLAB/Octave``. The Jacobian 1480 is dumped as a text file containing :math:`(i,j,s)` triplets, the 1481 vectors :math:`D`, `x` and `f` are dumped as text files 1482 containing a list of their values. 1483 1484 A ``MATLAB/Octave`` script called 1485 ``ceres_solver_iteration_???.m`` is also output, which can be 1486 used to parse and load the problem into memory. 1487 1488 .. member:: bool Solver::Options::check_gradients 1489 1490 Default: ``false`` 1491 1492 Check all Jacobians computed by each residual block with finite 1493 differences. This is expensive since it involves computing the 1494 derivative by normal means (e.g. user specified, autodiff, etc), 1495 then also computing it using finite differences. The results are 1496 compared, and if they differ substantially, details are printed to 1497 the log. 1498 1499 .. member:: double Solver::Options::gradient_check_relative_precision 1500 1501 Default: ``1e08`` 1502 1503 Precision to check for in the gradient checker. If the relative 1504 difference between an element in a Jacobian exceeds this number, 1505 then the Jacobian for that cost term is dumped. 1506 1507 .. member:: double Solver::Options::numeric_derivative_relative_step_size 1508 1509 Default: ``1e-6`` 1510 1511 Relative shift used for taking numeric derivatives. For finite 1512 differencing, each dimension is evaluated at slightly shifted 1513 values, e.g., for forward differences, the numerical derivative is 1514 1515 .. math:: 1516 1517 \delta &= numeric\_derivative\_relative\_step\_size\\ 1518 \Delta f &= \frac{f((1 + \delta) x) - f(x)}{\delta x} 1519 1520 The finite differencing is done along each dimension. The reason to 1521 use a relative (rather than absolute) step size is that this way, 1522 numeric differentiation works for functions where the arguments are 1523 typically large (e.g. :math:`10^9`) and when the values are small 1524 (e.g. :math:`10^{-5}`). It is possible to construct *torture cases* 1525 which break this finite difference heuristic, but they do not come 1526 up often in practice. 1527 1528 .. member:: vector<IterationCallback> Solver::Options::callbacks 1529 1530 Callbacks that are executed at the end of each iteration of the 1531 :class:`Minimizer`. They are executed in the order that they are 1532 specified in this vector. By default, parameter blocks are updated 1533 only at the end of the optimization, i.e when the 1534 :class:`Minimizer` terminates. This behavior is controlled by 1535 :member:`Solver::Options::update_state_every_variable`. If the user 1536 wishes to have access to the update parameter blocks when his/her 1537 callbacks are executed, then set 1538 :member:`Solver::Options::update_state_every_iteration` to true. 1539 1540 The solver does NOT take ownership of these pointers. 1541 1542 .. member:: bool Solver::Options::update_state_every_iteration 1543 1544 Default: ``false`` 1545 1546 Normally the parameter blocks are only updated when the solver 1547 terminates. Setting this to true update them in every 1548 iteration. This setting is useful when building an interactive 1549 application using Ceres and using an :class:`IterationCallback`. 1550 1551 :class:`ParameterBlockOrdering` 1552 ------------------------------- 1553 1554 .. class:: ParameterBlockOrdering 1555 1556 ``ParameterBlockOrdering`` is a class for storing and manipulating 1557 an ordered collection of groups/sets with the following semantics: 1558 1559 Group IDs are non-negative integer values. Elements are any type 1560 that can serve as a key in a map or an element of a set. 1561 1562 An element can only belong to one group at a time. A group may 1563 contain an arbitrary number of elements. 1564 1565 Groups are ordered by their group id. 1566 1567 .. function:: bool ParameterBlockOrdering::AddElementToGroup(const double* element, const int group) 1568 1569 Add an element to a group. If a group with this id does not exist, 1570 one is created. This method can be called any number of times for 1571 the same element. Group ids should be non-negative numbers. Return 1572 value indicates if adding the element was a success. 1573 1574 .. function:: void ParameterBlockOrdering::Clear() 1575 1576 Clear the ordering. 1577 1578 .. function:: bool ParameterBlockOrdering::Remove(const double* element) 1579 1580 Remove the element, no matter what group it is in. If the element 1581 is not a member of any group, calling this method will result in a 1582 crash. Return value indicates if the element was actually removed. 1583 1584 .. function:: void ParameterBlockOrdering::Reverse() 1585 1586 Reverse the order of the groups in place. 1587 1588 .. function:: int ParameterBlockOrdering::GroupId(const double* element) const 1589 1590 Return the group id for the element. If the element is not a member 1591 of any group, return -1. 1592 1593 .. function:: bool ParameterBlockOrdering::IsMember(const double* element) const 1594 1595 True if there is a group containing the parameter block. 1596 1597 .. function:: int ParameterBlockOrdering::GroupSize(const int group) const 1598 1599 This function always succeeds, i.e., implicitly there exists a 1600 group for every integer. 1601 1602 .. function:: int ParameterBlockOrdering::NumElements() const 1603 1604 Number of elements in the ordering. 1605 1606 .. function:: int ParameterBlockOrdering::NumGroups() const 1607 1608 Number of groups with one or more elements. 1609 1610 1611 :class:`IterationCallback` 1612 -------------------------- 1613 1614 .. class:: IterationSummary 1615 1616 :class:`IterationSummary` describes the state of the minimizer at 1617 the end of each iteration. 1618 1619 .. member:: int32 IterationSummary::iteration 1620 1621 Current iteration number. 1622 1623 .. member:: bool IterationSummary::step_is_valid 1624 1625 Step was numerically valid, i.e., all values are finite and the 1626 step reduces the value of the linearized model. 1627 1628 **Note**: :member:`IterationSummary::step_is_valid` is `false` 1629 when :member:`IterationSummary::iteration` = 0. 1630 1631 .. member:: bool IterationSummary::step_is_nonmonotonic 1632 1633 Step did not reduce the value of the objective function 1634 sufficiently, but it was accepted because of the relaxed 1635 acceptance criterion used by the non-monotonic trust region 1636 algorithm. 1637 1638 **Note**: :member:`IterationSummary::step_is_nonmonotonic` is 1639 `false` when when :member:`IterationSummary::iteration` = 0. 1640 1641 .. member:: bool IterationSummary::step_is_successful 1642 1643 Whether or not the minimizer accepted this step or not. 1644 1645 If the ordinary trust region algorithm is used, this means that the 1646 relative reduction in the objective function value was greater than 1647 :member:`Solver::Options::min_relative_decrease`. However, if the 1648 non-monotonic trust region algorithm is used 1649 (:member:`Solver::Options::use_nonmonotonic_steps` = `true`), then 1650 even if the relative decrease is not sufficient, the algorithm may 1651 accept the step and the step is declared successful. 1652 1653 **Note**: :member:`IterationSummary::step_is_successful` is `false` 1654 when when :member:`IterationSummary::iteration` = 0. 1655 1656 .. member:: double IterationSummary::cost 1657 1658 Value of the objective function. 1659 1660 .. member:: double IterationSummary::cost_change 1661 1662 Change in the value of the objective function in this 1663 iteration. This can be positive or negative. 1664 1665 .. member:: double IterationSummary::gradient_max_norm 1666 1667 Infinity norm of the gradient vector. 1668 1669 .. member:: double IterationSummary::gradient_norm 1670 1671 2-norm of the gradient vector. 1672 1673 .. member:: double IterationSummary::step_norm 1674 1675 2-norm of the size of the step computed in this iteration. 1676 1677 .. member:: double IterationSummary::relative_decrease 1678 1679 For trust region algorithms, the ratio of the actual change in cost 1680 and the change in the cost of the linearized approximation. 1681 1682 This field is not used when a linear search minimizer is used. 1683 1684 .. member:: double IterationSummary::trust_region_radius 1685 1686 Size of the trust region at the end of the current iteration. For 1687 the Levenberg-Marquardt algorithm, the regularization parameter is 1688 1.0 / member::`IterationSummary::trust_region_radius`. 1689 1690 .. member:: double IterationSummary::eta 1691 1692 For the inexact step Levenberg-Marquardt algorithm, this is the 1693 relative accuracy with which the step is solved. This number is 1694 only applicable to the iterative solvers capable of solving linear 1695 systems inexactly. Factorization-based exact solvers always have an 1696 eta of 0.0. 1697 1698 .. member:: double IterationSummary::step_size 1699 1700 Step sized computed by the line search algorithm. 1701 1702 This field is not used when a trust region minimizer is used. 1703 1704 .. member:: int IterationSummary::line_search_function_evaluations 1705 1706 Number of function evaluations used by the line search algorithm. 1707 1708 This field is not used when a trust region minimizer is used. 1709 1710 .. member:: int IterationSummary::linear_solver_iterations 1711 1712 Number of iterations taken by the linear solver to solve for the 1713 trust region step. 1714 1715 Currently this field is not used when a line search minimizer is 1716 used. 1717 1718 .. member:: double IterationSummary::iteration_time_in_seconds 1719 1720 Time (in seconds) spent inside the minimizer loop in the current 1721 iteration. 1722 1723 .. member:: double IterationSummary::step_solver_time_in_seconds 1724 1725 Time (in seconds) spent inside the trust region step solver. 1726 1727 .. member:: double IterationSummary::cumulative_time_in_seconds 1728 1729 Time (in seconds) since the user called Solve(). 1730 1731 1732 .. class:: IterationCallback 1733 1734 Interface for specifying callbacks that are executed at the end of 1735 each iteration of the minimizer. 1736 1737 .. code-block:: c++ 1738 1739 class IterationCallback { 1740 public: 1741 virtual ~IterationCallback() {} 1742 virtual CallbackReturnType operator()(const IterationSummary& summary) = 0; 1743 }; 1744 1745 1746 The solver uses the return value of ``operator()`` to decide whether 1747 to continue solving or to terminate. The user can return three 1748 values. 1749 1750 #. ``SOLVER_ABORT`` indicates that the callback detected an abnormal 1751 situation. The solver returns without updating the parameter 1752 blocks (unless ``Solver::Options::update_state_every_iteration`` is 1753 set true). Solver returns with ``Solver::Summary::termination_type`` 1754 set to ``USER_FAILURE``. 1755 1756 #. ``SOLVER_TERMINATE_SUCCESSFULLY`` indicates that there is no need 1757 to optimize anymore (some user specified termination criterion 1758 has been met). Solver returns with 1759 ``Solver::Summary::termination_type``` set to ``USER_SUCCESS``. 1760 1761 #. ``SOLVER_CONTINUE`` indicates that the solver should continue 1762 optimizing. 1763 1764 For example, the following :class:`IterationCallback` is used 1765 internally by Ceres to log the progress of the optimization. 1766 1767 .. code-block:: c++ 1768 1769 class LoggingCallback : public IterationCallback { 1770 public: 1771 explicit LoggingCallback(bool log_to_stdout) 1772 : log_to_stdout_(log_to_stdout) {} 1773 1774 ~LoggingCallback() {} 1775 1776 CallbackReturnType operator()(const IterationSummary& summary) { 1777 const char* kReportRowFormat = 1778 "% 4d: f:% 8e d:% 3.2e g:% 3.2e h:% 3.2e " 1779 "rho:% 3.2e mu:% 3.2e eta:% 3.2e li:% 3d"; 1780 string output = StringPrintf(kReportRowFormat, 1781 summary.iteration, 1782 summary.cost, 1783 summary.cost_change, 1784 summary.gradient_max_norm, 1785 summary.step_norm, 1786 summary.relative_decrease, 1787 summary.trust_region_radius, 1788 summary.eta, 1789 summary.linear_solver_iterations); 1790 if (log_to_stdout_) { 1791 cout << output << endl; 1792 } else { 1793 VLOG(1) << output; 1794 } 1795 return SOLVER_CONTINUE; 1796 } 1797 1798 private: 1799 const bool log_to_stdout_; 1800 }; 1801 1802 1803 1804 :class:`CRSMatrix` 1805 ------------------ 1806 1807 .. class:: CRSMatrix 1808 1809 A compressed row sparse matrix used primarily for communicating the 1810 Jacobian matrix to the user. 1811 1812 .. member:: int CRSMatrix::num_rows 1813 1814 Number of rows. 1815 1816 .. member:: int CRSMatrix::num_cols 1817 1818 Number of columns. 1819 1820 .. member:: vector<int> CRSMatrix::rows 1821 1822 :member:`CRSMatrix::rows` is a :member:`CRSMatrix::num_rows` + 1 1823 sized array that points into the :member:`CRSMatrix::cols` and 1824 :member:`CRSMatrix::values` array. 1825 1826 .. member:: vector<int> CRSMatrix::cols 1827 1828 :member:`CRSMatrix::cols` contain as many entries as there are 1829 non-zeros in the matrix. 1830 1831 For each row ``i``, ``cols[rows[i]]`` ... ``cols[rows[i + 1] - 1]`` 1832 are the indices of the non-zero columns of row ``i``. 1833 1834 .. member:: vector<int> CRSMatrix::values 1835 1836 :member:`CRSMatrix::values` contain as many entries as there are 1837 non-zeros in the matrix. 1838 1839 For each row ``i``, 1840 ``values[rows[i]]`` ... ``values[rows[i + 1] - 1]`` are the values 1841 of the non-zero columns of row ``i``. 1842 1843 e.g, consider the 3x4 sparse matrix 1844 1845 .. code-block:: c++ 1846 1847 0 10 0 4 1848 0 2 -3 2 1849 1 2 0 0 1850 1851 The three arrays will be: 1852 1853 .. code-block:: c++ 1854 1855 -row0- ---row1--- -row2- 1856 rows = [ 0, 2, 5, 7] 1857 cols = [ 1, 3, 1, 2, 3, 0, 1] 1858 values = [10, 4, 2, -3, 2, 1, 2] 1859 1860 1861 :class:`Solver::Summary` 1862 ------------------------ 1863 1864 .. class:: Solver::Summary 1865 1866 Summary of the various stages of the solver after termination. 1867 1868 1869 .. function:: string Solver::Summary::BriefReport() const 1870 1871 A brief one line description of the state of the solver after 1872 termination. 1873 1874 .. function:: string Solver::Summary::FullReport() const 1875 1876 A full multiline description of the state of the solver after 1877 termination. 1878 1879 .. function:: bool Solver::Summary::IsSolutionUsable() const 1880 1881 Whether the solution returned by the optimization algorithm can be 1882 relied on to be numerically sane. This will be the case if 1883 `Solver::Summary:termination_type` is set to `CONVERGENCE`, 1884 `USER_SUCCESS` or `NO_CONVERGENCE`, i.e., either the solver 1885 converged by meeting one of the convergence tolerances or because 1886 the user indicated that it had converged or it ran to the maximum 1887 number of iterations or time. 1888 1889 .. member:: MinimizerType Solver::Summary::minimizer_type 1890 1891 Type of minimization algorithm used. 1892 1893 .. member:: TerminationType Solver::Summary::termination_type 1894 1895 The cause of the minimizer terminating. 1896 1897 .. member:: string Solver::Summary::message 1898 1899 Reason why the solver terminated. 1900 1901 .. member:: double Solver::Summary::initial_cost 1902 1903 Cost of the problem (value of the objective function) before the 1904 optimization. 1905 1906 .. member:: double Solver::Summary::final_cost 1907 1908 Cost of the problem (value of the objective function) after the 1909 optimization. 1910 1911 .. member:: double Solver::Summary::fixed_cost 1912 1913 The part of the total cost that comes from residual blocks that 1914 were held fixed by the preprocessor because all the parameter 1915 blocks that they depend on were fixed. 1916 1917 .. member:: vector<IterationSummary> Solver::Summary::iterations 1918 1919 :class:`IterationSummary` for each minimizer iteration in order. 1920 1921 .. member:: int Solver::Summary::num_successful_steps 1922 1923 Number of minimizer iterations in which the step was 1924 accepted. Unless :member:`Solver::Options::use_non_monotonic_steps` 1925 is `true` this is also the number of steps in which the objective 1926 function value/cost went down. 1927 1928 .. member:: int Solver::Summary::num_unsuccessful_steps 1929 1930 Number of minimizer iterations in which the step was rejected 1931 either because it did not reduce the cost enough or the step was 1932 not numerically valid. 1933 1934 .. member:: int Solver::Summary::num_inner_iteration_steps 1935 1936 Number of times inner iterations were performed. 1937 1938 .. member:: double Solver::Summary::preprocessor_time_in_seconds 1939 1940 Time (in seconds) spent in the preprocessor. 1941 1942 .. member:: double Solver::Summary::minimizer_time_in_seconds 1943 1944 Time (in seconds) spent in the Minimizer. 1945 1946 .. member:: double Solver::Summary::postprocessor_time_in_seconds 1947 1948 Time (in seconds) spent in the post processor. 1949 1950 .. member:: double Solver::Summary::total_time_in_seconds 1951 1952 Time (in seconds) spent in the solver. 1953 1954 .. member:: double Solver::Summary::linear_solver_time_in_seconds 1955 1956 Time (in seconds) spent in the linear solver computing the trust 1957 region step. 1958 1959 .. member:: double Solver::Summary::residual_evaluation_time_in_seconds 1960 1961 Time (in seconds) spent evaluating the residual vector. 1962 1963 .. member:: double Solver::Summary::jacobian_evaluation_time_in_seconds 1964 1965 Time (in seconds) spent evaluating the Jacobian matrix. 1966 1967 .. member:: double Solver::Summary::inner_iteration_time_in_seconds 1968 1969 Time (in seconds) spent doing inner iterations. 1970 1971 .. member:: int Solver::Summary::num_parameter_blocks 1972 1973 Number of parameter blocks in the problem. 1974 1975 .. member:: int Solver::Summary::num_parameters 1976 1977 Number of parameters in the problem. 1978 1979 .. member:: int Solver::Summary::num_effective_parameters 1980 1981 Dimension of the tangent space of the problem (or the number of 1982 columns in the Jacobian for the problem). This is different from 1983 :member:`Solver::Summary::num_parameters` if a parameter block is 1984 associated with a :class:`LocalParameterization`. 1985 1986 .. member:: int Solver::Summary::num_residual_blocks 1987 1988 Number of residual blocks in the problem. 1989 1990 .. member:: int Solver::Summary::num_residuals 1991 1992 Number of residuals in the problem. 1993 1994 .. member:: int Solver::Summary::num_parameter_blocks_reduced 1995 1996 Number of parameter blocks in the problem after the inactive and 1997 constant parameter blocks have been removed. A parameter block is 1998 inactive if no residual block refers to it. 1999 2000 .. member:: int Solver::Summary::num_parameters_reduced 2001 2002 Number of parameters in the reduced problem. 2003 2004 .. member:: int Solver::Summary::num_effective_parameters_reduced 2005 2006 Dimension of the tangent space of the reduced problem (or the 2007 number of columns in the Jacobian for the reduced problem). This is 2008 different from :member:`Solver::Summary::num_parameters_reduced` if 2009 a parameter block in the reduced problem is associated with a 2010 :class:`LocalParameterization`. 2011 2012 .. member:: int Solver::Summary::num_residual_blocks_reduced 2013 2014 Number of residual blocks in the reduced problem. 2015 2016 .. member:: int Solver::Summary::num_residuals_reduced 2017 2018 Number of residuals in the reduced problem. 2019 2020 .. member:: int Solver::Summary::num_threads_given 2021 2022 Number of threads specified by the user for Jacobian and residual 2023 evaluation. 2024 2025 .. member:: int Solver::Summary::num_threads_used 2026 2027 Number of threads actually used by the solver for Jacobian and 2028 residual evaluation. This number is not equal to 2029 :member:`Solver::Summary::num_threads_given` if `OpenMP` is not 2030 available. 2031 2032 .. member:: int Solver::Summary::num_linear_solver_threads_given 2033 2034 Number of threads specified by the user for solving the trust 2035 region problem. 2036 2037 .. member:: int Solver::Summary::num_linear_solver_threads_used 2038 2039 Number of threads actually used by the solver for solving the trust 2040 region problem. This number is not equal to 2041 :member:`Solver::Summary::num_linear_solver_threads_given` if 2042 `OpenMP` is not available. 2043 2044 .. member:: LinearSolverType Solver::Summary::linear_solver_type_given 2045 2046 Type of the linear solver requested by the user. 2047 2048 .. member:: LinearSolverType Solver::Summary::linear_solver_type_used 2049 2050 Type of the linear solver actually used. This may be different from 2051 :member:`Solver::Summary::linear_solver_type_given` if Ceres 2052 determines that the problem structure is not compatible with the 2053 linear solver requested or if the linear solver requested by the 2054 user is not available, e.g. The user requested 2055 `SPARSE_NORMAL_CHOLESKY` but no sparse linear algebra library was 2056 available. 2057 2058 .. member:: vector<int> Solver::Summary::linear_solver_ordering_given 2059 2060 Size of the elimination groups given by the user as hints to the 2061 linear solver. 2062 2063 .. member:: vector<int> Solver::Summary::linear_solver_ordering_used 2064 2065 Size of the parameter groups used by the solver when ordering the 2066 columns of the Jacobian. This maybe different from 2067 :member:`Solver::Summary::linear_solver_ordering_given` if the user 2068 left :member:`Solver::Summary::linear_solver_ordering_given` blank 2069 and asked for an automatic ordering, or if the problem contains 2070 some constant or inactive parameter blocks. 2071 2072 .. member:: bool Solver::Summary::inner_iterations_given 2073 2074 `True` if the user asked for inner iterations to be used as part of 2075 the optimization. 2076 2077 .. member:: bool Solver::Summary::inner_iterations_used 2078 2079 `True` if the user asked for inner iterations to be used as part of 2080 the optimization and the problem structure was such that they were 2081 actually performed. e.g., in a problem with just one parameter 2082 block, inner iterations are not performed. 2083 2084 .. member:: vector<int> inner_iteration_ordering_given 2085 2086 Size of the parameter groups given by the user for performing inner 2087 iterations. 2088 2089 .. member:: vector<int> inner_iteration_ordering_used 2090 2091 Size of the parameter groups given used by the solver for 2092 performing inner iterations. This maybe different from 2093 :member:`Solver::Summary::inner_iteration_ordering_given` if the 2094 user left :member:`Solver::Summary::inner_iteration_ordering_given` 2095 blank and asked for an automatic ordering, or if the problem 2096 contains some constant or inactive parameter blocks. 2097 2098 .. member:: PreconditionerType Solver::Summary::preconditioner_type 2099 2100 Type of preconditioner used for solving the trust region step. Only 2101 meaningful when an iterative linear solver is used. 2102 2103 .. member:: VisibilityClusteringType Solver::Summary::visibility_clustering_type 2104 2105 Type of clustering algorithm used for visibility based 2106 preconditioning. Only meaningful when the 2107 :member:`Solver::Summary::preconditioner_type` is 2108 ``CLUSTER_JACOBI`` or ``CLUSTER_TRIDIAGONAL``. 2109 2110 .. member:: TrustRegionStrategyType Solver::Summary::trust_region_strategy_type 2111 2112 Type of trust region strategy. 2113 2114 .. member:: DoglegType Solver::Summary::dogleg_type 2115 2116 Type of dogleg strategy used for solving the trust region problem. 2117 2118 .. member:: DenseLinearAlgebraLibraryType Solver::Summary::dense_linear_algebra_library_type 2119 2120 Type of the dense linear algebra library used. 2121 2122 .. member:: SparseLinearAlgebraLibraryType Solver::Summary::sparse_linear_algebra_library_type 2123 2124 Type of the sparse linear algebra library used. 2125 2126 .. member:: LineSearchDirectionType Solver::Summary::line_search_direction_type 2127 2128 Type of line search direction used. 2129 2130 .. member:: LineSearchType Solver::Summary::line_search_type 2131 2132 Type of the line search algorithm used. 2133 2134 .. member:: LineSearchInterpolationType Solver::Summary::line_search_interpolation_type 2135 2136 When performing line search, the degree of the polynomial used to 2137 approximate the objective function. 2138 2139 .. member:: NonlinearConjugateGradientType Solver::Summary::nonlinear_conjugate_gradient_type 2140 2141 If the line search direction is `NONLINEAR_CONJUGATE_GRADIENT`, 2142 then this indicates the particular variant of non-linear conjugate 2143 gradient used. 2144 2145 .. member:: int Solver::Summary::max_lbfgs_rank 2146 2147 If the type of the line search direction is `LBFGS`, then this 2148 indicates the rank of the Hessian approximation. 2149 2150 Covariance Estimation 2151 ===================== 2152 2153 Background 2154 ---------- 2155 2156 One way to assess the quality of the solution returned by a 2157 non-linear least squares solve is to analyze the covariance of the 2158 solution. 2159 2160 Let us consider the non-linear regression problem 2161 2162 .. math:: y = f(x) + N(0, I) 2163 2164 i.e., the observation :math:`y` is a random non-linear function of the 2165 independent variable :math:`x` with mean :math:`f(x)` and identity 2166 covariance. Then the maximum likelihood estimate of :math:`x` given 2167 observations :math:`y` is the solution to the non-linear least squares 2168 problem: 2169 2170 .. math:: x^* = \arg \min_x \|f(x)\|^2 2171 2172 And the covariance of :math:`x^*` is given by 2173 2174 .. math:: C(x^*) = \left(J'(x^*)J(x^*)\right)^{-1} 2175 2176 Here :math:`J(x^*)` is the Jacobian of :math:`f` at :math:`x^*`. The 2177 above formula assumes that :math:`J(x^*)` has full column rank. 2178 2179 If :math:`J(x^*)` is rank deficient, then the covariance matrix :math:`C(x^*)` 2180 is also rank deficient and is given by the Moore-Penrose pseudo inverse. 2181 2182 .. math:: C(x^*) = \left(J'(x^*)J(x^*)\right)^{\dagger} 2183 2184 Note that in the above, we assumed that the covariance matrix for 2185 :math:`y` was identity. This is an important assumption. If this is 2186 not the case and we have 2187 2188 .. math:: y = f(x) + N(0, S) 2189 2190 Where :math:`S` is a positive semi-definite matrix denoting the 2191 covariance of :math:`y`, then the maximum likelihood problem to be 2192 solved is 2193 2194 .. math:: x^* = \arg \min_x f'(x) S^{-1} f(x) 2195 2196 and the corresponding covariance estimate of :math:`x^*` is given by 2197 2198 .. math:: C(x^*) = \left(J'(x^*) S^{-1} J(x^*)\right)^{-1} 2199 2200 So, if it is the case that the observations being fitted to have a 2201 covariance matrix not equal to identity, then it is the user's 2202 responsibility that the corresponding cost functions are correctly 2203 scaled, e.g. in the above case the cost function for this problem 2204 should evaluate :math:`S^{-1/2} f(x)` instead of just :math:`f(x)`, 2205 where :math:`S^{-1/2}` is the inverse square root of the covariance 2206 matrix :math:`S`. 2207 2208 Gauge Invariance 2209 ---------------- 2210 2211 In structure from motion (3D reconstruction) problems, the 2212 reconstruction is ambiguous upto a similarity transform. This is 2213 known as a *Gauge Ambiguity*. Handling Gauges correctly requires the 2214 use of SVD or custom inversion algorithms. For small problems the 2215 user can use the dense algorithm. For more details see the work of 2216 Kanatani & Morris [KanataniMorris]_. 2217 2218 2219 :class:`Covariance` 2220 ------------------- 2221 2222 :class:`Covariance` allows the user to evaluate the covariance for a 2223 non-linear least squares problem and provides random access to its 2224 blocks. The computation assumes that the cost functions compute 2225 residuals such that their covariance is identity. 2226 2227 Since the computation of the covariance matrix requires computing the 2228 inverse of a potentially large matrix, this can involve a rather large 2229 amount of time and memory. However, it is usually the case that the 2230 user is only interested in a small part of the covariance 2231 matrix. Quite often just the block diagonal. :class:`Covariance` 2232 allows the user to specify the parts of the covariance matrix that she 2233 is interested in and then uses this information to only compute and 2234 store those parts of the covariance matrix. 2235 2236 Rank of the Jacobian 2237 -------------------- 2238 2239 As we noted above, if the Jacobian is rank deficient, then the inverse 2240 of :math:`J'J` is not defined and instead a pseudo inverse needs to be 2241 computed. 2242 2243 The rank deficiency in :math:`J` can be *structural* -- columns 2244 which are always known to be zero or *numerical* -- depending on the 2245 exact values in the Jacobian. 2246 2247 Structural rank deficiency occurs when the problem contains parameter 2248 blocks that are constant. This class correctly handles structural rank 2249 deficiency like that. 2250 2251 Numerical rank deficiency, where the rank of the matrix cannot be 2252 predicted by its sparsity structure and requires looking at its 2253 numerical values is more complicated. Here again there are two 2254 cases. 2255 2256 a. The rank deficiency arises from overparameterization. e.g., a 2257 four dimensional quaternion used to parameterize :math:`SO(3)`, 2258 which is a three dimensional manifold. In cases like this, the 2259 user should use an appropriate 2260 :class:`LocalParameterization`. Not only will this lead to better 2261 numerical behaviour of the Solver, it will also expose the rank 2262 deficiency to the :class:`Covariance` object so that it can 2263 handle it correctly. 2264 2265 b. More general numerical rank deficiency in the Jacobian requires 2266 the computation of the so called Singular Value Decomposition 2267 (SVD) of :math:`J'J`. We do not know how to do this for large 2268 sparse matrices efficiently. For small and moderate sized 2269 problems this is done using dense linear algebra. 2270 2271 2272 :class:`Covariance::Options` 2273 2274 .. class:: Covariance::Options 2275 2276 .. member:: int Covariance::Options::num_threads 2277 2278 Default: ``1`` 2279 2280 Number of threads to be used for evaluating the Jacobian and 2281 estimation of covariance. 2282 2283 .. member:: CovarianceAlgorithmType Covariance::Options::algorithm_type 2284 2285 Default: ``SUITE_SPARSE_QR`` if ``SuiteSparseQR`` is installed and 2286 ``EIGEN_SPARSE_QR`` otherwise. 2287 2288 Ceres supports three different algorithms for covariance 2289 estimation, which represent different tradeoffs in speed, accuracy 2290 and reliability. 2291 2292 1. ``DENSE_SVD`` uses ``Eigen``'s ``JacobiSVD`` to perform the 2293 computations. It computes the singular value decomposition 2294 2295 .. math:: U S V^\top = J 2296 2297 and then uses it to compute the pseudo inverse of J'J as 2298 2299 .. math:: (J'J)^{\dagger} = V S^{\dagger} V^\top 2300 2301 It is an accurate but slow method and should only be used for 2302 small to moderate sized problems. It can handle full-rank as 2303 well as rank deficient Jacobians. 2304 2305 2. ``EIGEN_SPARSE_QR`` uses the sparse QR factorization algorithm 2306 in ``Eigen`` to compute the decomposition 2307 2308 .. math:: 2309 2310 QR &= J\\ 2311 \left(J^\top J\right)^{-1} &= \left(R^\top R\right)^{-1} 2312 2313 It is a moderately fast algorithm for sparse matrices. 2314 2315 3. ``SUITE_SPARSE_QR`` uses the sparse QR factorization algorithm 2316 in ``SuiteSparse``. It uses dense linear algebra and is multi 2317 threaded, so for large sparse sparse matrices it is 2318 significantly faster than ``EIGEN_SPARSE_QR``. 2319 2320 Neither ``EIGEN_SPARSE_QR`` nor ``SUITE_SPARSE_QR`` are capable of 2321 computing the covariance if the Jacobian is rank deficient. 2322 2323 .. member:: int Covariance::Options::min_reciprocal_condition_number 2324 2325 Default: :math:`10^{-14}` 2326 2327 If the Jacobian matrix is near singular, then inverting :math:`J'J` 2328 will result in unreliable results, e.g, if 2329 2330 .. math:: 2331 2332 J = \begin{bmatrix} 2333 1.0& 1.0 \\ 2334 1.0& 1.0000001 2335 \end{bmatrix} 2336 2337 which is essentially a rank deficient matrix, we have 2338 2339 .. math:: 2340 2341 (J'J)^{-1} = \begin{bmatrix} 2342 2.0471e+14& -2.0471e+14 \\ 2343 -2.0471e+14 2.0471e+14 2344 \end{bmatrix} 2345 2346 2347 This is not a useful result. Therefore, by default 2348 :func:`Covariance::Compute` will return ``false`` if a rank 2349 deficient Jacobian is encountered. How rank deficiency is detected 2350 depends on the algorithm being used. 2351 2352 1. ``DENSE_SVD`` 2353 2354 .. math:: \frac{\sigma_{\text{min}}}{\sigma_{\text{max}}} < \sqrt{\text{min_reciprocal_condition_number}} 2355 2356 where :math:`\sigma_{\text{min}}` and 2357 :math:`\sigma_{\text{max}}` are the minimum and maxiumum 2358 singular values of :math:`J` respectively. 2359 2360 2. ``EIGEN_SPARSE_QR`` and ``SUITE_SPARSE_QR`` 2361 2362 .. math:: \operatorname{rank}(J) < \operatorname{num\_col}(J) 2363 2364 Here :\math:`\operatorname{rank}(J)` is the estimate of the 2365 rank of `J` returned by the sparse QR factorization 2366 algorithm. It is a fairly reliable indication of rank 2367 deficiency. 2368 2369 .. member:: int Covariance::Options::null_space_rank 2370 2371 When using ``DENSE_SVD``, the user has more control in dealing 2372 with singular and near singular covariance matrices. 2373 2374 As mentioned above, when the covariance matrix is near singular, 2375 instead of computing the inverse of :math:`J'J`, the Moore-Penrose 2376 pseudoinverse of :math:`J'J` should be computed. 2377 2378 If :math:`J'J` has the eigen decomposition :math:`(\lambda_i, 2379 e_i)`, where :math:`lambda_i` is the :math:`i^\textrm{th}` 2380 eigenvalue and :math:`e_i` is the corresponding eigenvector, then 2381 the inverse of :math:`J'J` is 2382 2383 .. math:: (J'J)^{-1} = \sum_i \frac{1}{\lambda_i} e_i e_i' 2384 2385 and computing the pseudo inverse involves dropping terms from this 2386 sum that correspond to small eigenvalues. 2387 2388 How terms are dropped is controlled by 2389 `min_reciprocal_condition_number` and `null_space_rank`. 2390 2391 If `null_space_rank` is non-negative, then the smallest 2392 `null_space_rank` eigenvalue/eigenvectors are dropped irrespective 2393 of the magnitude of :math:`\lambda_i`. If the ratio of the 2394 smallest non-zero eigenvalue to the largest eigenvalue in the 2395 truncated matrix is still below min_reciprocal_condition_number, 2396 then the `Covariance::Compute()` will fail and return `false`. 2397 2398 Setting `null_space_rank = -1` drops all terms for which 2399 2400 .. math:: \frac{\lambda_i}{\lambda_{\textrm{max}}} < \textrm{min_reciprocal_condition_number} 2401 2402 This option has no effect on ``EIGEN_SPARSE_QR`` and 2403 ``SUITE_SPARSE_QR``. 2404 2405 .. member:: bool Covariance::Options::apply_loss_function 2406 2407 Default: `true` 2408 2409 Even though the residual blocks in the problem may contain loss 2410 functions, setting ``apply_loss_function`` to false will turn off 2411 the application of the loss function to the output of the cost 2412 function and in turn its effect on the covariance. 2413 2414 .. class:: Covariance 2415 2416 :class:`Covariance::Options` as the name implies is used to control 2417 the covariance estimation algorithm. Covariance estimation is a 2418 complicated and numerically sensitive procedure. Please read the 2419 entire documentation for :class:`Covariance::Options` before using 2420 :class:`Covariance`. 2421 2422 .. function:: bool Covariance::Compute(const vector<pair<const double*, const double*> >& covariance_blocks, Problem* problem) 2423 2424 Compute a part of the covariance matrix. 2425 2426 The vector ``covariance_blocks``, indexes into the covariance 2427 matrix block-wise using pairs of parameter blocks. This allows the 2428 covariance estimation algorithm to only compute and store these 2429 blocks. 2430 2431 Since the covariance matrix is symmetric, if the user passes 2432 ``<block1, block2>``, then ``GetCovarianceBlock`` can be called with 2433 ``block1``, ``block2`` as well as ``block2``, ``block1``. 2434 2435 ``covariance_blocks`` cannot contain duplicates. Bad things will 2436 happen if they do. 2437 2438 Note that the list of ``covariance_blocks`` is only used to 2439 determine what parts of the covariance matrix are computed. The 2440 full Jacobian is used to do the computation, i.e. they do not have 2441 an impact on what part of the Jacobian is used for computation. 2442 2443 The return value indicates the success or failure of the covariance 2444 computation. Please see the documentation for 2445 :class:`Covariance::Options` for more on the conditions under which 2446 this function returns ``false``. 2447 2448 .. function:: bool GetCovarianceBlock(const double* parameter_block1, const double* parameter_block2, double* covariance_block) const 2449 2450 Return the block of the covariance matrix corresponding to 2451 ``parameter_block1`` and ``parameter_block2``. 2452 2453 Compute must be called before the first call to ``GetCovarianceBlock`` 2454 and the pair ``<parameter_block1, parameter_block2>`` OR the pair 2455 ``<parameter_block2, parameter_block1>`` must have been present in the 2456 vector covariance_blocks when ``Compute`` was called. Otherwise 2457 ``GetCovarianceBlock`` will return false. 2458 2459 ``covariance_block`` must point to a memory location that can store 2460 a ``parameter_block1_size x parameter_block2_size`` matrix. The 2461 returned covariance will be a row-major matrix. 2462 2463 Example Usage 2464 ------------- 2465 2466 .. code-block:: c++ 2467 2468 double x[3]; 2469 double y[2]; 2470 2471 Problem problem; 2472 problem.AddParameterBlock(x, 3); 2473 problem.AddParameterBlock(y, 2); 2474 <Build Problem> 2475 <Solve Problem> 2476 2477 Covariance::Options options; 2478 Covariance covariance(options); 2479 2480 vector<pair<const double*, const double*> > covariance_blocks; 2481 covariance_blocks.push_back(make_pair(x, x)); 2482 covariance_blocks.push_back(make_pair(y, y)); 2483 covariance_blocks.push_back(make_pair(x, y)); 2484 2485 CHECK(covariance.Compute(covariance_blocks, &problem)); 2486 2487 double covariance_xx[3 * 3]; 2488 double covariance_yy[2 * 2]; 2489 double covariance_xy[3 * 2]; 2490 covariance.GetCovarianceBlock(x, x, covariance_xx) 2491 covariance.GetCovarianceBlock(y, y, covariance_yy) 2492 covariance.GetCovarianceBlock(x, y, covariance_xy) 2493