Home | History | Annotate | Download | only in docs
      1 # SafetyNet - Performance regression detection for PDFium
      2 
      3 [TOC]
      4 
      5 This document explains how to use SafetyNet to detect performance regressions
      6 in PDFium.
      7 
      8 ## Comparing performance of two versions of PDFium
      9 
     10 safetynet_compare.py is a script that compares the performance between two
     11 versions of pdfium. This can be used to verify if a given change has caused
     12 or will cause any positive or negative changes in performance for a set of test
     13 cases.
     14 
     15 The supported profilers are exclusive to Linux, so for now this can only be run
     16 on Linux.
     17 
     18 An illustrative example is below, comparing the local code version to an older
     19 version. Positive % changes mean an increase in time/instructions to run the
     20 test - a regression, while negative % changes mean a decrease in
     21 time/instructions, therefore an improvement.
     22 
     23 ```
     24 $ testing/tools/safetynet_compare.py ~/test_pdfs --branch-before beef5e4
     25 ================================================================================
     26    % Change      Time after  Test case
     27 --------------------------------------------------------------------------------
     28    -0.1980%  45,703,820,326  ~/test_pdfs/PDF Reference 1-7.pdf
     29    -0.5678%      42,038,814  ~/test_pdfs/Page 24 - PDF Reference 1-7.pdf
     30    +0.2666%  10,983,158,809  ~/test_pdfs/Rival.pdf
     31    +0.0447%  10,413,890,748  ~/test_pdfs/dynamic.pdf
     32    -7.7228%      26,161,171  ~/test_pdfs/encrypted1234.pdf
     33    -0.2763%     102,084,398  ~/test_pdfs/ghost.pdf
     34    -3.7005%  10,800,642,262  ~/test_pdfs/musician.pdf
     35    -0.2266%  45,691,618,789  ~/test_pdfs/no_metadata.pdf
     36    +1.4440%  38,442,606,162  ~/test_pdfs/test7.pdf
     37    +0.0335%       9,286,083  ~/test_pdfs/testbulletpoint.pdf
     38 ================================================================================
     39 Test cases run: 10
     40 Failed to measure: 0
     41 Regressions: 0
     42 Improvements: 2
     43 ```
     44 
     45 ### Usage
     46 
     47 Run the safetynet_compare.py script in testing/tools to perform a comparison.
     48 Pass one or more paths with test cases - each path can be either a .pdf file or
     49 a directory containing .pdf files. Other files in those directories are
     50 ignored.
     51 
     52 The following comparison modes are supported:
     53 
     54 1. Compare uncommitted changes against clean branch:
     55 ```shell
     56 $ testing/tools/safetynet_compare.py path/to/pdfs
     57 ```
     58 
     59 2. Compare current branch with another branch or commit:
     60 ```shell
     61 $ testing/tools/safetynet_compare.py path/to/pdfs --branch-before another_branch
     62 $ testing/tools/safetynet_compare.py path/to/pdfs --branch-before 1a3c5e7
     63 ```
     64 
     65 3. Compare two other branches or commits:
     66 ```shell
     67 $ testing/tools/safetynet_compare.py path/to/pdfs --branch-after another_branch --branch-before yet_another_branch
     68 $ testing/tools/safetynet_compare.py path/to/pdfs --branch-after 1a3c5e7 --branch-before 0b2d4f6
     69 $ testing/tools/safetynet_compare.py path/to/pdfs --branch-after another_branch --branch-before 0b2d4f6
     70 ```
     71 
     72 4. Compare two build flag configurations:
     73 ```shell
     74 $ gn args out/BuildConfig1
     75 $ gn args out/BuildConfig2
     76 $ testing/tools/safetynet_compare.py path/to/pdfs --build-dir out/BuildConfig2 --build-dir-before out/BuildConfig1
     77 ```
     78 
     79 safetynet_compare.py takes care of checking out the appropriate branch, building
     80 it, running the test cases and comparing results.
     81 
     82 ### Profilers
     83 
     84 safetynet_compare.py uses callgrind as a profiler by default. Use --profiler
     85 to specify another one. The supported ones are:
     86 
     87 #### perfstat
     88 
     89 Only works on Linux.
     90 Make sure you have perf by typing in the terminal:
     91 ```shell
     92 $ perf
     93 ```
     94 
     95 This is a fast profiler, but uses sampling so it's slightly inaccurate.
     96 Expect variations of up to 1%, which is below the cutoff to consider a
     97 change significant.
     98 
     99 Use this when running over large test sets to get good enough results.
    100 
    101 #### callgrind
    102 
    103 Only works on Linux.
    104 Make sure valgrind is installed:
    105 ```shell
    106 $ valgrind
    107 ```
    108 
    109 This is a slow and accurate profiler. Expect variations of around 100
    110 instructions. However, this takes about 50 times longer to run than perf stat.
    111 
    112 Use this when looking for small variations (< 1%).
    113 
    114 One advantage is that callgrind can generate `callgrind.out` files (by passing
    115 --output-dir to safetynet_compare.py), which contain profiling information that
    116 can be analyzed to find the cause of a regression. KCachegrind is a good
    117 visualizer for these files.
    118 
    119 ### Common Options
    120 
    121 Arguments commonly passed to safetynet_compare.py.
    122 
    123 * --profiler: described above.
    124 * --build-dir: this specified the build config with a relative path from the
    125 pdfium src directory to the build directory. Defaults to out/Release.
    126 * --output-dir: where to place the profiling output files. These are
    127 callgrind.out.[test_case] files for callgrind, perfstat does not produce them.
    128 By default they are not written.
    129 * --case-order: sort test case results according to this metric. Can be "after",
    130 "before", "ratio" and "rating". If not specified, sort by path.
    131 * --this-repo: use the repository where the script is instead of checking out a
    132 temporary one. This is faster and does not require downloads. Although it
    133 restores the state of the local repo, if the script is killed or crashes the
    134 uncommitted changes can remain stashed and you may be on another branch.
    135 
    136 ### Other Options
    137 
    138 Most of the time these don't need to be used.
    139 
    140 * --build-dir-before: if comparing different build dirs (say, to test what a
    141 flag flip does), specify the build dir for the before branch here and the
    142 build dir for the after branch with --build-dir.
    143 * --interesting-section: only the interesting section should be measured instead
    144 of all the execution of the test harness. This only works in debug, since in
    145 release the delimiters are stripped out. This does not work to compare branches
    146 that dont have the callgrind delimiters, as it would otherwise be unfair to
    147 compare a whole run vs the interesting section of another run.
    148 * --machine-readable: output a json with the results that is easier to read by
    149 code.
    150 * --num-workers: how many workers to use to parallelize test case runs. Defaults
    151 to # of CPUs in the machine.
    152 * --threshold-significant: highlight differences that exceed this value.
    153 Defaults to 0.02.
    154 * --tmp-dir: directory in which temporary repos will be cloned and downloads
    155 will be cached, if --this-repo is not enabled. Defaults to /tmp.
    156 
    157 ## Setup a nightly job
    158 
    159 TODO: Complete with safetynet_job.py setup and usage.
    160