Home | History | Annotate | Download | only in binary_search_tool
      1 This document is for future maintainers of the binary search/bisection tools.
      2 
      3 Authors:
      4   * Original Tool: asharif@, llozano@, cmtice@
      5   * Updates after May 2016: cburden@
      6   * chromeos-toolchain@
      7 
      8 The following are good reference materials on how the tool works:
      9   * Ahmad's original presentation:
     10     https://goto.google.com/zxdfyi
     11 
     12   * Bisection tool update design doc:
     13     https://goto.google.com/zcwei
     14 
     15   * Bisection tool webpage:
     16     https://goto.google.com/ruwpyi
     17 
     18   * Compiler wrapper webpage:
     19     https://goto.google.com/xossn
     20 
     21 
     22 TESTING:
     23 All unit tests live under the ./test directory. However, these tests
     24 specifically test binary_search_state.py, binary_search_perforce.py, bisect.py.
     25 These unit tests will not test the specific logic for ChromeOS/Android
     26 bisection. To test the ChromeOS/Android bisectors, use the common/hash_test.sh
     27 test. This is a simple test case that just checks the hashes of files on your
     28 file system. This means you won't have to find a specific compiler error for
     29 the bisector to triage in order to test each bisector.
     30 
     31 TODO:
     32 The bisection tool (I believe) is in a fairly good state. So these are mostly
     33 wishlist items and things that could use some improvement.
     34 
     35   1. Get rid of binary_search_perforce.py. This file is mostly legacy code and
     36      the majority of it isn't even used to bisect object files. The file was
     37      originally intended to bisect CLs, and binary_search_state.py just reused
     38      the binary searching logic from it. Maybe just extract the binary searching
     39      logic from binary_search_perforce.py and put it in its own module in
     40      cros_utils?
     41 
     42   2. Cleanup unit tests in ./test. These tests are a little hacked together,
     43      and are all under one test suite. Maybe consider organizing them across
     44      multiple directories.
     45 
     46   3. Create a "checkout setup" system for bisection. Currently if you want to
     47      bisect, you have to run scripts/edit sources in this repo. Ideally these
     48      scripts would be static, and if you wanted to bisect/make changes you would
     49      "checkout" or copy all the scripts to a working directory and have a unique
     50      working directory for each bisection. Credits to Luis for this idea =)
     51 
     52   4. Make all scripts relative to each other. Currently all scripts enforce the
     53      idea that their cwd will be ./binary_search_tool/. But it would be less
     54      confusing to have each script relative to each other. There's quite a few
     55      stackoverflow topics on how to do this best, but each one has some sort of
     56      downside or flaw.
     57 
     58   5. Overall modularize code more, especially in binary_search_state.py
     59 
     60 DESIGN EXPLANATIONS:
     61 Some of the design decisions are a bit difficult to understand from just reading
     62 the code unfortunately. I will attempt to clear up the major offenders of this:
     63 
     64   1. common.py's argument dictionary:
     65      binary_search_state.py and bisect.py both have to have near identical
     66      arguments in order to support argument overriding in bisect.py. However
     67      they do have to be slightly different. Mainly, bisect.py needs to have no
     68      default values for arguments (so it can determine what's being overriden).
     69 
     70      In order to reduce huge amounts of code duplication for the argument
     71      building, we put argument building in common.py. That way both modules
     72      can reference the arguments, and they can have different configurations
     73      across both.
     74 
     75   2. Compiler wrapper:
     76      The compiler wrapper is called before all compiler calls. It exists to
     77      trick whatever build system (make, emerge, etc.) into thinking our
     78      bisection is just a normal build, when really we're doing some tricks.
     79 
     80      The biggest benefit the compiler wrapper gives is: knowing for sure which
     81      files are actually generated by the compiler during bisection setup, and
     82      potentially being able to skip compilations while triaging (speeding up the
     83      triaging process significantly).
     84 
     85   3. The weird options for the --verify, --verbose, --file_args, etc. arguments:
     86      Some of the arguments for the bisection tool have a weird set of options
     87      for the AddArgument method (nargs, const, default, StrToBool). This is so
     88      we can make argument overriding workable. These options allow the following
     89      functionality for a boolean argument (using --prune as an example):
     90        * --prune (prune set to True)
     91        * <not given> (prune set to False)
     92        * --prune=True (prune set to True)
     93        * --prune=False (prune set to False)
     94 
     95      The first two are easy to implement (action='store_true'), but the last two
     96      are why the extra weird arguments are required. Now, why would we want the
     97      last two? Imagine if the Android bisector set --prune=True as a default
     98      argument. With just the first two options above it would be impossible for
     99      the user to override prune and set it to False. So the user needs the
    100      --prune=False option. See the argparse documentation for more details.
    101 
    102   4. General binary searching logic/pruning logic:
    103      binary_search_state.py will enumerate all items into a list. The binary
    104      search will find the *first* bad item (starting with lowest index).
    105      Everything to the left of the "current" index is switched to good,
    106      everything to right of the "current" index is switched to bad. Once a bad
    107      item is found, it's put at the very end of the list.
    108 
    109      If prune is set, the tool will continuing searching until all bad items are
    110      found (instead of stopping after the first one). If the tool finds the same
    111      item twice, that means no more bad items exist. This is because the item
    112      was found, said item was put at the end of the list, and it was found
    113      again. Because the binary search logic finds the bad item with the lowest
    114      index, this means nothing in between the start of the list and the end of
    115      the list is bad (thus no more bad items remain).
    116