Home | History | Annotate | Download | only in docs
      1 # Build Performance
      2 
      3 ## Debugging Build Performance
      4 
      5 ### Tracing
      6 
      7 soong_ui has tracing built in, so that every build execution's trace can be
      8 viewed.  Just open `$OUT_DIR/build.trace.gz` in Chrome's <chrome://tracing>, or
      9 with [catapult's trace viewer][catapult trace_viewer]. The last few traces are
     10 stored in `build.trace.#.gz` (larger numbers are older). The associated logs
     11 are stored in `soong.#.log`.
     12 
     13 ![trace example](./trace_example.png)
     14 
     15 ### Soong
     16 
     17 Soong can be traced and profiled using the standard Go tools. It understands
     18 the `-cpuprofile`, `-trace`, and `-memprofile` command line arguments, but we
     19 don't currently have an easy way to enable them in the context of a full build.
     20 
     21 ### Kati
     22 
     23 In general, the slow path of reading Android.mk files isn't particularly
     24 performance sensitive, since it doesn't need to happen on every build. It is
     25 important for the fast-path (detecting whether it needs to regenerate the ninja
     26 file) to be fast however. And it shouldn't hit the slow path too often -- so
     27 don't rely on output of a `$(shell)` command that includes the current timestamp,
     28 or read a file that's going to change on every build.
     29 
     30 #### Regen check is slow
     31 
     32 In most cases, we've found that the fast-path is slow because all of the
     33 `$(shell)` commands need to be re-executed to determine if their output changed.
     34 The `$OUT_DIR/soong.log` contains statistics from the regen check:
     35 
     36 ```
     37 .../kati.go:127: *kati*: regen check time: 1.699207
     38 .../kati.go:127: *kati*: glob time (regen): 0.377193 / 33609
     39 .../kati.go:127: *kati*: shell time (regen): 1.313529 / 184
     40 .../kati.go:127: *kati*:  0.217 find device vendor -type f -name \*.pk8 -o -name verifiedboot\* -o -name \*.x509.pem -o -name oem\*.prop | sort
     41 .../kati.go:127: *kati*:  0.105 cd packages/apps/Dialer ; find -L . -type d -name "res"
     42 .../kati.go:127: *kati*:  0.035 find device vendor -maxdepth 4 -name '*_aux_variant_config.mk' -o -name '*_aux_os_config.mk' | sort
     43 .../kati.go:127: *kati*:  0.029 cd frameworks/base ; find -L  core/java graphics/java location/java media/java media/mca/effect/java media/mca/filterfw/java media/mca/filterpacks/java drm/java opengl/java sax/java telecomm/java telephony/java wifi/java lowpan/java keystore/java rs/java ../opt/telephony/src/java/android/telephony ../opt/telephony/src/java/android/telephony/gsm ../opt/net/voip/src/java/android/net/rtp ../opt/net/voip/src/java/android/net/sip   -name "*.html" -and -not -name ".*"
     44 .../kati.go:127: *kati*:  0.025 test -d device && find -L device -maxdepth 4 -path '*/marlin/BoardConfig.mk'
     45 .../kati.go:127: *kati*:  0.023 find packages/apps/Settings/tests/robotests -type f -name '*Test.java' | sed -e 's!.*\(com/google.*Test\)\.java!\1!' -e 's!.*\(com/android.*Test\)\.java!\1!' | sed 's!/!\.!g' | cat
     46 .../kati.go:127: *kati*:  0.022 test -d vendor && find -L vendor -maxdepth 4 -path '*/marlin/BoardConfig.mk'
     47 .../kati.go:127: *kati*:  0.017 cd cts/tests/tests/shortcutmanager/packages/launchermanifest ; find -L  ../src -name "*.java" -and -not -name ".*"
     48 .../kati.go:127: *kati*:  0.016 cd cts/tests/tests/shortcutmanager/packages/launchermanifest ; find -L  ../../common/src -name "*.java" -and -not -name ".*"
     49 .../kati.go:127: *kati*:  0.015 cd libcore && (find luni/src/test/java -name "*.java" 2> /dev/null) | grep -v -f java_tests_blacklist
     50 .../kati.go:127: *kati*: stat time (regen): 0.250384 / 4405
     51 ```
     52 
     53 In this case, the total time spent checking was 1.69 seconds, even though the
     54 other "(regen)" numbers add up to more than that (some parts are parallelized
     55 where possible). The biggest contributor is the `$(shell)` times -- 184
     56 executions took a total of 1.31 seconds. The top 10 longest shell functions are
     57 printed.
     58 
     59 All the longest commands in this case are all variants of a call to `find`, but
     60 this is where using pure make functions instead of calling out to the shell can
     61 make a performance impact -- many calls to check if `26 > 20` can add up. We've
     62 added some basic math functions in `math.mk` to help some common use cases that
     63 used to be rather expensive when they were used too often.
     64 
     65 There are some optimizations in place for find commands -- if Kati can
     66 understand the find command, the built-in find emulator can turn some of them
     67 into glob or stat checks (falling back to calling `find` if one of those imply
     68 that the output may change). Many of the common macros produce find commands
     69 that Kati can understand, but if you're writing your own, you may want to
     70 experiment with other options if they're showing up in this list. For example,
     71 if this was significantly more expensive (either in runtime, or was called
     72 often):
     73 
     74 ```
     75 .../kati.go:127: *kati*:  0.015 cd libcore && (find luni/src/test/java -name "*.java" 2> /dev/null) | grep -v -f java_tests_blacklist
     76 ```
     77 
     78 It may be more efficient to move the grep into make, so that the `find` portion
     79 can be rewritten and cached:
     80 
     81 ```
     82 $(filter-out $(file <$(LOCAL_PATH)/java_tests_blacklist),$(call all-java-files-under,luni/src/test/java))
     83 ```
     84 
     85 Others can be simplified by just switching to an equivalent find command that
     86 Kati understands:
     87 
     88 ```
     89 .../kati.go:127: *kati*:  0.217 find device vendor -type f -name \*.pk8 -o -name verifiedboot\* -o -name \*.x509.pem -o -name oem\*.prop | sort
     90 ```
     91 
     92 By adding the implicit `-a` and moving the `| sort` to Make, this can now be
     93 cached by Kati:
     94 
     95 ```
     96 $(sort $(shell find device vendor -type -f -a -name \*.pk8 -o -name verifiedboot\* -o -name \*.x509.pem -o -name oem\*.prop))
     97 ```
     98 
     99 Kati is learning about the implicit `-a` in [this change](https://github.com/google/kati/pull/132)
    100 
    101 #### Kati regens too often
    102 
    103 Kati prints out what triggered the slow path to be taken -- this can be a
    104 changed file, a changed environment variable, or different output from a
    105 `$(shell)` command:
    106 
    107 ```
    108 out/soong/Android-aosp_arm.mk was modified, regenerating...
    109 ```
    110 
    111 The state is stored in `$OUT_DIR/.kati_stamp*` files, and can be (partially)
    112 read with the `ckati_stamp_dump` tool in prebuilts/build-tools. More debugging
    113 is available when ckati is run with `--regen_debug`, but that can be a lot of
    114 data to understand.
    115 
    116 ### Ninja
    117 
    118 #### Understanding why something rebuilt
    119 
    120 Add `NINJA_ARGS="-d explain"` to your environment before a build, this will cause
    121 ninja to print out explanations on why actions were taken. Start reading from the
    122 beginning, as this much data can be hard to read:
    123 
    124 ```
    125 $ cd art
    126 $ mma
    127 $ touch runtime/jit/profile_compilation_info.h
    128 $ NINJA_ARGS="-d explain" mma
    129 ...
    130 ninja explain: output out/soong/.intermediates/art/tools/cpp-define-generator/cpp-define-generator-data/linux_glibc_x86_64/obj/art/tools/cpp-define-generator/main.o older than most recent input art/runtime/jit/profile_compilation_info.h (
    131 1516683538 vs 1516685188)
    132 ninja explain: out/soong/.intermediates/art/tools/cpp-define-generator/cpp-define-generator-data/linux_glibc_x86_64/obj/art/tools/cpp-define-generator/main.o is dirty
    133 ninja explain: out/soong/.intermediates/art/tools/cpp-define-generator/cpp-define-generator-data/linux_glibc_x86_64/cpp-define-generator-data is dirty
    134 ninja explain: out/soong/host/linux-x86/bin/cpp-define-generator-data is dirty
    135 ninja explain: out/soong/.intermediates/art/tools/cpp-define-generator/cpp-define-generator-asm-support/gen/asm_support_gen.h is dirty
    136 ninja explain: out/soong/.intermediates/art/cmdline/art_cmdline_tests/android_arm_armv7-a_core_cmdline_parser_test/obj/art/cmdline/cmdline_parser_test.o is dirty
    137 ...
    138 ```
    139 
    140 In this case, art/cmdline/cmdline_parser_test.o was rebuilt because it uses
    141 asm_support_gen.h, which was generated by cpp-define-generator-data, which uses
    142 profile_compilation_info.h.
    143 
    144 You'll likely need to cross-reference this data against the build graph in the
    145 various .ninja files. The files are (mostly) human-readable, but a (slow) web
    146 interface can be used by running `NINJA_ARGS="-t browse <target>" m`.
    147 
    148 #### Builds take a long time
    149 
    150 If the long part in the trace view of a build is a relatively solid block, then
    151 the performance is probably more related to how much time the actual build
    152 commands are taking than having extra dependencies, or slowdowns in
    153 soong/kati/ninja themselves.
    154 
    155 Beyond looking at visible outliers in the trace view, we don't have any tooling
    156 to help in this area yet. It's possible to aggregate some of the raw data
    157 together, but since our builds are heavily parallelized, it's particularly easy
    158 for build commands to impact unrelated build commands. This is an area we'd
    159 like to improve -- we expect keeping track of user/system time per-action would
    160 provide more reliable data, but tracking some full-system data (memory/swap
    161 use, disk bandwidth, etc) may also be necessary.
    162 
    163 ## Known Issues
    164 
    165 ### Common
    166 
    167 #### mm
    168 
    169 Soong always loads the entire module graph, so as modules convert from Make to
    170 Soong, `mm` is becoming closer to `mma`. This produces more correct builds, but
    171 does slow down builds, as we need to verify/produce/load a larger build graph.
    172 
    173 We're exploring a few options to speed up build startup, one being [an
    174 experimental set of ninja patches][ninja parse optimization],
    175 though that's not the current path we're working towards.
    176 
    177 ### Android 8.1 (Oreo MR1)
    178 
    179 In some cases, a tree would get into a state where Soong would be run twice on
    180 every incremental build, even if there was nothing to do. This was fixed in
    181 master with [these changes][blueprint_microfactory], but they were too
    182 significant to backport at the time. And while they fix this particular issue,
    183 they appear to cause ninja to spend more time during every build loading the
    184 `.ninja_log` / `.ninja_deps` files, especially as they become larger.
    185 
    186 A workaround to get out of this state is to remove the build.ninja entry from
    187 `$OUT_DIR/.ninja_log`:
    188 
    189 ```
    190 sed -i "/\/build.ninja/d" $(get_build_var OUT_DIR)/.ninja_log
    191 ```
    192 
    193 [catapult trace_viewer]: https://github.com/catapult-project/catapult/blob/master/tracing/README.md
    194 [ninja parse optimization]: https://android-review.googlesource.com/c/platform/external/ninja/+/461005
    195 [blueprint_microfactory]: https://android-review.googlesource.com/q/topic:%22blueprint_microfactory%22+status:merged
    196