Home | History | Annotate | Download | only in kati
      1 Kati internals
      2 ==============
      3 
      4 This is an informal document about internals of kati. This document is not meant
      5 to be a comprehensive document of kati or GNU make. This explains some random
      6 topics which other programmers may be interested in.
      7 
      8 Motivation
      9 ----------
     10 
     11 The motivation of kati was to speed up Android platform build. Especially, its
     12 incremental build time was the main focus. Android platform's build system is a
     13 very unique system. It provides a DSL, (ab)using Turing-completeness of GNU
     14 make. The DSL allows developers to write build rules in a descriptive way, but
     15 the downside is it's complicated and slow.
     16 
     17 When we say a build system is slow, we consider "null build" and "full
     18 build". Null build is a build which does nothing, because all output files are
     19 already up-to-date. Full build is a build which builds everything, because there
     20 were nothing which have been already built. Actual builds in daily development
     21 are somewhere between null build and full build. Most benchmarks below were done
     22 for null build.
     23 
     24 For Android with my fairly beefy workstation, null build took ~100 secs with GNU
     25 make. This means you needed to wait ~100 secs to see if there's a compile error
     26 when you changed a single C file. To be fair, things were not that bad. There
     27 are tools called mm/mmm. They allow developers to build an individual module. As
     28 they ignore dependencies between modules, they are fast. However, you need to be
     29 somewhat experienced to use them properly. You should know which modules will be
     30 affected by your change. It would be nicer if you can just type "make" whenever
     31 you change something.
     32 
     33 This is why we started this project. We decided to create a GNU make clone from
     34 scratch, but there were some other options. One option was to replace all
     35 Android.mk by files with a better format. There is actually a longer-term
     36 project for this. Kati was planned to be a short-term project. Another option
     37 was to hack GNU make instead of developing a clone. We didn't take this option
     38 because we thought the source code of GNU make is somewhat complicated due to
     39 historical reason. It's written in old-style C, has a lot of ifdefs for some
     40 unknown architectures, etc.
     41 
     42 Currently, kati's main mode is --ninja mode. Instead of executing build commands
     43 by itself, kati generates build.ninja file and
     44 [ninja](https://github.com/martine/ninja) actually runs commands. There were
     45 some back-and-forths before kati became the current form. Some experiments
     46 succeeded and some others failed. We even changed the language for kati. At
     47 first, we wrote kati in Go. We naively expected we can get enough performance
     48 with Go. I guessed at least one of the following statements are true: 1. GNU
     49 make is not very optimized for computation heavy Makefiles, 2. Go is fast for
     50 our purpose, or 3. we can come up with some optimization tricks for Android's
     51 build system. As for 3, some of such optimization succeeded but it's performance
     52 gain didn't cancel the slowness of Go.
     53 
     54 Go's performance would be somewhat interesting topic. I didn't study the
     55 performance difference in detail, but it seemed both our use of Go and Go
     56 language itself were making the Go version of kati slower. As for our fault, I
     57 think Go version has more unnecessary string allocations than C++ version
     58 has. As for Go itself, it seemed GC was the main show-stopper. For example,
     59 Android's build system defines about one million make variables, and buffers for
     60 them will be never freed. IIRC, this kind of allocation pattern isn't good for
     61 non-generational GC.
     62 
     63 Go version and test cases were written by ukai and me, and C++ rewrite was done
     64 mostly by me. The rest of this document is mostly about the C++ version.
     65 
     66 Overall architecture
     67 --------------------
     68 
     69 Kati consists of the following components:
     70 
     71 * Parser
     72 * Evaluator
     73 * Dependency builder
     74 * Executor
     75 * Ninja generator
     76 
     77 A Makefile has some statements which consist of zero or more expressions. There
     78 are two parsers and two evaluators - one for statements and the other for
     79 expressions.
     80 
     81 Most of users of GNU make may not care about the evaluator much. However, GNU
     82 make's evaluator is very powerful and is Turing-complete. For Android's null
     83 build, most time is spent in this phase. Other tasks, such as building
     84 dependency graphs and calling stat function for build targets, are not the
     85 bottleneck. This would be a very Android specific characteristics. Android's
     86 build system uses a lot of GNU make black magics.
     87 
     88 The evaluator outputs a list of build rules and a variable table. The dependency
     89 builder creates a dependency graph from the list of build rules. Note this step
     90 doesn't use the variable table.
     91 
     92 Then either executor or ninja generator will be used. Either way, kati runs its
     93 evaluator again for command lines. The variable table is used again for this
     94 step.
     95 
     96 We'll look at each components closely. GNU make is a somewhat different language
     97 from modern languages. Let's see.
     98 
     99 Parser for statements
    100 ---------------------
    101 
    102 I'm not 100% sure, but I think GNU make parses and evaluates Makefiles
    103 simultaneously, but kati has two phases for parsing and evaluation. The reason
    104 of this design is for performance. For Android build, kati (or GNU make) needs
    105 to read ~3k files ~50k times. The file which is read most often is read ~5k
    106 times. It's waste of time to parse such files again and again. Kati can re-use
    107 parsed results when it needs to evaluate a Makefile second time. If we stop
    108 caching the parsed results, kati will be two times slower for Android's
    109 build. Caching parsed statements is done in *file_cache.cc*.
    110 
    111 The statement parser is defined in *parser.cc*. In kati, there are four kinds of
    112 statements:
    113 
    114 * Rules
    115 * Assignments
    116 * Commands
    117 * Make directives
    118 
    119 Data structures for them are defined in *stmt.h*. Here are examples of these
    120 statements:
    121 
    122     VAR := yay!      # An assignment
    123     all:             # A rule
    124     	echo $(VAR)  # A command
    125     include xxx.mk   # A make directive (include)
    126 
    127 In addition to include directive, there are ifeq/ifneq/ifdef/ifndef directives
    128 and export/unexport directives. Also, kati internally uses "parse error
    129 statement". As GNU make doesn't show parse errors in branches which are not
    130 taken, we need to delay parse errors to evaluation time.
    131 
    132 ### Context dependent parser
    133 
    134 A tricky point of parsing make statements is that the parsing depends on the
    135 context of the evaluation. See the following Makefile chunk for example:
    136 
    137     $(VAR)
    138     	X=hoge echo $${X}
    139 
    140 You cannot tell whether the second line is a command or an assignment until
    141 *$(VAR)* is evaluated. If *$(VAR)* is a rule statement, the second line is a
    142 command and otherwise it's an assignment. If the previous line is
    143 
    144     VAR := target:
    145 
    146 the second line will turn out to be a command.
    147 
    148 For some reason, GNU make expands expressions before it decides the type of
    149 a statement only for rules. Storing assignments or directives in a variable
    150 won't work as assignments or directives. For example
    151 
    152     ASSIGN := A=B
    153     $(ASSIGN):
    154 
    155 doesn't assign "*B:*" to *A*, but defines a build rule whose target is *A=B*.
    156 
    157 Anyway, as a line starts with a tab character can be either a command statement
    158 or other statements depending on the evaluation result of the previous line,
    159 sometimes kati's parser cannot tell the statement type of a line. In this case,
    160 kati's parser speculatively creates a command statement object, keeping the
    161 original line. If it turns out the line is actually not a command statement,
    162 the evaluator re-runs the parser.
    163 
    164 ### Line concatenations and comments
    165 
    166 In most programming languages, line concatenations by a backslash character and
    167 comments are handled at a very early stage of a language
    168 implementation. However, GNU make changes the behavior for them depending on
    169 parse/eval context. For example, the following Makefile outputs "has space" and
    170 "hasnospace":
    171 
    172     VAR := has\
    173     space
    174     all:
    175     	echo $(VAR)
    176     	echo has\
    177     nospace
    178 
    179 GNU make usually inserts a whitespace between lines, but for command lines it
    180 doesn't. As we've seen in the previous subsection, sometimes kati cannot tell
    181 a line is a command statement or not. This means we should handle them after
    182 evaluating statements. Similar discussion applies for comments. GNU make usually
    183 trims characters after '#', but it does nothing for '#' in command lines.
    184 
    185 We have a bunch of comment/backslash related testcases in the testcase directory
    186 of kati's repository.
    187 
    188 Parser for expressions
    189 ----------------------
    190 
    191 A statement may have one or more expressions. The number of expressions in a
    192 statement depends on the statement's type. For example,
    193 
    194     A := $(X)
    195 
    196 This is an assignment statement, which has two expressions - *A* and
    197 *$(X)*. Types of expressions and their parser are defined in *expr.cc*. Like
    198 other programming languages, an expression is a tree of expressions. The type of
    199 a leaf expression is either literal, variable reference,
    200 [substitution references](http://www.gnu.org/software/make/manual/make.html#Substitution-Refs),
    201 or make functions.
    202 
    203 As written, backslashes and comments change their behavior depending on the
    204 context. Kati handles them in this phase. *ParseExprOpt* is the enum for the
    205 contexts.
    206 
    207 As a nature of old systems, GNU make is very permissive. For some reason, it
    208 allows some kind of unmatched pairs of parentheses. For example, GNU make
    209 doesn't think *$($(foo)* is an error - this is a reference to variable
    210 *$(foo*. If you have some experiences with parsers, you may wonder how one can
    211 implement a parser which allows such expressions. It seems GNU make
    212 intentionally allows this:
    213 
    214 http://git.savannah.gnu.org/cgit/make.git/tree/expand.c#n285
    215 
    216 No one won't use this feature intentionally. However, as GNU make allows this,
    217 some Makefiles have unmatched parentheses, so kati shouldn't raise an error for
    218 them, unfortunately.
    219 
    220 GNU make has a bunch of functions. Most users would use only simple ones such as
    221 *$(wildcard ...)* and *$(subst ...)*. There are also more complex functions such
    222 as *$(if ...)* and *$(call ...)*, which make GNU make Turing-complete. Make
    223 functions are defined in *func.cc*. Though *func.cc* is not short, the
    224 implementation is fairly simple. There is only one weirdness I remember around
    225 functions. GNU make slightly changes its parsing for *$(if ...)*, *$(and ...)*,
    226 and *$(or ...)*. See *trim_space* and *trim_right_space_1st* in *func.h* and how
    227 they are used in *expr.cc*.
    228 
    229 Evaluator for statements
    230 ------------------------
    231 
    232 Evaluator for statements are defined in *eval.cc*. As written, there are four
    233 kinds of statements:
    234 
    235 * Rules
    236 * Assignments
    237 * Commands
    238 * Make directives
    239 
    240 There is nothing tricky around commands and make directives. A rule statement
    241 have some forms and should be parsed after evaluating expression by the third
    242 parser. This will be discussed in the next section.
    243 
    244 Assignments in GNU make is tricky a bit. There are two kinds of variables in GNU
    245 make - simple variables and recursive variables. See the following code snippet:
    246 
    247     A = $(info world!)   # recursive
    248     B := $(info Hello,)  # simple
    249     $(A)
    250     $(B)
    251 
    252 This code outputs "Hello," and "world!", in this order. The evaluation of
    253 a recursive variable is delayed until the variable is referenced. So the first
    254 line, which is an assignment of a recursive variable, outputs nothing. The
    255 content of the variable *$(A)* will be *$(info world!)* after the first
    256 line. The assignment in the second line uses *:=* which means this is a simple
    257 variable assignment. For simple variables, the right hand side is evaluated
    258 immediately. So "Hello," will be output and the value of *$(B)* will be an empty
    259 string ($(info ...) returns an empty string). Then, "world!" will be shown when
    260 the third line is evaluated as *$(A)* is evaluated, and lastly the forth line
    261 does nothing, as *$(B)* is an empty string.
    262 
    263 There are two more kinds of assignments (i.e., *+=* and *?=*). These assignments
    264 keep the type of the original variable. Evaluation of them will be done
    265 immediately only when the left hand side of the assignment is already defined
    266 and is a simple variable.
    267 
    268 Parser for rules
    269 ----------------
    270 
    271 After evaluating a rule statement, kati needs to parse the evaluated result. A
    272 rule statement can actually be the following four things:
    273 
    274 * A rule
    275 * A [target specific variable](http://www.gnu.org/software/make/manual/make.html#Target_002dspecific)
    276 * An empty line
    277 * An error (there're non-whitespace characters without a colon)
    278 
    279 Parsing them is mostly done in *rule.cc*.
    280 
    281 ### Rules
    282 
    283 A rule is something like *all: hello.exe*. You should be familiar with it. There
    284 are several kinds of rules such as pattern rules, double colon rules, and order
    285 only dependencies, but they don't complicate the rule parser.
    286 
    287 A feature which complicates the parser is semicolon. You can write the first
    288 build command on the same line as the rule. For example,
    289 
    290     target:
    291     	echo hi!
    292 
    293 and
    294 
    295     target: ; echo hi!
    296 
    297 have the same meaning. This is tricky because kati shouldn't evaluate expressions
    298 in a command until the command is actually invoked. As a semicolon can appear as
    299 the result of expression evaluation, there are some corner cases. A tricky
    300 example:
    301 
    302     all: $(info foo) ; $(info bar)
    303     $(info baz)
    304 
    305 should output *foo*, *baz*, and then *bar*, in this order, but
    306 
    307     VAR := all: $(info foo) ; $(info bar)
    308     $(VAR)
    309     $(info baz)
    310 
    311 outputs *foo*, *bar*, and then *baz*.
    312 
    313 Again, for the command line after a semicolon, kati should also change how
    314 backslashes and comments are handled.
    315 
    316     target: has\
    317     space ; echo no\
    318     space
    319 
    320 The above example says *target* depends on two targets, *has* and *space*, and
    321 to build *target*, *echo nospace* should be executed.
    322 
    323 ### Target specific variables
    324 
    325 You may not familiar with target specific variables. This feature allows you to
    326 define variable which can be referenced only from commands in a specified
    327 target. See the following code:
    328 
    329     VAR := X
    330     target1: VAR := Y
    331     target1:
    332     	echo $(VAR)
    333     target2:
    334     	echo $(VAR)
    335 
    336 In this example, *target1* shows *Y* and *target2* shows *X*. I think this
    337 feature is somewhat similar to namespaces in other programming languages. If a
    338 target specific variable is specified for a non-leaf target, the variable will
    339 be used even in build commands of prerequisite targets.
    340 
    341 In general, I like GNU make, but this is the only GNU make's feature I don't
    342 like. See the following Makefile:
    343 
    344     hello: CFLAGS := -g
    345     hello: hello.o
    346     	gcc $(CFLAGS) $< -o $@
    347     hello.o: hello.c
    348     	gcc $(CFLAGS) -c $< -o $@
    349 
    350 If you run make for the target *hello*, *CFLAGS* is applied for both commands:
    351 
    352     $ make hello
    353     gcc -g -c hello.c -o hello.o
    354     gcc -g hello.o -o hello
    355 
    356 However, *CFLAGS* for *hello* won't be used when you build only *hello.o*:
    357 
    358     $ make hello.o
    359     gcc  -c hello.c -o hello.o
    360 
    361 Things could be even worse when two targets with different target specific
    362 variables depend on a same target. The build result will be inconsistent. I
    363 think there is no valid usage of this feature for non-leaf targets.
    364 
    365 Let's go back to the parsing. Like for semicolons, we need to delay the
    366 evaluation of the right hand side of the assignment for recursive variables. Its
    367 implementation is very similar to the one for semicolons, but the combination of
    368 the assignment and the semicolon makes parsing a bit trickier. An example:
    369 
    370     target1: ;X=Y echo $(X)  # A rule with a command
    371     target2: X=;Y echo $(X)  # A target specific variable
    372 
    373 Evaluator for expressions
    374 -------------------------
    375 
    376 Evaluation of expressions is done in *expr.cc*, *func.cc*, and
    377 *command.cc*. The amount of code for this step is fairly large especially
    378 because of the number of GNU make functions. However, their implementations are
    379 fairly straightforward.
    380 
    381 One tricky function is $(wildcard ...). It seems GNU make is doing some kind of
    382 optimization only for this function and $(wildcard ...) in commands seem to be
    383 evaluated before the evaluation phase for commands. Both C++ kati and Go kati
    384 are different from GNU make's behavior in different ways, but it seems this
    385 incompatibility is OK for Android build.
    386 
    387 There is an important optimization done for Android. Android's build system has
    388 a lot of $(shell find ...) calls to create a list of all .java/.mk files under a
    389 directory, and they are slow. For this, kati has a builtin emulator of GNU
    390 find. The find emulator traverses the directory tree and creates an in-memory
    391 directory tree. Then the find emulator returns results of find commands using
    392 the cached tree. For my environment, the find command emulator makes kati ~1.6x
    393 faster for AOSP.
    394 
    395 The implementations of some IO-related functions in commands are tricky in the
    396 ninja generation mode. This will be described later.
    397 
    398 Dependency builder
    399 ------------------
    400 
    401 Now we get a list of rules and a variable table. *dep.cc* builds a dependency
    402 graph using the list of rules. I think this step is what GNU make is supposed to
    403 do for normal users.
    404 
    405 This step is fairly complex like other components but there's nothing
    406 strange. There are three types of rules in GNU make:
    407 
    408 * explicit rule
    409 * implicit rule
    410 * suffix rule
    411 
    412 The following code shows the three types:
    413 
    414     all: foo.o
    415     foo.o:
    416     	echo explicit
    417     %.o:
    418     	echo implicit
    419     .c.o:
    420     	echo suffix
    421 
    422 In the above example, all of these three rules match the target *foo.o*. GNU
    423 make prioritizes explicit rules first. When there's no explicit rule for a
    424 target, it uses an implicit rule with longer pattern string. Suffix rules are
    425 used only when there are no explicit/implicit rules.
    426 
    427 Android has more than one thousand implicit rules and there are ten thousands of
    428 targets. It's too slow to do matching for them with a naive O(NM)
    429 algorithm. Kati uses a trie to speed up this step.
    430 
    431 Multiple rules without commands should be merged into the rule with a
    432 command. For example:
    433 
    434     foo.o: foo.h
    435     %.o: %.c
    436     	$(CC) -c $< -o $@
    437 
    438 *foo.o* depends not only on *foo.c*, but also on *foo.h*.
    439 
    440 Executor
    441 --------
    442 
    443 C++ kati's executor is fairly simple. This is defined in *exec.cc*. This is
    444 useful only for testing because this lacks some important features for a build
    445 system (e.g., parallel build).
    446 
    447 Expressions in commands are evaluated at this stage. When they are evaluated,
    448 target specific variables and some special variables (e.g., $< and $@) should be
    449 considered. *command.cc* is handling them. This file is used by both the
    450 executor and the ninja generator.
    451 
    452 Evaluation at this stage is tricky when both *+=* and target specific variables
    453 are involved. Here is an example code:
    454 
    455     all: test1 test2 test3 test4
    456     
    457     A:=X
    458     B=X
    459     X:=foo
    460     
    461     test1: A+=$(X)
    462     test1:
    463     	@echo $(A)  # X bar
    464     
    465     test2: B+=$(X)
    466     test2:
    467     	@echo $(B)  # X bar
    468     
    469     test3: A:=
    470     test3: A+=$(X)
    471     test3:
    472     	@echo $(A)  # foo
    473     
    474     test4: B=
    475     test4: B+=$(X)
    476     test4:
    477     	@echo $(B)  # bar
    478     
    479     X:=bar
    480 
    481 *$(A)* in *test3* is a simple variable. Though *$(A)* in the global scope is
    482 simple, *$(A)* in *test1* is a recursive variable. This means types of global
    483 variables don't affect types of target specific variables. However, The result
    484 of *test1* ("X bar") shows the value of a target specific variable is
    485 concatenated to the value of a global variable.
    486 
    487 Ninja generator
    488 ---------------
    489 
    490 *ninja.cc* generates a ninja file using the results of other components. This
    491 step is actually fairly complicated because kati needs to map GNU make's
    492 features to ninja's.
    493 
    494 A build rule in GNU make may have multiple commands, while ninja's has always a
    495 single command. To mitigate this, the ninja generator translates multiple
    496 commands into something like *(cmd1) && (cmd2) && ...*. Kati should also escape
    497 some special characters for ninja and shell.
    498 
    499 The tougher thing is $(shell ...) in commands. Current kati's implementation
    500 translates it into shell's $(...). This works for many cases. But this approach
    501 won't work when the result of $(shell ...) is passed to another make
    502 function. For example
    503 
    504     all:
    505     	echo $(if $(shell echo),FAIL,PASS)
    506 
    507 should output PASS, because the result of $(shell echo) is an empty string. GNU
    508 make and kati's executor mode output PASS correctly. However, kati's ninja
    509 generator emits a ninja file which shows FAIL.
    510 
    511 I wrote a few experimental patches for this issue, but they didn't
    512 work well. The current kati's implementation has an Android specific workaround
    513 for this. See *HasNoIoInShellScript* in *func.cc* for detail.
    514 
    515 Ninja regeneration
    516 ------------------
    517 
    518 C++ kati has --regen flag. If this flag is specified, kati checks if anything
    519 in your environment was changed after the previous run. If kati thinks it doesn't
    520 need to regenerate the ninja file, it finishes quickly. For Android, running
    521 kati takes ~30 secs at the first run but the second run takes only ~1 sec.
    522 
    523 Kati thinks it needs to regenerate the ninja file when one of the followings is
    524 changed:
    525 
    526 * The command line flags passed to kati
    527 * A timestamp of a Makefile used to generate the previous ninja file
    528 * An environment variable used while evaluating Makefiles
    529 * A result of $(wildcard ...)
    530 * A result of $(shell ...)
    531 
    532 Quickly doing the last check is not trivial. It takes ~18 secs to run all
    533 $(shell ...) in Android's build system due to the slowness of $(shell find
    534 ...). So, for find commands executed by kati's find emulator, kati stores the
    535 timestamps of traversed directories with the find command itself. For each find
    536 commands, kati checks the timestamps of them. If they are not changed, kati
    537 skips re-running the find command.
    538 
    539 Kati doesn't run $(shell date ...) and $(shell echo ...) during this check. The
    540 former always changes so there's no sense to re-run them. Android uses the
    541 latter to create a file and the result of them are empty strings. We don't want
    542 to update these files to get empty strings.
    543 
    544 TODO
    545 ----
    546 
    547 A big TODO is sub-makes invoked by $(MAKE). I wrote some experimental patches
    548 but nothing is ready to be used as of writing.
    549