1 Kati internals 2 ============== 3 4 This is an informal document about internals of kati. This document is not meant 5 to be a comprehensive document of kati or GNU make. This explains some random 6 topics which other programmers may be interested in. 7 8 Motivation 9 ---------- 10 11 The motivation of kati was to speed up Android platform build. Especially, its 12 incremental build time was the main focus. Android platform's build system is a 13 very unique system. It provides a DSL, (ab)using Turing-completeness of GNU 14 make. The DSL allows developers to write build rules in a descriptive way, but 15 the downside is it's complicated and slow. 16 17 When we say a build system is slow, we consider "null build" and "full 18 build". Null build is a build which does nothing, because all output files are 19 already up-to-date. Full build is a build which builds everything, because there 20 were nothing which have been already built. Actual builds in daily development 21 are somewhere between null build and full build. Most benchmarks below were done 22 for null build. 23 24 For Android with my fairly beefy workstation, null build took ~100 secs with GNU 25 make. This means you needed to wait ~100 secs to see if there's a compile error 26 when you changed a single C file. To be fair, things were not that bad. There 27 are tools called mm/mmm. They allow developers to build an individual module. As 28 they ignore dependencies between modules, they are fast. However, you need to be 29 somewhat experienced to use them properly. You should know which modules will be 30 affected by your change. It would be nicer if you can just type "make" whenever 31 you change something. 32 33 This is why we started this project. We decided to create a GNU make clone from 34 scratch, but there were some other options. One option was to replace all 35 Android.mk by files with a better format. There is actually a longer-term 36 project for this. Kati was planned to be a short-term project. Another option 37 was to hack GNU make instead of developing a clone. We didn't take this option 38 because we thought the source code of GNU make is somewhat complicated due to 39 historical reason. It's written in old-style C, has a lot of ifdefs for some 40 unknown architectures, etc. 41 42 Currently, kati's main mode is --ninja mode. Instead of executing build commands 43 by itself, kati generates build.ninja file and 44 [ninja](https://github.com/martine/ninja) actually runs commands. There were 45 some back-and-forths before kati became the current form. Some experiments 46 succeeded and some others failed. We even changed the language for kati. At 47 first, we wrote kati in Go. We naively expected we can get enough performance 48 with Go. I guessed at least one of the following statements are true: 1. GNU 49 make is not very optimized for computation heavy Makefiles, 2. Go is fast for 50 our purpose, or 3. we can come up with some optimization tricks for Android's 51 build system. As for 3, some of such optimization succeeded but it's performance 52 gain didn't cancel the slowness of Go. 53 54 Go's performance would be somewhat interesting topic. I didn't study the 55 performance difference in detail, but it seemed both our use of Go and Go 56 language itself were making the Go version of kati slower. As for our fault, I 57 think Go version has more unnecessary string allocations than C++ version 58 has. As for Go itself, it seemed GC was the main show-stopper. For example, 59 Android's build system defines about one million make variables, and buffers for 60 them will be never freed. IIRC, this kind of allocation pattern isn't good for 61 non-generational GC. 62 63 Go version and test cases were written by ukai and me, and C++ rewrite was done 64 mostly by me. The rest of this document is mostly about the C++ version. 65 66 Overall architecture 67 -------------------- 68 69 Kati consists of the following components: 70 71 * Parser 72 * Evaluator 73 * Dependency builder 74 * Executor 75 * Ninja generator 76 77 A Makefile has some statements which consist of zero or more expressions. There 78 are two parsers and two evaluators - one for statements and the other for 79 expressions. 80 81 Most of users of GNU make may not care about the evaluator much. However, GNU 82 make's evaluator is very powerful and is Turing-complete. For Android's null 83 build, most time is spent in this phase. Other tasks, such as building 84 dependency graphs and calling stat function for build targets, are not the 85 bottleneck. This would be a very Android specific characteristics. Android's 86 build system uses a lot of GNU make black magics. 87 88 The evaluator outputs a list of build rules and a variable table. The dependency 89 builder creates a dependency graph from the list of build rules. Note this step 90 doesn't use the variable table. 91 92 Then either executor or ninja generator will be used. Either way, kati runs its 93 evaluator again for command lines. The variable table is used again for this 94 step. 95 96 We'll look at each components closely. GNU make is a somewhat different language 97 from modern languages. Let's see. 98 99 Parser for statements 100 --------------------- 101 102 I'm not 100% sure, but I think GNU make parses and evaluates Makefiles 103 simultaneously, but kati has two phases for parsing and evaluation. The reason 104 of this design is for performance. For Android build, kati (or GNU make) needs 105 to read ~3k files ~50k times. The file which is read most often is read ~5k 106 times. It's waste of time to parse such files again and again. Kati can re-use 107 parsed results when it needs to evaluate a Makefile second time. If we stop 108 caching the parsed results, kati will be two times slower for Android's 109 build. Caching parsed statements is done in *file_cache.cc*. 110 111 The statement parser is defined in *parser.cc*. In kati, there are four kinds of 112 statements: 113 114 * Rules 115 * Assignments 116 * Commands 117 * Make directives 118 119 Data structures for them are defined in *stmt.h*. Here are examples of these 120 statements: 121 122 VAR := yay! # An assignment 123 all: # A rule 124 echo $(VAR) # A command 125 include xxx.mk # A make directive (include) 126 127 In addition to include directive, there are ifeq/ifneq/ifdef/ifndef directives 128 and export/unexport directives. Also, kati internally uses "parse error 129 statement". As GNU make doesn't show parse errors in branches which are not 130 taken, we need to delay parse errors to evaluation time. 131 132 ### Context dependent parser 133 134 A tricky point of parsing make statements is that the parsing depends on the 135 context of the evaluation. See the following Makefile chunk for example: 136 137 $(VAR) 138 X=hoge echo $${X} 139 140 You cannot tell whether the second line is a command or an assignment until 141 *$(VAR)* is evaluated. If *$(VAR)* is a rule statement, the second line is a 142 command and otherwise it's an assignment. If the previous line is 143 144 VAR := target: 145 146 the second line will turn out to be a command. 147 148 For some reason, GNU make expands expressions before it decides the type of 149 a statement only for rules. Storing assignments or directives in a variable 150 won't work as assignments or directives. For example 151 152 ASSIGN := A=B 153 $(ASSIGN): 154 155 doesn't assign "*B:*" to *A*, but defines a build rule whose target is *A=B*. 156 157 Anyway, as a line starts with a tab character can be either a command statement 158 or other statements depending on the evaluation result of the previous line, 159 sometimes kati's parser cannot tell the statement type of a line. In this case, 160 kati's parser speculatively creates a command statement object, keeping the 161 original line. If it turns out the line is actually not a command statement, 162 the evaluator re-runs the parser. 163 164 ### Line concatenations and comments 165 166 In most programming languages, line concatenations by a backslash character and 167 comments are handled at a very early stage of a language 168 implementation. However, GNU make changes the behavior for them depending on 169 parse/eval context. For example, the following Makefile outputs "has space" and 170 "hasnospace": 171 172 VAR := has\ 173 space 174 all: 175 echo $(VAR) 176 echo has\ 177 nospace 178 179 GNU make usually inserts a whitespace between lines, but for command lines it 180 doesn't. As we've seen in the previous subsection, sometimes kati cannot tell 181 a line is a command statement or not. This means we should handle them after 182 evaluating statements. Similar discussion applies for comments. GNU make usually 183 trims characters after '#', but it does nothing for '#' in command lines. 184 185 We have a bunch of comment/backslash related testcases in the testcase directory 186 of kati's repository. 187 188 Parser for expressions 189 ---------------------- 190 191 A statement may have one or more expressions. The number of expressions in a 192 statement depends on the statement's type. For example, 193 194 A := $(X) 195 196 This is an assignment statement, which has two expressions - *A* and 197 *$(X)*. Types of expressions and their parser are defined in *expr.cc*. Like 198 other programming languages, an expression is a tree of expressions. The type of 199 a leaf expression is either literal, variable reference, 200 [substitution references](http://www.gnu.org/software/make/manual/make.html#Substitution-Refs), 201 or make functions. 202 203 As written, backslashes and comments change their behavior depending on the 204 context. Kati handles them in this phase. *ParseExprOpt* is the enum for the 205 contexts. 206 207 As a nature of old systems, GNU make is very permissive. For some reason, it 208 allows some kind of unmatched pairs of parentheses. For example, GNU make 209 doesn't think *$($(foo)* is an error - this is a reference to variable 210 *$(foo*. If you have some experiences with parsers, you may wonder how one can 211 implement a parser which allows such expressions. It seems GNU make 212 intentionally allows this: 213 214 http://git.savannah.gnu.org/cgit/make.git/tree/expand.c#n285 215 216 No one won't use this feature intentionally. However, as GNU make allows this, 217 some Makefiles have unmatched parentheses, so kati shouldn't raise an error for 218 them, unfortunately. 219 220 GNU make has a bunch of functions. Most users would use only simple ones such as 221 *$(wildcard ...)* and *$(subst ...)*. There are also more complex functions such 222 as *$(if ...)* and *$(call ...)*, which make GNU make Turing-complete. Make 223 functions are defined in *func.cc*. Though *func.cc* is not short, the 224 implementation is fairly simple. There is only one weirdness I remember around 225 functions. GNU make slightly changes its parsing for *$(if ...)*, *$(and ...)*, 226 and *$(or ...)*. See *trim_space* and *trim_right_space_1st* in *func.h* and how 227 they are used in *expr.cc*. 228 229 Evaluator for statements 230 ------------------------ 231 232 Evaluator for statements are defined in *eval.cc*. As written, there are four 233 kinds of statements: 234 235 * Rules 236 * Assignments 237 * Commands 238 * Make directives 239 240 There is nothing tricky around commands and make directives. A rule statement 241 have some forms and should be parsed after evaluating expression by the third 242 parser. This will be discussed in the next section. 243 244 Assignments in GNU make is tricky a bit. There are two kinds of variables in GNU 245 make - simple variables and recursive variables. See the following code snippet: 246 247 A = $(info world!) # recursive 248 B := $(info Hello,) # simple 249 $(A) 250 $(B) 251 252 This code outputs "Hello," and "world!", in this order. The evaluation of 253 a recursive variable is delayed until the variable is referenced. So the first 254 line, which is an assignment of a recursive variable, outputs nothing. The 255 content of the variable *$(A)* will be *$(info world!)* after the first 256 line. The assignment in the second line uses *:=* which means this is a simple 257 variable assignment. For simple variables, the right hand side is evaluated 258 immediately. So "Hello," will be output and the value of *$(B)* will be an empty 259 string ($(info ...) returns an empty string). Then, "world!" will be shown when 260 the third line is evaluated as *$(A)* is evaluated, and lastly the forth line 261 does nothing, as *$(B)* is an empty string. 262 263 There are two more kinds of assignments (i.e., *+=* and *?=*). These assignments 264 keep the type of the original variable. Evaluation of them will be done 265 immediately only when the left hand side of the assignment is already defined 266 and is a simple variable. 267 268 Parser for rules 269 ---------------- 270 271 After evaluating a rule statement, kati needs to parse the evaluated result. A 272 rule statement can actually be the following four things: 273 274 * A rule 275 * A [target specific variable](http://www.gnu.org/software/make/manual/make.html#Target_002dspecific) 276 * An empty line 277 * An error (there're non-whitespace characters without a colon) 278 279 Parsing them is mostly done in *rule.cc*. 280 281 ### Rules 282 283 A rule is something like *all: hello.exe*. You should be familiar with it. There 284 are several kinds of rules such as pattern rules, double colon rules, and order 285 only dependencies, but they don't complicate the rule parser. 286 287 A feature which complicates the parser is semicolon. You can write the first 288 build command on the same line as the rule. For example, 289 290 target: 291 echo hi! 292 293 and 294 295 target: ; echo hi! 296 297 have the same meaning. This is tricky because kati shouldn't evaluate expressions 298 in a command until the command is actually invoked. As a semicolon can appear as 299 the result of expression evaluation, there are some corner cases. A tricky 300 example: 301 302 all: $(info foo) ; $(info bar) 303 $(info baz) 304 305 should output *foo*, *baz*, and then *bar*, in this order, but 306 307 VAR := all: $(info foo) ; $(info bar) 308 $(VAR) 309 $(info baz) 310 311 outputs *foo*, *bar*, and then *baz*. 312 313 Again, for the command line after a semicolon, kati should also change how 314 backslashes and comments are handled. 315 316 target: has\ 317 space ; echo no\ 318 space 319 320 The above example says *target* depends on two targets, *has* and *space*, and 321 to build *target*, *echo nospace* should be executed. 322 323 ### Target specific variables 324 325 You may not familiar with target specific variables. This feature allows you to 326 define variable which can be referenced only from commands in a specified 327 target. See the following code: 328 329 VAR := X 330 target1: VAR := Y 331 target1: 332 echo $(VAR) 333 target2: 334 echo $(VAR) 335 336 In this example, *target1* shows *Y* and *target2* shows *X*. I think this 337 feature is somewhat similar to namespaces in other programming languages. If a 338 target specific variable is specified for a non-leaf target, the variable will 339 be used even in build commands of prerequisite targets. 340 341 In general, I like GNU make, but this is the only GNU make's feature I don't 342 like. See the following Makefile: 343 344 hello: CFLAGS := -g 345 hello: hello.o 346 gcc $(CFLAGS) $< -o $@ 347 hello.o: hello.c 348 gcc $(CFLAGS) -c $< -o $@ 349 350 If you run make for the target *hello*, *CFLAGS* is applied for both commands: 351 352 $ make hello 353 gcc -g -c hello.c -o hello.o 354 gcc -g hello.o -o hello 355 356 However, *CFLAGS* for *hello* won't be used when you build only *hello.o*: 357 358 $ make hello.o 359 gcc -c hello.c -o hello.o 360 361 Things could be even worse when two targets with different target specific 362 variables depend on a same target. The build result will be inconsistent. I 363 think there is no valid usage of this feature for non-leaf targets. 364 365 Let's go back to the parsing. Like for semicolons, we need to delay the 366 evaluation of the right hand side of the assignment for recursive variables. Its 367 implementation is very similar to the one for semicolons, but the combination of 368 the assignment and the semicolon makes parsing a bit trickier. An example: 369 370 target1: ;X=Y echo $(X) # A rule with a command 371 target2: X=;Y echo $(X) # A target specific variable 372 373 Evaluator for expressions 374 ------------------------- 375 376 Evaluation of expressions is done in *expr.cc*, *func.cc*, and 377 *command.cc*. The amount of code for this step is fairly large especially 378 because of the number of GNU make functions. However, their implementations are 379 fairly straightforward. 380 381 One tricky function is $(wildcard ...). It seems GNU make is doing some kind of 382 optimization only for this function and $(wildcard ...) in commands seem to be 383 evaluated before the evaluation phase for commands. Both C++ kati and Go kati 384 are different from GNU make's behavior in different ways, but it seems this 385 incompatibility is OK for Android build. 386 387 There is an important optimization done for Android. Android's build system has 388 a lot of $(shell find ...) calls to create a list of all .java/.mk files under a 389 directory, and they are slow. For this, kati has a builtin emulator of GNU 390 find. The find emulator traverses the directory tree and creates an in-memory 391 directory tree. Then the find emulator returns results of find commands using 392 the cached tree. For my environment, the find command emulator makes kati ~1.6x 393 faster for AOSP. 394 395 The implementations of some IO-related functions in commands are tricky in the 396 ninja generation mode. This will be described later. 397 398 Dependency builder 399 ------------------ 400 401 Now we get a list of rules and a variable table. *dep.cc* builds a dependency 402 graph using the list of rules. I think this step is what GNU make is supposed to 403 do for normal users. 404 405 This step is fairly complex like other components but there's nothing 406 strange. There are three types of rules in GNU make: 407 408 * explicit rule 409 * implicit rule 410 * suffix rule 411 412 The following code shows the three types: 413 414 all: foo.o 415 foo.o: 416 echo explicit 417 %.o: 418 echo implicit 419 .c.o: 420 echo suffix 421 422 In the above example, all of these three rules match the target *foo.o*. GNU 423 make prioritizes explicit rules first. When there's no explicit rule for a 424 target, it uses an implicit rule with longer pattern string. Suffix rules are 425 used only when there are no explicit/implicit rules. 426 427 Android has more than one thousand implicit rules and there are ten thousands of 428 targets. It's too slow to do matching for them with a naive O(NM) 429 algorithm. Kati uses a trie to speed up this step. 430 431 Multiple rules without commands should be merged into the rule with a 432 command. For example: 433 434 foo.o: foo.h 435 %.o: %.c 436 $(CC) -c $< -o $@ 437 438 *foo.o* depends not only on *foo.c*, but also on *foo.h*. 439 440 Executor 441 -------- 442 443 C++ kati's executor is fairly simple. This is defined in *exec.cc*. This is 444 useful only for testing because this lacks some important features for a build 445 system (e.g., parallel build). 446 447 Expressions in commands are evaluated at this stage. When they are evaluated, 448 target specific variables and some special variables (e.g., $< and $@) should be 449 considered. *command.cc* is handling them. This file is used by both the 450 executor and the ninja generator. 451 452 Evaluation at this stage is tricky when both *+=* and target specific variables 453 are involved. Here is an example code: 454 455 all: test1 test2 test3 test4 456 457 A:=X 458 B=X 459 X:=foo 460 461 test1: A+=$(X) 462 test1: 463 @echo $(A) # X bar 464 465 test2: B+=$(X) 466 test2: 467 @echo $(B) # X bar 468 469 test3: A:= 470 test3: A+=$(X) 471 test3: 472 @echo $(A) # foo 473 474 test4: B= 475 test4: B+=$(X) 476 test4: 477 @echo $(B) # bar 478 479 X:=bar 480 481 *$(A)* in *test3* is a simple variable. Though *$(A)* in the global scope is 482 simple, *$(A)* in *test1* is a recursive variable. This means types of global 483 variables don't affect types of target specific variables. However, The result 484 of *test1* ("X bar") shows the value of a target specific variable is 485 concatenated to the value of a global variable. 486 487 Ninja generator 488 --------------- 489 490 *ninja.cc* generates a ninja file using the results of other components. This 491 step is actually fairly complicated because kati needs to map GNU make's 492 features to ninja's. 493 494 A build rule in GNU make may have multiple commands, while ninja's has always a 495 single command. To mitigate this, the ninja generator translates multiple 496 commands into something like *(cmd1) && (cmd2) && ...*. Kati should also escape 497 some special characters for ninja and shell. 498 499 The tougher thing is $(shell ...) in commands. Current kati's implementation 500 translates it into shell's $(...). This works for many cases. But this approach 501 won't work when the result of $(shell ...) is passed to another make 502 function. For example 503 504 all: 505 echo $(if $(shell echo),FAIL,PASS) 506 507 should output PASS, because the result of $(shell echo) is an empty string. GNU 508 make and kati's executor mode output PASS correctly. However, kati's ninja 509 generator emits a ninja file which shows FAIL. 510 511 I wrote a few experimental patches for this issue, but they didn't 512 work well. The current kati's implementation has an Android specific workaround 513 for this. See *HasNoIoInShellScript* in *func.cc* for detail. 514 515 Ninja regeneration 516 ------------------ 517 518 C++ kati has --regen flag. If this flag is specified, kati checks if anything 519 in your environment was changed after the previous run. If kati thinks it doesn't 520 need to regenerate the ninja file, it finishes quickly. For Android, running 521 kati takes ~30 secs at the first run but the second run takes only ~1 sec. 522 523 Kati thinks it needs to regenerate the ninja file when one of the followings is 524 changed: 525 526 * The command line flags passed to kati 527 * A timestamp of a Makefile used to generate the previous ninja file 528 * An environment variable used while evaluating Makefiles 529 * A result of $(wildcard ...) 530 * A result of $(shell ...) 531 532 Quickly doing the last check is not trivial. It takes ~18 secs to run all 533 $(shell ...) in Android's build system due to the slowness of $(shell find 534 ...). So, for find commands executed by kati's find emulator, kati stores the 535 timestamps of traversed directories with the find command itself. For each find 536 commands, kati checks the timestamps of them. If they are not changed, kati 537 skips re-running the find command. 538 539 Kati doesn't run $(shell date ...) and $(shell echo ...) during this check. The 540 former always changes so there's no sense to re-run them. Android uses the 541 latter to create a file and the result of them are empty strings. We don't want 542 to update these files to get empty strings. 543 544 TODO 545 ---- 546 547 A big TODO is sub-makes invoked by $(MAKE). I wrote some experimental patches 548 but nothing is ready to be used as of writing. 549