1 <?xml version="1.0"?> <!-- -*- sgml -*- --> 2 <!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN" 3 "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd" 4 [ <!ENTITY % vg-entities SYSTEM "../../docs/xml/vg-entities.xml"> %vg-entities; ]> 5 6 <chapter id="cl-format" xreflabel="Callgrind Format Specification"> 7 <title>Callgrind Format Specification</title> 8 9 <para>This chapter describes the Callgrind Profile Format, Version 1.</para> 10 11 <para>A synonymous name is "Calltree Profile Format". These names actually mean 12 the same since Callgrind was previously named Calltree.</para> 13 14 <para>The format description is meant for the user to be able to understand the 15 file contents; but more important, it is given for authors of measurement or 16 visualization tools to be able to write and read this format.</para> 17 18 <sect1 id="cl-format.overview" xreflabel="Overview"> 19 <title>Overview</title> 20 21 <para>The profile data format is ASCII based. 22 It is written by Callgrind, and it is upwards compatible 23 to the format used by Cachegrind (ie. Cachegrind uses a subset). It can 24 be read by callgrind_annotate and KCachegrind.</para> 25 26 <para>This chapter gives on overview of format features and examples. 27 For detailed syntax, look at the format reference.</para> 28 29 <sect2 id="cl-format.overview.basics" xreflabel="Basic Structure"> 30 <title>Basic Structure</title> 31 32 <para>Each file has a header part of an arbitrary number of lines of the 33 format "key: value". After the header, lines specifying profile costs 34 follow. Everywhere, comments on own lines starting with '#' are allowed. 35 The header lines with keys "positions" and "events" define 36 the meaning of cost lines in the second part of the file: the value of 37 "positions" is a list of subpositions, and the value of "events" is a list 38 of event type names. Cost lines consist of subpositions followed by 64-bit 39 counters for the events, in the order specified by the "positions" and "events" 40 header line.</para> 41 42 <para>The "events" header line is always required in contrast to the optional 43 line for "positions", which defaults to "line", i.e. a line number of some 44 source file. In addition, the second part of the file contains position 45 specifications of the form "spec=name". "spec" can be e.g. "fn" for a 46 function name or "fl" for a file name. Cost lines are always related to 47 the function/file specifications given directly before.</para> 48 49 </sect2> 50 51 <sect2 id="cl-format.overview.example1" xreflabel="Simple Example"> 52 <title>Simple Example</title> 53 54 <para>The event names in the following example are quite arbitrary, and are not 55 related to event names used by Callgrind. Especially, cycle counts matching 56 real processors probably will never be generated by any Valgrind tools, as these 57 are bound to simulations of simple machine models for acceptable slowdown. 58 However, any profiling tool could use the format described in this chapter.</para> 59 60 <para> 61 <screen>events: Cycles Instructions Flops 62 fl=file.f 63 fn=main 64 15 90 14 2 65 16 20 12</screen></para> 66 67 <para>The above example gives profile information for event types "Cycles", 68 "Instructions", and "Flops". Thus, cost lines give the number of CPU cycles 69 passed by, number of executed instructions, and number of floating point 70 operations executed while running code corresponding to some source 71 position. As there is no line specifying the value of "positions", it defaults 72 to "line", which means that the first number of a cost line is always a line 73 number.</para> 74 75 <para>Thus, the first cost line specifies that in line 15 of source file 76 <filename>file.f</filename> there is code belonging to function 77 <function>main</function>. While running, 90 CPU cycles passed by, and 2 of 78 the 14 instructions executed were floating point operations. Similarly, the 79 next line specifies that there were 12 instructions executed in the context 80 of function <function>main</function> which can be related to line 16 in 81 file <filename>file.f</filename>, taking 20 CPU cycles. If a cost line 82 specifies less event counts than given in the "events" line, the rest is 83 assumed to be zero. I.e. there was no floating point instruction executed 84 relating to line 16.</para> 85 86 <para>Note that regular cost lines always give self (also called exclusive) 87 cost of code at a given position. If you specify multiple cost lines for the 88 same position, these will be summed up. On the other hand, in the example above 89 there is no specification of how many times function 90 <function>main</function> actually was 91 called: profile data only contains sums.</para> 92 93 </sect2> 94 95 96 <sect2 id="cl-format.overview.associations" xreflabel="Associations"> 97 <title>Associations</title> 98 99 <para>The most important extension to the original format of Cachegrind is the 100 ability to specify call relationship among functions. More generally, you 101 specify associations among positions. For this, the second part of the 102 file also can contain association specifications. These look similar to 103 position specifications, but consist of two lines. For calls, the format 104 looks like 105 <screen> 106 calls=(Call Count) (Target position) 107 (Source position) (Inclusive cost of call) 108 </screen></para> 109 110 <para>The destination only specifies subpositions like line number. Therefore, 111 to be able to specify a call to another function in another source file, you 112 have to precede the above lines with a "cfn=" specification for the name of the 113 called function, and optionally a "cfi=" specification if the function is in 114 another source file ("cfl=" is an alternative specification for "cfi=" because 115 of historical reasons, and both should be supported by format readers). 116 The second line looks like a regular cost line with the difference 117 that inclusive cost spent inside of the function call has to be specified.</para> 118 119 <para>Other associations are for example (conditional) jumps. See the 120 reference below for details.</para> 121 122 </sect2> 123 124 125 <sect2 id="cl-format.overview.example2" xreflabel="Extended Example"> 126 <title>Extended Example</title> 127 128 <para>The following example shows 3 functions, <function>main</function>, 129 <function>func1</function>, and <function>func2</function>. Function 130 <function>main</function> calls <function>func1</function> once and 131 <function>func2</function> 3 times. <function>func1</function> calls 132 <function>func2</function> 2 times. 133 <screen>events: Instructions 134 135 fl=file1.c 136 fn=main 137 16 20 138 cfn=func1 139 calls=1 50 140 16 400 141 cfi=file2.c 142 cfn=func2 143 calls=3 20 144 16 400 145 146 fn=func1 147 51 100 148 cfi=file2.c 149 cfn=func2 150 calls=2 20 151 51 300 152 153 fl=file2.c 154 fn=func2 155 20 700</screen></para> 156 157 <para>One can see that in <function>main</function> only code from line 16 158 is executed where also the other functions are called. Inclusive cost of 159 <function>main</function> is 820, which is the sum of self cost 20 and costs 160 spent in the calls: 400 for the single call to <function>func1</function> 161 and 400 as sum for the three calls to <function>func2</function>.</para> 162 163 <para>Function <function>func1</function> is located in 164 <filename>file1.c</filename>, the same as <function>main</function>. 165 Therefore, a "cfi=" specification for the call to <function>func1</function> 166 is not needed. The function <function>func1</function> only consists of code 167 at line 51 of <filename>file1.c</filename>, where <function>func2</function> 168 is called.</para> 169 170 </sect2> 171 172 173 <sect2 id="cl-format.overview.compression1" xreflabel="Name Compression"> 174 <title>Name Compression</title> 175 176 <para>With the introduction of association specifications like calls it is 177 needed to specify the same function or same file name multiple times. As 178 absolute filenames or symbol names in C++ can be quite long, it is advantageous 179 to be able to specify integer IDs for position specifications. 180 Here, the term "position" corresponds to a file name (source or object file) 181 or function name.</para> 182 183 <para>To support name compression, a position specification can be not only of 184 the format "spec=name", but also "spec=(ID) name" to specify a mapping of an 185 integer ID to a name, and "spec=(ID)" to reference a previously defined ID 186 mapping. There is a separate ID mapping for each position specification, 187 i.e. you can use ID 1 for both a file name and a symbol name.</para> 188 189 <para>With string compression, the example from 1.4 looks like this: 190 <screen>events: Instructions 191 192 fl=(1) file1.c 193 fn=(1) main 194 16 20 195 cfn=(2) func1 196 calls=1 50 197 16 400 198 cfi=(2) file2.c 199 cfn=(3) func2 200 calls=3 20 201 16 400 202 203 fn=(2) 204 51 100 205 cfi=(2) 206 cfn=(3) 207 calls=2 20 208 51 300 209 210 fl=(2) 211 fn=(3) 212 20 700</screen></para> 213 214 <para>As position specifications carry no information themselves, but only change 215 the meaning of subsequent cost lines or associations, they can appear 216 everywhere in the file without any negative consequence. Especially, you can 217 define name compression mappings directly after the header, and before any cost 218 lines. Thus, the above example can also be written as 219 <screen>events: Instructions 220 221 # define file ID mapping 222 fl=(1) file1.c 223 fl=(2) file2.c 224 # define function ID mapping 225 fn=(1) main 226 fn=(2) func1 227 fn=(3) func2 228 229 fl=(1) 230 fn=(1) 231 16 20 232 ...</screen></para> 233 234 </sect2> 235 236 237 <sect2 id="cl-format.overview.compression2" xreflabel="Subposition Compression"> 238 <title>Subposition Compression</title> 239 240 <para>If a Callgrind data file should hold costs for each assembler instruction 241 of a program, you specify subposition "instr" in the "positions:" header line, 242 and each cost line has to include the address of some instruction. Addresses 243 are allowed to have a size of 64 bits to support 64-bit architectures. Thus, 244 repeating similar, long addresses for almost every line in the data file can 245 enlarge the file size quite significantly, and 246 motivates for subposition compression: instead of every cost line starting with 247 a 16 character long address, one is allowed to specify relative addresses. 248 This relative specification is not only allowed for instruction addresses, but 249 also for line numbers; both addresses and line numbers are called "subpositions".</para> 250 251 <para>A relative subposition always is based on the corresponding subposition 252 of the last cost line, and starts with a "+" to specify a positive difference, 253 a "-" to specify a negative difference, or consists of "*" to specify the same 254 subposition. Because absolute subpositions always are positive (ie. never 255 prefixed by "-"), any relative specification is non-ambiguous; additionally, 256 absolute and relative subposition specifications can be mixed freely. 257 Assume the following example (subpositions can always be specified 258 as hexadecimal numbers, beginning with "0x"): 259 <screen>positions: instr line 260 events: ticks 261 262 fn=func 263 0x80001234 90 1 264 0x80001237 90 5 265 0x80001238 91 6</screen></para> 266 267 <para>With subposition compression, this looks like 268 <screen>positions: instr line 269 events: ticks 270 271 fn=func 272 0x80001234 90 1 273 +3 * 5 274 +1 +1 6</screen></para> 275 276 <para>Remark: For assembler annotation to work, instruction addresses have to 277 be corrected to correspond to addresses found in the original binary. I.e. for 278 relocatable shared objects, often a load offset has to be subtracted.</para> 279 280 </sect2> 281 282 283 <sect2 id="cl-format.overview.misc" xreflabel="Miscellaneous"> 284 <title>Miscellaneous</title> 285 286 <sect3 id="cl-format.overview.misc.summary" xreflabel="Cost Summary Information"> 287 <title>Cost Summary Information</title> 288 289 <para>For the visualization to be able to show cost percentage, a sum of the 290 cost of the full run has to be known. Usually, it is assumed that this is the 291 sum of all cost lines in a file. But sometimes, this is not correct. Thus, you 292 can specify a "summary:" line in the header giving the full cost for the 293 profile run. An import filter may use this to show a progress bar 294 while loading a large data file.</para> 295 296 </sect3> 297 298 <sect3 id="cl-format.overview.misc.events" xreflabel="Long Names for Event Types and inherited Types"> 299 <title>Long Names for Event Types and inherited Types</title> 300 301 <para>Event types for cost lines are specified in the "events:" line with an 302 abbreviated name. For visualization, it makes sense to be able to specify some 303 longer, more descriptive name. For an event type "Ir" which means "Instruction 304 Fetches", this can be specified the header line 305 <screen>event: Ir : Instruction Fetches 306 events: Ir Dr</screen></para> 307 308 <para>In this example, "Dr" itself has no long name associated. The order of 309 "event:" lines and the "events:" line is of no importance. Additionally, 310 inherited event types can be introduced for which no raw data is available, but 311 which are calculated from given types. Suppose the last example, you could add 312 <screen>event: Sum = Ir + Dr</screen> 313 to specify an additional event type "Sum", which is calculated by adding costs 314 for "Ir and "Dr".</para> 315 316 </sect3> 317 318 </sect2> 319 320 </sect1> 321 322 <sect1 id="cl-format.reference" xreflabel="Reference"> 323 <title>Reference</title> 324 325 <sect2 id="cl-format.reference.grammar" xreflabel="Grammar"> 326 <title>Grammar</title> 327 328 <para> 329 <screen>ProfileDataFile := FormatVersion? Creator? PartData*</screen> 330 <screen>FormatVersion := "version: 1\n"</screen> 331 <screen>Creator := "creator:" NoNewLineChar* "\n"</screen> 332 <screen>PartData := (HeaderLine "\n")+ (BodyLine "\n")+</screen> 333 <screen>HeaderLine := (empty line) 334 | ('#' NoNewLineChar*) 335 | PartDetail 336 | Description 337 | EventSpecification 338 | CostLineDef</screen> 339 <screen>PartDetail := TargetCommand | TargetID</screen> 340 <screen>TargetCommand := "cmd:" Space* NoNewLineChar*</screen> 341 <screen>TargetID := ("pid"|"thread"|"part") ":" Space* Number</screen> 342 <screen>Description := "desc:" Space* Name Space* ":" NoNewLineChar*</screen> 343 <screen>EventSpecification := "event:" Space* Name InheritedDef? LongNameDef?</screen> 344 <screen>InheritedDef := "=" InheritedExpr</screen> 345 <screen>InheritedExpr := Name 346 | Number Space* ("*" Space*)? Name 347 | InheritedExpr Space* "+" Space* InheritedExpr</screen> 348 <screen>LongNameDef := ":" NoNewLineChar*</screen> 349 <screen>CostLineDef := "events:" Space* Name (Space+ Name)* 350 | "positions:" "instr"? (Space+ "line")?</screen> 351 <screen>BodyLine := (empty line) 352 | ('#' NoNewLineChar*) 353 | CostLine 354 | PositionSpec 355 | CallSpec 356 | UncondJumpSpec 357 | CondJumpSpec</screen> 358 <screen>CostLine := SubPositionList Costs?</screen> 359 <screen>SubPositionList := (SubPosition+ Space+)+</screen> 360 <screen>SubPosition := Number | "+" Number | "-" Number | "*"</screen> 361 <screen>Costs := (Number Space+)+</screen> 362 <screen>PositionSpec := Position "=" Space* PositionName</screen> 363 <screen>Position := CostPosition | CalledPosition</screen> 364 <screen>CostPosition := "ob" | "fl" | "fi" | "fe" | "fn"</screen> 365 <screen>CalledPosition := " "cob" | "cfi" | "cfl" | "cfn"</screen> 366 <screen>PositionName := ( "(" Number ")" )? (Space* NoNewLineChar* )?</screen> 367 <screen>CallSpec := CallLine "\n" CostLine</screen> 368 <screen>CallLine := "calls=" Space* Number Space+ SubPositionList</screen> 369 <screen>UncondJumpSpec := "jump=" Space* Number Space+ SubPositionList</screen> 370 <screen>CondJumpSpec := "jcnd=" Space* Number Space+ Number Space+ SubPositionList</screen> 371 <screen>Space := " " | "\t"</screen> 372 <screen>Number := HexNumber | (Digit)+</screen> 373 <screen>Digit := "0" | ... | "9"</screen> 374 <screen>HexNumber := "0x" (Digit | HexChar)+</screen> 375 <screen>HexChar := "a" | ... | "f" | "A" | ... | "F"</screen> 376 <screen>Name = Alpha (Digit | Alpha)*</screen> 377 <screen>Alpha = "a" | ... | "z" | "A" | ... | "Z"</screen> 378 <screen>NoNewLineChar := all characters without "\n"</screen> 379 </para> 380 381 <para>A profile data file ("ProfileDataFile") starts with basic information 382 such as the version and creator information, and then has a list of parts, where 383 each part has its own header and body. Parts typically are different threads 384 and/or time spans/phases within a profiled application run.</para> 385 386 <para>Note that callgrind_annotate currently only supports profile data files with 387 one part. Callgrind may produce multiple parts for one profile run, but defaults 388 to one output file for each part.</para> 389 390 </sect2> 391 392 <sect2 id="cl-format.reference.header" xreflabel="Description of Header Lines"> 393 <title>Description of Header Lines</title> 394 395 <para>Basic information in the first lines of a profile data file:</para> 396 397 <itemizedlist> 398 <listitem> 399 <para><computeroutput>version: number</computeroutput> [Callgrind]</para> 400 <para>This is used to distinguish future profile data formats. A 401 major version of 0 or 1 is supposed to be upwards compatible with 402 Cachegrind's format. It is optional; if not appearing, version 1 403 is assumed. Otherwise, this has to be the first header line.</para> 404 </listitem> 405 406 <listitem> 407 <para><computeroutput>creator: string</computeroutput> [Callgrind]</para> 408 <para>This is an arbitrary string to denote the creator of this file. 409 Optional.</para> 410 </listitem> 411 412 </itemizedlist> 413 414 <para>The header for each part has an arbitrary number of lines of the format 415 "key: value". Possible <emphasis>key</emphasis> values for the header are:</para> 416 417 <itemizedlist> 418 419 <listitem> 420 <para><computeroutput>pid: process id</computeroutput> [Callgrind]</para> 421 <para>Optional. This specifies the process ID of the supervised application 422 for which this profile was generated.</para> 423 </listitem> 424 425 <listitem> 426 <para><computeroutput>cmd: program name + args</computeroutput> [Cachegrind]</para> 427 <para>Optional. This specifies the full command line of the supervised 428 application for which this profile was generated.</para> 429 </listitem> 430 431 <listitem> 432 <para><computeroutput>part: number</computeroutput> [Callgrind]</para> 433 <para>Optional. This specifies a sequentially incremented number for each dump 434 generated, starting at 1.</para> 435 </listitem> 436 437 <listitem> 438 <para><computeroutput>desc: type: value</computeroutput> [Cachegrind]</para> 439 <para>This specifies various information for this dump. For some 440 types, the semantic is defined, but any description type is allowed. 441 Unknown types should be ignored.</para> 442 <para>There are the types "I1 cache", "D1 cache", "LL cache", which 443 specify parameters used for the cache simulator. These are the only 444 types originally used by Cachegrind. Additionally, Callgrind uses 445 the following types: "Timerange" gives a rough range of the basic 446 block counter, for which the cost of this dump was collected. 447 Type "Trigger" states the reason of why this trace was generated. 448 E.g. program termination or forced interactive dump.</para> 449 </listitem> 450 451 <listitem> 452 <para><computeroutput>positions: [instr] [line]</computeroutput> [Callgrind]</para> 453 <para>For cost lines, this defines the semantic of the first numbers. 454 Any combination of "instr", "bb" and "line" is allowed, but has to be 455 in this order which corresponds to position numbers at the start of 456 the cost lines later in the file.</para> 457 <para>If "instr" is specified, the position is the address of an 458 instruction whose execution raised the events given later on the 459 line. This address is relative to the offset of the binary/shared 460 library file to not have to specify relocation info. For "line", 461 the position is the line number of a source file, which is 462 responsible for the events raised. Note that the mapping of "instr" 463 and "line" positions are given by the debugging line information 464 produced by the compiler.</para> 465 <para>This header line is optional, defaulting to "positions: 466 line" if not specified.</para> 467 </listitem> 468 469 <listitem> 470 <para><computeroutput>events: event type abbreviations</computeroutput> [Cachegrind]</para> 471 <para>A list of short names of the event types logged in cost 472 lines in this part of the profile data file. Arbitrary short 473 names are allowed. The order given specifies the required order 474 in cost lines. Thus, the first event type is the second or third 475 number in a cost line, depending on the value of "positions". 476 Required to appear for each header part exactly once.</para> 477 </listitem> 478 479 <listitem> 480 <para><computeroutput>summary: costs</computeroutput> [Callgrind]</para> 481 <para>Optional. This header line specifies a summary cost, which should be 482 equal or larger than a total over all self costs. It may be larger as 483 the cost lines may not represent all cost of the program run.</para> 484 </listitem> 485 486 <listitem> 487 <para><computeroutput>totals: costs</computeroutput> [Cachegrind]</para> 488 <para>Optional. Should appear at the end of the file (although 489 looking like a header line). Must give the total of all cost lines, 490 to allow for a consistency check.</para> 491 </listitem> 492 493 </itemizedlist> 494 495 </sect2> 496 497 <sect2 id="cl-format.reference.body" xreflabel="Description of Body Lines"> 498 <title>Description of Body Lines</title> 499 500 <para>The regular body line is a cost line consisting of one or two 501 position numbers (depending on "positions:" header line, see above) 502 and an array of cost numbers. A position number either is a 503 line numbers into a source file or an instruction address within binary 504 code, with source/binary file names specified as position names (see 505 below). The cost numbers get mapped to event types in the same order 506 as specified in the "events:" header line. If less numbers than event 507 types are given, the costs default to zero for the remaining event 508 types.</para> 509 510 <para>Further, there exist lines 511 <computeroutput>spec=position name</computeroutput>. A position name 512 is an arbitrary string. If it starts with "(" and a 513 digit, it's a string in compressed format. Otherwise it's the real 514 position string. This allows for file and symbol names as position 515 strings, as these never start with "(" + <emphasis>digit</emphasis>. 516 The compressed format is either "(" <emphasis>number</emphasis> ")" 517 <emphasis>space</emphasis> <emphasis>position</emphasis> or only 518 "(" <emphasis>number</emphasis> ")". The first relates 519 <emphasis>position</emphasis> to <emphasis>number</emphasis> in the 520 context of the given format specification from this line to the end of 521 the file; it makes the (<emphasis>number</emphasis>) an alias for 522 <emphasis>position</emphasis>. Compressed format is always 523 optional.</para> 524 525 <para>Position specifications allowed:</para> 526 <itemizedlist> 527 528 <listitem> 529 <para><computeroutput>ob=</computeroutput> [Callgrind]</para> 530 <para>The ELF object where the cost of next cost lines happens.</para> 531 </listitem> 532 533 <listitem> 534 <para><computeroutput>fl=</computeroutput> [Cachegrind]</para> 535 </listitem> 536 537 <listitem> 538 <para><computeroutput>fi=</computeroutput> [Cachegrind]</para> 539 </listitem> 540 541 <listitem> 542 <para><computeroutput>fe=</computeroutput> [Cachegrind]</para> 543 <para>The source file including the code which is responsible for 544 the cost of next cost lines. "fi="/"fe=" is used when the source 545 file changes inside of a function, i.e. for inlined code.</para> 546 </listitem> 547 548 <listitem> 549 <para><computeroutput>fn=</computeroutput> [Cachegrind]</para> 550 <para>The name of the function where the cost of next cost lines 551 happens.</para> 552 </listitem> 553 554 <listitem> 555 <para><computeroutput>cob=</computeroutput> [Callgrind]</para> 556 <para>The ELF object of the target of the next call cost lines.</para> 557 </listitem> 558 559 <listitem> 560 <para><computeroutput>cfi=</computeroutput> [Callgrind]</para> 561 <para>The source file including the code of the target of the 562 next call cost lines.</para> 563 </listitem> 564 565 <listitem> 566 <para><computeroutput>cfl=</computeroutput> [Callgrind]</para> 567 <para>Alternative spelling for <computeroutput>cfi=</computeroutput> 568 specification (because of historical reasons).</para> 569 </listitem> 570 571 <listitem> 572 <para><computeroutput>cfn=</computeroutput> [Callgrind]</para> 573 <para>The name of the target function of the next call cost 574 lines.</para> 575 </listitem> 576 577 </itemizedlist> 578 579 <para>The last type of body line provides specific costs not just 580 related to one position as regular cost lines. It starts with specific 581 strings similar to position name specifications.</para> 582 583 <itemizedlist> 584 585 <listitem> 586 <para><computeroutput>calls=count target-position</computeroutput> [Callgrind]</para> 587 <para>Call executed "count" times to "target-position". 588 After a "calls=" line there MUST be a cost line. This provides the source position 589 of the call and the cost spent in the called function in total.</para> 590 </listitem> 591 592 <listitem> 593 <para><computeroutput>jump=count target-position</computeroutput> [Callgrind]</para> 594 <para>Unconditional jump, executed "count" times, to "target-position".</para> 595 </listitem> 596 597 <listitem> 598 <para><computeroutput>jcnd=exe-count jump-count target-position</computeroutput> [Callgrind]</para> 599 <para>Conditional jump, executed "exe-count" times with "jump-count" jumps 600 happening (rest is fall-through) to "target-position".</para> 601 </listitem> 602 603 </itemizedlist> 604 605 </sect2> 606 607 </sect1> 608 609 </chapter> 610