1 <?xml version="1.0"?> <!-- -*- sgml -*- --> 2 <!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN" 3 "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd" 4 [ <!ENTITY % vg-entities SYSTEM "../../docs/xml/vg-entities.xml"> %vg-entities; ]> 5 6 7 <chapter id="ms-manual" xreflabel="Massif: a heap profiler"> 8 <title>Massif: a heap profiler</title> 9 10 <para>To use this tool, you must specify 11 <option>--tool=massif</option> on the Valgrind 12 command line.</para> 13 14 <sect1 id="ms-manual.overview" xreflabel="Overview"> 15 <title>Overview</title> 16 17 <para>Massif is a heap profiler. It measures how much heap memory your 18 program uses. This includes both the useful space, and the extra bytes 19 allocated for book-keeping and alignment purposes. It can also 20 measure the size of your program's stack(s), although it does not do so by 21 default.</para> 22 23 <para>Heap profiling can help you reduce the amount of memory your program 24 uses. On modern machines with virtual memory, this provides the following 25 benefits:</para> 26 27 <itemizedlist> 28 <listitem><para>It can speed up your program -- a smaller 29 program will interact better with your machine's caches and 30 avoid paging.</para></listitem> 31 32 <listitem><para>If your program uses lots of memory, it will 33 reduce the chance that it exhausts your machine's swap 34 space.</para></listitem> 35 </itemizedlist> 36 37 <para>Also, there are certain space leaks that aren't detected by 38 traditional leak-checkers, such as Memcheck's. That's because 39 the memory isn't ever actually lost -- a pointer remains to it -- 40 but it's not in use. Programs that have leaks like this can 41 unnecessarily increase the amount of memory they are using over 42 time. Massif can help identify these leaks.</para> 43 44 <para>Importantly, Massif tells you not only how much heap memory your 45 program is using, it also gives very detailed information that indicates 46 which parts of your program are responsible for allocating the heap memory. 47 </para> 48 49 </sect1> 50 51 52 <sect1 id="ms-manual.using" xreflabel="Using Massif and ms_print"> 53 <title>Using Massif and ms_print</title> 54 55 <para>First off, as for the other Valgrind tools, you should compile with 56 debugging info (the <option>-g</option> option). It shouldn't 57 matter much what optimisation level you compile your program with, as this 58 is unlikely to affect the heap memory usage.</para> 59 60 <para>Then, you need to run Massif itself to gather the profiling 61 information, and then run ms_print to present it in a readable way.</para> 62 63 64 65 66 <sect2 id="ms-manual.anexample" xreflabel="An Example"> 67 <title>An Example Program</title> 68 69 <para>An example will make things clear. Consider the following C program 70 (annotated with line numbers) which allocates a number of different blocks 71 on the heap.</para> 72 73 <screen><![CDATA[ 74 1 #include <stdlib.h> 75 2 76 3 void g(void) 77 4 { 78 5 malloc(4000); 79 6 } 80 7 81 8 void f(void) 82 9 { 83 10 malloc(2000); 84 11 g(); 85 12 } 86 13 87 14 int main(void) 88 15 { 89 16 int i; 90 17 int* a[10]; 91 18 92 19 for (i = 0; i < 10; i++) { 93 20 a[i] = malloc(1000); 94 21 } 95 22 96 23 f(); 97 24 98 25 g(); 99 26 100 27 for (i = 0; i < 10; i++) { 101 28 free(a[i]); 102 29 } 103 30 104 31 return 0; 105 32 } 106 ]]></screen> 107 108 </sect2> 109 110 111 <sect2 id="ms-manual.running-massif" xreflabel="Running Massif"> 112 <title>Running Massif</title> 113 114 <para>To gather heap profiling information about the program 115 <computeroutput>prog</computeroutput>, type:</para> 116 <screen><![CDATA[ 117 valgrind --tool=massif prog 118 ]]></screen> 119 120 <para>The program will execute (slowly). Upon completion, no summary 121 statistics are printed to Valgrind's commentary; all of Massif's profiling 122 data is written to a file. By default, this file is called 123 <filename>massif.out.<pid></filename>, where 124 <filename><pid></filename> is the process ID, although this filename 125 can be changed with the <option>--massif-out-file</option> option.</para> 126 127 </sect2> 128 129 130 <sect2 id="ms-manual.running-ms_print" xreflabel="Running ms_print"> 131 <title>Running ms_print</title> 132 133 <para>To see the information gathered by Massif in an easy-to-read form, use 134 ms_print. If the output file's name is 135 <filename>massif.out.12345</filename>, type:</para> 136 <screen><![CDATA[ 137 ms_print massif.out.12345]]></screen> 138 139 <para>ms_print will produce (a) a graph showing the memory consumption over 140 the program's execution, and (b) detailed information about the responsible 141 allocation sites at various points in the program, including the point of 142 peak memory allocation. The use of a separate script for presenting the 143 results is deliberate: it separates the data gathering from its 144 presentation, and means that new methods of presenting the data can be added in 145 the future.</para> 146 147 </sect2> 148 149 150 <sect2 id="ms-manual.theoutputpreamble" xreflabel="The Output Preamble"> 151 <title>The Output Preamble</title> 152 153 <para>After running this program under Massif, the first part of ms_print's 154 output contains a preamble which just states how the program, Massif and 155 ms_print were each invoked:</para> 156 157 <screen><![CDATA[ 158 -------------------------------------------------------------------------------- 159 Command: example 160 Massif arguments: (none) 161 ms_print arguments: massif.out.12797 162 -------------------------------------------------------------------------------- 163 ]]></screen> 164 165 </sect2> 166 167 168 <sect2 id="ms-manual.theoutputgraph" xreflabel="The Output Graph"> 169 <title>The Output Graph</title> 170 171 <para>The next part is the graph that shows how memory consumption occurred 172 as the program executed:</para> 173 174 <screen><![CDATA[ 175 KB 176 19.63^ # 177 | # 178 | # 179 | # 180 | # 181 | # 182 | # 183 | # 184 | # 185 | # 186 | # 187 | # 188 | # 189 | # 190 | # 191 | # 192 | # 193 | :# 194 | :# 195 | :# 196 0 +----------------------------------------------------------------------->ki 0 113.4 197 198 199 Number of snapshots: 25 200 Detailed snapshots: [9, 14 (peak), 24] 201 ]]></screen> 202 203 <para>Why is most of the graph empty, with only a couple of bars at the very 204 end? By default, Massif uses "instructions executed" as the unit of time. 205 For very short-run programs such as the example, most of the executed 206 instructions involve the loading and dynamic linking of the program. The 207 execution of <computeroutput>main</computeroutput> (and thus the heap 208 allocations) only occur at the very end. For a short-running program like 209 this, we can use the <option>--time-unit=B</option> option 210 to specify that we want the time unit to instead be the number of bytes 211 allocated/deallocated on the heap and stack(s).</para> 212 213 <para>If we re-run the program under Massif with this option, and then 214 re-run ms_print, we get this more useful graph:</para> 215 216 <screen><![CDATA[ 217 19.63^ ### 218 | # 219 | # :: 220 | # : ::: 221 | :::::::::# : : :: 222 | : # : : : :: 223 | : # : : : : ::: 224 | : # : : : : : :: 225 | ::::::::::: # : : : : : : ::: 226 | : : # : : : : : : : :: 227 | ::::: : # : : : : : : : : :: 228 | @@@: : : # : : : : : : : : : @ 229 | ::@ : : : # : : : : : : : : : @ 230 | :::: @ : : : # : : : : : : : : : @ 231 | ::: : @ : : : # : : : : : : : : : @ 232 | ::: : : @ : : : # : : : : : : : : : @ 233 | :::: : : : @ : : : # : : : : : : : : : @ 234 | ::: : : : : @ : : : # : : : : : : : : : @ 235 | :::: : : : : : @ : : : # : : : : : : : : : @ 236 | ::: : : : : : : @ : : : # : : : : : : : : : @ 237 0 +----------------------------------------------------------------------->KB 0 29.48 238 239 Number of snapshots: 25 240 Detailed snapshots: [9, 14 (peak), 24] 241 ]]></screen> 242 243 <para>The size of the graph can be changed with ms_print's 244 <option>--x</option> and <option>--y</option> options. Each vertical bar 245 represents a snapshot, i.e. a measurement of the memory usage at a certain 246 point in time. If the next snapshot is more than one column away, a 247 horizontal line of characters is drawn from the top of the snapshot to just 248 before the next snapshot column. The text at the bottom show that 25 249 snapshots were taken for this program, which is one per heap 250 allocation/deallocation, plus a couple of extras. Massif starts by taking 251 snapshots for every heap allocation/deallocation, but as a program runs for 252 longer, it takes snapshots less frequently. It also discards older 253 snapshots as the program goes on; when it reaches the maximum number of 254 snapshots (100 by default, although changeable with the 255 <option>--max-snapshots</option> option) half of them are 256 deleted. This means that a reasonable number of snapshots are always 257 maintained.</para> 258 259 <para>Most snapshots are <emphasis>normal</emphasis>, and only basic 260 information is recorded for them. Normal snapshots are represented in the 261 graph by bars consisting of ':' characters.</para> 262 263 <para>Some snapshots are <emphasis>detailed</emphasis>. Information about 264 where allocations happened are recorded for these snapshots, as we will see 265 shortly. Detailed snapshots are represented in the graph by bars consisting 266 of '@' characters. The text at the bottom show that 3 detailed 267 snapshots were taken for this program (snapshots 9, 14 and 24). By default, 268 every 10th snapshot is detailed, although this can be changed via the 269 <option>--detailed-freq</option> option.</para> 270 271 <para>Finally, there is at most one <emphasis>peak</emphasis> snapshot. The 272 peak snapshot is a detailed snapshot, and records the point where memory 273 consumption was greatest. The peak snapshot is represented in the graph by 274 a bar consisting of '#' characters. The text at the bottom shows 275 that snapshot 14 was the peak.</para> 276 277 <para>Massif's determination of when the peak occurred can be wrong, for 278 two reasons.</para> 279 280 <itemizedlist> 281 <listitem><para>Peak snapshots are only ever taken after a deallocation 282 happens. This avoids lots of unnecessary peak snapshot recordings 283 (imagine what happens if your program allocates a lot of heap blocks in 284 succession, hitting a new peak every time). But it means that if your 285 program never deallocates any blocks, no peak will be recorded. It also 286 means that if your program does deallocate blocks but later allocates to a 287 higher peak without subsequently deallocating, the reported peak will be 288 too low. 289 </para> 290 </listitem> 291 292 <listitem><para>Even with this behaviour, recording the peak accurately 293 is slow. So by default Massif records a peak whose size is within 1% of 294 the size of the true peak. This inaccuracy in the peak measurement can be 295 changed with the <option>--peak-inaccuracy</option> option.</para> 296 </listitem> 297 </itemizedlist> 298 299 <para>The following graph is from an execution of Konqueror, the KDE web 300 browser. It shows what graphs for larger programs look like.</para> 301 <screen><![CDATA[ 302 MB 303 3.952^ # 304 | @#: 305 | :@@#: 306 | @@::::@@#: 307 | @ :: :@@#:: 308 | @@@ :: :@@#:: 309 | @@:@@@ :: :@@#:: 310 | :::@ :@@@ :: :@@#:: 311 | : :@ :@@@ :: :@@#:: 312 | :@: :@ :@@@ :: :@@#:: 313 | @@:@: :@ :@@@ :: :@@#::: 314 | : :: ::@@:@: :@ :@@@ :: :@@#::: 315 | :@@: ::::: ::::@@@:::@@:@: :@ :@@@ :: :@@#::: 316 | ::::@@: ::: ::::::: @ :::@@:@: :@ :@@@ :: :@@#::: 317 | @: ::@@: ::: ::::::: @ :::@@:@: :@ :@@@ :: :@@#::: 318 | @: ::@@: ::: ::::::: @ :::@@:@: :@ :@@@ :: :@@#::: 319 | @: ::@@:::::: ::::::: @ :::@@:@: :@ :@@@ :: :@@#::: 320 | ::@@@: ::@@:: ::: ::::::: @ :::@@:@: :@ :@@@ :: :@@#::: 321 | :::::@ @: ::@@:: ::: ::::::: @ :::@@:@: :@ :@@@ :: :@@#::: 322 | @@:::::@ @: ::@@:: ::: ::::::: @ :::@@:@: :@ :@@@ :: :@@#::: 323 0 +----------------------------------------------------------------------->Mi 324 0 626.4 325 326 Number of snapshots: 63 327 Detailed snapshots: [3, 4, 10, 11, 15, 16, 29, 33, 34, 36, 39, 41, 328 42, 43, 44, 49, 50, 51, 53, 55, 56, 57 (peak)] 329 ]]></screen> 330 331 <para>Note that the larger size units are KB, MB, GB, etc. As is typical 332 for memory measurements, these are based on a multiplier of 1024, rather 333 than the standard SI multiplier of 1000. Strictly speaking, they should be 334 written KiB, MiB, GiB, etc.</para> 335 336 </sect2> 337 338 339 <sect2 id="ms-manual.thesnapshotdetails" xreflabel="The Snapshot Details"> 340 <title>The Snapshot Details</title> 341 342 <para>Returning to our example, the graph is followed by the detailed 343 information for each snapshot. The first nine snapshots are normal, so only 344 a small amount of information is recorded for each one:</para> 345 <screen><![CDATA[ 346 -------------------------------------------------------------------------------- 347 n time(B) total(B) useful-heap(B) extra-heap(B) stacks(B) 348 -------------------------------------------------------------------------------- 349 0 0 0 0 0 0 350 1 1,008 1,008 1,000 8 0 351 2 2,016 2,016 2,000 16 0 352 3 3,024 3,024 3,000 24 0 353 4 4,032 4,032 4,000 32 0 354 5 5,040 5,040 5,000 40 0 355 6 6,048 6,048 6,000 48 0 356 7 7,056 7,056 7,000 56 0 357 8 8,064 8,064 8,000 64 0 358 ]]></screen> 359 360 <para>Each normal snapshot records several things.</para> 361 362 <itemizedlist> 363 <listitem><para>Its number.</para></listitem> 364 365 <listitem><para>The time it was taken. In this case, the time unit is 366 bytes, due to the use of 367 <option>--time-unit=B</option>.</para></listitem> 368 369 <listitem><para>The total memory consumption at that point.</para></listitem> 370 371 <listitem><para>The number of useful heap bytes allocated at that point. 372 This reflects the number of bytes asked for by the 373 program.</para></listitem> 374 375 <listitem><para>The number of extra heap bytes allocated at that point. 376 This reflects the number of bytes allocated in excess of what the program 377 asked for. There are two sources of extra heap bytes.</para> 378 379 <para>First, every heap block has administrative bytes associated with it. 380 The exact number of administrative bytes depends on the details of the 381 allocator. By default Massif assumes 8 bytes per block, as can be seen 382 from the example, but this number can be changed via the 383 <option>--heap-admin</option> option.</para> 384 385 <para>Second, allocators often round up the number of bytes asked for to a 386 larger number, usually 8 or 16. This is required to ensure that elements 387 within the block are suitably aligned. If N bytes are asked for, Massif 388 rounds N up to the nearest multiple of the value specified by the 389 <option><xref linkend="opt.alignment"/></option> option. 390 </para></listitem> 391 392 <listitem><para>The size of the stack(s). By default, stack profiling is 393 off as it slows Massif down greatly. Therefore, the stack column is zero 394 in the example. Stack profiling can be turned on with the 395 <option>--stacks=yes</option> option. 396 397 </para></listitem> 398 </itemizedlist> 399 400 <para>The next snapshot is detailed. As well as the basic counts, it gives 401 an allocation tree which indicates exactly which pieces of code were 402 responsible for allocating heap memory:</para> 403 404 <screen><![CDATA[ 405 9 9,072 9,072 9,000 72 0 406 99.21% (9,000B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc. 407 ->99.21% (9,000B) 0x804841A: main (example.c:20) 408 ]]></screen> 409 410 <para>The allocation tree can be read from the top down. The first line 411 indicates all heap allocation functions such as <function>malloc</function> 412 and C++ <function>new</function>. All heap allocations go through these 413 functions, and so all 9,000 useful bytes (which is 99.21% of all allocated 414 bytes) go through them. But how were <function>malloc</function> and new 415 called? At this point, every allocation so far has been due to line 20 416 inside <function>main</function>, hence the second line in the tree. The 417 <option>-></option> indicates that main (line 20) called 418 <function>malloc</function>.</para> 419 420 <para>Let's see what the subsequent output shows happened next:</para> 421 422 <screen><![CDATA[ 423 -------------------------------------------------------------------------------- 424 n time(B) total(B) useful-heap(B) extra-heap(B) stacks(B) 425 -------------------------------------------------------------------------------- 426 10 10,080 10,080 10,000 80 0 427 11 12,088 12,088 12,000 88 0 428 12 16,096 16,096 16,000 96 0 429 13 20,104 20,104 20,000 104 0 430 14 20,104 20,104 20,000 104 0 431 99.48% (20,000B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc. 432 ->49.74% (10,000B) 0x804841A: main (example.c:20) 433 | 434 ->39.79% (8,000B) 0x80483C2: g (example.c:5) 435 | ->19.90% (4,000B) 0x80483E2: f (example.c:11) 436 | | ->19.90% (4,000B) 0x8048431: main (example.c:23) 437 | | 438 | ->19.90% (4,000B) 0x8048436: main (example.c:25) 439 | 440 ->09.95% (2,000B) 0x80483DA: f (example.c:10) 441 ->09.95% (2,000B) 0x8048431: main (example.c:23) 442 ]]></screen> 443 444 <para>The first four snapshots are similar to the previous ones. But then 445 the global allocation peak is reached, and a detailed snapshot (number 14) 446 is taken. Its allocation tree shows that 20,000B of useful heap memory has 447 been allocated, and the lines and arrows indicate that this is from three 448 different code locations: line 20, which is responsible for 10,000B 449 (49.74%); line 5, which is responsible for 8,000B (39.79%); and line 10, 450 which is responsible for 2,000B (9.95%).</para> 451 452 <para>We can then drill down further in the allocation tree. For example, 453 of the 8,000B asked for by line 5, half of it was due to a call from line 454 11, and half was due to a call from line 25.</para> 455 456 <para>In short, Massif collates the stack trace of every single allocation 457 point in the program into a single tree, which gives a complete picture at 458 a particular point in time of how and why all heap memory was 459 allocated.</para> 460 461 <para>Note that the tree entries correspond not to functions, but to 462 individual code locations. For example, if function <function>A</function> 463 calls <function>malloc</function>, and function <function>B</function> calls 464 <function>A</function> twice, once on line 10 and once on line 11, then 465 the two calls will result in two distinct stack traces in the tree. In 466 contrast, if <function>B</function> calls <function>A</function> repeatedly 467 from line 15 (e.g. due to a loop), then each of those calls will be 468 represented by the same stack trace in the tree.</para> 469 470 <para>Note also that each tree entry with children in the example satisfies an 471 invariant: the entry's size is equal to the sum of its children's sizes. 472 For example, the first entry has size 20,000B, and its children have sizes 473 10,000B, 8,000B, and 2,000B. In general, this invariant almost always 474 holds. However, in rare circumstances stack traces can be malformed, in 475 which case a stack trace can be a sub-trace of another stack trace. This 476 means that some entries in the tree may not satisfy the invariant -- the 477 entry's size will be greater than the sum of its children's sizes. This is 478 not a big problem, but could make the results confusing. Massif can 479 sometimes detect when this happens; if it does, it issues a warning:</para> 480 481 <screen><![CDATA[ 482 Warning: Malformed stack trace detected. In Massif's output, 483 the size of an entry's child entries may not sum up 484 to the entry's size as they normally do. 485 ]]></screen> 486 487 <para>However, Massif does not detect and warn about every such occurrence. 488 Fortunately, malformed stack traces are rare in practice.</para> 489 490 <para>Returning now to ms_print's output, the final part is similar:</para> 491 492 <screen><![CDATA[ 493 -------------------------------------------------------------------------------- 494 n time(B) total(B) useful-heap(B) extra-heap(B) stacks(B) 495 -------------------------------------------------------------------------------- 496 15 21,112 19,096 19,000 96 0 497 16 22,120 18,088 18,000 88 0 498 17 23,128 17,080 17,000 80 0 499 18 24,136 16,072 16,000 72 0 500 19 25,144 15,064 15,000 64 0 501 20 26,152 14,056 14,000 56 0 502 21 27,160 13,048 13,000 48 0 503 22 28,168 12,040 12,000 40 0 504 23 29,176 11,032 11,000 32 0 505 24 30,184 10,024 10,000 24 0 506 99.76% (10,000B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc. 507 ->79.81% (8,000B) 0x80483C2: g (example.c:5) 508 | ->39.90% (4,000B) 0x80483E2: f (example.c:11) 509 | | ->39.90% (4,000B) 0x8048431: main (example.c:23) 510 | | 511 | ->39.90% (4,000B) 0x8048436: main (example.c:25) 512 | 513 ->19.95% (2,000B) 0x80483DA: f (example.c:10) 514 | ->19.95% (2,000B) 0x8048431: main (example.c:23) 515 | 516 ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%) 517 ]]></screen> 518 519 <para>The final detailed snapshot shows how the heap looked at termination. 520 The 00.00% entry represents the code locations for which memory was 521 allocated and then freed (line 20 in this case, the memory for which was 522 freed on line 28). However, no code location details are given for this 523 entry; by default, Massif only records the details for code locations 524 responsible for more than 1% of useful memory bytes, and ms_print likewise 525 only prints the details for code locations responsible for more than 1%. 526 The entries that do not meet this threshold are aggregated. This avoids 527 filling up the output with large numbers of unimportant entries. The 528 thresholds can be changed with the 529 <option>--threshold</option> option that both Massif and 530 ms_print support.</para> 531 532 </sect2> 533 534 535 <sect2 id="ms-manual.forkingprograms" xreflabel="Forking Programs"> 536 <title>Forking Programs</title> 537 <para>If your program forks, the child will inherit all the profiling data that 538 has been gathered for the parent.</para> 539 540 <para>If the output file format string (controlled by 541 <option>--massif-out-file</option>) does not contain <option>%p</option>, then 542 the outputs from the parent and child will be intermingled in a single output 543 file, which will almost certainly make it unreadable by ms_print.</para> 544 </sect2> 545 546 547 <sect2 id="ms-manual.not-measured" 548 xreflabel="Measuring All Memory in a Process"> 549 <title>Measuring All Memory in a Process</title> 550 <para> 551 It is worth emphasising that by default Massif measures only heap memory, i.e. 552 memory allocated with 553 <function>malloc</function>, 554 <function>calloc</function>, 555 <function>realloc</function>, 556 <function>memalign</function>, 557 <function>new</function>, 558 <function>new[]</function>, 559 and a few other, similar functions. (And it can optionally measure stack 560 memory, of course.) This means it does <emphasis>not</emphasis> directly 561 measure memory allocated with lower-level system calls such as 562 <function>mmap</function>, 563 <function>mremap</function>, and 564 <function>brk</function>. 565 </para> 566 567 <para> 568 Heap allocation functions such as <function>malloc</function> are built on 569 top of these system calls. For example, when needed, an allocator will 570 typically call <function>mmap</function> to allocate a large chunk of 571 memory, and then hand over pieces of that memory chunk to the client program 572 in response to calls to <function>malloc</function> et al. Massif directly 573 measures only these higher-level <function>malloc</function> et al calls, 574 not the lower-level system calls. 575 </para> 576 577 <para> 578 Furthermore, a client program may use these lower-level system calls 579 directly to allocate memory. By default, Massif does not measure these. Nor 580 does it measure the size of code, data and BSS segments. Therefore, the 581 numbers reported by Massif may be significantly smaller than those reported by 582 tools such as <filename>top</filename> that measure a program's total size in 583 memory. 584 </para> 585 586 <para> 587 However, if you wish to measure <emphasis>all</emphasis> the memory used by 588 your program, you can use the <option>--pages-as-heap=yes</option>. When this 589 option is enabled, Massif's normal heap block profiling is replaced by 590 lower-level page profiling. Every page allocated via 591 <function>mmap</function> and similar system calls is treated as a distinct 592 block. This means that code, data and BSS segments are all measured, as they 593 are just memory pages. Even the stack is measured, since it is ultimately 594 allocated (and extended when necessary) via <function>mmap</function>; for 595 this reason <option>--stacks=yes</option> is not allowed in conjunction with 596 <option>--pages-as-heap=yes</option>. 597 </para> 598 599 <para> 600 After <option>--pages-as-heap=yes</option> is used, ms_print's output is 601 mostly unchanged. One difference is that the start of each detailed snapshot 602 says: 603 </para> 604 605 <screen><![CDATA[ 606 (page allocation syscalls) mmap/mremap/brk, --alloc-fns, etc. 607 ]]></screen> 608 609 <para>instead of the usual</para>: 610 611 <screen><![CDATA[ 612 (heap allocation functions) malloc/new/new[], --alloc-fns, etc. 613 ]]></screen> 614 615 <para> 616 The stack traces in the output may be more difficult to read, and interpreting 617 them may require some detailed understanding of the lower levels of a program 618 like the memory allocators. But for some programs having the full information 619 about memory usage can be very useful. 620 </para> 621 622 </sect2> 623 624 625 <sect2 id="ms-manual.acting" xreflabel="Action on Massif's Information"> 626 <title>Acting on Massif's Information</title> 627 <para>Massif's information is generally fairly easy to act upon. The 628 obvious place to start looking is the peak snapshot.</para> 629 630 <para>It can also be useful to look at the overall shape of the graph, to 631 see if memory usage climbs and falls as you expect; spikes in the graph 632 might be worth investigating.</para> 633 634 <para>The detailed snapshots can get quite large. It is worth viewing them 635 in a very wide window. It's also a good idea to view them with a text 636 editor. That makes it easy to scroll up and down while keeping the cursor 637 in a particular column, which makes following the allocation chains easier. 638 </para> 639 640 </sect2> 641 642 </sect1> 643 644 645 <sect1 id="ms-manual.options" xreflabel="Massif Command-line Options"> 646 <title>Massif Command-line Options</title> 647 648 <para>Massif-specific command-line options are:</para> 649 650 <!-- start of xi:include in the manpage --> 651 <variablelist id="ms.opts.list"> 652 653 <varlistentry id="opt.heap" xreflabel="--heap"> 654 <term> 655 <option><![CDATA[--heap=<yes|no> [default: yes] ]]></option> 656 </term> 657 <listitem> 658 <para>Specifies whether heap profiling should be done.</para> 659 </listitem> 660 </varlistentry> 661 662 <varlistentry id="opt.heap-admin" xreflabel="--heap-admin"> 663 <term> 664 <option><![CDATA[--heap-admin=<size> [default: 8] ]]></option> 665 </term> 666 <listitem> 667 <para>If heap profiling is enabled, gives the number of administrative 668 bytes per block to use. This should be an estimate of the average, 669 since it may vary. For example, the allocator used by 670 glibc on Linux requires somewhere between 4 to 671 15 bytes per block, depending on various factors. That allocator also 672 requires admin space for freed blocks, but Massif cannot 673 account for this.</para> 674 </listitem> 675 </varlistentry> 676 677 <varlistentry id="opt.stacks" xreflabel="--stacks"> 678 <term> 679 <option><![CDATA[--stacks=<yes|no> [default: no] ]]></option> 680 </term> 681 <listitem> 682 <para>Specifies whether stack profiling should be done. This option 683 slows Massif down greatly, and so is off by default. Note that Massif 684 assumes that the main stack has size zero at start-up. This is not 685 true, but doing otherwise accurately is difficult. Furthermore, 686 starting at zero better indicates the size of the part of the main 687 stack that a user program actually has control over.</para> 688 </listitem> 689 </varlistentry> 690 691 <varlistentry id="opt.pages-as-heap" xreflabel="--pages-as-heap"> 692 <term> 693 <option><![CDATA[--pages-as-heap=<yes|no> [default: no] ]]></option> 694 </term> 695 <listitem> 696 <para>Tells Massif to profile memory at the page level rather 697 than at the malloc'd block level. See above for details. 698 </para> 699 </listitem> 700 </varlistentry> 701 702 <varlistentry id="opt.depth" xreflabel="--depth"> 703 <term> 704 <option><![CDATA[--depth=<number> [default: 30] ]]></option> 705 </term> 706 <listitem> 707 <para>Maximum depth of the allocation trees recorded for detailed 708 snapshots. Increasing it will make Massif run somewhat more slowly, 709 use more memory, and produce bigger output files.</para> 710 </listitem> 711 </varlistentry> 712 713 <varlistentry id="opt.alloc-fn" xreflabel="--alloc-fn"> 714 <term> 715 <option><![CDATA[--alloc-fn=<name> ]]></option> 716 </term> 717 <listitem> 718 <para>Functions specified with this option will be treated as though 719 they were a heap allocation function such as 720 <function>malloc</function>. This is useful for functions that are 721 wrappers to <function>malloc</function> or <function>new</function>, 722 which can fill up the allocation trees with uninteresting information. 723 This option can be specified multiple times on the command line, to 724 name multiple functions.</para> 725 726 <para>Note that the named function will only be treated this way if it is 727 the top entry in a stack trace, or just below another function treated 728 this way. For example, if you have a function 729 <function>malloc1</function> that wraps <function>malloc</function>, 730 and <function>malloc2</function> that wraps 731 <function>malloc1</function>, just specifying 732 <option>--alloc-fn=malloc2</option> will have no effect. You need to 733 specify <option>--alloc-fn=malloc1</option> as well. This is a little 734 inconvenient, but the reason is that checking for allocation functions 735 is slow, and it saves a lot of time if Massif can stop looking through 736 the stack trace entries as soon as it finds one that doesn't match 737 rather than having to continue through all the entries.</para> 738 739 <para>Note that C++ names are demangled. Note also that overloaded 740 C++ names must be written in full. Single quotes may be necessary to 741 prevent the shell from breaking them up. For example: 742 <screen><![CDATA[ 743 --alloc-fn='operator new(unsigned, std::nothrow_t const&)' 744 ]]></screen> 745 </para> 746 </listitem> 747 </varlistentry> 748 749 <varlistentry id="opt.ignore-fn" xreflabel="--ignore-fn"> 750 <term> 751 <option><![CDATA[--ignore-fn=<name> ]]></option> 752 </term> 753 <listitem> 754 <para>Any direct heap allocation (i.e. a call to 755 <function>malloc</function>, <function>new</function>, etc, or a call 756 to a function named by an <option>--alloc-fn</option> 757 option) that occurs in a function specified by this option will be 758 ignored. This is mostly useful for testing purposes. This option can 759 be specified multiple times on the command line, to name multiple 760 functions. 761 </para> 762 763 <para>Any <function>realloc</function> of an ignored block will 764 also be ignored, even if the <function>realloc</function> call does 765 not occur in an ignored function. This avoids the possibility of 766 negative heap sizes if ignored blocks are shrunk with 767 <function>realloc</function>. 768 </para> 769 770 <para>The rules for writing C++ function names are the same as 771 for <option>--alloc-fn</option> above. 772 </para> 773 </listitem> 774 </varlistentry> 775 776 <varlistentry id="opt.threshold" xreflabel="--threshold"> 777 <term> 778 <option><![CDATA[--threshold=<m.n> [default: 1.0] ]]></option> 779 </term> 780 <listitem> 781 <para>The significance threshold for heap allocations, as a 782 percentage of total memory size. Allocation tree entries that account 783 for less than this will be aggregated. Note that this should be 784 specified in tandem with ms_print's option of the same name.</para> 785 </listitem> 786 </varlistentry> 787 788 <varlistentry id="opt.peak-inaccuracy" xreflabel="--peak-inaccuracy"> 789 <term> 790 <option><![CDATA[--peak-inaccuracy=<m.n> [default: 1.0] ]]></option> 791 </term> 792 <listitem> 793 <para>Massif does not necessarily record the actual global memory 794 allocation peak; by default it records a peak only when the global 795 memory allocation size exceeds the previous peak by at least 1.0%. 796 This is because there can be many local allocation peaks along the way, 797 and doing a detailed snapshot for every one would be expensive and 798 wasteful, as all but one of them will be later discarded. This 799 inaccuracy can be changed (even to 0.0%) via this option, but Massif 800 will run drastically slower as the number approaches zero.</para> 801 </listitem> 802 </varlistentry> 803 804 <varlistentry id="opt.time-unit" xreflabel="--time-unit"> 805 <term> 806 <option><![CDATA[--time-unit=<i|ms|B> [default: i] ]]></option> 807 </term> 808 <listitem> 809 <para>The time unit used for the profiling. There are three 810 possibilities: instructions executed (i), which is good for most 811 cases; real (wallclock) time (ms, i.e. milliseconds), which is 812 sometimes useful; and bytes allocated/deallocated on the heap and/or 813 stack (B), which is useful for very short-run programs, and for 814 testing purposes, because it is the most reproducible across different 815 machines.</para> </listitem> 816 </varlistentry> 817 818 <varlistentry id="opt.detailed-freq" xreflabel="--detailed-freq"> 819 <term> 820 <option><![CDATA[--detailed-freq=<n> [default: 10] ]]></option> 821 </term> 822 <listitem> 823 <para>Frequency of detailed snapshots. With 824 <option>--detailed-freq=1</option>, every snapshot is 825 detailed.</para> 826 </listitem> 827 </varlistentry> 828 829 <varlistentry id="opt.max-snapshots" xreflabel="--max-snapshots"> 830 <term> 831 <option><![CDATA[--max-snapshots=<n> [default: 100] ]]></option> 832 </term> 833 <listitem> 834 <para>The maximum number of snapshots recorded. If set to N, for all 835 programs except very short-running ones, the final number of snapshots 836 will be between N/2 and N.</para> 837 </listitem> 838 </varlistentry> 839 840 <varlistentry id="opt.massif-out-file" xreflabel="--massif-out-file"> 841 <term> 842 <option><![CDATA[--massif-out-file=<file> [default: massif.out.%p] ]]></option> 843 </term> 844 <listitem> 845 <para>Write the profile data to <computeroutput>file</computeroutput> 846 rather than to the default output file, 847 <computeroutput>massif.out.<pid></computeroutput>. The 848 <option>%p</option> and <option>%q</option> format specifiers can be 849 used to embed the process ID and/or the contents of an environment 850 variable in the name, as is the case for the core option 851 <option><xref linkend="opt.log-file"/></option>. 852 </para> 853 </listitem> 854 </varlistentry> 855 856 </variablelist> 857 <!-- end of xi:include in the manpage --> 858 859 </sect1> 860 861 862 <sect1 id="ms-manual.clientreqs" xreflabel="Client requests"> 863 <title>Massif Client Requests</title> 864 865 <para>Massif does not have a <filename>massif.h</filename> file, but it does 866 implement two of the core client requests: 867 <function>VALGRIND_MALLOCLIKE_BLOCK</function> and 868 <function>VALGRIND_FREELIKE_BLOCK</function>; they are described in 869 <xref linkend="manual-core-adv.clientreq"/>. 870 </para> 871 872 </sect1> 873 874 875 <sect1 id="ms-manual.ms_print-options" xreflabel="ms_print Command-line Options"> 876 <title>ms_print Command-line Options</title> 877 878 <para>ms_print's options are:</para> 879 880 <!-- start of xi:include in the manpage --> 881 <variablelist id="ms_print.opts.list"> 882 883 <varlistentry> 884 <term> 885 <option><![CDATA[-h --help ]]></option> 886 </term> 887 <listitem> 888 <para>Show the help message.</para> 889 </listitem> 890 </varlistentry> 891 892 <varlistentry> 893 <term> 894 <option><![CDATA[--version ]]></option> 895 </term> 896 <listitem> 897 <para>Show the version number.</para> 898 </listitem> 899 </varlistentry> 900 901 <varlistentry> 902 <term> 903 <option><![CDATA[--threshold=<m.n> [default: 1.0] ]]></option> 904 </term> 905 <listitem> 906 <para>Same as Massif's <option>--threshold</option> option, but 907 applied after profiling rather than during.</para> 908 </listitem> 909 </varlistentry> 910 911 <varlistentry> 912 <term> 913 <option><![CDATA[--x=<4..1000> [default: 72]]]></option> 914 </term> 915 <listitem> 916 <para>Width of the graph, in columns.</para> 917 </listitem> 918 </varlistentry> 919 920 <varlistentry> 921 <term> 922 <option><![CDATA[--y=<4..1000> [default: 20] ]]></option> 923 </term> 924 <listitem> 925 <para>Height of the graph, in rows.</para> 926 </listitem> 927 </varlistentry> 928 929 </variablelist> 930 931 </sect1> 932 933 <sect1 id="ms-manual.fileformat" xreflabel="fileformat"> 934 <title>Massif's Output File Format</title> 935 <para>Massif's file format is plain text (i.e. not binary) and deliberately 936 easy to read for both humans and machines. Nonetheless, the exact format 937 is not described here. This is because the format is currently very 938 Massif-specific. In the future we hope to make the format more general, and 939 thus suitable for possible use with other tools. Once this has been done, 940 the format will be documented here.</para> 941 942 </sect1> 943 944 </chapter> 945