Home | History | Annotate | Download | only in docs
      1 <?xml version="1.0"?> <!-- -*- sgml -*- -->
      2 <!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
      3           "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd"
      4 [ <!ENTITY % vg-entities SYSTEM "../../docs/xml/vg-entities.xml"> %vg-entities; ]>
      5 
      6 
      7 <chapter id="ms-manual" xreflabel="Massif: a heap profiler">
      8   <title>Massif: a heap profiler</title>
      9 
     10 <para>To use this tool, you must specify
     11 <option>--tool=massif</option> on the Valgrind
     12 command line.</para>
     13 
     14 <sect1 id="ms-manual.overview" xreflabel="Overview">
     15 <title>Overview</title>
     16 
     17 <para>Massif is a heap profiler.  It measures how much heap memory your
     18 program uses.  This includes both the useful space, and the extra bytes
     19 allocated for book-keeping and alignment purposes.  It can also
     20 measure the size of your program's stack(s), although it does not do so by
     21 default.</para>
     22 
     23 <para>Heap profiling can help you reduce the amount of memory your program
     24 uses.  On modern machines with virtual memory, this provides the following
     25 benefits:</para>
     26 
     27 <itemizedlist>
     28   <listitem><para>It can speed up your program -- a smaller
     29     program will interact better with your machine's caches and
     30     avoid paging.</para></listitem>
     31 
     32   <listitem><para>If your program uses lots of memory, it will
     33     reduce the chance that it exhausts your machine's swap
     34     space.</para></listitem>
     35 </itemizedlist>
     36 
     37 <para>Also, there are certain space leaks that aren't detected by
     38 traditional leak-checkers, such as Memcheck's.  That's because
     39 the memory isn't ever actually lost -- a pointer remains to it --
     40 but it's not in use.  Programs that have leaks like this can
     41 unnecessarily increase the amount of memory they are using over
     42 time.  Massif can help identify these leaks.</para>
     43 
     44 <para>Importantly, Massif tells you not only how much heap memory your
     45 program is using, it also gives very detailed information that indicates
     46 which parts of your program are responsible for allocating the heap memory.
     47 </para>
     48 
     49 </sect1>
     50 
     51 
     52 <sect1 id="ms-manual.using" xreflabel="Using Massif and ms_print">
     53 <title>Using Massif and ms_print</title>
     54 
     55 <para>First off, as for the other Valgrind tools, you should compile with
     56 debugging info (the <option>-g</option> option).  It shouldn't
     57 matter much what optimisation level you compile your program with, as this
     58 is unlikely to affect the heap memory usage.</para>
     59 
     60 <para>Then, you need to run Massif itself to gather the profiling
     61 information, and then run ms_print to present it in a readable way.</para>
     62 
     63 
     64 
     65 
     66 <sect2 id="ms-manual.anexample" xreflabel="An Example">
     67 <title>An Example Program</title>
     68 
     69 <para>An example will make things clear.  Consider the following C program
     70 (annotated with line numbers) which allocates a number of different blocks
     71 on the heap.</para>
     72 
     73 <screen><![CDATA[
     74  1      #include <stdlib.h>
     75  2
     76  3      void g(void)
     77  4      {
     78  5         malloc(4000);
     79  6      }
     80  7
     81  8      void f(void)
     82  9      {
     83 10         malloc(2000);
     84 11         g();
     85 12      }
     86 13
     87 14      int main(void)
     88 15      {
     89 16         int i;
     90 17         int* a[10];
     91 18
     92 19         for (i = 0; i < 10; i++) {
     93 20            a[i] = malloc(1000);
     94 21         }
     95 22
     96 23         f();
     97 24
     98 25         g();
     99 26
    100 27         for (i = 0; i < 10; i++) {
    101 28            free(a[i]);
    102 29         }
    103 30
    104 31         return 0;
    105 32      }
    106 ]]></screen>
    107 
    108 </sect2>
    109 
    110 
    111 <sect2 id="ms-manual.running-massif" xreflabel="Running Massif">
    112 <title>Running Massif</title>
    113 
    114 <para>To gather heap profiling information about the program
    115 <computeroutput>prog</computeroutput>, type:</para>
    116 <screen><![CDATA[
    117 valgrind --tool=massif prog
    118 ]]></screen>
    119 
    120 <para>The program will execute (slowly).  Upon completion, no summary
    121 statistics are printed to Valgrind's commentary;  all of Massif's profiling
    122 data is written to a file.  By default, this file is called
    123 <filename>massif.out.&lt;pid&gt;</filename>, where
    124 <filename>&lt;pid&gt;</filename> is the process ID, although this filename
    125 can be changed with the <option>--massif-out-file</option> option.</para>
    126 
    127 </sect2>
    128 
    129 
    130 <sect2 id="ms-manual.running-ms_print" xreflabel="Running ms_print">
    131 <title>Running ms_print</title>
    132 
    133 <para>To see the information gathered by Massif in an easy-to-read form, use
    134 ms_print.  If the output file's name is
    135 <filename>massif.out.12345</filename>, type:</para>
    136 <screen><![CDATA[
    137 ms_print massif.out.12345]]></screen>
    138 
    139 <para>ms_print will produce (a) a graph showing the memory consumption over
    140 the program's execution, and (b) detailed information about the responsible
    141 allocation sites at various points in the program, including the point of
    142 peak memory allocation.  The use of a separate script for presenting the
    143 results is deliberate:  it separates the data gathering from its
    144 presentation, and means that new methods of presenting the data can be added in
    145 the future.</para>
    146 
    147 </sect2>
    148 
    149 
    150 <sect2 id="ms-manual.theoutputpreamble" xreflabel="The Output Preamble">
    151 <title>The Output Preamble</title>
    152 
    153 <para>After running this program under Massif, the first part of ms_print's
    154 output contains a preamble which just states how the program, Massif and
    155 ms_print were each invoked:</para>
    156 
    157 <screen><![CDATA[
    158 --------------------------------------------------------------------------------
    159 Command:            example
    160 Massif arguments:   (none)
    161 ms_print arguments: massif.out.12797
    162 --------------------------------------------------------------------------------
    163 ]]></screen>
    164 
    165 </sect2>
    166 
    167 
    168 <sect2 id="ms-manual.theoutputgraph" xreflabel="The Output Graph">
    169 <title>The Output Graph</title>
    170 
    171 <para>The next part is the graph that shows how memory consumption occurred
    172 as the program executed:</para>
    173 
    174 <screen><![CDATA[
    175     KB
    176 19.63^                                                                       #
    177      |                                                                       #
    178      |                                                                       #
    179      |                                                                       #
    180      |                                                                       #
    181      |                                                                       #
    182      |                                                                       #
    183      |                                                                       #
    184      |                                                                       #
    185      |                                                                       #
    186      |                                                                       #
    187      |                                                                       #
    188      |                                                                       #
    189      |                                                                       #
    190      |                                                                       #
    191      |                                                                       #
    192      |                                                                       #
    193      |                                                                      :#
    194      |                                                                      :#
    195      |                                                                      :#
    196    0 +----------------------------------------------------------------------->ki     0                                                                   113.4
    197 
    198 
    199 Number of snapshots: 25
    200  Detailed snapshots: [9, 14 (peak), 24]
    201 ]]></screen>
    202 
    203 <para>Why is most of the graph empty, with only a couple of bars at the very
    204 end?  By default, Massif uses "instructions executed" as the unit of time.
    205 For very short-run programs such as the example, most of the executed
    206 instructions involve the loading and dynamic linking of the program.  The
    207 execution of <computeroutput>main</computeroutput> (and thus the heap
    208 allocations) only occur at the very end.  For a short-running program like
    209 this, we can use the <option>--time-unit=B</option> option
    210 to specify that we want the time unit to instead be the number of bytes
    211 allocated/deallocated on the heap and stack(s).</para>
    212 
    213 <para>If we re-run the program under Massif with this option, and then
    214 re-run ms_print, we get this more useful graph:</para>
    215 
    216 <screen><![CDATA[
    217 19.63^                                               ###                      
    218      |                                               #                        
    219      |                                               #  ::                    
    220      |                                               #  : :::                 
    221      |                                      :::::::::#  : :  ::               
    222      |                                      :        #  : :  : ::             
    223      |                                      :        #  : :  : : :::          
    224      |                                      :        #  : :  : : :  ::        
    225      |                            :::::::::::        #  : :  : : :  : :::     
    226      |                            :         :        #  : :  : : :  : :  ::   
    227      |                        :::::         :        #  : :  : : :  : :  : :: 
    228      |                     @@@:   :         :        #  : :  : : :  : :  : : @
    229      |                   ::@  :   :         :        #  : :  : : :  : :  : : @
    230      |                :::: @  :   :         :        #  : :  : : :  : :  : : @
    231      |              :::  : @  :   :         :        #  : :  : : :  : :  : : @
    232      |            ::: :  : @  :   :         :        #  : :  : : :  : :  : : @
    233      |         :::: : :  : @  :   :         :        #  : :  : : :  : :  : : @
    234      |       :::  : : :  : @  :   :         :        #  : :  : : :  : :  : : @
    235      |    :::: :  : : :  : @  :   :         :        #  : :  : : :  : :  : : @
    236      |  :::  : :  : : :  : @  :   :         :        #  : :  : : :  : :  : : @
    237    0 +----------------------------------------------------------------------->KB     0                                                                   29.48
    238 
    239 Number of snapshots: 25
    240  Detailed snapshots: [9, 14 (peak), 24]
    241 ]]></screen>
    242 
    243 <para>The size of the graph can be changed with ms_print's
    244 <option>--x</option> and <option>--y</option> options.  Each vertical bar
    245 represents a snapshot, i.e. a measurement of the memory usage at a certain
    246 point in time.  If the next snapshot is more than one column away, a
    247 horizontal line of characters is drawn from the top of the snapshot to just
    248 before the next snapshot column.  The text at the bottom show that 25
    249 snapshots were taken for this program, which is one per heap
    250 allocation/deallocation, plus a couple of extras.  Massif starts by taking
    251 snapshots for every heap allocation/deallocation, but as a program runs for
    252 longer, it takes snapshots less frequently.  It also discards older
    253 snapshots as the program goes on;  when it reaches the maximum number of
    254 snapshots (100 by default, although changeable with the
    255 <option>--max-snapshots</option> option) half of them are
    256 deleted.  This means that a reasonable number of snapshots are always
    257 maintained.</para>
    258 
    259 <para>Most snapshots are <emphasis>normal</emphasis>, and only basic
    260 information is recorded for them.  Normal snapshots are represented in the
    261 graph by bars consisting of ':' characters.</para>
    262 
    263 <para>Some snapshots are <emphasis>detailed</emphasis>.  Information about
    264 where allocations happened are recorded for these snapshots, as we will see
    265 shortly.  Detailed snapshots are represented in the graph by bars consisting
    266 of '@' characters.  The text at the bottom show that 3 detailed
    267 snapshots were taken for this program (snapshots 9, 14 and 24).  By default,
    268 every 10th snapshot is detailed, although this can be changed via the
    269 <option>--detailed-freq</option> option.</para>
    270 
    271 <para>Finally, there is at most one <emphasis>peak</emphasis> snapshot.  The
    272 peak snapshot is a detailed snapshot, and records the point where memory
    273 consumption was greatest.  The peak snapshot is represented in the graph by
    274 a bar consisting of '#' characters.  The text at the bottom shows
    275 that snapshot 14 was the peak.</para>
    276 
    277 <para>Massif's determination of when the peak occurred can be wrong, for
    278 two reasons.</para>
    279 
    280 <itemizedlist>
    281   <listitem><para>Peak snapshots are only ever taken after a deallocation
    282   happens.  This avoids lots of unnecessary peak snapshot recordings
    283   (imagine what happens if your program allocates a lot of heap blocks in
    284   succession, hitting a new peak every time).  But it means that if your
    285   program never deallocates any blocks, no peak will be recorded.  It also
    286   means that if your program does deallocate blocks but later allocates to a
    287   higher peak without subsequently deallocating, the reported peak will be
    288   too low.
    289   </para>
    290   </listitem>
    291 
    292   <listitem><para>Even with this behaviour, recording the peak accurately
    293   is slow.  So by default Massif records a peak whose size is within 1% of
    294   the size of the true peak.  This inaccuracy in the peak measurement can be
    295   changed with the <option>--peak-inaccuracy</option> option.</para>
    296   </listitem>
    297 </itemizedlist>
    298 
    299 <para>The following graph is from an execution of Konqueror, the KDE web
    300 browser.  It shows what graphs for larger programs look like.</para>
    301 <screen><![CDATA[
    302     MB
    303 3.952^                                                                    # 
    304      |                                                                   @#:
    305      |                                                                 :@@#:
    306      |                                                            @@::::@@#: 
    307      |                                                            @ :: :@@#::
    308      |                                                          @@@ :: :@@#::
    309      |                                                       @@:@@@ :: :@@#::
    310      |                                                    :::@ :@@@ :: :@@#::
    311      |                                                    : :@ :@@@ :: :@@#::
    312      |                                                  :@: :@ :@@@ :: :@@#:: 
    313      |                                                @@:@: :@ :@@@ :: :@@#:::
    314      |                           :       ::         ::@@:@: :@ :@@@ :: :@@#:::
    315      |                        :@@:    ::::: ::::@@@:::@@:@: :@ :@@@ :: :@@#:::
    316      |                     ::::@@:  ::: ::::::: @  :::@@:@: :@ :@@@ :: :@@#:::
    317      |                    @: ::@@:  ::: ::::::: @  :::@@:@: :@ :@@@ :: :@@#:::
    318      |                    @: ::@@:  ::: ::::::: @  :::@@:@: :@ :@@@ :: :@@#:::
    319      |                    @: ::@@:::::: ::::::: @  :::@@:@: :@ :@@@ :: :@@#:::
    320      |                ::@@@: ::@@:: ::: ::::::: @  :::@@:@: :@ :@@@ :: :@@#:::
    321      |             :::::@ @: ::@@:: ::: ::::::: @  :::@@:@: :@ :@@@ :: :@@#:::
    322      |           @@:::::@ @: ::@@:: ::: ::::::: @  :::@@:@: :@ :@@@ :: :@@#:::
    323    0 +----------------------------------------------------------------------->Mi
    324      0                                                                   626.4
    325 
    326 Number of snapshots: 63
    327  Detailed snapshots: [3, 4, 10, 11, 15, 16, 29, 33, 34, 36, 39, 41,
    328                       42, 43, 44, 49, 50, 51, 53, 55, 56, 57 (peak)]
    329 ]]></screen>
    330 
    331 <para>Note that the larger size units are KB, MB, GB, etc.  As is typical
    332 for memory measurements, these are based on a multiplier of 1024, rather
    333 than the standard SI multiplier of 1000.  Strictly speaking, they should be
    334 written KiB, MiB, GiB, etc.</para>
    335 
    336 </sect2>
    337 
    338 
    339 <sect2 id="ms-manual.thesnapshotdetails" xreflabel="The Snapshot Details">
    340 <title>The Snapshot Details</title>
    341 
    342 <para>Returning to our example, the graph is followed by the detailed
    343 information for each snapshot.  The first nine snapshots are normal, so only
    344 a small amount of information is recorded for each one:</para>
    345 <screen><![CDATA[
    346 --------------------------------------------------------------------------------
    347   n        time(B)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
    348 --------------------------------------------------------------------------------
    349   0              0                0                0             0            0
    350   1          1,008            1,008            1,000             8            0
    351   2          2,016            2,016            2,000            16            0
    352   3          3,024            3,024            3,000            24            0
    353   4          4,032            4,032            4,000            32            0
    354   5          5,040            5,040            5,000            40            0
    355   6          6,048            6,048            6,000            48            0
    356   7          7,056            7,056            7,000            56            0
    357   8          8,064            8,064            8,000            64            0
    358 ]]></screen>
    359 
    360 <para>Each normal snapshot records several things.</para>
    361 
    362 <itemizedlist>
    363   <listitem><para>Its number.</para></listitem>
    364 
    365   <listitem><para>The time it was taken. In this case, the time unit is
    366   bytes, due to the use of
    367   <option>--time-unit=B</option>.</para></listitem>
    368 
    369   <listitem><para>The total memory consumption at that point.</para></listitem>
    370 
    371   <listitem><para>The number of useful heap bytes allocated at that point.
    372   This reflects the number of bytes asked for by the
    373   program.</para></listitem>
    374 
    375   <listitem><para>The number of extra heap bytes allocated at that point.
    376   This reflects the number of bytes allocated in excess of what the program
    377   asked for.  There are two sources of extra heap bytes.</para>
    378   
    379   <para>First, every heap block has administrative bytes associated with it.
    380   The exact number of administrative bytes depends on the details of the
    381   allocator.  By default Massif assumes 8 bytes per block, as can be seen
    382   from the example, but this number can be changed via the
    383   <option>--heap-admin</option> option.</para>
    384 
    385   <para>Second, allocators often round up the number of bytes asked for to a
    386   larger number, usually 8 or 16.  This is required to ensure that elements
    387   within the block are suitably aligned.  If N bytes are asked for, Massif
    388   rounds N up to the nearest multiple of the value specified by the
    389   <option><xref linkend="opt.alignment"/></option> option.
    390   </para></listitem>
    391 
    392   <listitem><para>The size of the stack(s).  By default, stack profiling is
    393   off as it slows Massif down greatly.  Therefore, the stack column is zero
    394   in the example.  Stack profiling can be turned on with the
    395   <option>--stacks=yes</option> option.  
    396   
    397   </para></listitem>
    398 </itemizedlist>
    399 
    400 <para>The next snapshot is detailed.  As well as the basic counts, it gives
    401 an allocation tree which indicates exactly which pieces of code were
    402 responsible for allocating heap memory:</para>
    403 
    404 <screen><![CDATA[
    405   9          9,072            9,072            9,000            72            0
    406 99.21% (9,000B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
    407 ->99.21% (9,000B) 0x804841A: main (example.c:20)
    408 ]]></screen>
    409 
    410 <para>The allocation tree can be read from the top down.  The first line
    411 indicates all heap allocation functions such as <function>malloc</function>
    412 and C++ <function>new</function>.  All heap allocations go through these
    413 functions, and so all 9,000 useful bytes (which is 99.21% of all allocated
    414 bytes) go through them.  But how were <function>malloc</function> and new
    415 called?  At this point, every allocation so far has been due to line 20
    416 inside <function>main</function>, hence the second line in the tree.  The
    417 <option>-></option> indicates that main (line 20) called
    418 <function>malloc</function>.</para>
    419 
    420 <para>Let's see what the subsequent output shows happened next:</para>
    421 
    422 <screen><![CDATA[
    423 --------------------------------------------------------------------------------
    424   n        time(B)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
    425 --------------------------------------------------------------------------------
    426  10         10,080           10,080           10,000            80            0
    427  11         12,088           12,088           12,000            88            0
    428  12         16,096           16,096           16,000            96            0
    429  13         20,104           20,104           20,000           104            0
    430  14         20,104           20,104           20,000           104            0
    431 99.48% (20,000B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
    432 ->49.74% (10,000B) 0x804841A: main (example.c:20)
    433 | 
    434 ->39.79% (8,000B) 0x80483C2: g (example.c:5)
    435 | ->19.90% (4,000B) 0x80483E2: f (example.c:11)
    436 | | ->19.90% (4,000B) 0x8048431: main (example.c:23)
    437 | |   
    438 | ->19.90% (4,000B) 0x8048436: main (example.c:25)
    439 |   
    440 ->09.95% (2,000B) 0x80483DA: f (example.c:10)
    441   ->09.95% (2,000B) 0x8048431: main (example.c:23)
    442 ]]></screen>
    443 
    444 <para>The first four snapshots are similar to the previous ones.  But then
    445 the global allocation peak is reached, and a detailed snapshot (number 14)
    446 is taken.  Its allocation tree shows that 20,000B of useful heap memory has
    447 been allocated, and the lines and arrows indicate that this is from three
    448 different code locations: line 20, which is responsible for 10,000B
    449 (49.74%);  line 5, which is responsible for 8,000B (39.79%); and line 10,
    450 which is responsible for 2,000B (9.95%).</para>
    451 
    452 <para>We can then drill down further in the allocation tree.  For example,
    453 of the 8,000B asked for by line 5, half of it was due to a call from line
    454 11, and half was due to a call from line 25.</para>
    455 
    456 <para>In short, Massif collates the stack trace of every single allocation
    457 point in the program into a single tree, which gives a complete picture at
    458 a particular point in time of how and why all heap memory was
    459 allocated.</para>
    460 
    461 <para>Note that the tree entries correspond not to functions, but to
    462 individual code locations.  For example, if function <function>A</function>
    463 calls <function>malloc</function>, and function <function>B</function> calls
    464 <function>A</function> twice, once on line 10 and once on line 11, then
    465 the two calls will result in two distinct stack traces in the tree.  In
    466 contrast, if <function>B</function> calls <function>A</function> repeatedly
    467 from line 15 (e.g. due to a loop), then each of those calls will be
    468 represented by the same stack trace in the tree.</para>
    469 
    470 <para>Note also that each tree entry with children in the example satisfies an
    471 invariant: the entry's size is equal to the sum of its children's sizes.
    472 For example, the first entry has size 20,000B, and its children have sizes
    473 10,000B, 8,000B, and 2,000B.  In general, this invariant almost always
    474 holds.  However, in rare circumstances stack traces can be malformed, in
    475 which case a stack trace can be a sub-trace of another stack trace.  This
    476 means that some entries in the tree may not satisfy the invariant -- the
    477 entry's size will be greater than the sum of its children's sizes.  This is
    478 not a big problem, but could make the results confusing.  Massif can
    479 sometimes detect when this happens;  if it does, it issues a warning:</para>
    480 
    481 <screen><![CDATA[
    482 Warning: Malformed stack trace detected.  In Massif's output,
    483          the size of an entry's child entries may not sum up
    484          to the entry's size as they normally do.
    485 ]]></screen>
    486 
    487 <para>However, Massif does not detect and warn about every such occurrence.
    488 Fortunately, malformed stack traces are rare in practice.</para>
    489 
    490 <para>Returning now to ms_print's output, the final part is similar:</para>
    491 
    492 <screen><![CDATA[
    493 --------------------------------------------------------------------------------
    494   n        time(B)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
    495 --------------------------------------------------------------------------------
    496  15         21,112           19,096           19,000            96            0
    497  16         22,120           18,088           18,000            88            0
    498  17         23,128           17,080           17,000            80            0
    499  18         24,136           16,072           16,000            72            0
    500  19         25,144           15,064           15,000            64            0
    501  20         26,152           14,056           14,000            56            0
    502  21         27,160           13,048           13,000            48            0
    503  22         28,168           12,040           12,000            40            0
    504  23         29,176           11,032           11,000            32            0
    505  24         30,184           10,024           10,000            24            0
    506 99.76% (10,000B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
    507 ->79.81% (8,000B) 0x80483C2: g (example.c:5)
    508 | ->39.90% (4,000B) 0x80483E2: f (example.c:11)
    509 | | ->39.90% (4,000B) 0x8048431: main (example.c:23)
    510 | |   
    511 | ->39.90% (4,000B) 0x8048436: main (example.c:25)
    512 |   
    513 ->19.95% (2,000B) 0x80483DA: f (example.c:10)
    514 | ->19.95% (2,000B) 0x8048431: main (example.c:23)
    515 |   
    516 ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%)
    517 ]]></screen>
    518 
    519 <para>The final detailed snapshot shows how the heap looked at termination.
    520 The 00.00% entry represents the code locations for which memory was
    521 allocated and then freed (line 20 in this case, the memory for which was
    522 freed on line 28).  However, no code location details are given for this
    523 entry;  by default, Massif only records the details for code locations
    524 responsible for more than 1% of useful memory bytes, and ms_print likewise
    525 only prints the details for code locations responsible for more than 1%.
    526 The entries that do not meet this threshold are aggregated.  This avoids
    527 filling up the output with large numbers of unimportant entries.  The
    528 thresholds can be changed with the
    529 <option>--threshold</option> option that both Massif and
    530 ms_print support.</para>
    531 
    532 </sect2>
    533 
    534 
    535 <sect2 id="ms-manual.forkingprograms" xreflabel="Forking Programs">
    536 <title>Forking Programs</title>
    537 <para>If your program forks, the child will inherit all the profiling data that
    538 has been gathered for the parent.</para>
    539 
    540 <para>If the output file format string (controlled by
    541 <option>--massif-out-file</option>) does not contain <option>%p</option>, then
    542 the outputs from the parent and child will be intermingled in a single output
    543 file, which will almost certainly make it unreadable by ms_print.</para>
    544 </sect2>
    545 
    546 
    547 <sect2 id="ms-manual.not-measured"
    548        xreflabel="Measuring All Memory in a Process">
    549 <title>Measuring All Memory in a Process</title>
    550 <para>
    551 It is worth emphasising that by default Massif measures only heap memory, i.e.
    552 memory allocated with
    553 <function>malloc</function>,
    554 <function>calloc</function>,
    555 <function>realloc</function>,
    556 <function>memalign</function>,
    557 <function>new</function>,
    558 <function>new[]</function>,
    559 and a few other, similar functions.  (And it can optionally measure stack
    560 memory, of course.)  This means it does <emphasis>not</emphasis> directly
    561 measure memory allocated with lower-level system calls such as
    562 <function>mmap</function>,
    563 <function>mremap</function>, and
    564 <function>brk</function>.  
    565 </para>
    566 
    567 <para>
    568 Heap allocation functions such as <function>malloc</function> are built on
    569 top of these system calls.  For example, when needed, an allocator will
    570 typically call <function>mmap</function> to allocate a large chunk of
    571 memory, and then hand over pieces of that memory chunk to the client program
    572 in response to calls to <function>malloc</function> et al.  Massif directly
    573 measures only these higher-level <function>malloc</function> et al calls,
    574 not the lower-level system calls.
    575 </para>
    576 
    577 <para>
    578 Furthermore, a client program may use these lower-level system calls
    579 directly to allocate memory.  By default, Massif does not measure these.  Nor
    580 does it measure the size of code, data and BSS segments.  Therefore, the
    581 numbers reported by Massif may be significantly smaller than those reported by
    582 tools such as <filename>top</filename> that measure a program's total size in
    583 memory.
    584 </para>
    585 
    586 <para>
    587 However, if you wish to measure <emphasis>all</emphasis> the memory used by
    588 your program, you can use the <option>--pages-as-heap=yes</option>.  When this
    589 option is enabled, Massif's normal heap block profiling is replaced by
    590 lower-level page profiling.  Every page allocated via
    591 <function>mmap</function> and similar system calls is treated as a distinct
    592 block.  This means that code, data and BSS segments are all measured, as they
    593 are just memory pages.  Even the stack is measured, since it is ultimately
    594 allocated (and extended when necessary) via <function>mmap</function>;  for
    595 this reason <option>--stacks=yes</option> is not allowed in conjunction with
    596 <option>--pages-as-heap=yes</option>.
    597 </para>
    598 
    599 <para>
    600 After <option>--pages-as-heap=yes</option> is used, ms_print's output is
    601 mostly unchanged.  One difference is that the start of each detailed snapshot
    602 says:
    603 </para>
    604 
    605 <screen><![CDATA[
    606 (page allocation syscalls) mmap/mremap/brk, --alloc-fns, etc.
    607 ]]></screen>
    608 
    609 <para>instead of the usual</para>:
    610 
    611 <screen><![CDATA[
    612 (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
    613 ]]></screen>
    614 
    615 <para>
    616 The stack traces in the output may be more difficult to read, and interpreting
    617 them may require some detailed understanding of the lower levels of a program
    618 like the memory allocators.  But for some programs having the full information
    619 about memory usage can be very useful.
    620 </para>
    621 
    622 </sect2>
    623 
    624 
    625 <sect2 id="ms-manual.acting" xreflabel="Action on Massif's Information">
    626 <title>Acting on Massif's Information</title>
    627 <para>Massif's information is generally fairly easy to act upon.  The
    628 obvious place to start looking is the peak snapshot.</para>
    629 
    630 <para>It can also be useful to look at the overall shape of the graph, to
    631 see if memory usage climbs and falls as you expect;  spikes in the graph
    632 might be worth investigating.</para>
    633 
    634 <para>The detailed snapshots can get quite large.  It is worth viewing them
    635 in a very wide window.   It's also a good idea to view them with a text
    636 editor.  That makes it easy to scroll up and down while keeping the cursor
    637 in a particular column, which makes following the allocation chains easier.
    638 </para>
    639 
    640 </sect2>
    641 
    642 </sect1>
    643 
    644 
    645 <sect1 id="ms-manual.options" xreflabel="Massif Command-line Options">
    646 <title>Massif Command-line Options</title>
    647 
    648 <para>Massif-specific command-line options are:</para>
    649 
    650 <!-- start of xi:include in the manpage -->
    651 <variablelist id="ms.opts.list">
    652 
    653   <varlistentry id="opt.heap" xreflabel="--heap">
    654     <term>
    655       <option><![CDATA[--heap=<yes|no> [default: yes] ]]></option>
    656     </term>
    657     <listitem>
    658       <para>Specifies whether heap profiling should be done.</para>
    659     </listitem>
    660   </varlistentry>
    661 
    662   <varlistentry id="opt.heap-admin" xreflabel="--heap-admin">
    663     <term>
    664       <option><![CDATA[--heap-admin=<size> [default: 8] ]]></option>
    665     </term>
    666     <listitem>
    667       <para>If heap profiling is enabled, gives the number of administrative
    668       bytes per block to use.  This should be an estimate of the average,
    669       since it may vary.  For example, the allocator used by
    670       glibc on Linux requires somewhere between 4 to
    671       15 bytes per block, depending on various factors.  That allocator also
    672       requires admin space for freed blocks, but Massif cannot
    673       account for this.</para>
    674     </listitem>
    675   </varlistentry>
    676 
    677   <varlistentry id="opt.stacks" xreflabel="--stacks">
    678     <term>
    679       <option><![CDATA[--stacks=<yes|no> [default: no] ]]></option>
    680     </term>
    681     <listitem>
    682       <para>Specifies whether stack profiling should be done.  This option
    683       slows Massif down greatly, and so is off by default.  Note that Massif
    684       assumes that the main stack has size zero at start-up.  This is not
    685       true, but doing otherwise accurately is difficult.  Furthermore,
    686       starting at zero better indicates the size of the part of the main
    687       stack that a user program actually has control over.</para>
    688     </listitem>
    689   </varlistentry>
    690 
    691   <varlistentry id="opt.pages-as-heap" xreflabel="--pages-as-heap">
    692     <term>
    693       <option><![CDATA[--pages-as-heap=<yes|no> [default: no] ]]></option>
    694     </term>
    695     <listitem>
    696       <para>Tells Massif to profile memory at the page level rather
    697         than at the malloc'd block level.  See above for details.
    698       </para>
    699     </listitem>
    700   </varlistentry>
    701 
    702   <varlistentry id="opt.depth" xreflabel="--depth">
    703     <term>
    704       <option><![CDATA[--depth=<number> [default: 30] ]]></option>
    705     </term>
    706     <listitem>
    707       <para>Maximum depth of the allocation trees recorded for detailed
    708       snapshots.  Increasing it will make Massif run somewhat more slowly,
    709       use more memory, and produce bigger output files.</para>
    710     </listitem>
    711   </varlistentry>
    712 
    713   <varlistentry id="opt.alloc-fn" xreflabel="--alloc-fn">
    714     <term>
    715       <option><![CDATA[--alloc-fn=<name> ]]></option>
    716     </term>
    717     <listitem>
    718       <para>Functions specified with this option will be treated as though
    719       they were a heap allocation function such as
    720       <function>malloc</function>.  This is useful for functions that are
    721       wrappers to <function>malloc</function> or <function>new</function>,
    722       which can fill up the allocation trees with uninteresting information.
    723       This option can be specified multiple times on the command line, to
    724       name multiple functions.</para>
    725 
    726       <para>Note that the named function will only be treated this way if it is
    727       the top entry in a stack trace, or just below another function treated
    728       this way.  For example, if you have a function
    729       <function>malloc1</function> that wraps <function>malloc</function>,
    730       and <function>malloc2</function> that wraps
    731       <function>malloc1</function>, just specifying
    732       <option>--alloc-fn=malloc2</option> will have no effect.  You need to
    733       specify <option>--alloc-fn=malloc1</option> as well.  This is a little
    734       inconvenient, but the reason is that checking for allocation functions
    735       is slow, and it saves a lot of time if Massif can stop looking through
    736       the stack trace entries as soon as it finds one that doesn't match
    737       rather than having to continue through all the entries.</para>
    738 
    739       <para>Note that C++ names are demangled.  Note also that overloaded
    740       C++ names must be written in full.  Single quotes may be necessary to
    741       prevent the shell from breaking them up.  For example:
    742 <screen><![CDATA[
    743 --alloc-fn='operator new(unsigned, std::nothrow_t const&)'
    744 ]]></screen>
    745       </para>
    746       </listitem>
    747   </varlistentry>
    748 
    749   <varlistentry id="opt.ignore-fn" xreflabel="--ignore-fn">
    750     <term>
    751       <option><![CDATA[--ignore-fn=<name> ]]></option>
    752     </term>
    753     <listitem>
    754       <para>Any direct heap allocation (i.e. a call to
    755       <function>malloc</function>, <function>new</function>, etc, or a call
    756       to a function named by an <option>--alloc-fn</option>
    757       option) that occurs in a function specified by this option will be
    758       ignored.  This is mostly useful for testing purposes.  This option can
    759       be specified multiple times on the command line, to name multiple
    760       functions.
    761       </para>
    762       
    763       <para>Any <function>realloc</function> of an ignored block will
    764       also be ignored, even if the <function>realloc</function> call does
    765       not occur in an ignored function.  This avoids the possibility of
    766       negative heap sizes if ignored blocks are shrunk with
    767       <function>realloc</function>.
    768       </para>
    769       
    770       <para>The rules for writing C++ function names are the same as
    771       for <option>--alloc-fn</option> above.
    772       </para>
    773       </listitem>
    774   </varlistentry>
    775 
    776   <varlistentry id="opt.threshold" xreflabel="--threshold">
    777     <term>
    778       <option><![CDATA[--threshold=<m.n> [default: 1.0] ]]></option>
    779     </term>
    780     <listitem>
    781       <para>The significance threshold for heap allocations, as a
    782       percentage of total memory size.  Allocation tree entries that account
    783       for less than this will be aggregated.  Note that this should be
    784       specified in tandem with ms_print's option of the same name.</para>
    785     </listitem>
    786   </varlistentry>
    787 
    788   <varlistentry id="opt.peak-inaccuracy" xreflabel="--peak-inaccuracy">
    789     <term>
    790       <option><![CDATA[--peak-inaccuracy=<m.n> [default: 1.0] ]]></option>
    791     </term>
    792     <listitem>
    793       <para>Massif does not necessarily record the actual global memory
    794       allocation peak;  by default it records a peak only when the global
    795       memory allocation size exceeds the previous peak by at least 1.0%.
    796       This is because there can be many local allocation peaks along the way,
    797       and doing a detailed snapshot for every one would be expensive and
    798       wasteful, as all but one of them will be later discarded.  This
    799       inaccuracy can be changed (even to 0.0%) via this option, but Massif
    800       will run drastically slower as the number approaches zero.</para>
    801     </listitem>
    802   </varlistentry>
    803 
    804   <varlistentry id="opt.time-unit" xreflabel="--time-unit">
    805     <term>
    806       <option><![CDATA[--time-unit=<i|ms|B> [default: i] ]]></option>
    807     </term>
    808     <listitem>
    809       <para>The time unit used for the profiling.  There are three
    810       possibilities: instructions executed (i), which is good for most
    811       cases; real (wallclock) time (ms, i.e. milliseconds), which is
    812       sometimes useful; and bytes allocated/deallocated on the heap and/or
    813       stack (B), which is useful for very short-run programs, and for
    814       testing purposes, because it is the most reproducible across different
    815       machines.</para> </listitem>
    816   </varlistentry>
    817 
    818   <varlistentry id="opt.detailed-freq" xreflabel="--detailed-freq">
    819     <term>
    820       <option><![CDATA[--detailed-freq=<n> [default: 10] ]]></option>
    821     </term>
    822     <listitem>
    823       <para>Frequency of detailed snapshots.  With
    824       <option>--detailed-freq=1</option>, every snapshot is
    825       detailed.</para>
    826     </listitem>
    827   </varlistentry>
    828 
    829   <varlistentry id="opt.max-snapshots" xreflabel="--max-snapshots">
    830     <term>
    831       <option><![CDATA[--max-snapshots=<n> [default: 100] ]]></option>
    832     </term>
    833     <listitem>
    834       <para>The maximum number of snapshots recorded.  If set to N, for all
    835       programs except very short-running ones, the final number of snapshots
    836       will be between N/2 and N.</para>
    837     </listitem>
    838   </varlistentry>
    839 
    840   <varlistentry id="opt.massif-out-file" xreflabel="--massif-out-file">
    841     <term>
    842       <option><![CDATA[--massif-out-file=<file> [default: massif.out.%p] ]]></option>
    843     </term>
    844     <listitem>
    845       <para>Write the profile data to <computeroutput>file</computeroutput>
    846       rather than to the default output file,
    847       <computeroutput>massif.out.&lt;pid&gt;</computeroutput>.  The
    848       <option>%p</option> and <option>%q</option> format specifiers can be
    849       used to embed the process ID and/or the contents of an environment
    850       variable in the name, as is the case for the core option
    851       <option><xref linkend="opt.log-file"/></option>.
    852       </para>
    853     </listitem>
    854   </varlistentry>
    855 
    856 </variablelist>
    857 <!-- end of xi:include in the manpage -->
    858 
    859 </sect1>
    860 
    861 
    862 <sect1 id="ms-manual.clientreqs" xreflabel="Client requests">
    863 <title>Massif Client Requests</title>
    864 
    865 <para>Massif does not have a <filename>massif.h</filename> file, but it does
    866 implement two of the core client requests:
    867 <function>VALGRIND_MALLOCLIKE_BLOCK</function> and
    868 <function>VALGRIND_FREELIKE_BLOCK</function>;  they are described in 
    869 <xref linkend="manual-core-adv.clientreq"/>.
    870 </para>
    871 
    872 </sect1>
    873 
    874 
    875 <sect1 id="ms-manual.ms_print-options" xreflabel="ms_print Command-line Options">
    876 <title>ms_print Command-line Options</title>
    877 
    878 <para>ms_print's options are:</para>
    879 
    880 <!-- start of xi:include in the manpage -->
    881 <variablelist id="ms_print.opts.list">
    882 
    883   <varlistentry>
    884     <term>
    885       <option><![CDATA[-h --help ]]></option>
    886     </term>
    887     <listitem>
    888       <para>Show the help message.</para>
    889     </listitem>
    890   </varlistentry>
    891 
    892   <varlistentry>
    893     <term>
    894       <option><![CDATA[--version ]]></option>
    895     </term>
    896     <listitem>
    897       <para>Show the version number.</para>
    898     </listitem>
    899   </varlistentry>
    900 
    901   <varlistentry>
    902     <term>
    903       <option><![CDATA[--threshold=<m.n> [default: 1.0] ]]></option>
    904     </term>
    905     <listitem>
    906       <para>Same as Massif's <option>--threshold</option> option, but
    907       applied after profiling rather than during.</para>
    908     </listitem>
    909   </varlistentry>
    910 
    911   <varlistentry>
    912     <term>
    913       <option><![CDATA[--x=<4..1000> [default: 72]]]></option>
    914     </term>
    915     <listitem>
    916       <para>Width of the graph, in columns.</para>
    917     </listitem>
    918   </varlistentry>
    919 
    920   <varlistentry>
    921     <term>
    922       <option><![CDATA[--y=<4..1000> [default: 20] ]]></option>
    923     </term>
    924     <listitem>
    925       <para>Height of the graph, in rows.</para>
    926     </listitem>
    927   </varlistentry>
    928 
    929 </variablelist>
    930 
    931 </sect1>
    932 
    933 <sect1 id="ms-manual.fileformat" xreflabel="fileformat">
    934 <title>Massif's Output File Format</title>
    935 <para>Massif's file format is plain text (i.e. not binary) and deliberately
    936 easy to read for both humans and machines.  Nonetheless, the exact format
    937 is not described here.  This is because the format is currently very
    938 Massif-specific.  In the future we hope to make the format more general, and
    939 thus suitable for possible use with other tools.  Once this has been done,
    940 the format will be documented here.</para>
    941 
    942 </sect1>
    943 
    944 </chapter>
    945