Home | History | Annotate | Download | only in html
      1 <html>
      2 <head>
      3 <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
      4 <title>10.DHAT: a dynamic heap analysis tool</title>
      5 <link rel="stylesheet" href="vg_basic.css" type="text/css">
      6 <meta name="generator" content="DocBook XSL Stylesheets V1.75.2">
      7 <link rel="home" href="index.html" title="Valgrind Documentation">
      8 <link rel="up" href="manual.html" title="Valgrind User Manual">
      9 <link rel="prev" href="ms-manual.html" title="9.Massif: a heap profiler">
     10 <link rel="next" href="pc-manual.html" title="11.Ptrcheck: an experimental heap, stack and global array overrun detector">
     11 </head>
     12 <body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF">
     13 <div><table class="nav" width="100%" cellspacing="3" cellpadding="3" border="0" summary="Navigation header"><tr>
     14 <td width="22px" align="center" valign="middle"><a accesskey="p" href="ms-manual.html"><img src="images/prev.png" width="18" height="21" border="0" alt="Prev"></a></td>
     15 <td width="25px" align="center" valign="middle"><a accesskey="u" href="manual.html"><img src="images/up.png" width="21" height="18" border="0" alt="Up"></a></td>
     16 <td width="31px" align="center" valign="middle"><a accesskey="h" href="index.html"><img src="images/home.png" width="27" height="20" border="0" alt="Up"></a></td>
     17 <th align="center" valign="middle">Valgrind User Manual</th>
     18 <td width="22px" align="center" valign="middle"><a accesskey="n" href="pc-manual.html"><img src="images/next.png" width="18" height="21" border="0" alt="Next"></a></td>
     19 </tr></table></div>
     20 <div class="chapter" title="10.DHAT: a dynamic heap analysis tool">
     21 <div class="titlepage"><div><div><h2 class="title">
     22 <a name="dh-manual"></a>10.DHAT: a dynamic heap analysis tool</h2></div></div></div>
     23 <div class="toc">
     24 <p><b>Table of Contents</b></p>
     25 <dl>
     26 <dt><span class="sect1"><a href="dh-manual.html#dh-manual.overview">10.1. Overview</a></span></dt>
     27 <dt><span class="sect1"><a href="dh-manual.html#dh-manual.understanding">10.2. Understanding DHAT's output</a></span></dt>
     28 <dd><dl>
     29 <dt><span class="sect2"><a href="dh-manual.html#id534558">10.2.1. Interpreting the max-live, tot-alloc and deaths fields</a></span></dt>
     30 <dt><span class="sect2"><a href="dh-manual.html#id534260">10.2.2. Interpreting the acc-ratios fields</a></span></dt>
     31 <dt><span class="sect2"><a href="dh-manual.html#id534998">10.2.3. Interpreting "Aggregated access counts by offset" data</a></span></dt>
     32 </dl></dd>
     33 <dt><span class="sect1"><a href="dh-manual.html#dh-manual.options">10.3. DHAT Command-line Options</a></span></dt>
     34 </dl>
     35 </div>
     36 <p>To use this tool, you must specify
     37 <code class="option">--tool=exp-dhat</code> on the Valgrind
     38 command line.</p>
     39 <div class="sect1" title="10.1.Overview">
     40 <div class="titlepage"><div><div><h2 class="title" style="clear: both">
     41 <a name="dh-manual.overview"></a>10.1.Overview</h2></div></div></div>
     42 <p>DHAT is a tool for examining how programs use their heap
     43 allocations.</p>
     44 <p>It tracks the allocated blocks, and inspects every memory access
     45 to find which block, if any, it is to.  The following data is
     46 collected and presented per allocation point (allocation
     47 stack):</p>
     48 <div class="itemizedlist"><ul class="itemizedlist" type="disc">
     49 <li class="listitem"><p>Total allocation (number of bytes and
     50   blocks)</p></li>
     51 <li class="listitem"><p>maximum live volume (number of bytes and
     52   blocks)</p></li>
     53 <li class="listitem"><p>average block lifetime (number of instructions
     54    between allocation and freeing)</p></li>
     55 <li class="listitem"><p>average number of reads and writes to each byte in
     56    the block ("access ratios")</p></li>
     57 <li class="listitem"><p>for allocation points which always allocate blocks
     58    only of one size, and that size is 4096 bytes or less: counts
     59    showing how often each byte offset inside the block is
     60    accessed.</p></li>
     61 </ul></div>
     62 <p>Using these statistics it is possible to identify allocation
     63 points with the following characteristics:</p>
     64 <div class="itemizedlist"><ul class="itemizedlist" type="disc">
     65 <li class="listitem"><p>potential process-lifetime leaks: blocks allocated
     66    by the point just accumulate, and are freed only at the end of the
     67    run.</p></li>
     68 <li class="listitem"><p>excessive turnover: points which chew through a lot
     69   of heap, even if it is not held onto for very long</p></li>
     70 <li class="listitem"><p>excessively transient: points which allocate very
     71  short lived blocks</p></li>
     72 <li class="listitem"><p>useless or underused allocations: blocks which are
     73   allocated but not completely filled in, or are filled in but not
     74   subsequently read.</p></li>
     75 <li class="listitem"><p>blocks with inefficient layout -- areas never
     76   accessed, or with hot fields scattered throughout the
     77   block.</p></li>
     78 </ul></div>
     79 <p>As with the Massif heap profiler, DHAT measures program progress
     80 by counting instructions, and so presents all age/time related figures
     81 as instruction counts.  This sounds a little odd at first, but it
     82 makes runs repeatable in a way which is not possible if CPU time is
     83 used.</p>
     84 </div>
     85 <div class="sect1" title="10.2.Understanding DHAT's output">
     86 <div class="titlepage"><div><div><h2 class="title" style="clear: both">
     87 <a name="dh-manual.understanding"></a>10.2.Understanding DHAT's output</h2></div></div></div>
     88 <p>DHAT provides a lot of useful information on dynamic heap usage.
     89 Most of the art of using it is in interpretation of the resulting
     90 numbers.  That is best illustrated via a set of examples.</p>
     91 <div class="sect2" title="10.2.1.Interpreting the max-live, tot-alloc and deaths fields">
     92 <div class="titlepage"><div><div><h3 class="title">
     93 <a name="id534558"></a>10.2.1.Interpreting the max-live, tot-alloc and deaths fields</h3></div></div></div>
     94 <div class="sect3" title="10.2.1.1.A simple example"><div class="titlepage"><div><div><h4 class="title">
     95 <a name="id534635"></a>10.2.1.1.A simple example</h4></div></div></div></div>
     96 <pre class="screen">
     97    ======== SUMMARY STATISTICS ========
     98 
     99    guest_insns:  1,045,339,534
    100    [...]
    101    max-live:    63,490 in 984 blocks
    102    tot-alloc:   1,904,700 in 29,520 blocks (avg size 64.52)
    103    deaths:      29,520, at avg age 22,227,424
    104    acc-ratios:  6.37 rd, 1.14 wr  (12,141,526 b-read, 2,174,460 b-written)
    105       at 0x4C275B8: malloc (vg_replace_malloc.c:236)
    106       by 0x40350E: tcc_malloc (tinycc.c:6712)
    107       by 0x404580: tok_alloc_new (tinycc.c:7151)
    108       by 0x40870A: next_nomacro1 (tinycc.c:9305)
    109 </pre>
    110 <p>Over the entire run of the program, this stack (allocation
    111 point) allocated 29,520 blocks in total, containing 1,904,700 bytes in
    112 total.  By looking at the max-live data, we see that not many blocks
    113 were simultaneously live, though: at the peak, there were 63,490
    114 allocated bytes in 984 blocks.  This tells us that the program is
    115 steadily freeing such blocks as it runs, rather than hanging on to all
    116 of them until the end and freeing them all.</p>
    117 <p>The deaths entry tells us that 29,520 blocks allocated by this stack
    118 died (were freed) during the run of the program.  Since 29,520 is
    119 also the number of blocks allocated in total, that tells us that
    120 all allocated blocks were freed by the end of the program.</p>
    121 <p>It also tells us that the average age at death was 22,227,424
    122 instructions.  From the summary statistics we see that the program ran
    123 for 1,045,339,534 instructions, and so the average age at death is
    124 about 2% of the program's total run time.</p>
    125 <div class="sect3" title="10.2.1.2.Example of a potential process-lifetime leak"><div class="titlepage"><div><div><h4 class="title">
    126 <a name="id560570"></a>10.2.1.2.Example of a potential process-lifetime leak</h4></div></div></div></div>
    127 <p>This next example (from a different program than the above)
    128 shows a potential process lifetime leak.  A process lifetime leak
    129 occurs when a program keeps allocating data, but only frees the
    130 data just before it exits.  Hence the program's heap grows constantly
    131 in size, yet Memcheck reports no leak, because the program has
    132 freed up everything at exit.  This is particularly a hazard for
    133 long running programs.</p>
    134 <pre class="screen">
    135    ======== SUMMARY STATISTICS ========
    136    
    137    guest_insns:  418,901,537
    138    [...]
    139    max-live:    32,512 in 254 blocks
    140    tot-alloc:   32,512 in 254 blocks (avg size 128.00)
    141    deaths:      254, at avg age 300,467,389
    142    acc-ratios:  0.26 rd, 0.20 wr  (8,756 b-read, 6,604 b-written)
    143       at 0x4C275B8: malloc (vg_replace_malloc.c:236)
    144       by 0x4C27632: realloc (vg_replace_malloc.c:525)
    145       by 0x56FF41D: QtFontStyle::pixelSize(unsigned short, bool) (qfontdatabase.cpp:269)
    146       by 0x5700D69: loadFontConfig() (qfontdatabase_x11.cpp:1146)
    147 </pre>
    148 <p>There are two tell-tale signs that this might be a
    149 process-lifetime leak.  Firstly, the max-live and tot-alloc numbers
    150 are identical.  The only way that can happen is if these blocks are
    151 all allocated and then all deallocated.</p>
    152 <p>Secondly, the average age at death (300 million insns) is 71% of
    153 the total program lifetime (419 million insns), hence this is not a
    154 transient allocation-free spike -- rather, it is spread out over a
    155 large part of the entire run.  One interpretation is, roughly, that
    156 all 254 blocks were allocated in the first half of the run, held onto
    157 for the second half, and then freed just before exit.</p>
    158 </div>
    159 <div class="sect2" title="10.2.2.Interpreting the acc-ratios fields">
    160 <div class="titlepage"><div><div><h3 class="title">
    161 <a name="id534260"></a>10.2.2.Interpreting the acc-ratios fields</h3></div></div></div>
    162 <div class="sect3" title="10.2.2.1.A fairly harmless allocation point record"><div class="titlepage"><div><div><h4 class="title">
    163 <a name="id534267"></a>10.2.2.1.A fairly harmless allocation point record</h4></div></div></div></div>
    164 <pre class="screen">
    165    max-live:    49,398 in 808 blocks
    166    tot-alloc:   1,481,940 in 24,240 blocks (avg size 61.13)
    167    deaths:      24,240, at avg age 34,611,026
    168    acc-ratios:  2.13 rd, 0.91 wr  (3,166,650 b-read, 1,358,820 b-written)
    169       at 0x4C275B8: malloc (vg_replace_malloc.c:236)
    170       by 0x40350E: tcc_malloc (tinycc.c:6712)
    171       by 0x404580: tok_alloc_new (tinycc.c:7151)
    172       by 0x4046C4: tok_alloc (tinycc.c:7190)
    173 </pre>
    174 <p>The acc-ratios field tells us that each byte in the blocks
    175 allocated here is read an average of 2.13 times before the block is
    176 deallocated.  Given that the blocks have an average age at death of
    177 34,611,026, that's one read per block per approximately every 15
    178 million instructions.  So from that standpoint the blocks aren't
    179 "working" very hard.</p>
    180 <p>More interesting is the write ratio: each byte is written an
    181 average of 0.91 times.  This tells us that some parts of the allocated
    182 blocks are never written, at least 9% on average.  To completely
    183 initialise the block would require writing each byte at least once,
    184 and that would give a write ratio of 1.0.  The fact that some block
    185 areas are evidently unused might point to data alignment holes or
    186 other layout inefficiencies.</p>
    187 <p>Well, at least all the blocks are freed (24,240 allocations,
    188 24,240 deaths).</p>
    189 <p>If all the blocks had been the same size, DHAT would also show
    190 the access counts by block offset, so we could see where exactly these
    191 unused areas are.  However, that isn't the case: the blocks have
    192 varying sizes, so DHAT can't perform such an analysis.  We can see
    193 that they must have varying sizes since the average block size, 61.13,
    194 isn't a whole number.</p>
    195 <div class="sect3" title="10.2.2.2.A more suspicious looking example"><div class="titlepage"><div><div><h4 class="title">
    196 <a name="id573354"></a>10.2.2.2.A more suspicious looking example</h4></div></div></div></div>
    197 <pre class="screen">
    198    max-live:    180,224 in 22 blocks
    199    tot-alloc:   180,224 in 22 blocks (avg size 8192.00)
    200    deaths:      none (none of these blocks were freed)
    201    acc-ratios:  0.00 rd, 0.00 wr  (0 b-read, 0 b-written)
    202       at 0x4C275B8: malloc (vg_replace_malloc.c:236)
    203       by 0x40350E: tcc_malloc (tinycc.c:6712)
    204       by 0x40369C: __sym_malloc (tinycc.c:6787)
    205       by 0x403711: sym_malloc (tinycc.c:6805)
    206 </pre>
    207 <p>Here, both the read and write access ratios are zero.  Hence
    208 this point is allocating blocks which are never used, neither read nor
    209 written.  Indeed, they are also not freed ("deaths: none") and are
    210 simply leaked.  So, here is 180k of completely useless allocation that
    211 could be removed.</p>
    212 <p>Re-running with Memcheck does indeed report the same leak.  What
    213 DHAT can tell us, that Memcheck can't, is that not only are the blocks
    214 leaked, they are also never used.</p>
    215 <div class="sect3" title="10.2.2.3.Another suspicious example"><div class="titlepage"><div><div><h4 class="title">
    216 <a name="id566577"></a>10.2.2.3.Another suspicious example</h4></div></div></div></div>
    217 <p>Here's one where blocks are allocated, written to,
    218 but never read from.  We see this immediately from the zero read
    219 access ratio.  They do get freed, though:</p>
    220 <pre class="screen">
    221    max-live:    54 in 3 blocks
    222    tot-alloc:   1,620 in 90 blocks (avg size 18.00)
    223    deaths:      90, at avg age 34,558,236
    224    acc-ratios:  0.00 rd, 1.11 wr  (0 b-read, 1,800 b-written)
    225       at 0x4C275B8: malloc (vg_replace_malloc.c:236)
    226       by 0x40350E: tcc_malloc (tinycc.c:6712)
    227       by 0x4035BD: tcc_strdup (tinycc.c:6750)
    228       by 0x41FEBB: tcc_add_sysinclude_path (tinycc.c:20931)
    229 </pre>
    230 <p>In the previous two examples, it is easy to see blocks that are
    231 never written to, or never read from, or some combination of both.
    232 Unfortunately, in C++ code, the situation is less clear.  That's
    233 because an object's constructor will write to the underlying block,
    234 and its destructor will read from it.  So the block's read and write
    235 ratios will be non-zero even if the object, once constructed, is never
    236 used, but only eventually destructed.</p>
    237 <p>Really, what we want is to measure only memory accesses in
    238 between the end of an object's construction and the start of its
    239 destruction.  Unfortunately I do not know of a reliable way to
    240 determine when those transitions are made.</p>
    241 </div>
    242 <div class="sect2" title='10.2.3.Interpreting "Aggregated access counts by offset" data'>
    243 <div class="titlepage"><div><div><h3 class="title">
    244 <a name="id534998"></a>10.2.3.Interpreting "Aggregated access counts by offset" data</h3></div></div></div>
    245 <p>For allocation points that always allocate blocks of the same
    246 size, and which are 4096 bytes or smaller, DHAT counts accesses
    247 per offset, for example:</p>
    248 <pre class="screen">
    249    max-live:    317,408 in 5,668 blocks
    250    tot-alloc:   317,408 in 5,668 blocks (avg size 56.00)
    251    deaths:      5,668, at avg age 622,890,597
    252    acc-ratios:  1.03 rd, 1.28 wr  (327,642 b-read, 408,172 b-written)
    253       at 0x4C275B8: malloc (vg_replace_malloc.c:236)
    254       by 0x5440C16: QDesignerPropertySheetPrivate::ensureInfo (qhash.h:515)
    255       by 0x544350B: QDesignerPropertySheet::setVisible (qdesigner_propertysh...)
    256       by 0x5446232: QDesignerPropertySheet::QDesignerPropertySheet (qdesigne...)
    257    
    258    Aggregated access counts by offset:
    259    
    260    [   0]  28782 28782 28782 28782 28782 28782 28782 28782
    261    [   8]  20638 20638 20638 20638 0 0 0 0 
    262    [  16]  22738 22738 22738 22738 22738 22738 22738 22738
    263    [  24]  6013 6013 6013 6013 6013 6013 6013 6013 
    264    [  32]  18883 18883 18883 37422 0 0 0 0
    265    [  36]  5668 11915 5668 5668 11336 11336 11336 11336 
    266    [  48]  6166 6166 6166 6166 0 0 0 0 
    267 </pre>
    268 <p>This is fairly typical, for C++ code running on a 64-bit
    269 platform.  Here, we have aggregated access statistics for 5668 blocks,
    270 all of size 56 bytes.  Each byte has been accessed at least 5668
    271 times, except for offsets 12--15, 36--39 and 52--55.  These are likely
    272 to be alignment holes.</p>
    273 <p>Careful interpretation of the numbers reveals useful information.
    274 Groups of N consecutive identical numbers that begin at an N-aligned
    275 offset, for N being 2, 4 or 8, are likely to indicate an N-byte object
    276 in the structure at that point.  For example, the first 32 bytes of
    277 this object are likely to have the layout</p>
    278 <pre class="screen">
    279    [0 ]  64-bit type
    280    [8 ]  32-bit type
    281    [12]  32-bit alignment hole
    282    [16]  64-bit type
    283    [24]  64-bit type
    284 </pre>
    285 <p>As a counterexample, it's also clear that, whatever is at offset 32,
    286 it is not a 32-bit value.  That's because the last number of the group
    287 (37422) is not the same as the first three (18883 18883 18883).</p>
    288 <p>This example leads one to enquire (by reading the source code)
    289 whether the zeroes at 12--15 and 52--55 are alignment holes, and
    290 whether 48--51 is indeed a 32-bit type.  If so, it might be possible
    291 to place what's at 48--51 at 12--15 instead, which would reduce
    292 the object size from 56 to 48 bytes.</p>
    293 <p>Bear in mind that the above inferences are all only "maybes".  That's
    294 because they are based on dynamic data, not static analysis of the
    295 object layout.  For example, the zeroes might not be alignment
    296 holes, but rather just parts of the structure which were not used
    297 at all for this particular run.  Experience shows that's unlikely
    298 to be the case, but it could happen.</p>
    299 </div>
    300 </div>
    301 <div class="sect1" title="10.3.DHAT Command-line Options">
    302 <div class="titlepage"><div><div><h2 class="title" style="clear: both">
    303 <a name="dh-manual.options"></a>10.3.DHAT Command-line Options</h2></div></div></div>
    304 <p>DHAT-specific command-line options are:</p>
    305 <div class="variablelist">
    306 <a name="dh.opts.list"></a><dl>
    307 <dt>
    308 <a name="opt.show-top-n"></a><span class="term">
    309       <code class="option">--show-top-n=&lt;number&gt;
    310       [default: 10] </code>
    311     </span>
    312 </dt>
    313 <dd><p>At the end of the run, DHAT sorts the accumulated
    314        allocation points according to some metric, and shows the
    315        highest scoring entries.  <code class="varname">--show-top-n</code>
    316        controls how many entries are shown.  The default of 10 is
    317        quite small.  For realistic applications you will probably need
    318        to set it much higher, at least several hundred.</p></dd>
    319 <dt>
    320 <a name="opt.sort-by"></a><span class="term">
    321       <code class="option">--sort-by=&lt;string&gt; [default: max-bytes-live] </code>
    322     </span>
    323 </dt>
    324 <dd>
    325 <p>At the end of the run, DHAT sorts the accumulated
    326        allocation points according to some metric, and shows the
    327        highest scoring entries.  <code class="varname">--sort-by</code>
    328        selects the metric used for sorting:</p>
    329 <p><code class="varname">max-bytes-live   </code>    maximum live bytes [default]</p>
    330 <p><code class="varname">tot-bytes-allocd </code>  total allocation (turnover)</p>
    331 <p><code class="varname">max-blocks-live  </code>   maximum live blocks</p>
    332 <p>This controls the order in which allocation points are
    333        displayed.  You can choose to look at allocation points with
    334        the highest maximum liveness, or the highest total turnover, or
    335        by the highest number of live blocks.  These give usefully
    336        different pictures of program behaviour.  For example, sorting
    337        by maximum live blocks tends to show up allocation points
    338        creating large numbers of small objects.</p>
    339 </dd>
    340 </dl>
    341 </div>
    342 <p>One important point to note is that each allocation stack counts
    343 as a seperate allocation point.  Because stacks by default have 12
    344 frames, this tends to spread data out over multiple allocation points.
    345 You may want to use the flag --num-callers=4 or some such small
    346 number, to reduce the spreading.</p>
    347 </div>
    348 </div>
    349 <div>
    350 <br><table class="nav" width="100%" cellspacing="3" cellpadding="2" border="0" summary="Navigation footer">
    351 <tr>
    352 <td rowspan="2" width="40%" align="left">
    353 <a accesskey="p" href="ms-manual.html">&lt;&lt;9.Massif: a heap profiler</a></td>
    354 <td width="20%" align="center"><a accesskey="u" href="manual.html">Up</a></td>
    355 <td rowspan="2" width="40%" align="right"><a accesskey="n" href="pc-manual.html">11.Ptrcheck: an experimental heap, stack and global array overrun detector&gt;&gt;</a>
    356 </td>
    357 </tr>
    358 <tr><td width="20%" align="center"><a accesskey="h" href="index.html">Home</a></td></tr>
    359 </table>
    360 </div>
    361 </body>
    362 </html>
    363