1 <html> 2 <head> 3 <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"> 4 <title>4.Memcheck: a memory error detector</title> 5 <link rel="stylesheet" href="vg_basic.css" type="text/css"> 6 <meta name="generator" content="DocBook XSL Stylesheets V1.75.2"> 7 <link rel="home" href="index.html" title="Valgrind Documentation"> 8 <link rel="up" href="manual.html" title="Valgrind User Manual"> 9 <link rel="prev" href="manual-core-adv.html" title="3.Using and understanding the Valgrind core: Advanced Topics"> 10 <link rel="next" href="cg-manual.html" title="5.Cachegrind: a cache and branch-prediction profiler"> 11 </head> 12 <body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"> 13 <div><table class="nav" width="100%" cellspacing="3" cellpadding="3" border="0" summary="Navigation header"><tr> 14 <td width="22px" align="center" valign="middle"><a accesskey="p" href="manual-core-adv.html"><img src="images/prev.png" width="18" height="21" border="0" alt="Prev"></a></td> 15 <td width="25px" align="center" valign="middle"><a accesskey="u" href="manual.html"><img src="images/up.png" width="21" height="18" border="0" alt="Up"></a></td> 16 <td width="31px" align="center" valign="middle"><a accesskey="h" href="index.html"><img src="images/home.png" width="27" height="20" border="0" alt="Up"></a></td> 17 <th align="center" valign="middle">Valgrind User Manual</th> 18 <td width="22px" align="center" valign="middle"><a accesskey="n" href="cg-manual.html"><img src="images/next.png" width="18" height="21" border="0" alt="Next"></a></td> 19 </tr></table></div> 20 <div class="chapter" title="4.Memcheck: a memory error detector"> 21 <div class="titlepage"><div><div><h2 class="title"> 22 <a name="mc-manual"></a>4.Memcheck: a memory error detector</h2></div></div></div> 23 <div class="toc"> 24 <p><b>Table of Contents</b></p> 25 <dl> 26 <dt><span class="sect1"><a href="mc-manual.html#mc-manual.overview">4.1. Overview</a></span></dt> 27 <dt><span class="sect1"><a href="mc-manual.html#mc-manual.errormsgs">4.2. Explanation of error messages from Memcheck</a></span></dt> 28 <dd><dl> 29 <dt><span class="sect2"><a href="mc-manual.html#mc-manual.badrw">4.2.1. Illegal read / Illegal write errors</a></span></dt> 30 <dt><span class="sect2"><a href="mc-manual.html#mc-manual.uninitvals">4.2.2. Use of uninitialised values</a></span></dt> 31 <dt><span class="sect2"><a href="mc-manual.html#mc-manual.bad-syscall-args">4.2.3. Use of uninitialised or unaddressable values in system 32 calls</a></span></dt> 33 <dt><span class="sect2"><a href="mc-manual.html#mc-manual.badfrees">4.2.4. Illegal frees</a></span></dt> 34 <dt><span class="sect2"><a href="mc-manual.html#mc-manual.rudefn">4.2.5. When a heap block is freed with an inappropriate deallocation 35 function</a></span></dt> 36 <dt><span class="sect2"><a href="mc-manual.html#mc-manual.overlap">4.2.6. Overlapping source and destination blocks</a></span></dt> 37 <dt><span class="sect2"><a href="mc-manual.html#mc-manual.leaks">4.2.7. Memory leak detection</a></span></dt> 38 </dl></dd> 39 <dt><span class="sect1"><a href="mc-manual.html#mc-manual.options">4.3. Memcheck Command-Line Options</a></span></dt> 40 <dt><span class="sect1"><a href="mc-manual.html#mc-manual.suppfiles">4.4. Writing suppression files</a></span></dt> 41 <dt><span class="sect1"><a href="mc-manual.html#mc-manual.machine">4.5. Details of Memcheck's checking machinery</a></span></dt> 42 <dd><dl> 43 <dt><span class="sect2"><a href="mc-manual.html#mc-manual.value">4.5.1. Valid-value (V) bits</a></span></dt> 44 <dt><span class="sect2"><a href="mc-manual.html#mc-manual.vaddress">4.5.2. Valid-address (A) bits</a></span></dt> 45 <dt><span class="sect2"><a href="mc-manual.html#mc-manual.together">4.5.3. Putting it all together</a></span></dt> 46 </dl></dd> 47 <dt><span class="sect1"><a href="mc-manual.html#mc-manual.clientreqs">4.6. Client Requests</a></span></dt> 48 <dt><span class="sect1"><a href="mc-manual.html#mc-manual.mempools">4.7. Memory Pools: describing and working with custom allocators</a></span></dt> 49 <dt><span class="sect1"><a href="mc-manual.html#mc-manual.mpiwrap">4.8. Debugging MPI Parallel Programs with Valgrind</a></span></dt> 50 <dd><dl> 51 <dt><span class="sect2"><a href="mc-manual.html#mc-manual.mpiwrap.build">4.8.1. Building and installing the wrappers</a></span></dt> 52 <dt><span class="sect2"><a href="mc-manual.html#mc-manual.mpiwrap.gettingstarted">4.8.2. Getting started</a></span></dt> 53 <dt><span class="sect2"><a href="mc-manual.html#mc-manual.mpiwrap.controlling">4.8.3. Controlling the wrapper library</a></span></dt> 54 <dt><span class="sect2"><a href="mc-manual.html#mc-manual.mpiwrap.limitations.functions">4.8.4. Functions</a></span></dt> 55 <dt><span class="sect2"><a href="mc-manual.html#mc-manual.mpiwrap.limitations.types">4.8.5. Types</a></span></dt> 56 <dt><span class="sect2"><a href="mc-manual.html#mc-manual.mpiwrap.writingwrappers">4.8.6. Writing new wrappers</a></span></dt> 57 <dt><span class="sect2"><a href="mc-manual.html#mc-manual.mpiwrap.whattoexpect">4.8.7. What to expect when using the wrappers</a></span></dt> 58 </dl></dd> 59 </dl> 60 </div> 61 <p>To use this tool, you may specify <code class="option">--tool=memcheck</code> 62 on the Valgrind command line. You don't have to, though, since Memcheck 63 is the default tool.</p> 64 <div class="sect1" title="4.1.Overview"> 65 <div class="titlepage"><div><div><h2 class="title" style="clear: both"> 66 <a name="mc-manual.overview"></a>4.1.Overview</h2></div></div></div> 67 <p>Memcheck is a memory error detector. It can detect the following 68 problems that are common in C and C++ programs.</p> 69 <div class="itemizedlist"><ul class="itemizedlist" type="disc"> 70 <li class="listitem"><p>Accessing memory you shouldn't, e.g. overrunning and underrunning 71 heap blocks, overrunning the top of the stack, and accessing memory after 72 it has been freed.</p></li> 73 <li class="listitem"><p>Using undefined values, i.e. values that have not been initialised, 74 or that have been derived from other undefined values.</p></li> 75 <li class="listitem"><p>Incorrect freeing of heap memory, such as double-freeing heap 76 blocks, or mismatched use of 77 <code class="function">malloc</code>/<code class="computeroutput">new</code>/<code class="computeroutput">new[]</code> 78 versus 79 <code class="function">free</code>/<code class="computeroutput">delete</code>/<code class="computeroutput">delete[]</code></p></li> 80 <li class="listitem"><p>Overlapping <code class="computeroutput">src</code> and 81 <code class="computeroutput">dst</code> pointers in 82 <code class="computeroutput">memcpy</code> and related 83 functions.</p></li> 84 <li class="listitem"><p>Memory leaks.</p></li> 85 </ul></div> 86 <p>Problems like these can be difficult to find by other means, 87 often remaining undetected for long periods, then causing occasional, 88 difficult-to-diagnose crashes.</p> 89 </div> 90 <div class="sect1" title="4.2.Explanation of error messages from Memcheck"> 91 <div class="titlepage"><div><div><h2 class="title" style="clear: both"> 92 <a name="mc-manual.errormsgs"></a>4.2.Explanation of error messages from Memcheck</h2></div></div></div> 93 <p>Memcheck issues a range of error messages. This section presents a 94 quick summary of what error messages mean. The precise behaviour of the 95 error-checking machinery is described in <a class="xref" href="mc-manual.html#mc-manual.machine" title="4.5.Details of Memcheck's checking machinery">Details of Memcheck's checking machinery</a>.</p> 96 <div class="sect2" title="4.2.1.Illegal read / Illegal write errors"> 97 <div class="titlepage"><div><div><h3 class="title"> 98 <a name="mc-manual.badrw"></a>4.2.1.Illegal read / Illegal write errors</h3></div></div></div> 99 <p>For example:</p> 100 <pre class="programlisting"> 101 Invalid read of size 4 102 at 0x40F6BBCC: (within /usr/lib/libpng.so.2.1.0.9) 103 by 0x40F6B804: (within /usr/lib/libpng.so.2.1.0.9) 104 by 0x40B07FF4: read_png_image(QImageIO *) (kernel/qpngio.cpp:326) 105 by 0x40AC751B: QImageIO::read() (kernel/qimage.cpp:3621) 106 Address 0xBFFFF0E0 is not stack'd, malloc'd or free'd 107 </pre> 108 <p>This happens when your program reads or writes memory at a place 109 which Memcheck reckons it shouldn't. In this example, the program did a 110 4-byte read at address 0xBFFFF0E0, somewhere within the system-supplied 111 library libpng.so.2.1.0.9, which was called from somewhere else in the 112 same library, called from line 326 of <code class="filename">qpngio.cpp</code>, 113 and so on.</p> 114 <p>Memcheck tries to establish what the illegal address might relate 115 to, since that's often useful. So, if it points into a block of memory 116 which has already been freed, you'll be informed of this, and also where 117 the block was freed. Likewise, if it should turn out to be just off 118 the end of a heap block, a common result of off-by-one-errors in 119 array subscripting, you'll be informed of this fact, and also where the 120 block was allocated. If you use the <code class="option"><a class="xref" href="manual-core.html#opt.read-var-info">--read-var-info</a></code> option Memcheck will run more slowly 121 but may give a more detailed description of any illegal address.</p> 122 <p>In this example, Memcheck can't identify the address. Actually 123 the address is on the stack, but, for some reason, this is not a valid 124 stack address -- it is below the stack pointer and that isn't allowed. 125 In this particular case it's probably caused by GCC generating invalid 126 code, a known bug in some ancient versions of GCC.</p> 127 <p>Note that Memcheck only tells you that your program is about to 128 access memory at an illegal address. It can't stop the access from 129 happening. So, if your program makes an access which normally would 130 result in a segmentation fault, you program will still suffer the same 131 fate -- but you will get a message from Memcheck immediately prior to 132 this. In this particular example, reading junk on the stack is 133 non-fatal, and the program stays alive.</p> 134 </div> 135 <div class="sect2" title="4.2.2.Use of uninitialised values"> 136 <div class="titlepage"><div><div><h3 class="title"> 137 <a name="mc-manual.uninitvals"></a>4.2.2.Use of uninitialised values</h3></div></div></div> 138 <p>For example:</p> 139 <pre class="programlisting"> 140 Conditional jump or move depends on uninitialised value(s) 141 at 0x402DFA94: _IO_vfprintf (_itoa.h:49) 142 by 0x402E8476: _IO_printf (printf.c:36) 143 by 0x8048472: main (tests/manuel1.c:8) 144 </pre> 145 <p>An uninitialised-value use error is reported when your program 146 uses a value which hasn't been initialised -- in other words, is 147 undefined. Here, the undefined value is used somewhere inside the 148 <code class="function">printf</code> machinery of the C library. This error was 149 reported when running the following small program:</p> 150 <pre class="programlisting"> 151 int main() 152 { 153 int x; 154 printf ("x = %d\n", x); 155 }</pre> 156 <p>It is important to understand that your program can copy around 157 junk (uninitialised) data as much as it likes. Memcheck observes this 158 and keeps track of the data, but does not complain. A complaint is 159 issued only when your program attempts to make use of uninitialised 160 data in a way that might affect your program's externally-visible behaviour. 161 In this example, <code class="varname">x</code> is uninitialised. Memcheck observes 162 the value being passed to <code class="function">_IO_printf</code> and thence to 163 <code class="function">_IO_vfprintf</code>, but makes no comment. However, 164 <code class="function">_IO_vfprintf</code> has to examine the value of 165 <code class="varname">x</code> so it can turn it into the corresponding ASCII string, 166 and it is at this point that Memcheck complains.</p> 167 <p>Sources of uninitialised data tend to be:</p> 168 <div class="itemizedlist"><ul class="itemizedlist" type="disc"> 169 <li class="listitem"><p>Local variables in procedures which have not been initialised, 170 as in the example above.</p></li> 171 <li class="listitem"><p>The contents of heap blocks (allocated with 172 <code class="function">malloc</code>, <code class="function">new</code>, or a similar 173 function) before you (or a constructor) write something there. 174 </p></li> 175 </ul></div> 176 <p>To see information on the sources of uninitialised data in your 177 program, use the <code class="option">--track-origins=yes</code> option. This 178 makes Memcheck run more slowly, but can make it much easier to track down 179 the root causes of uninitialised value errors.</p> 180 </div> 181 <div class="sect2" title="4.2.3.Use of uninitialised or unaddressable values in system calls"> 182 <div class="titlepage"><div><div><h3 class="title"> 183 <a name="mc-manual.bad-syscall-args"></a>4.2.3.Use of uninitialised or unaddressable values in system 184 calls</h3></div></div></div> 185 <p>Memcheck checks all parameters to system calls: 186 </p> 187 <div class="itemizedlist"><ul class="itemizedlist" type="disc"> 188 <li class="listitem"><p>It checks all the direct parameters themselves, whether they are 189 initialised.</p></li> 190 <li class="listitem"><p>Also, if a system call needs to read from a buffer provided by 191 your program, Memcheck checks that the entire buffer is addressable 192 and its contents are initialised.</p></li> 193 <li class="listitem"><p>Also, if the system call needs to write to a user-supplied 194 buffer, Memcheck checks that the buffer is addressable.</p></li> 195 </ul></div> 196 <p> 197 </p> 198 <p>After the system call, Memcheck updates its tracked information to 199 precisely reflect any changes in memory state caused by the system 200 call.</p> 201 <p>Here's an example of two system calls with invalid parameters:</p> 202 <pre class="programlisting"> 203 #include <stdlib.h> 204 #include <unistd.h> 205 int main( void ) 206 { 207 char* arr = malloc(10); 208 int* arr2 = malloc(sizeof(int)); 209 write( 1 /* stdout */, arr, 10 ); 210 exit(arr2[0]); 211 } 212 </pre> 213 <p>You get these complaints ...</p> 214 <pre class="programlisting"> 215 Syscall param write(buf) points to uninitialised byte(s) 216 at 0x25A48723: __write_nocancel (in /lib/tls/libc-2.3.3.so) 217 by 0x259AFAD3: __libc_start_main (in /lib/tls/libc-2.3.3.so) 218 by 0x8048348: (within /auto/homes/njn25/grind/head4/a.out) 219 Address 0x25AB8028 is 0 bytes inside a block of size 10 alloc'd 220 at 0x259852B0: malloc (vg_replace_malloc.c:130) 221 by 0x80483F1: main (a.c:5) 222 223 Syscall param exit(error_code) contains uninitialised byte(s) 224 at 0x25A21B44: __GI__exit (in /lib/tls/libc-2.3.3.so) 225 by 0x8048426: main (a.c:8) 226 </pre> 227 <p>... because the program has (a) written uninitialised junk 228 from the heap block to the standard output, and (b) passed an 229 uninitialised value to <code class="function">exit</code>. Note that the first 230 error refers to the memory pointed to by 231 <code class="computeroutput">buf</code> (not 232 <code class="computeroutput">buf</code> itself), but the second error 233 refers directly to <code class="computeroutput">exit</code>'s argument 234 <code class="computeroutput">arr2[0]</code>.</p> 235 </div> 236 <div class="sect2" title="4.2.4.Illegal frees"> 237 <div class="titlepage"><div><div><h3 class="title"> 238 <a name="mc-manual.badfrees"></a>4.2.4.Illegal frees</h3></div></div></div> 239 <p>For example:</p> 240 <pre class="programlisting"> 241 Invalid free() 242 at 0x4004FFDF: free (vg_clientmalloc.c:577) 243 by 0x80484C7: main (tests/doublefree.c:10) 244 Address 0x3807F7B4 is 0 bytes inside a block of size 177 free'd 245 at 0x4004FFDF: free (vg_clientmalloc.c:577) 246 by 0x80484C7: main (tests/doublefree.c:10) 247 </pre> 248 <p>Memcheck keeps track of the blocks allocated by your program 249 with <code class="function">malloc</code>/<code class="computeroutput">new</code>, 250 so it can know exactly whether or not the argument to 251 <code class="function">free</code>/<code class="computeroutput">delete</code> is 252 legitimate or not. Here, this test program has freed the same block 253 twice. As with the illegal read/write errors, Memcheck attempts to 254 make sense of the address freed. If, as here, the address is one 255 which has previously been freed, you wil be told that -- making 256 duplicate frees of the same block easy to spot. You will also get this 257 message if you try to free a pointer that doesn't point to the start of a 258 heap block.</p> 259 </div> 260 <div class="sect2" title="4.2.5.When a heap block is freed with an inappropriate deallocation function"> 261 <div class="titlepage"><div><div><h3 class="title"> 262 <a name="mc-manual.rudefn"></a>4.2.5.When a heap block is freed with an inappropriate deallocation 263 function</h3></div></div></div> 264 <p>In the following example, a block allocated with 265 <code class="function">new[]</code> has wrongly been deallocated with 266 <code class="function">free</code>:</p> 267 <pre class="programlisting"> 268 Mismatched free() / delete / delete [] 269 at 0x40043249: free (vg_clientfuncs.c:171) 270 by 0x4102BB4E: QGArray::~QGArray(void) (tools/qgarray.cpp:149) 271 by 0x4C261C41: PptDoc::~PptDoc(void) (include/qmemarray.h:60) 272 by 0x4C261F0E: PptXml::~PptXml(void) (pptxml.cc:44) 273 Address 0x4BB292A8 is 0 bytes inside a block of size 64 alloc'd 274 at 0x4004318C: operator new[](unsigned int) (vg_clientfuncs.c:152) 275 by 0x4C21BC15: KLaola::readSBStream(int) const (klaola.cc:314) 276 by 0x4C21C155: KLaola::stream(KLaola::OLENode const *) (klaola.cc:416) 277 by 0x4C21788F: OLEFilter::convert(QCString const &) (olefilter.cc:272) 278 </pre> 279 <p>In <code class="literal">C++</code> it's important to deallocate memory in a 280 way compatible with how it was allocated. The deal is:</p> 281 <div class="itemizedlist"><ul class="itemizedlist" type="disc"> 282 <li class="listitem"><p>If allocated with 283 <code class="function">malloc</code>, 284 <code class="function">calloc</code>, 285 <code class="function">realloc</code>, 286 <code class="function">valloc</code> or 287 <code class="function">memalign</code>, you must 288 deallocate with <code class="function">free</code>.</p></li> 289 <li class="listitem"><p>If allocated with <code class="function">new</code>, you must deallocate 290 with <code class="function">delete</code>.</p></li> 291 <li class="listitem"><p>If allocated with <code class="function">new[]</code>, you must 292 deallocate with <code class="function">delete[]</code>.</p></li> 293 </ul></div> 294 <p>The worst thing is that on Linux apparently it doesn't matter if 295 you do mix these up, but the same program may then crash on a 296 different platform, Solaris for example. So it's best to fix it 297 properly. According to the KDE folks "it's amazing how many C++ 298 programmers don't know this".</p> 299 <p>The reason behind the requirement is as follows. In some C++ 300 implementations, <code class="function">delete[]</code> must be used for 301 objects allocated by <code class="function">new[]</code> because the compiler 302 stores the size of the array and the pointer-to-member to the 303 destructor of the array's content just before the pointer actually 304 returned. <code class="function">delete</code> doesn't account for this and will get 305 confused, possibly corrupting the heap.</p> 306 </div> 307 <div class="sect2" title="4.2.6.Overlapping source and destination blocks"> 308 <div class="titlepage"><div><div><h3 class="title"> 309 <a name="mc-manual.overlap"></a>4.2.6.Overlapping source and destination blocks</h3></div></div></div> 310 <p>The following C library functions copy some data from one 311 memory block to another (or something similar): 312 <code class="function">memcpy</code>, 313 <code class="function">strcpy</code>, 314 <code class="function">strncpy</code>, 315 <code class="function">strcat</code>, 316 <code class="function">strncat</code>. 317 The blocks pointed to by their <code class="computeroutput">src</code> and 318 <code class="computeroutput">dst</code> pointers aren't allowed to overlap. 319 The POSIX standards have wording along the lines "If copying takes place 320 between objects that overlap, the behavior is undefined." Therefore, 321 Memcheck checks for this. 322 </p> 323 <p>For example:</p> 324 <pre class="programlisting"> 325 ==27492== Source and destination overlap in memcpy(0xbffff294, 0xbffff280, 21) 326 ==27492== at 0x40026CDC: memcpy (mc_replace_strmem.c:71) 327 ==27492== by 0x804865A: main (overlap.c:40) 328 </pre> 329 <p>You don't want the two blocks to overlap because one of them could 330 get partially overwritten by the copying.</p> 331 <p>You might think that Memcheck is being overly pedantic reporting 332 this in the case where <code class="computeroutput">dst</code> is less than 333 <code class="computeroutput">src</code>. For example, the obvious way to 334 implement <code class="function">memcpy</code> is by copying from the first 335 byte to the last. However, the optimisation guides of some 336 architectures recommend copying from the last byte down to the first. 337 Also, some implementations of <code class="function">memcpy</code> zero 338 <code class="computeroutput">dst</code> before copying, because zeroing the 339 destination's cache line(s) can improve performance.</p> 340 <p>The moral of the story is: if you want to write truly portable 341 code, don't make any assumptions about the language 342 implementation.</p> 343 </div> 344 <div class="sect2" title="4.2.7.Memory leak detection"> 345 <div class="titlepage"><div><div><h3 class="title"> 346 <a name="mc-manual.leaks"></a>4.2.7.Memory leak detection</h3></div></div></div> 347 <p>Memcheck keeps track of all heap blocks issued in response to 348 calls to 349 <code class="function">malloc</code>/<code class="function">new</code> et al. 350 So when the program exits, it knows which blocks have not been freed. 351 </p> 352 <p>If <code class="option">--leak-check</code> is set appropriately, for each 353 remaining block, Memcheck determines if the block is reachable from pointers 354 within the root-set. The root-set consists of (a) general purpose registers 355 of all threads, and (b) initialised, aligned, pointer-sized data words in 356 accessible client memory, including stacks.</p> 357 <p>There are two ways a block can be reached. The first is with a 358 "start-pointer", i.e. a pointer to the start of the block. The second is with 359 an "interior-pointer", i.e. a pointer to the middle of the block. There are 360 three ways we know of that an interior-pointer can occur:</p> 361 <div class="itemizedlist"><ul class="itemizedlist" type="disc"> 362 <li class="listitem"><p>The pointer might have originally been a start-pointer and have been 363 moved along deliberately (or not deliberately) by the program. In 364 particular, this can happen if your program uses tagged pointers, i.e. 365 if it uses the bottom one, two or three bits of a pointer, which are 366 normally always zero due to alignment, in order to store extra 367 information.</p></li> 368 <li class="listitem"><p>It might be a random junk value in memory, entirely unrelated, just 369 a coincidence.</p></li> 370 <li class="listitem"><p>It might be a pointer to an array of C++ objects (which possess 371 destructors) allocated with <code class="computeroutput">new[]</code>. In 372 this case, some compilers store a "magic cookie" containing the array 373 length at the start of the allocated block, and return a pointer to just 374 past that magic cookie, i.e. an interior-pointer. 375 See <a class="ulink" href="http://theory.uwinnipeg.ca/gnu/gcc/gxxint_14.html" target="_top">this 376 page</a> for more information.</p></li> 377 </ul></div> 378 <p>With that in mind, consider the nine possible cases described by the 379 following figure.</p> 380 <pre class="programlisting"> 381 Pointer chain AAA Category BBB Category 382 ------------- ------------ ------------ 383 (1) RRR ------------> BBB DR 384 (2) RRR ---> AAA ---> BBB DR IR 385 (3) RRR BBB DL 386 (4) RRR AAA ---> BBB DL IL 387 (5) RRR ------?-----> BBB (y)DR, (n)DL 388 (6) RRR ---> AAA -?-> BBB DR (y)IR, (n)DL 389 (7) RRR -?-> AAA ---> BBB (y)DR, (n)DL (y)IR, (n)IL 390 (8) RRR -?-> AAA -?-> BBB (y)DR, (n)DL (y,y)IR, (n,y)IL, (_,n)DL 391 (9) RRR AAA -?-> BBB DL (y)IL, (n)DL 392 393 Pointer chain legend: 394 - RRR: a root set node or DR block 395 - AAA, BBB: heap blocks 396 - --->: a start-pointer 397 - -?->: an interior-pointer 398 399 Category legend: 400 - DR: Directly reachable 401 - IR: Indirectly reachable 402 - DL: Directly lost 403 - IL: Indirectly lost 404 - (y)XY: it's XY if the interior-pointer is a real pointer 405 - (n)XY: it's XY if the interior-pointer is not a real pointer 406 - (_)XY: it's XY in either case 407 </pre> 408 <p>Every possible case can be reduced to one of the above nine. Memcheck 409 merges some of these cases in its output, resulting in the following four 410 categories.</p> 411 <div class="itemizedlist"><ul class="itemizedlist" type="disc"> 412 <li class="listitem"><p>"Still reachable". This covers cases 1 and 2 (for the BBB blocks) 413 above. A start-pointer or chain of start-pointers to the block is 414 found. Since the block is still pointed at, the programmer could, at 415 least in principle, have freed it before program exit. Because these 416 are very common and arguably not a problem, Memcheck won't report such 417 blocks individually unless <code class="option">--show-reachable=yes</code> is 418 specified.</p></li> 419 <li class="listitem"><p>"Definitely lost". This covers case 3 (for the BBB blocks) above. 420 This means that no pointer to the block can be found. The block is 421 classified as "lost", because the programmer could not possibly have 422 freed it at program exit, since no pointer to it exists. This is likely 423 a symptom of having lost the pointer at some earlier point in the 424 program. Such cases should be fixed by the programmer.</p></li> 425 <li class="listitem"><p>"Indirectly lost". This covers cases 4 and 9 (for the BBB blocks) 426 above. This means that the block is lost, not because there are no 427 pointers to it, but rather because all the blocks that point to it are 428 themselves lost. For example, if you have a binary tree and the root 429 node is lost, all its children nodes will be indirectly lost. Because 430 the problem will disappear if the definitely lost block that caused the 431 indirect leak is fixed, Memcheck won't report such blocks individually 432 unless <code class="option">--show-reachable=yes</code> is specified.</p></li> 433 <li class="listitem"><p>"Possibly lost". This covers cases 5--8 (for the BBB blocks) 434 above. This means that a chain of one or more pointers to the block has 435 been found, but at least one of the pointers is an interior-pointer. 436 This could just be a random value in memory that happens to point into a 437 block, and so you shouldn't consider this ok unless you know you have 438 interior-pointers.</p></li> 439 </ul></div> 440 <p>(Note: This mapping of the nine possible cases onto four categories is 441 not necessarily the best way that leaks could be reported; in particular, 442 interior-pointers are treated inconsistently. It is possible the 443 categorisation may be improved in the future.)</p> 444 <p>Furthermore, if suppressions exists for a block, it will be reported 445 as "suppressed" no matter what which of the above four categories it belongs 446 to.</p> 447 <p>The following is an example leak summary.</p> 448 <pre class="programlisting"> 449 LEAK SUMMARY: 450 definitely lost: 48 bytes in 3 blocks. 451 indirectly lost: 32 bytes in 2 blocks. 452 possibly lost: 96 bytes in 6 blocks. 453 still reachable: 64 bytes in 4 blocks. 454 suppressed: 0 bytes in 0 blocks. 455 </pre> 456 <p>If <code class="option">--leak-check=full</code> is specified, 457 Memcheck will give details for each definitely lost or possibly lost block, 458 including where it was allocated. (Actually, it merges results for all 459 blocks that have the same category and sufficiently similar stack traces 460 into a single "loss record". The 461 <code class="option">--leak-resolution</code> lets you control the 462 meaning of "sufficiently similar".) It cannot tell you when or how or why 463 the pointer to a leaked block was lost; you have to work that out for 464 yourself. In general, you should attempt to ensure your programs do not 465 have any definitely lost or possibly lost blocks at exit.</p> 466 <p>For example:</p> 467 <pre class="programlisting"> 468 8 bytes in 1 blocks are definitely lost in loss record 1 of 14 469 at 0x........: malloc (vg_replace_malloc.c:...) 470 by 0x........: mk (leak-tree.c:11) 471 by 0x........: main (leak-tree.c:39) 472 473 88 (8 direct, 80 indirect) bytes in 1 blocks are definitely lost in loss record 13 of 14 474 at 0x........: malloc (vg_replace_malloc.c:...) 475 by 0x........: mk (leak-tree.c:11) 476 by 0x........: main (leak-tree.c:25) 477 </pre> 478 <p>The first message describes a simple case of a single 8 byte block 479 that has been definitely lost. The second case mentions another 8 byte 480 block that has been definitely lost; the difference is that a further 80 481 bytes in other blocks are indirectly lost because of this lost block. 482 The loss records are not presented in any notable order, so the loss record 483 numbers aren't particularly meaningful.</p> 484 <p>If you specify <code class="option">--show-reachable=yes</code>, 485 reachable and indirectly lost blocks will also be shown, as the following 486 two examples show.</p> 487 <pre class="programlisting"> 488 64 bytes in 4 blocks are still reachable in loss record 2 of 4 489 at 0x........: malloc (vg_replace_malloc.c:177) 490 by 0x........: mk (leak-cases.c:52) 491 by 0x........: main (leak-cases.c:74) 492 493 32 bytes in 2 blocks are indirectly lost in loss record 1 of 4 494 at 0x........: malloc (vg_replace_malloc.c:177) 495 by 0x........: mk (leak-cases.c:52) 496 by 0x........: main (leak-cases.c:80) 497 </pre> 498 <p>Because there are different kinds of leaks with different severities, an 499 interesting question is this: which leaks should be counted as true "errors" 500 and which should not? The answer to this question affects the numbers printed 501 in the <code class="computeroutput">ERROR SUMMARY</code> line, and also the effect 502 of the <code class="option">--error-exitcode</code> option. Memcheck uses the following 503 criteria:</p> 504 <div class="itemizedlist"><ul class="itemizedlist" type="disc"> 505 <li class="listitem"><p>First, a leak is only counted as a true "error" if 506 <code class="option">--leak-check=full</code> is specified. In other words, an 507 unprinted leak is not considered a true "error". If this were not the 508 case, it would be possible to get a high error count but not have any 509 errors printed, which would be confusing.</p></li> 510 <li class="listitem"><p>After that, definitely lost and possibly lost blocks are counted as 511 true "errors". Indirectly lost and still reachable blocks are not counted 512 as true "errors", even if <code class="option">--show-reachable=yes</code> is 513 specified and they are printed; this is because such blocks don't need 514 direct fixing by the programmer. 515 </p></li> 516 </ul></div> 517 </div> 518 </div> 519 <div class="sect1" title="4.3.Memcheck Command-Line Options"> 520 <div class="titlepage"><div><div><h2 class="title" style="clear: both"> 521 <a name="mc-manual.options"></a>4.3.Memcheck Command-Line Options</h2></div></div></div> 522 <div class="variablelist"> 523 <a name="mc.opts.list"></a><dl> 524 <dt> 525 <a name="opt.leak-check"></a><span class="term"> 526 <code class="option">--leak-check=<no|summary|yes|full> [default: summary] </code> 527 </span> 528 </dt> 529 <dd><p>When enabled, search for memory leaks when the client 530 program finishes. If set to <code class="varname">summary</code>, it says how 531 many leaks occurred. If set to <code class="varname">full</code> or 532 <code class="varname">yes</code>, it also gives details of each individual 533 leak.</p></dd> 534 <dt> 535 <a name="opt.show-possibly-lost"></a><span class="term"> 536 <code class="option">--show-possibly-lost=<yes|no> [default: yes] </code> 537 </span> 538 </dt> 539 <dd><p>When disabled, the memory leak detector will not show "possibly lost" blocks. 540 </p></dd> 541 <dt> 542 <a name="opt.leak-resolution"></a><span class="term"> 543 <code class="option">--leak-resolution=<low|med|high> [default: high] </code> 544 </span> 545 </dt> 546 <dd> 547 <p>When doing leak checking, determines how willing 548 Memcheck is to consider different backtraces to 549 be the same for the purposes of merging multiple leaks into a single 550 leak report. When set to <code class="varname">low</code>, only the first 551 two entries need match. When <code class="varname">med</code>, four entries 552 have to match. When <code class="varname">high</code>, all entries need to 553 match.</p> 554 <p>For hardcore leak debugging, you probably want to use 555 <code class="option">--leak-resolution=high</code> together with 556 <code class="option">--num-callers=40</code> or some such large number. 557 </p> 558 <p>Note that the <code class="option">--leak-resolution</code> setting 559 does not affect Memcheck's ability to find 560 leaks. It only changes how the results are presented.</p> 561 </dd> 562 <dt> 563 <a name="opt.show-reachable"></a><span class="term"> 564 <code class="option">--show-reachable=<yes|no> [default: no] </code> 565 </span> 566 </dt> 567 <dd><p>When disabled, the memory leak detector only shows "definitely 568 lost" and "possibly lost" blocks. When enabled, the leak detector also 569 shows "reachable" and "indirectly lost" blocks. (In other words, it 570 shows all blocks, except suppressed ones, so 571 <code class="option">--show-all</code> would be a better name for 572 it.)</p></dd> 573 <dt> 574 <a name="opt.undef-value-errors"></a><span class="term"> 575 <code class="option">--undef-value-errors=<yes|no> [default: yes] </code> 576 </span> 577 </dt> 578 <dd><p>Controls whether Memcheck reports 579 uses of undefined value errors. Set this to 580 <code class="varname">no</code> if you don't want to see undefined value 581 errors. It also has the side effect of speeding up 582 Memcheck somewhat. 583 </p></dd> 584 <dt> 585 <a name="opt.track-origins"></a><span class="term"> 586 <code class="option">--track-origins=<yes|no> [default: no] </code> 587 </span> 588 </dt> 589 <dd> 590 <p>Controls whether Memcheck tracks 591 the origin of uninitialised values. By default, it does not, 592 which means that although it can tell you that an 593 uninitialised value is being used in a dangerous way, it 594 cannot tell you where the uninitialised value came from. This 595 often makes it difficult to track down the root problem. 596 </p> 597 <p>When set 598 to <code class="varname">yes</code>, Memcheck keeps 599 track of the origins of all uninitialised values. Then, when 600 an uninitialised value error is 601 reported, Memcheck will try to show the 602 origin of the value. An origin can be one of the following 603 four places: a heap block, a stack allocation, a client 604 request, or miscellaneous other sources (eg, a call 605 to <code class="varname">brk</code>). 606 </p> 607 <p>For uninitialised values originating from a heap 608 block, Memcheck shows where the block was 609 allocated. For uninitialised values originating from a stack 610 allocation, Memcheck can tell you which 611 function allocated the value, but no more than that -- typically 612 it shows you the source location of the opening brace of the 613 function. So you should carefully check that all of the 614 function's local variables are initialised properly. 615 </p> 616 <p>Performance overhead: origin tracking is expensive. It 617 halves Memcheck's speed and increases 618 memory use by a minimum of 100MB, and possibly more. 619 Nevertheless it can drastically reduce the effort required to 620 identify the root cause of uninitialised value errors, and so 621 is often a programmer productivity win, despite running 622 more slowly. 623 </p> 624 <p>Accuracy: Memcheck tracks origins 625 quite accurately. To avoid very large space and time 626 overheads, some approximations are made. It is possible, 627 although unlikely, that Memcheck will report an incorrect origin, or 628 not be able to identify any origin. 629 </p> 630 <p>Note that the combination 631 <code class="option">--track-origins=yes</code> 632 and <code class="option">--undef-value-errors=no</code> is 633 nonsensical. Memcheck checks for and 634 rejects this combination at startup. 635 </p> 636 </dd> 637 <dt> 638 <a name="opt.partial-loads-ok"></a><span class="term"> 639 <code class="option">--partial-loads-ok=<yes|no> [default: no] </code> 640 </span> 641 </dt> 642 <dd> 643 <p>Controls how Memcheck handles word-sized, 644 word-aligned loads from addresses for which some bytes are 645 addressable and others are not. When <code class="varname">yes</code>, such 646 loads do not produce an address error. Instead, loaded bytes 647 originating from illegal addresses are marked as uninitialised, and 648 those corresponding to legal addresses are handled in the normal 649 way.</p> 650 <p>When <code class="varname">no</code>, loads from partially invalid 651 addresses are treated the same as loads from completely invalid 652 addresses: an illegal-address error is issued, and the resulting 653 bytes are marked as initialised.</p> 654 <p>Note that code that behaves in this way is in violation of 655 the the ISO C/C++ standards, and should be considered broken. If 656 at all possible, such code should be fixed. This option should be 657 used only as a last resort.</p> 658 </dd> 659 <dt> 660 <a name="opt.freelist-vol"></a><span class="term"> 661 <code class="option">--freelist-vol=<number> [default: 20000000] </code> 662 </span> 663 </dt> 664 <dd> 665 <p>When the client program releases memory using 666 <code class="function">free</code> (in <code class="literal">C</code>) or 667 <code class="computeroutput">delete</code> 668 (<code class="literal">C++</code>), that memory is not immediately made 669 available for re-allocation. Instead, it is marked inaccessible 670 and placed in a queue of freed blocks. The purpose is to defer as 671 long as possible the point at which freed-up memory comes back 672 into circulation. This increases the chance that 673 Memcheck will be able to detect invalid 674 accesses to blocks for some significant period of time after they 675 have been freed.</p> 676 <p>This option specifies the maximum total size, in bytes, of the 677 blocks in the queue. The default value is twenty million bytes. 678 Increasing this increases the total amount of memory used by 679 Memcheck but may detect invalid uses of freed 680 blocks which would otherwise go undetected.</p> 681 </dd> 682 <dt> 683 <a name="opt.workaround-gcc296-bugs"></a><span class="term"> 684 <code class="option">--workaround-gcc296-bugs=<yes|no> [default: no] </code> 685 </span> 686 </dt> 687 <dd> 688 <p>When enabled, assume that reads and writes some small 689 distance below the stack pointer are due to bugs in GCC 2.96, and 690 does not report them. The "small distance" is 256 bytes by 691 default. Note that GCC 2.96 is the default compiler on some ancient 692 Linux distributions (RedHat 7.X) and so you may need to use this 693 option. Do not use it if you do not have to, as it can cause real 694 errors to be overlooked. A better alternative is to use a more 695 recent GCC in which this bug is fixed.</p> 696 <p>You may also need to use this option when working with 697 GCC 3.X or 4.X on 32-bit PowerPC Linux. This is because 698 GCC generates code which occasionally accesses below the 699 stack pointer, particularly for floating-point to/from integer 700 conversions. This is in violation of the 32-bit PowerPC ELF 701 specification, which makes no provision for locations below the 702 stack pointer to be accessible.</p> 703 </dd> 704 <dt> 705 <a name="opt.ignore-ranges"></a><span class="term"> 706 <code class="option">--ignore-ranges=0xPP-0xQQ[,0xRR-0xSS] </code> 707 </span> 708 </dt> 709 <dd><p>Any ranges listed in this option (and multiple ranges can be 710 specified, separated by commas) will be ignored by Memcheck's 711 addressability checking.</p></dd> 712 <dt> 713 <a name="opt.malloc-fill"></a><span class="term"> 714 <code class="option">--malloc-fill=<hexnumber> </code> 715 </span> 716 </dt> 717 <dd><p>Fills blocks allocated 718 by <code class="computeroutput">malloc</code>, 719 <code class="computeroutput">new</code>, etc, but not 720 by <code class="computeroutput">calloc</code>, with the specified 721 byte. This can be useful when trying to shake out obscure 722 memory corruption problems. The allocated area is still 723 regarded by Memcheck as undefined -- this option only affects its 724 contents. 725 </p></dd> 726 <dt> 727 <a name="opt.free-fill"></a><span class="term"> 728 <code class="option">--free-fill=<hexnumber> </code> 729 </span> 730 </dt> 731 <dd><p>Fills blocks freed 732 by <code class="computeroutput">free</code>, 733 <code class="computeroutput">delete</code>, etc, with the 734 specified byte value. This can be useful when trying to shake out 735 obscure memory corruption problems. The freed area is still 736 regarded by Memcheck as not valid for access -- this option only 737 affects its contents. 738 </p></dd> 739 </dl> 740 </div> 741 </div> 742 <div class="sect1" title="4.4.Writing suppression files"> 743 <div class="titlepage"><div><div><h2 class="title" style="clear: both"> 744 <a name="mc-manual.suppfiles"></a>4.4.Writing suppression files</h2></div></div></div> 745 <p>The basic suppression format is described in 746 <a class="xref" href="manual-core.html#manual-core.suppress" title="2.5.Suppressing errors">Suppressing errors</a>.</p> 747 <p>The suppression-type (second) line should have the form:</p> 748 <pre class="programlisting"> 749 Memcheck:suppression_type</pre> 750 <p>The Memcheck suppression types are as follows:</p> 751 <div class="itemizedlist"><ul class="itemizedlist" type="disc"> 752 <li class="listitem"><p><code class="varname">Value1</code>, 753 <code class="varname">Value2</code>, 754 <code class="varname">Value4</code>, 755 <code class="varname">Value8</code>, 756 <code class="varname">Value16</code>, 757 meaning an uninitialised-value error when 758 using a value of 1, 2, 4, 8 or 16 bytes.</p></li> 759 <li class="listitem"><p><code class="varname">Cond</code> (or its old 760 name, <code class="varname">Value0</code>), meaning use 761 of an uninitialised CPU condition code.</p></li> 762 <li class="listitem"><p><code class="varname">Addr1</code>, 763 <code class="varname">Addr2</code>, 764 <code class="varname">Addr4</code>, 765 <code class="varname">Addr8</code>, 766 <code class="varname">Addr16</code>, 767 meaning an invalid address during a 768 memory access of 1, 2, 4, 8 or 16 bytes respectively.</p></li> 769 <li class="listitem"><p><code class="varname">Jump</code>, meaning an 770 jump to an unaddressable location error.</p></li> 771 <li class="listitem"><p><code class="varname">Param</code>, meaning an 772 invalid system call parameter error.</p></li> 773 <li class="listitem"><p><code class="varname">Free</code>, meaning an 774 invalid or mismatching free.</p></li> 775 <li class="listitem"><p><code class="varname">Overlap</code>, meaning a 776 <code class="computeroutput">src</code> / 777 <code class="computeroutput">dst</code> overlap in 778 <code class="function">memcpy</code> or a similar function.</p></li> 779 <li class="listitem"><p><code class="varname">Leak</code>, meaning 780 a memory leak.</p></li> 781 </ul></div> 782 <p><code class="computeroutput">Param</code> errors have an extra 783 information line at this point, which is the name of the offending 784 system call parameter. No other error kinds have this extra 785 line.</p> 786 <p>The first line of the calling context: for <code class="varname">ValueN</code> 787 and <code class="varname">AddrN</code> errors, it is either the name of the function 788 in which the error occurred, or, failing that, the full path of the 789 <code class="filename">.so</code> file 790 or executable containing the error location. For <code class="varname">Free</code> errors, is the name 791 of the function doing the freeing (eg, <code class="function">free</code>, 792 <code class="function">__builtin_vec_delete</code>, etc). For 793 <code class="varname">Overlap</code> errors, is the name of the function with the 794 overlapping arguments (eg. <code class="function">memcpy</code>, 795 <code class="function">strcpy</code>, etc).</p> 796 <p>Lastly, there's the rest of the calling context.</p> 797 </div> 798 <div class="sect1" title="4.5.Details of Memcheck's checking machinery"> 799 <div class="titlepage"><div><div><h2 class="title" style="clear: both"> 800 <a name="mc-manual.machine"></a>4.5.Details of Memcheck's checking machinery</h2></div></div></div> 801 <p>Read this section if you want to know, in detail, exactly 802 what and how Memcheck is checking.</p> 803 <div class="sect2" title="4.5.1.Valid-value (V) bits"> 804 <div class="titlepage"><div><div><h3 class="title"> 805 <a name="mc-manual.value"></a>4.5.1.Valid-value (V) bits</h3></div></div></div> 806 <p>It is simplest to think of Memcheck implementing a synthetic CPU 807 which is identical to a real CPU, except for one crucial detail. Every 808 bit (literally) of data processed, stored and handled by the real CPU 809 has, in the synthetic CPU, an associated "valid-value" bit, which says 810 whether or not the accompanying bit has a legitimate value. In the 811 discussions which follow, this bit is referred to as the V (valid-value) 812 bit.</p> 813 <p>Each byte in the system therefore has a 8 V bits which follow it 814 wherever it goes. For example, when the CPU loads a word-size item (4 815 bytes) from memory, it also loads the corresponding 32 V bits from a 816 bitmap which stores the V bits for the process' entire address space. 817 If the CPU should later write the whole or some part of that value to 818 memory at a different address, the relevant V bits will be stored back 819 in the V-bit bitmap.</p> 820 <p>In short, each bit in the system has (conceptually) an associated V 821 bit, which follows it around everywhere, even inside the CPU. Yes, all the 822 CPU's registers (integer, floating point, vector and condition registers) 823 have their own V bit vectors. For this to work, Memcheck uses a great deal 824 of compression to represent the V bits compactly.</p> 825 <p>Copying values around does not cause Memcheck to check for, or 826 report on, errors. However, when a value is used in a way which might 827 conceivably affect your program's externally-visible behaviour, 828 the associated V bits are immediately checked. If any of these indicate 829 that the value is undefined (even partially), an error is reported.</p> 830 <p>Here's an (admittedly nonsensical) example:</p> 831 <pre class="programlisting"> 832 int i, j; 833 int a[10], b[10]; 834 for ( i = 0; i < 10; i++ ) { 835 j = a[i]; 836 b[i] = j; 837 }</pre> 838 <p>Memcheck emits no complaints about this, since it merely copies 839 uninitialised values from <code class="varname">a[]</code> into 840 <code class="varname">b[]</code>, and doesn't use them in a way which could 841 affect the behaviour of the program. However, if 842 the loop is changed to:</p> 843 <pre class="programlisting"> 844 for ( i = 0; i < 10; i++ ) { 845 j += a[i]; 846 } 847 if ( j == 77 ) 848 printf("hello there\n"); 849 </pre> 850 <p>then Memcheck will complain, at the 851 <code class="computeroutput">if</code>, that the condition depends on 852 uninitialised values. Note that it <span class="command"><strong>doesn't</strong></span> complain 853 at the <code class="varname">j += a[i];</code>, since at that point the 854 undefinedness is not "observable". It's only when a decision has to be 855 made as to whether or not to do the <code class="function">printf</code> -- an 856 observable action of your program -- that Memcheck complains.</p> 857 <p>Most low level operations, such as adds, cause Memcheck to use the 858 V bits for the operands to calculate the V bits for the result. Even if 859 the result is partially or wholly undefined, it does not 860 complain.</p> 861 <p>Checks on definedness only occur in three places: when a value is 862 used to generate a memory address, when control flow decision needs to 863 be made, and when a system call is detected, Memcheck checks definedness 864 of parameters as required.</p> 865 <p>If a check should detect undefinedness, an error message is 866 issued. The resulting value is subsequently regarded as well-defined. 867 To do otherwise would give long chains of error messages. In other 868 words, once Memcheck reports an undefined value error, it tries to 869 avoid reporting further errors derived from that same undefined 870 value.</p> 871 <p>This sounds overcomplicated. Why not just check all reads from 872 memory, and complain if an undefined value is loaded into a CPU 873 register? Well, that doesn't work well, because perfectly legitimate C 874 programs routinely copy uninitialised values around in memory, and we 875 don't want endless complaints about that. Here's the canonical example. 876 Consider a struct like this:</p> 877 <pre class="programlisting"> 878 struct S { int x; char c; }; 879 struct S s1, s2; 880 s1.x = 42; 881 s1.c = 'z'; 882 s2 = s1; 883 </pre> 884 <p>The question to ask is: how large is <code class="varname">struct S</code>, 885 in bytes? An <code class="varname">int</code> is 4 bytes and a 886 <code class="varname">char</code> one byte, so perhaps a <code class="varname">struct 887 S</code> occupies 5 bytes? Wrong. All non-toy compilers we know 888 of will round the size of <code class="varname">struct S</code> up to a whole 889 number of words, in this case 8 bytes. Not doing this forces compilers 890 to generate truly appalling code for accessing arrays of 891 <code class="varname">struct S</code>'s on some architectures.</p> 892 <p>So <code class="varname">s1</code> occupies 8 bytes, yet only 5 of them will 893 be initialised. For the assignment <code class="varname">s2 = s1</code>, GCC 894 generates code to copy all 8 bytes wholesale into <code class="varname">s2</code> 895 without regard for their meaning. If Memcheck simply checked values as 896 they came out of memory, it would yelp every time a structure assignment 897 like this happened. So the more complicated behaviour described above 898 is necessary. This allows GCC to copy 899 <code class="varname">s1</code> into <code class="varname">s2</code> any way it likes, and a 900 warning will only be emitted if the uninitialised values are later 901 used.</p> 902 </div> 903 <div class="sect2" title="4.5.2.Valid-address (A) bits"> 904 <div class="titlepage"><div><div><h3 class="title"> 905 <a name="mc-manual.vaddress"></a>4.5.2.Valid-address (A) bits</h3></div></div></div> 906 <p>Notice that the previous subsection describes how the validity of 907 values is established and maintained without having to say whether the 908 program does or does not have the right to access any particular memory 909 location. We now consider the latter question.</p> 910 <p>As described above, every bit in memory or in the CPU has an 911 associated valid-value (V) bit. In addition, all bytes in memory, but 912 not in the CPU, have an associated valid-address (A) bit. This 913 indicates whether or not the program can legitimately read or write that 914 location. It does not give any indication of the validity or the data 915 at that location -- that's the job of the V bits -- only whether or not 916 the location may be accessed.</p> 917 <p>Every time your program reads or writes memory, Memcheck checks 918 the A bits associated with the address. If any of them indicate an 919 invalid address, an error is emitted. Note that the reads and writes 920 themselves do not change the A bits, only consult them.</p> 921 <p>So how do the A bits get set/cleared? Like this:</p> 922 <div class="itemizedlist"><ul class="itemizedlist" type="disc"> 923 <li class="listitem"><p>When the program starts, all the global data areas are 924 marked as accessible.</p></li> 925 <li class="listitem"><p>When the program does 926 <code class="function">malloc</code>/<code class="computeroutput">new</code>, 927 the A bits for exactly the area allocated, and not a byte more, 928 are marked as accessible. Upon freeing the area the A bits are 929 changed to indicate inaccessibility.</p></li> 930 <li class="listitem"><p>When the stack pointer register (<code class="literal">SP</code>) moves 931 up or down, A bits are set. The rule is that the area from 932 <code class="literal">SP</code> up to the base of the stack is marked as 933 accessible, and below <code class="literal">SP</code> is inaccessible. (If 934 that sounds illogical, bear in mind that the stack grows down, not 935 up, on almost all Unix systems, including GNU/Linux.) Tracking 936 <code class="literal">SP</code> like this has the useful side-effect that the 937 section of stack used by a function for local variables etc is 938 automatically marked accessible on function entry and inaccessible 939 on exit.</p></li> 940 <li class="listitem"><p>When doing system calls, A bits are changed appropriately. 941 For example, <code class="literal">mmap</code> 942 magically makes files appear in the process' 943 address space, so the A bits must be updated if <code class="literal">mmap</code> 944 succeeds.</p></li> 945 <li class="listitem"><p>Optionally, your program can tell Memcheck about such changes 946 explicitly, using the client request mechanism described 947 above.</p></li> 948 </ul></div> 949 </div> 950 <div class="sect2" title="4.5.3.Putting it all together"> 951 <div class="titlepage"><div><div><h3 class="title"> 952 <a name="mc-manual.together"></a>4.5.3.Putting it all together</h3></div></div></div> 953 <p>Memcheck's checking machinery can be summarised as 954 follows:</p> 955 <div class="itemizedlist"><ul class="itemizedlist" type="disc"> 956 <li class="listitem"><p>Each byte in memory has 8 associated V (valid-value) bits, 957 saying whether or not the byte has a defined value, and a single A 958 (valid-address) bit, saying whether or not the program currently has 959 the right to read/write that address. As mentioned above, heavy 960 use of compression means the overhead is typically around 25%.</p></li> 961 <li class="listitem"><p>When memory is read or written, the relevant A bits are 962 consulted. If they indicate an invalid address, Memcheck emits an 963 Invalid read or Invalid write error.</p></li> 964 <li class="listitem"><p>When memory is read into the CPU's registers, the relevant V 965 bits are fetched from memory and stored in the simulated CPU. They 966 are not consulted.</p></li> 967 <li class="listitem"><p>When a register is written out to memory, the V bits for that 968 register are written back to memory too.</p></li> 969 <li class="listitem"><p>When values in CPU registers are used to generate a memory 970 address, or to determine the outcome of a conditional branch, the V 971 bits for those values are checked, and an error emitted if any of 972 them are undefined.</p></li> 973 <li class="listitem"><p>When values in CPU registers are used for any other purpose, 974 Memcheck computes the V bits for the result, but does not check 975 them.</p></li> 976 <li class="listitem"><p>Once the V bits for a value in the CPU have been checked, they 977 are then set to indicate validity. This avoids long chains of 978 errors.</p></li> 979 <li class="listitem"> 980 <p>When values are loaded from memory, Memcheck checks the A bits 981 for that location and issues an illegal-address warning if needed. 982 In that case, the V bits loaded are forced to indicate Valid, 983 despite the location being invalid.</p> 984 <p>This apparently strange choice reduces the amount of confusing 985 information presented to the user. It avoids the unpleasant 986 phenomenon in which memory is read from a place which is both 987 unaddressable and contains invalid values, and, as a result, you get 988 not only an invalid-address (read/write) error, but also a 989 potentially large set of uninitialised-value errors, one for every 990 time the value is used.</p> 991 <p>There is a hazy boundary case to do with multi-byte loads from 992 addresses which are partially valid and partially invalid. See 993 details of the option <code class="option">--partial-loads-ok</code> for details. 994 </p> 995 </li> 996 </ul></div> 997 <p>Memcheck intercepts calls to <code class="function">malloc</code>, 998 <code class="function">calloc</code>, <code class="function">realloc</code>, 999 <code class="function">valloc</code>, <code class="function">memalign</code>, 1000 <code class="function">free</code>, <code class="computeroutput">new</code>, 1001 <code class="computeroutput">new[]</code>, 1002 <code class="computeroutput">delete</code> and 1003 <code class="computeroutput">delete[]</code>. The behaviour you get 1004 is:</p> 1005 <div class="itemizedlist"><ul class="itemizedlist" type="disc"> 1006 <li class="listitem"><p><code class="function">malloc</code>/<code class="function">new</code>/<code class="computeroutput">new[]</code>: 1007 the returned memory is marked as addressable but not having valid 1008 values. This means you have to write to it before you can read 1009 it.</p></li> 1010 <li class="listitem"><p><code class="function">calloc</code>: returned memory is marked both 1011 addressable and valid, since <code class="function">calloc</code> clears 1012 the area to zero.</p></li> 1013 <li class="listitem"><p><code class="function">realloc</code>: if the new size is larger than 1014 the old, the new section is addressable but invalid, as with 1015 <code class="function">malloc</code>. If the new size is smaller, the 1016 dropped-off section is marked as unaddressable. You may only pass to 1017 <code class="function">realloc</code> a pointer previously issued to you by 1018 <code class="function">malloc</code>/<code class="function">calloc</code>/<code class="function">realloc</code>.</p></li> 1019 <li class="listitem"><p><code class="function">free</code>/<code class="computeroutput">delete</code>/<code class="computeroutput">delete[]</code>: 1020 you may only pass to these functions a pointer previously issued 1021 to you by the corresponding allocation function. Otherwise, 1022 Memcheck complains. If the pointer is indeed valid, Memcheck 1023 marks the entire area it points at as unaddressable, and places 1024 the block in the freed-blocks-queue. The aim is to defer as long 1025 as possible reallocation of this block. Until that happens, all 1026 attempts to access it will elicit an invalid-address error, as you 1027 would hope.</p></li> 1028 </ul></div> 1029 </div> 1030 </div> 1031 <div class="sect1" title="4.6.Client Requests"> 1032 <div class="titlepage"><div><div><h2 class="title" style="clear: both"> 1033 <a name="mc-manual.clientreqs"></a>4.6.Client Requests</h2></div></div></div> 1034 <p>The following client requests are defined in 1035 <code class="filename">memcheck.h</code>. 1036 See <code class="filename">memcheck.h</code> for exact details of their 1037 arguments.</p> 1038 <div class="itemizedlist"><ul class="itemizedlist" type="disc"> 1039 <li class="listitem"><p><code class="varname">VALGRIND_MAKE_MEM_NOACCESS</code>, 1040 <code class="varname">VALGRIND_MAKE_MEM_UNDEFINED</code> and 1041 <code class="varname">VALGRIND_MAKE_MEM_DEFINED</code>. 1042 These mark address ranges as completely inaccessible, 1043 accessible but containing undefined data, and accessible and 1044 containing defined data, respectively.</p></li> 1045 <li class="listitem"><p><code class="varname">VALGRIND_MAKE_MEM_DEFINED_IF_ADDRESSABLE</code>. 1046 This is just like <code class="varname">VALGRIND_MAKE_MEM_DEFINED</code> but only 1047 affects those bytes that are already addressable.</p></li> 1048 <li class="listitem"><p><code class="varname">VALGRIND_CHECK_MEM_IS_ADDRESSABLE</code> and 1049 <code class="varname">VALGRIND_CHECK_MEM_IS_DEFINED</code>: check immediately 1050 whether or not the given address range has the relevant property, 1051 and if not, print an error message. Also, for the convenience of 1052 the client, returns zero if the relevant property holds; otherwise, 1053 the returned value is the address of the first byte for which the 1054 property is not true. Always returns 0 when not run on 1055 Valgrind.</p></li> 1056 <li class="listitem"><p><code class="varname">VALGRIND_CHECK_VALUE_IS_DEFINED</code>: a quick and easy 1057 way to find out whether Valgrind thinks a particular value 1058 (lvalue, to be precise) is addressable and defined. Prints an error 1059 message if not. It has no return value.</p></li> 1060 <li class="listitem"><p><code class="varname">VALGRIND_DO_LEAK_CHECK</code>: does a full memory leak 1061 check (like <code class="option">--leak-check=full</code>) right now. 1062 This is useful for incrementally checking for leaks between arbitrary 1063 places in the program's execution. It has no return value.</p></li> 1064 <li class="listitem"><p><code class="varname">VALGRIND_DO_QUICK_LEAK_CHECK</code>: like 1065 <code class="varname">VALGRIND_DO_LEAK_CHECK</code>, except it produces only a leak 1066 summary (like <code class="option">--leak-check=summary</code>). 1067 It has no return value.</p></li> 1068 <li class="listitem"><p><code class="varname">VALGRIND_COUNT_LEAKS</code>: fills in the four 1069 arguments with the number of bytes of memory found by the previous 1070 leak check to be leaked (i.e. the sum of direct leaks and indirect leaks), 1071 dubious, reachable and suppressed. This is useful in test harness code, 1072 after calling <code class="varname">VALGRIND_DO_LEAK_CHECK</code> or 1073 <code class="varname">VALGRIND_DO_QUICK_LEAK_CHECK</code>.</p></li> 1074 <li class="listitem"><p><code class="varname">VALGRIND_COUNT_LEAK_BLOCKS</code>: identical to 1075 <code class="varname">VALGRIND_COUNT_LEAKS</code> except that it returns the 1076 number of blocks rather than the number of bytes in each 1077 category.</p></li> 1078 <li class="listitem"><p><code class="varname">VALGRIND_GET_VBITS</code> and 1079 <code class="varname">VALGRIND_SET_VBITS</code>: allow you to get and set the 1080 V (validity) bits for an address range. You should probably only 1081 set V bits that you have got with 1082 <code class="varname">VALGRIND_GET_VBITS</code>. Only for those who really 1083 know what they are doing.</p></li> 1084 <li class="listitem"> 1085 <p><code class="varname">VALGRIND_CREATE_BLOCK</code> and 1086 <code class="varname">VALGRIND_DISCARD</code>. <code class="varname">VALGRIND_CREATE_BLOCK</code> 1087 takes an address, a number of bytes and a character string. The 1088 specified address range is then associated with that string. When 1089 Memcheck reports an invalid access to an address in the range, it 1090 will describe it in terms of this block rather than in terms of 1091 any other block it knows about. Note that the use of this macro 1092 does not actually change the state of memory in any way -- it 1093 merely gives a name for the range. 1094 </p> 1095 <p>At some point you may want Memcheck to stop reporting errors 1096 in terms of the block named 1097 by <code class="varname">VALGRIND_CREATE_BLOCK</code>. To make this 1098 possible, <code class="varname">VALGRIND_CREATE_BLOCK</code> returns a 1099 "block handle", which is a C <code class="varname">int</code> value. You 1100 can pass this block handle to <code class="varname">VALGRIND_DISCARD</code>. 1101 After doing so, Valgrind will no longer relate addressing errors 1102 in the specified range to the block. Passing invalid handles to 1103 <code class="varname">VALGRIND_DISCARD</code> is harmless. 1104 </p> 1105 </li> 1106 </ul></div> 1107 </div> 1108 <div class="sect1" title="4.7.Memory Pools: describing and working with custom allocators"> 1109 <div class="titlepage"><div><div><h2 class="title" style="clear: both"> 1110 <a name="mc-manual.mempools"></a>4.7.Memory Pools: describing and working with custom allocators</h2></div></div></div> 1111 <p>Some programs use custom memory allocators, often for performance 1112 reasons. Left to itself, Memcheck is unable to understand the 1113 behaviour of custom allocation schemes as well as it understands the 1114 standard allocators, and so may miss errors and leaks in your program. What 1115 this section describes is a way to give Memcheck enough of a description of 1116 your custom allocator that it can make at least some sense of what is 1117 happening.</p> 1118 <p>There are many different sorts of custom allocator, so Memcheck 1119 attempts to reason about them using a loose, abstract model. We 1120 use the following terminology when describing custom allocation 1121 systems:</p> 1122 <div class="itemizedlist"><ul class="itemizedlist" type="disc"> 1123 <li class="listitem"><p>Custom allocation involves a set of independent "memory pools". 1124 </p></li> 1125 <li class="listitem"><p>Memcheck's notion of a a memory pool consists of a single "anchor 1126 address" and a set of non-overlapping "chunks" associated with the 1127 anchor address.</p></li> 1128 <li class="listitem"><p>Typically a pool's anchor address is the address of a 1129 book-keeping "header" structure.</p></li> 1130 <li class="listitem"><p>Typically the pool's chunks are drawn from a contiguous 1131 "superblock" acquired through the system 1132 <code class="function">malloc</code> or 1133 <code class="function">mmap</code>.</p></li> 1134 </ul></div> 1135 <p>Keep in mind that the last two points above say "typically": the 1136 Valgrind mempool client request API is intentionally vague about the 1137 exact structure of a mempool. There is no specific mention made of 1138 headers or superblocks. Nevertheless, the following picture may help 1139 elucidate the intention of the terms in the API:</p> 1140 <pre class="programlisting"> 1141 "pool" 1142 (anchor address) 1143 | 1144 v 1145 +--------+---+ 1146 | header | o | 1147 +--------+-|-+ 1148 | 1149 v superblock 1150 +------+---+--------------+---+------------------+ 1151 | |rzB| allocation |rzB| | 1152 +------+---+--------------+---+------------------+ 1153 ^ ^ 1154 | | 1155 "addr" "addr"+"size" 1156 </pre> 1157 <p> 1158 Note that the header and the superblock may be contiguous or 1159 discontiguous, and there may be multiple superblocks associated with a 1160 single header; such variations are opaque to Memcheck. The API 1161 only requires that your allocation scheme can present sensible values 1162 of "pool", "addr" and "size".</p> 1163 <p> 1164 Typically, before making client requests related to mempools, a client 1165 program will have allocated such a header and superblock for their 1166 mempool, and marked the superblock NOACCESS using the 1167 <code class="varname">VALGRIND_MAKE_MEM_NOACCESS</code> client request.</p> 1168 <p> 1169 When dealing with mempools, the goal is to maintain a particular 1170 invariant condition: that Memcheck believes the unallocated portions 1171 of the pool's superblock (including redzones) are NOACCESS. To 1172 maintain this invariant, the client program must ensure that the 1173 superblock starts out in that state; Memcheck cannot make it so, since 1174 Memcheck never explicitly learns about the superblock of a pool, only 1175 the allocated chunks within the pool.</p> 1176 <p> 1177 Once the header and superblock for a pool are established and properly 1178 marked, there are a number of client requests programs can use to 1179 inform Memcheck about changes to the state of a mempool:</p> 1180 <div class="itemizedlist"><ul class="itemizedlist" type="disc"> 1181 <li class="listitem"> 1182 <p> 1183 <code class="varname">VALGRIND_CREATE_MEMPOOL(pool, rzB, is_zeroed)</code>: 1184 This request registers the address <code class="varname">pool</code> as the anchor 1185 address for a memory pool. It also provides a size 1186 <code class="varname">rzB</code>, specifying how large the redzones placed around 1187 chunks allocated from the pool should be. Finally, it provides an 1188 <code class="varname">is_zeroed</code> argument that specifies whether the pool's 1189 chunks are zeroed (more precisely: defined) when allocated. 1190 </p> 1191 <p> 1192 Upon completion of this request, no chunks are associated with the 1193 pool. The request simply tells Memcheck that the pool exists, so that 1194 subsequent calls can refer to it as a pool. 1195 </p> 1196 </li> 1197 <li class="listitem"><p><code class="varname">VALGRIND_DESTROY_MEMPOOL(pool)</code>: 1198 This request tells Memcheck that a pool is being torn down. Memcheck 1199 then removes all records of chunks associated with the pool, as well 1200 as its record of the pool's existence. While destroying its records of 1201 a mempool, Memcheck resets the redzones of any live chunks in the pool 1202 to NOACCESS. 1203 </p></li> 1204 <li class="listitem"><p><code class="varname">VALGRIND_MEMPOOL_ALLOC(pool, addr, size)</code>: 1205 This request informs Memcheck that a <code class="varname">size</code>-byte chunk 1206 has been allocated at <code class="varname">addr</code>, and associates the chunk with the 1207 specified 1208 <code class="varname">pool</code>. If the pool was created with nonzero 1209 <code class="varname">rzB</code> redzones, Memcheck will mark the 1210 <code class="varname">rzB</code> bytes before and after the chunk as NOACCESS. If 1211 the pool was created with the <code class="varname">is_zeroed</code> argument set, 1212 Memcheck will mark the chunk as DEFINED, otherwise Memcheck will mark 1213 the chunk as UNDEFINED. 1214 </p></li> 1215 <li class="listitem"><p><code class="varname">VALGRIND_MEMPOOL_FREE(pool, addr)</code>: 1216 This request informs Memcheck that the chunk at <code class="varname">addr</code> 1217 should no longer be considered allocated. Memcheck will mark the chunk 1218 associated with <code class="varname">addr</code> as NOACCESS, and delete its 1219 record of the chunk's existence. 1220 </p></li> 1221 <li class="listitem"> 1222 <p><code class="varname">VALGRIND_MEMPOOL_TRIM(pool, addr, size)</code>: 1223 This request trims the chunks associated with <code class="varname">pool</code>. 1224 The request only operates on chunks associated with 1225 <code class="varname">pool</code>. Trimming is formally defined as:</p> 1226 <div class="itemizedlist"><ul class="itemizedlist" type="circle"> 1227 <li class="listitem"><p> All chunks entirely inside the range 1228 <code class="varname">addr..(addr+size-1)</code> are preserved.</p></li> 1229 <li class="listitem"><p>All chunks entirely outside the range 1230 <code class="varname">addr..(addr+size-1)</code> are discarded, as though 1231 <code class="varname">VALGRIND_MEMPOOL_FREE</code> was called on them. </p></li> 1232 <li class="listitem"><p>All other chunks must intersect with the range 1233 <code class="varname">addr..(addr+size-1)</code>; areas outside the 1234 intersection are marked as NOACCESS, as though they had been 1235 independently freed with 1236 <code class="varname">VALGRIND_MEMPOOL_FREE</code>.</p></li> 1237 </ul></div> 1238 <p>This is a somewhat rare request, but can be useful in 1239 implementing the type of mass-free operations common in custom 1240 LIFO allocators.</p> 1241 </li> 1242 <li class="listitem"> 1243 <p><code class="varname">VALGRIND_MOVE_MEMPOOL(poolA, poolB)</code>: This 1244 request informs Memcheck that the pool previously anchored at 1245 address <code class="varname">poolA</code> has moved to anchor address 1246 <code class="varname">poolB</code>. This is a rare request, typically only needed 1247 if you <code class="function">realloc</code> the header of a mempool.</p> 1248 <p>No memory-status bits are altered by this request.</p> 1249 </li> 1250 <li class="listitem"> 1251 <p> 1252 <code class="varname">VALGRIND_MEMPOOL_CHANGE(pool, addrA, addrB, 1253 size)</code>: This request informs Memcheck that the chunk 1254 previously allocated at address <code class="varname">addrA</code> within 1255 <code class="varname">pool</code> has been moved and/or resized, and should be 1256 changed to cover the region <code class="varname">addrB..(addrB+size-1)</code>. This 1257 is a rare request, typically only needed if you 1258 <code class="function">realloc</code> a superblock or wish to extend a chunk 1259 without changing its memory-status bits. 1260 </p> 1261 <p>No memory-status bits are altered by this request. 1262 </p> 1263 </li> 1264 <li class="listitem"><p><code class="varname">VALGRIND_MEMPOOL_EXISTS(pool)</code>: 1265 This request informs the caller whether or not Memcheck is currently 1266 tracking a mempool at anchor address <code class="varname">pool</code>. It 1267 evaluates to 1 when there is a mempool associated with that address, 0 1268 otherwise. This is a rare request, only useful in circumstances when 1269 client code might have lost track of the set of active mempools. 1270 </p></li> 1271 </ul></div> 1272 </div> 1273 <div class="sect1" title="4.8.Debugging MPI Parallel Programs with Valgrind"> 1274 <div class="titlepage"><div><div><h2 class="title" style="clear: both"> 1275 <a name="mc-manual.mpiwrap"></a>4.8.Debugging MPI Parallel Programs with Valgrind</h2></div></div></div> 1276 <p>Memcheck supports debugging of distributed-memory applications 1277 which use the MPI message passing standard. This support consists of a 1278 library of wrapper functions for the 1279 <code class="computeroutput">PMPI_*</code> interface. When incorporated 1280 into the application's address space, either by direct linking or by 1281 <code class="computeroutput">LD_PRELOAD</code>, the wrappers intercept 1282 calls to <code class="computeroutput">PMPI_Send</code>, 1283 <code class="computeroutput">PMPI_Recv</code>, etc. They then 1284 use client requests to inform Memcheck of memory state changes caused 1285 by the function being wrapped. This reduces the number of false 1286 positives that Memcheck otherwise typically reports for MPI 1287 applications.</p> 1288 <p>The wrappers also take the opportunity to carefully check 1289 size and definedness of buffers passed as arguments to MPI functions, hence 1290 detecting errors such as passing undefined data to 1291 <code class="computeroutput">PMPI_Send</code>, or receiving data into a 1292 buffer which is too small.</p> 1293 <p>Unlike most of the rest of Valgrind, the wrapper library is subject to a 1294 BSD-style license, so you can link it into any code base you like. 1295 See the top of <code class="computeroutput">mpi/libmpiwrap.c</code> 1296 for license details.</p> 1297 <div class="sect2" title="4.8.1.Building and installing the wrappers"> 1298 <div class="titlepage"><div><div><h3 class="title"> 1299 <a name="mc-manual.mpiwrap.build"></a>4.8.1.Building and installing the wrappers</h3></div></div></div> 1300 <p> The wrapper library will be built automatically if possible. 1301 Valgrind's configure script will look for a suitable 1302 <code class="computeroutput">mpicc</code> to build it with. This must be 1303 the same <code class="computeroutput">mpicc</code> you use to build the 1304 MPI application you want to debug. By default, Valgrind tries 1305 <code class="computeroutput">mpicc</code>, but you can specify a 1306 different one by using the configure-time option 1307 <code class="option">--with-mpicc</code>. Currently the 1308 wrappers are only buildable with 1309 <code class="computeroutput">mpicc</code>s which are based on GNU 1310 GCC or Intel's C++ Compiler.</p> 1311 <p>Check that the configure script prints a line like this:</p> 1312 <pre class="programlisting"> 1313 checking for usable MPI2-compliant mpicc and mpi.h... yes, mpicc 1314 </pre> 1315 <p>If it says <code class="computeroutput">... no</code>, your 1316 <code class="computeroutput">mpicc</code> has failed to compile and link 1317 a test MPI2 program.</p> 1318 <p>If the configure test succeeds, continue in the usual way with 1319 <code class="computeroutput">make</code> and <code class="computeroutput">make 1320 install</code>. The final install tree should then contain 1321 <code class="computeroutput">libmpiwrap-<platform>.so</code>. 1322 </p> 1323 <p>Compile up a test MPI program (eg, MPI hello-world) and try 1324 this:</p> 1325 <pre class="programlisting"> 1326 LD_PRELOAD=$prefix/lib/valgrind/libmpiwrap-<platform>.so \ 1327 mpirun [args] $prefix/bin/valgrind ./hello 1328 </pre> 1329 <p>You should see something similar to the following</p> 1330 <pre class="programlisting"> 1331 valgrind MPI wrappers 31901: Active for pid 31901 1332 valgrind MPI wrappers 31901: Try MPIWRAP_DEBUG=help for possible options 1333 </pre> 1334 <p>repeated for every process in the group. If you do not see 1335 these, there is an build/installation problem of some kind.</p> 1336 <p> The MPI functions to be wrapped are assumed to be in an ELF 1337 shared object with soname matching 1338 <code class="computeroutput">libmpi.so*</code>. This is known to be 1339 correct at least for Open MPI and Quadrics MPI, and can easily be 1340 changed if required.</p> 1341 </div> 1342 <div class="sect2" title="4.8.2.Getting started"> 1343 <div class="titlepage"><div><div><h3 class="title"> 1344 <a name="mc-manual.mpiwrap.gettingstarted"></a>4.8.2.Getting started</h3></div></div></div> 1345 <p>Compile your MPI application as usual, taking care to link it 1346 using the same <code class="computeroutput">mpicc</code> that your 1347 Valgrind build was configured with.</p> 1348 <p> 1349 Use the following basic scheme to run your application on Valgrind with 1350 the wrappers engaged:</p> 1351 <pre class="programlisting"> 1352 MPIWRAP_DEBUG=[wrapper-args] \ 1353 LD_PRELOAD=$prefix/lib/valgrind/libmpiwrap-<platform>.so \ 1354 mpirun [mpirun-args] \ 1355 $prefix/bin/valgrind [valgrind-args] \ 1356 [application] [app-args] 1357 </pre> 1358 <p>As an alternative to 1359 <code class="computeroutput">LD_PRELOAD</code>ing 1360 <code class="computeroutput">libmpiwrap-<platform>.so</code>, you can 1361 simply link it to your application if desired. This should not disturb 1362 native behaviour of your application in any way.</p> 1363 </div> 1364 <div class="sect2" title="4.8.3.Controlling the wrapper library"> 1365 <div class="titlepage"><div><div><h3 class="title"> 1366 <a name="mc-manual.mpiwrap.controlling"></a>4.8.3.Controlling the wrapper library</h3></div></div></div> 1367 <p>Environment variable 1368 <code class="computeroutput">MPIWRAP_DEBUG</code> is consulted at 1369 startup. The default behaviour is to print a starting banner</p> 1370 <pre class="programlisting"> 1371 valgrind MPI wrappers 16386: Active for pid 16386 1372 valgrind MPI wrappers 16386: Try MPIWRAP_DEBUG=help for possible options 1373 </pre> 1374 <p> and then be relatively quiet.</p> 1375 <p>You can give a list of comma-separated options in 1376 <code class="computeroutput">MPIWRAP_DEBUG</code>. These are</p> 1377 <div class="itemizedlist"><ul class="itemizedlist" type="disc"> 1378 <li class="listitem"><p><code class="computeroutput">verbose</code>: 1379 show entries/exits of all wrappers. Also show extra 1380 debugging info, such as the status of outstanding 1381 <code class="computeroutput">MPI_Request</code>s resulting 1382 from uncompleted <code class="computeroutput">MPI_Irecv</code>s.</p></li> 1383 <li class="listitem"><p><code class="computeroutput">quiet</code>: 1384 opposite of <code class="computeroutput">verbose</code>, only print 1385 anything when the wrappers want 1386 to report a detected programming error, or in case of catastrophic 1387 failure of the wrappers.</p></li> 1388 <li class="listitem"><p><code class="computeroutput">warn</code>: 1389 by default, functions which lack proper wrappers 1390 are not commented on, just silently 1391 ignored. This causes a warning to be printed for each unwrapped 1392 function used, up to a maximum of three warnings per function.</p></li> 1393 <li class="listitem"><p><code class="computeroutput">strict</code>: 1394 print an error message and abort the program if 1395 a function lacking a wrapper is used.</p></li> 1396 </ul></div> 1397 <p> If you want to use Valgrind's XML output facility 1398 (<code class="option">--xml=yes</code>), you should pass 1399 <code class="computeroutput">quiet</code> in 1400 <code class="computeroutput">MPIWRAP_DEBUG</code> so as to get rid of any 1401 extraneous printing from the wrappers.</p> 1402 </div> 1403 <div class="sect2" title="4.8.4.Functions"> 1404 <div class="titlepage"><div><div><h3 class="title"> 1405 <a name="mc-manual.mpiwrap.limitations.functions"></a>4.8.4.Functions</h3></div></div></div> 1406 <p>All MPI2 functions except 1407 <code class="computeroutput">MPI_Wtick</code>, 1408 <code class="computeroutput">MPI_Wtime</code> and 1409 <code class="computeroutput">MPI_Pcontrol</code> have wrappers. The 1410 first two are not wrapped because they return a 1411 <code class="computeroutput">double</code>, which Valgrind's 1412 function-wrap mechanism cannot handle (but it could easily be 1413 extended to do so). <code class="computeroutput">MPI_Pcontrol</code> cannot be 1414 wrapped as it has variable arity: 1415 <code class="computeroutput">int MPI_Pcontrol(const int level, ...)</code></p> 1416 <p>Most functions are wrapped with a default wrapper which does 1417 nothing except complain or abort if it is called, depending on 1418 settings in <code class="computeroutput">MPIWRAP_DEBUG</code> listed 1419 above. The following functions have "real", do-something-useful 1420 wrappers:</p> 1421 <pre class="programlisting"> 1422 PMPI_Send PMPI_Bsend PMPI_Ssend PMPI_Rsend 1423 1424 PMPI_Recv PMPI_Get_count 1425 1426 PMPI_Isend PMPI_Ibsend PMPI_Issend PMPI_Irsend 1427 1428 PMPI_Irecv 1429 PMPI_Wait PMPI_Waitall 1430 PMPI_Test PMPI_Testall 1431 1432 PMPI_Iprobe PMPI_Probe 1433 1434 PMPI_Cancel 1435 1436 PMPI_Sendrecv 1437 1438 PMPI_Type_commit PMPI_Type_free 1439 1440 PMPI_Pack PMPI_Unpack 1441 1442 PMPI_Bcast PMPI_Gather PMPI_Scatter PMPI_Alltoall 1443 PMPI_Reduce PMPI_Allreduce PMPI_Op_create 1444 1445 PMPI_Comm_create PMPI_Comm_dup PMPI_Comm_free PMPI_Comm_rank PMPI_Comm_size 1446 1447 PMPI_Error_string 1448 PMPI_Init PMPI_Initialized PMPI_Finalize 1449 </pre> 1450 <p> A few functions such as 1451 <code class="computeroutput">PMPI_Address</code> are listed as 1452 <code class="computeroutput">HAS_NO_WRAPPER</code>. They have no wrapper 1453 at all as there is nothing worth checking, and giving a no-op wrapper 1454 would reduce performance for no reason.</p> 1455 <p> Note that the wrapper library itself can itself generate large 1456 numbers of calls to the MPI implementation, especially when walking 1457 complex types. The most common functions called are 1458 <code class="computeroutput">PMPI_Extent</code>, 1459 <code class="computeroutput">PMPI_Type_get_envelope</code>, 1460 <code class="computeroutput">PMPI_Type_get_contents</code>, and 1461 <code class="computeroutput">PMPI_Type_free</code>. </p> 1462 </div> 1463 <div class="sect2" title="4.8.5.Types"> 1464 <div class="titlepage"><div><div><h3 class="title"> 1465 <a name="mc-manual.mpiwrap.limitations.types"></a>4.8.5.Types</h3></div></div></div> 1466 <p> MPI-1.1 structured types are supported, and walked exactly. 1467 The currently supported combiners are 1468 <code class="computeroutput">MPI_COMBINER_NAMED</code>, 1469 <code class="computeroutput">MPI_COMBINER_CONTIGUOUS</code>, 1470 <code class="computeroutput">MPI_COMBINER_VECTOR</code>, 1471 <code class="computeroutput">MPI_COMBINER_HVECTOR</code> 1472 <code class="computeroutput">MPI_COMBINER_INDEXED</code>, 1473 <code class="computeroutput">MPI_COMBINER_HINDEXED</code> and 1474 <code class="computeroutput">MPI_COMBINER_STRUCT</code>. This should 1475 cover all MPI-1.1 types. The mechanism (function 1476 <code class="computeroutput">walk_type</code>) should extend easily to 1477 cover MPI2 combiners.</p> 1478 <p>MPI defines some named structured types 1479 (<code class="computeroutput">MPI_FLOAT_INT</code>, 1480 <code class="computeroutput">MPI_DOUBLE_INT</code>, 1481 <code class="computeroutput">MPI_LONG_INT</code>, 1482 <code class="computeroutput">MPI_2INT</code>, 1483 <code class="computeroutput">MPI_SHORT_INT</code>, 1484 <code class="computeroutput">MPI_LONG_DOUBLE_INT</code>) which are pairs 1485 of some basic type and a C <code class="computeroutput">int</code>. 1486 Unfortunately the MPI specification makes it impossible to look inside 1487 these types and see where the fields are. Therefore these wrappers 1488 assume the types are laid out as <code class="computeroutput">struct { float val; 1489 int loc; }</code> (for 1490 <code class="computeroutput">MPI_FLOAT_INT</code>), etc, and act 1491 accordingly. This appears to be correct at least for Open MPI 1.0.2 1492 and for Quadrics MPI.</p> 1493 <p>If <code class="computeroutput">strict</code> is an option specified 1494 in <code class="computeroutput">MPIWRAP_DEBUG</code>, the application 1495 will abort if an unhandled type is encountered. Otherwise, the 1496 application will print a warning message and continue.</p> 1497 <p>Some effort is made to mark/check memory ranges corresponding to 1498 arrays of values in a single pass. This is important for performance 1499 since asking Valgrind to mark/check any range, no matter how small, 1500 carries quite a large constant cost. This optimisation is applied to 1501 arrays of primitive types (<code class="computeroutput">double</code>, 1502 <code class="computeroutput">float</code>, 1503 <code class="computeroutput">int</code>, 1504 <code class="computeroutput">long</code>, <code class="computeroutput">long 1505 long</code>, <code class="computeroutput">short</code>, 1506 <code class="computeroutput">char</code>, and <code class="computeroutput">long 1507 double</code> on platforms where <code class="computeroutput">sizeof(long 1508 double) == 8</code>). For arrays of all other types, the 1509 wrappers handle each element individually and so there can be a very 1510 large performance cost.</p> 1511 </div> 1512 <div class="sect2" title="4.8.6.Writing new wrappers"> 1513 <div class="titlepage"><div><div><h3 class="title"> 1514 <a name="mc-manual.mpiwrap.writingwrappers"></a>4.8.6.Writing new wrappers</h3></div></div></div> 1515 <p> 1516 For the most part the wrappers are straightforward. The only 1517 significant complexity arises with nonblocking receives.</p> 1518 <p>The issue is that <code class="computeroutput">MPI_Irecv</code> 1519 states the recv buffer and returns immediately, giving a handle 1520 (<code class="computeroutput">MPI_Request</code>) for the transaction. 1521 Later the user will have to poll for completion with 1522 <code class="computeroutput">MPI_Wait</code> etc, and when the 1523 transaction completes successfully, the wrappers have to paint the 1524 recv buffer. But the recv buffer details are not presented to 1525 <code class="computeroutput">MPI_Wait</code> -- only the handle is. The 1526 library therefore maintains a shadow table which associates 1527 uncompleted <code class="computeroutput">MPI_Request</code>s with the 1528 corresponding buffer address/count/type. When an operation completes, 1529 the table is searched for the associated address/count/type info, and 1530 memory is marked accordingly.</p> 1531 <p>Access to the table is guarded by a (POSIX pthreads) lock, so as 1532 to make the library thread-safe.</p> 1533 <p>The table is allocated with 1534 <code class="computeroutput">malloc</code> and never 1535 <code class="computeroutput">free</code>d, so it will show up in leak 1536 checks.</p> 1537 <p>Writing new wrappers should be fairly easy. The source file is 1538 <code class="computeroutput">mpi/libmpiwrap.c</code>. If possible, 1539 find an existing wrapper for a function of similar behaviour to the 1540 one you want to wrap, and use it as a starting point. The wrappers 1541 are organised in sections in the same order as the MPI 1.1 spec, to 1542 aid navigation. When adding a wrapper, remember to comment out the 1543 definition of the default wrapper in the long list of defaults at the 1544 bottom of the file (do not remove it, just comment it out).</p> 1545 </div> 1546 <div class="sect2" title="4.8.7.What to expect when using the wrappers"> 1547 <div class="titlepage"><div><div><h3 class="title"> 1548 <a name="mc-manual.mpiwrap.whattoexpect"></a>4.8.7.What to expect when using the wrappers</h3></div></div></div> 1549 <p>The wrappers should reduce Memcheck's false-error rate on MPI 1550 applications. Because the wrapping is done at the MPI interface, 1551 there will still potentially be a large number of errors reported in 1552 the MPI implementation below the interface. The best you can do is 1553 try to suppress them.</p> 1554 <p>You may also find that the input-side (buffer 1555 length/definedness) checks find errors in your MPI use, for example 1556 passing too short a buffer to 1557 <code class="computeroutput">MPI_Recv</code>.</p> 1558 <p>Functions which are not wrapped may increase the false 1559 error rate. A possible approach is to run with 1560 <code class="computeroutput">MPI_DEBUG</code> containing 1561 <code class="computeroutput">warn</code>. This will show you functions 1562 which lack proper wrappers but which are nevertheless used. You can 1563 then write wrappers for them. 1564 </p> 1565 <p>A known source of potential false errors are the 1566 <code class="computeroutput">PMPI_Reduce</code> family of functions, when 1567 using a custom (user-defined) reduction function. In a reduction 1568 operation, each node notionally sends data to a "central point" which 1569 uses the specified reduction function to merge the data items into a 1570 single item. Hence, in general, data is passed between nodes and fed 1571 to the reduction function, but the wrapper library cannot mark the 1572 transferred data as initialised before it is handed to the reduction 1573 function, because all that happens "inside" the 1574 <code class="computeroutput">PMPI_Reduce</code> call. As a result you 1575 may see false positives reported in your reduction function.</p> 1576 </div> 1577 </div> 1578 </div> 1579 <div> 1580 <br><table class="nav" width="100%" cellspacing="3" cellpadding="2" border="0" summary="Navigation footer"> 1581 <tr> 1582 <td rowspan="2" width="40%" align="left"> 1583 <a accesskey="p" href="manual-core-adv.html"><<3.Using and understanding the Valgrind core: Advanced Topics</a></td> 1584 <td width="20%" align="center"><a accesskey="u" href="manual.html">Up</a></td> 1585 <td rowspan="2" width="40%" align="right"><a accesskey="n" href="cg-manual.html">5.Cachegrind: a cache and branch-prediction profiler>></a> 1586 </td> 1587 </tr> 1588 <tr><td width="20%" align="center"><a accesskey="h" href="index.html">Home</a></td></tr> 1589 </table> 1590 </div> 1591 </body> 1592 </html> 1593